The Database Design Resource Center
The Database Design Resource Center
 
 FAQ   Search    Memberlist   Usergroups    Register    Profile   Log in to check your private messages   Log in   The Website
Unique ID as Text or Number

 
Post new topic   Reply to topic    Database Design Resource Forum: Forum Index -> Design for performance: How can we create high-performing databases?
 Display posts from previous 
Display posts from previous:   

 Unique ID as Text or Number 
Author Message
JeanWM
User


Joined: 07 Nov 2007
Posts: 208
Location: Yuc, Mx

PostPosted: Thu Jan 14, 2010 5:14 am    Post subject: Unique ID as Text or Number Reply with quote

As far as I can see:

Numeric Serial ID Fields:
Allows for more accurate sorting --- depends what you call 'accurate'
Can be used in > or < type calculation relationships. --- so can text as long as you understand what is being compared
Does not allow for multi-line entries (multi-line keys, etc) in Relationships. --- but you can coerce a multi-line text at one end to relate to a single line number at the other

Text Serial ID Fields:
Allows for Alpha-Numeric keys. --- you can also do that in a number field
Allows multi-line entries (multi-line keys, etc) in relationships. --- see above
Can use alpha characters to identify table or file of the ID (ORD000001, is from the Order table, etc). --- see above
Needs to be padded with prefixed 0's to allow for correct sorting (or can use a calculated field to GetAsNumber). --- true

One big difference between text and number fields is that the index of a number field is half the size of that of a text field. That may be a consideration for very large record counts.

This may not seem very helpful (but it is only the start of the discussion) - the "possible issues or problems of using a text field as a unique serial ID" depend on what you are going to be using it for.

Can I have some insight from the forum members?
_________________
JeanWM - Moderator
Dar síntomas erróneos, resulta en soluciónes erróneas.
View user's profile Send private message Visit poster's website

 Re: Unique ID as Text or Number 
Author Message
alf-admin
User


Joined: 07 Nov 2007
Posts: 229
Location: Norway

PostPosted: Sun Jan 17, 2010 2:47 am    Post subject: Re: Unique ID as Text or Number Reply with quote

JeanWM wrote:
As far as I can see:

Numeric Serial ID Fields:
Allows for more accurate sorting --- depends what you call 'accurate'
Can be used in > or < type calculation relationships.

Why do we want IDs?: For identification of rows purposes: Nothing else.

Quote:
Text Serial ID Fields:
Allows for Alpha-Numeric keys. --- you can also do that in a number field
Allows multi-line entries (multi-line keys, etc) in relationships. --- see above
Can use alpha characters to identify table or file of the ID (ORD000001, is from the Order table, etc). --- see above
Needs to be padded with prefixed 0's to allow for correct sorting (or can use a calculated field to GetAsNumber). --- true

So you want to transform the ID column to something else? (Adding info to the column?)

Quote:
One big difference between text and number fields is that the index of a number field is half the size of that of a text field. That may be a consideration for very large record counts.

So size would be considered in order to change the meaning of the DB? I think not. Especially today where I can buy 1000 GB storage for 100 USD

Quote:
Can I have some insight from the forum members?

Just my 2 cents Wink
_________________
All the best,
Alf
Forum Admin
Light travels faster than sound: That's why so many people appear bright until you hear them speak...
www.databasedesign-resource.com
The Database Normalization eBook
View user's profile Send private message Visit poster's website

  
Author Message
JeanWM
User


Joined: 07 Nov 2007
Posts: 208
Location: Yuc, Mx

PostPosted: Sun Jan 17, 2010 2:34 pm    Post subject: Reply with quote

Thanks for your 2 cents Alf.

The discussion we had is which type of field to use for the unique identifier of a record or column.

Text or number.

In an application with several tables, should we use a reference to the given table? And why (not).

Or do we have to use a more complicated data structure and use a sort of calculated result.

And how about the label of the field:
should we incorporate a
'p': for primary identifier
'f' : for foreign identifier
'a' : for alternate identifier
'c' : for compound/calculated/concatenated identifier
'm' : for multi-line identifier

As you can see we enlarge a little the concept of 'key' to 'identifier'.

An example could be the identifier of an invoice.
Something like: INV yyy, where INV stands for 'invoice' and yyy a given number construction, which could hold the 'year' and a serial incremented number.
_________________
JeanWM - Moderator
Dar síntomas erróneos, resulta en soluciónes erróneas.
View user's profile Send private message Visit poster's website

  
Author Message
alf-admin
User


Joined: 07 Nov 2007
Posts: 229
Location: Norway

PostPosted: Mon Jan 18, 2010 8:53 am    Post subject: Reply with quote

JeanWM wrote:
An example could be the identifier of an invoice.
Something like: INV yyy, where INV stands for 'invoice' and yyy a given number construction, which could hold the 'year' and a serial incremented number.


Hmm, which is actually a composite key consisting of:

Document type: INV
Year: 4 digits
Serial no. x digits.

But do they need to be part of the key? Not necessarily. The serial no. should be sufficient: According to accounting rules, an inv.no should be unique over the years of the business (at least where I am).

Document type and year are really just two forms of lookup info.

I admit that it could be handy with a composite key for underlying table searches, but not necessary, and ... well, there could be a long discussion about this Laughing
_________________
All the best,
Alf
Forum Admin
Light travels faster than sound: That's why so many people appear bright until you hear them speak...
www.databasedesign-resource.com
The Database Normalization eBook
View user's profile Send private message Visit poster's website

  
Author Message
peter
User


Joined: 09 Nov 2007
Posts: 81
Location: Berlin, Germany

PostPosted: Tue Jan 19, 2010 12:19 am    Post subject: Reply with quote

Jean,

basically it doesn't matter what kind of key you use as long as it is unique. And a serial value should do the job, whether it is numeric or alpha-numeric.

From a data modelling resp. the user's perspective it might be useful to use a (more or less) natural key as your invoice number. Why to introduce an additional ID column if you still have a unique one? And modern database systems can handle alpha-numeric values well (if your tables haven't millions of rows).

I agree with Alf that the serial no. in your INV example is sufficient. But if it gives more meaning to the user if you include INV and the year, why not. I know, this could be a basis for a long debate about avoiding redundancy ...

@Alf: Jean's invoice number scheme is unique over the years, even if it starts counting every year.

From a performance perspective I suggest a numeric value. E.g. for OLAP systems with billions of rows an alpha-numeric key wouldn't be a good idea. String operations are definitely slower than e.g. an integer comparison.

But lately it is also a matter of the amount of working memory (RAM). If your system runs well with alpha-numeric keys and the difference to numeric keys are only some milliseconds then it really doesn't matter. In this case the user requirements should be the decisive factor.

Back to your example: of course you could concatenate the INV and the year to the serial no. in the application only. But to split semantically connected information between database and application is not a good idea, because you have to implement this connection in every application based on the same database. And the semantics can only be recovered if you use the database AND the corresponding application. This violates Codd's integrity independence rule.

The question how to label a field is a matter of taste. E.g. some modelling software generate foreign keys with a "FK_" prefix. If it helps in application programming, why not. But to be honest I never used such prefixes, because a database application developer knows resp. has to know its data model. So he/she knows if an identifier is primary, foreign, compound, etc.
_________________
Peter
Moderator
View user's profile Send private message Visit poster's website

  
Author Message
alf-admin
User


Joined: 07 Nov 2007
Posts: 229
Location: Norway

PostPosted: Tue Jan 19, 2010 9:46 pm    Post subject: Reply with quote

Jean,

Peter is right to the point. Let me complement with a few viewpoints, because the decision on primary keys is a very important one. Also, though I hate that part, due to performance sometimes (I don't like cluttering logic with performance considerations only because of the current physical state of computer performance: Not logical. But I understand I can be taken for a purist, which I am, I guess). Peter has very valid points around the issue.

But in the logical DB design phase, the only thing that is important, is to identify candidate keys. My best experience is when I can identify a natural, single attribute candidate that holds as a primary. Why?

1. A natural key has a natural place in the tuple (row). Very good.
2. A single attribute as key will ensure that your relation is up in the 5th Normal Form directly, provided it is already in 3rd Normal Form: 4th and 5th Normal Forms only deal with the problems of composite primary keys. Great design.

That being said: If you cannot find a single-attribute natural primary key and therefore proceed to assign a sequence, then normalisation must be executed with special care, as it (the sequence) is only introduced because you miss that natural, and dependencies must be vigorously examined. Hope this makes sense, if not, give me a hint.

Invoice numbers are the kind of attributes that can well be assigned by a sequence. Like Order no's etc. It's just a number, with the sole purpose of uniquely identifying something. A manually assigned serial no. did the trick in the old, pre-computer days: Nothing's changed.

peter wrote:
@Alf: Jean's invoice number scheme is unique over the years, even if it starts counting every year.


If that is true (new inv.no start each year), then year must be concatenated with inv.no. I hope this is not the case.

As for naming conventions, I couldn't care less. So many people scream about conventions, but as long as you have consistent and understandable naming rules, all should be fine. Just be sure you don't design badly with badly naming: I once revised a system and there was an attribute, type BLOB, that was named VG. I couldn't figure out what the heck this was, but when I investigated, I was told it was reserved for Various Garbage... Not exactly normalised Rolling Eyes Shocked Laughing

The explanation for this was as follows: They created the VG column to store various info that was not thought of in the initial design, so the column could contain whatever. Since they didn't know what it would be used for, they made it a BLOB. And they hard-coded each meaning in the application.

Hey, I could actually write a whole section on the website about how bad this is Idea Idea

Oh well: It was good money in redesigning the "garbage" system Laughing

Edit: Jean, I assume you are running a discussion with your pupils? If so, just invite them here Wink
_________________
All the best,
Alf
Forum Admin
Light travels faster than sound: That's why so many people appear bright until you hear them speak...
www.databasedesign-resource.com
The Database Normalization eBook
View user's profile Send private message Visit poster's website

  
Author Message
JeanWM
User


Joined: 07 Nov 2007
Posts: 208
Location: Yuc, Mx

PostPosted: Thu Jan 21, 2010 5:06 pm    Post subject: Reply with quote

@ Alf & Peter

Thank you both for the insight.

It's good to have some ideas 'outside' the normal rules.

For Alf: most of my students speak Spanish and have little knowledge of English.

They read the information on your site, but to have them write something on the forum....
They prefer me to 'translate' and post.
Anyway, we're working on it....

You both know perhaps that our core engine is not SQL but FileMaker Pro.
Some details are a little different.

In FM we use keys to build relationships, but we can also use 'identifiers' for a bunch of other things, like scripting.
Another FM specific engine.

And we do not show keys to users, but sometimes we need a meaningful identifier to show to the user, hence the INV...

Can the identifier be the same as the key?
Now that is again open for discussion.
A yes can agree with redundancy....
A no can trigger the 'why not'?

@Peter
Quote:
...because a database application developer knows resp. has to know its data model. So he/she knows if an identifier is primary, foreign, compound, etc.


I agree. But...what if another developer 'has to go in' when the 'original' developer is not available for whatever reason?

When I revise some applications, I stumble more than once over things that I have absolutely no clue what they're doing or why they are there for.
Making whatever change to such application can break something that will not be visible right away and will show up later, usually when it is too late.

And where do we find the data model or/and the database model of an application?

Right, nowhere....or only in the head of the original developer....

Do we, as instructors/teachers/lectures have a task to change attitude ?
To force our students to document, document and document what they're doing?
_________________
JeanWM - Moderator
Dar síntomas erróneos, resulta en soluciónes erróneas.
View user's profile Send private message Visit poster's website

  
Author Message
peter
User


Joined: 09 Nov 2007
Posts: 81
Location: Berlin, Germany

PostPosted: Tue Feb 09, 2010 12:25 am    Post subject: Reply with quote

3 weeks later ... sorry for the delay ... audit period

JeanWM wrote:
I agree. But...what if another developer 'has to go in' when the 'original' developer is not available for whatever reason?

When I revise some applications, I stumble more than once over things that I have absolutely no clue what they're doing or why they are there for.
Making whatever change to such application can break something that will not be visible right away and will show up later, usually when it is too late.

And where do we find the data model or/and the database model of an application?

Right, nowhere....or only in the head of the original developer....

Well, well, it's the old discussion about having a well-documented system, isn't it? Wink But if you worry about this issue some naming conventions won't help either. There is no need to follow naming standards -- if they might exist.

But there is a light at the end of the tunnel. Modern software development methods and tools support the documentation process more and more.

JeanWM wrote:
Do we, as instructors/teachers/lectures have a task to change attitude ?
To force our students to document, document and document what they're doing?


Yes, I think it is our duty to change this attitude. From time to time I have long discussions with my students about the "Being Irreplaceable" attitude, which might be one reason for the lack of documentation. And as experience teaches software developers are not poets. Wink

But it requires practical experience to recognise that the "Being Irreplaceable" attitude has counter-productive effects. Unfortunately I haven't found the silver bullet how to convince students. In my lessons, if students have to develop software, I often act as the client. So the students have to learn that the client also demands a well-documented system.

Of course it would be fine if (potential) clients in real-life would also learn to demand it from their software developers. I think this are two sides of the same coin. The software developers will document the software better and better if the clients demand it more and more. It is a learning process on each side.
_________________
Peter
Moderator
View user's profile Send private message Visit poster's website

  
Author Message
JeanWM
User


Joined: 07 Nov 2007
Posts: 208
Location: Yuc, Mx

PostPosted: Tue Feb 09, 2010 2:57 pm    Post subject: Reply with quote

peter wrote:


Yes, I think it is our duty to change this attitude. From time to time I have long discussions with my students about the "Being Irreplaceable" attitude, which might be one reason for the lack of documentation. And as experience teaches software developers are not poets.


Then we are both on the same line.
I too have those discussions, where I call it the 'Job protection' attitude.
They can move on from one job to another, leaving a not-documented mess behind.
Doing so can mean they will sooner or later land into a same kind of mess they once left behind.
And then that attitude will bite them in their own behind.
Now thŕt will teach them.....

peter wrote:

But it requires practical experience to recognise that the "Being Irreplaceable" attitude has counter-productive effects. Unfortunately I haven't found the silver bullet how to convince students. In my lessons, if students have to develop software, I often act as the client. So the students have to learn that the client also demands a well-documented system.


Indeed, the convincing part is hard.
Therefor I sometimes divide the class in two teams. One team gets the requirements and have to pass it to the other team. That team has to build the application and pass it to the first team.

The assignment for the first team is to make changes to the program.
And I am the one asking specific changes.... Twisted Evil

Both teams have to comment the weaknesses....
It's sometimes amazing what an eye opener that is....

peter wrote:

Of course it would be fine if (potential) clients in real-life would also learn to demand it from their software developers. I think this are two sides of the same coin. The software developers will document the software better and better if the clients demand it more and more. It is a learning process on each side.

Most of the clients just want the software to work. They don't even want to know the how, why and when.
Then we see multi $ projects miserably failing while the client says: 'Oh well....'

We still have a long way to go.....

or, to use a paraphrase of saying by Zen sage Ekai

When an ignorant person understands he may become a teacher. But a teacher may understand only by becoming an ignorant person again.
_________________
JeanWM - Moderator
Dar síntomas erróneos, resulta en soluciónes erróneas.
View user's profile Send private message Visit poster's website

Post new topic   Reply to topic    Database Design Resource Forum: Forum Index -> Design for performance: How can we create high-performing databases? All times are GMT
Page 1 of 1
View previous topic :: View next topic  

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Site Build It!

Copyright © 2004-2008 www.databasedesign-resource.com / Alf A. Pedersen
All rights reserved.
All information contained on this website is for informational purposes only.
*Disclaimer: www.databasedesign-resource.com does not warrant any company, product, service or any content contained herein.

Copyright acknowledgement note:

The name Oracle is a trademark of Oracle Corporation.
The names MS Access/MS SQL Server are trademarks of Microsoft Corporation.
Any other names used on this website may be trademarks of their respective owners, which I fully respect.


Powered by phpBB © 2001, 2005 phpBB Group :: Theme & Graphics by GHS :: phpbb Templates available at GemViper.com :: Contents © Database Design Resource Forum: