I was wondering peoples opinions on the naming of ID columns in database tables.
If I have a table called Invoices with a primary key of an identity column I would call that column InvoiceID so that I would not conflict with other tables and it's obvious what it is.
Where I am workind current they have called all ID columns ID.
So they would do the following:
Select
i.ID
, il.ID
From
Invoices i
Left Join InvoiceLines il
on i.ID = il.InvoiceID
Now, I see a few problems here:
1. You would need to alias the columns on the select
2. ID = InvoiceID does not fit in my brain
3. If you did not alias the tables and referred to InvoiceID is it obvious what table it is on?
What are other peoples thoughts on the topic?
-
We use
InvoiceID
, notID
. It makes queries more readable -- when you seeID
alone it could mean anything, especially when you alias the table toi
.From Jason Cohen -
I do hate the plain id name. I strongly prefer to always use the invoice_id or a variant thereof. I always know which table is the authoritative table for the id when I need to, but this confuses me
SELECT * from Invoice inv, InvoiceLine inv_l where inv_l.InvoiceID = inv.ID SELECT * from Invoice inv, InvoiceLine inv_l where inv_l.ID = inv.InvoiceLineID SELECT * from Invoice inv, InvoiceLine inv_l where inv_l.ID = inv.InvoiceID SELECT * from Invoice inv, InvoiceLine inv_l where inv_l.InvoiceLineID = inv.ID
What's worst of all is the mix you mention, totally confusing. I've had to work with a database where almost always it was foo_id except in one of the most used ids. That was total hell.
Kevin : I have read the word "invoice" too many times in this post. It looks funny nowFrom Vinko Vrsalovic -
For the sake of simplicity most people name the column on the table ID. If it has a foreign key reference on another table, then they explicity call it InvoiceID (to use your example) in the case of joins, you are aliasing the table anyway so the explicit inv.ID is still simpler than inv.InvoiceID
From Mike Brown -
I definitely agree with including the table name in the ID field name, for exactly the reasons you give. Generally, this is the only field where I would include the table name.
From DOK -
You could use the following naming convention. It has its flaws but it solves your particular problems.
- Use short (3-4 characters) nicknames for the table names, i.e. Invoice -
inv
, InvoiceLines -invl
- Name the columns in the table using those nicknames, i.e.
inv_id
,invl_id
- For the reference columns use
invl_inv_id
for the names.
this way you could say
SELECT * FROM Invoice LEFT JOIN InvoiceLines ON inv_id = invl_inv_id
James Curran : ick! I'd vote against using short nicknames for tables (or any other object). With nicknames, you'll never know exactly what the short name is. Remember, there are many way to spell it wrong; there's only one way to spell it right.Kevin : James, I disagree. If you have a short name that isn't descriptive and you can't remember what it is, then you've picked the wrong name or don't understand the naming convention that someone else chose.kristian : I've worked on databases structured like this and, once you get past the ugliness, it's a pretty good system.David B : Nicknames, maybe. Underscores? Just say no.Justice : Use aliases to achieve the same effect. select * from Invoice inv left join InvoiceLines invl on inv.ID = invl.InvoiceIDAdrian : NO no no no. alias the table with a abrievation in the query. But the table name should be a full.mP : Why are so many programmers lazy and seem the answer to everything is to type the least possible just because its too hard to type a bit more.From Ilya Kochetov - Use short (3-4 characters) nicknames for the table names, i.e. Invoice -
-
It's not really important, you are likely to run into simalar problems in all naming conventions.
But it is important to be consistent so you don't have to look at the table definitions every time you write a query.
From Nir -
For the column name in the database, I'd use "InvoiceID".
If I copy the fields into a unnamed struct via LINQ, I may name it "ID" there, if it's the only ID in the structure.
If the column is NOT going to be used in a foreign key, so that it's only used to uniquely identify a row for edit editing or deletion, I'll name it "PK".
From James Curran -
I always prefered ID to TableName + ID for the id column and then TableName + ID for a foreign key. That way all tables have a the same name for the id field and there isn't a redundant description. This seems simpler to me because all the tables have the same primary key field name.
As far as joining tables and not knowing which Id field belongs to which table, in my opinion the query should be written to handle this situation. Where I work, we always prefece the fields we use in a statement with the table/table alias.
From Kevin -
I always try to make id columns descriptive of what they are an ID for usually tablename_ID for a primary key. Then where possible I always use the same id name anywhere that it is a foreign key. The only exceptions are where you might have two fields in a subordinate table that relate to different records in the main table. Then I would add another qualifier to the field to indicate the difference, so if in my Meeting table I have the sales rep and the speaker and both their ids would come from the person table, I would have Rep_Person_ID and Speaker_Person_ID. Under no circumstances would I ever use just ID. That is way too confusing when you are trying to maintain old code.
The legacy database I work with did not do this and it's a pain to write queries because you are always trying to remember what the field is called in each table as there are at least 7 or 8 differnt names that relate to the id in the main table. And at least 5 differnt versions of another common id field. Naming consistency is important for ease of maintenance.
From HLGEM -
If you give each key a unique name, e.g. "invoices.invoice_id" instead of "invoices.id", then you can use the "natural join" and "using" operators with no worries. E.g.
SELECT * FROM invoices NATURAL JOIN invoice_lines SELECT * FROM invoices JOIN invoice_lines USING (invoice_id)
instead of
SELECT * from invoices JOIN invoice_lines ON invoices.id = invoice_lines.invoice_id
SQL is verbose enough without making it more verbose.
Arry : Do you know if SQL Server supports natural join?Steven Huwig : I don't think that it does. According to http://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=153679 it appears that the syntax is slated to be added in some version after SQL Server 2005. I know it works in PostgreSQL and in Oracle.dland : heh, that sounds like the sound of real-life experience :)Steven Huwig : I would only use natural join for ad hoc queries.Arry : SQL Server 2008 still does not have natural join.From Steven Huwig -
I think you can use anything for the "ID" as long as you're consistent. Including the table name is important to. I would suggest using a modeling tool like Erwin to enforce the naming conventions and standards so when writing queries it's easy to understand the relationships that may exist between tables.
What I mean by the first statement is, instead of ID you can use something else like 'recno'. So then this table would have a PK of invoice_recno and so on.
Cheers, Ben
-
Theres been a nerd fight about this very thing in my company of late. The advent of LINQ has made the redundant tablename+ID pattern even more obviously silly in my eyes. I think most reasonable people will say that if you're hand writing your SQL in such a manner as that you have to specify table names to differentiate FKs then it's not only a savings on typing but it adds clarity to your SQL to use just the ID in that you can clearly see which is the PK and which is the FK.
ie. LEFT JOIN Customers ON Employee.ID = Customer.EmployeeID
tells me not only that the two are linked but which is the PK and which is the FK
whereas in the old style you're forced to either look or hope that they were named well.
From Echostorm -
My vote is for InvoiceID for the table ID. I also use the same naming convention when it's used as a foreign key and use intelligent alias names in the queries.
Select Invoice.InvoiceID, Lines.InvoiceLine, Customer.OrgName From Invoices Invoice Join InvoiceLines Lines on Lines.InvoiceID = Invoice.InvoiceID Join Customers Customer on Customer.CustomerID = Invoice.CustomerID
Sure, it's longer than some other examples. But smile. This is for posterity and someday, some poor junior coder is going to have to alter your masterpiece. In this example there is no ambiguity and as additional tables get added to the query, you'll be grateful for the verbosity.
From Rob Allen -
Coming at this from the perspective of a formal data dictionary, I would name the data element
invoice_ID
. Generally, a data element name will be unique in the data dictionary and ideally will have the same name throughout, though sometimes additional qualifying terms may be required based on context e.g. the data element namedemployee_ID
could be used twice in the org chart and therefore qualified assupervisor_employee_ID
andsubordinate_employee_ID
respectively.Obviously, naming conventions are subjective and a matter of style. I've find ISO/IEC 11179 guidelines to be a useful starting point.
For the DBMS, I see tables as collections of entites (except those that only ever contain one row e.g. cofig table, table of constants, etc) e.g. the table where my
employee_ID
is the key would be namedPersonnel
. So straight away theTableNameID
convention doesn't work for me.I've seen the
TableName.ID=PK TableNameID=FK
style used on large data models and have to say I find it slightly confusing: I much prefer an identifier's name be the same throughout i.e. does not change name based on which table it happens to appear in. Something to note is the aforementioned style seems to be used in the shops which add anIDENTITY
(auto-increment) column to every table while shunning natural and compound keys in foreign keys. Those shops tend not to have formal data dictionaries nor build from data models. Again, this is merely a question of style and one to which I don't personally subscribe. So ultimately, it's not for me.All that said, I can see a case for sometimes dropping the qualifier from the column name when the table's name provides a context for doing so e.g. the element named
employee_last_name
may become simplylast_name
in thePersonnel
table. The rationale here is that the domain is 'people's last names' and is more likely to beUNION
ed withlast_name
columns from other tables rather than be used as a foreign key in another table, but then again... I might just change my mind, sometimes you can never tell. That's the thing: data modelling is part art, part science.From onedaywhen -
I prefer DomainName || 'ID'. (i.e. DomainName + ID)
DomainName is often, but not always, the same as TableName.
The problem with ID all by itself is that it doesn't scale upwards. Once you have about 200 tables, each with a first column named ID, the data begins to look all alike. If you always qualify ID with the table name, that helps a little, but not that much.
DomainName & ID can be used to name foreign keys as well as primary keys. When foriegn keys are named after the column that they reference, that can be of mnemonic assistance. Formally, tying the name of a foreign key to the key it references is not necessary, since the referential integrity constrain will establish the reference. But it's awfully handy when it comes to reading queries and updates.
Occasionally, DomainName || 'ID' can't be used, because there would be two columns in the same table with the same name. Example: Employees.EmployeeID and Employees.SupervisorID. In those cases, I use RoleName || 'ID', as in the example.
Last but not least, I use natural keys rather than synthetic keys when possible. There are situations where natural keys are unavailable or untrustworthy, but there are plenty of situations where the natural key is the right choice. In those cases, I let the natural key take on the name it would naturally have. This name often doesn't even have the letters, 'ID' in it. Example: OrderNo where No is an abbreviation for "Number".
From Walter Mitty -
For each table I choose a tree letter shorthand(e.g. Employees => Emp)
That way a numeric autonumber primary key becomes nkEmp.
It is short, unique in the entire database and I know exactly its properties at a glance.
I keep the same names in SQL and all languages I use (mostly C#, Javascript, VB6).
From pkario -
See the Interakt site's naming conventions for a well thought out system of naming tables and columns. The method makes use of a suffix for each table (
_prd
for a product table, or_ctg
for a category table) and appends that to each column in a given table. So the identity column for the products table would beid_prd
and is therefore unique in the database.They go one step further to help with understanding the foreign keys: The foreign key in the product table that refers to the category table would be
idctg_prd
so that it is obvious to which table it belong (_prd
suffix) and to which table it refers (category).Advantages are that there is no ambiguity with the identity columns in different tables, and that you can tell at a glance which columns a query is referring to by the column names.
From flamingLogos -
What I do to keep things consistent for myself (where a table has a single column primary key used as the ID) is to name the primary key of the table
Table_pk
. Anywhere I have a foreign key pointing to that tables primary key, I call the columnPrimaryKeyTable_fk
. That way I know that if I have aCustomer_pk
in my Customer table and aCustomer_fk
in my Order table, I know that the Order table is referring to an entry in the Customer table.To me, this makes sense especially for joins where I think it reads easier.
SELECT * FROM Customer AS c INNER JOIN Order AS c ON c.Customer_pk = o.Customer_fk
From JettGeek -
FWIW, our new standard (which changes, uh, I mean "evolves", with every new project) is:
- Lower case database field names
- Uppercase table names
- Use underscores to separate words in the field name - convert these to Pascal case in code.
pk_
prefix means primary key_id
suffix means an integer, auto-increment IDfk_
prefix means foreign key (no suffix necessary)_VW
suffix for viewsis_
prefix for booleans
So, a table named NAMES might have the fields
pk_name_id, first_name, last_name, is_alive,
andfk_company
and a view calledLIVING_CUSTOMERS_VW
, defined like:SELECT first_name, last_name FROM CONTACT.NAMES WHERE (is_alive = 'True')
As others have said, though, just about any scheme will work as long as it is consistent and doesn't unnecessarily obfuscate your meanings.
From CMPalmer
0 comments:
Post a Comment