DaveWentzel.com            All Things Data

SQL (Anti-)Standards

Lots of data professionals like to write SQL development standards.  That's fine if the point is to note some best practices, but strict adherence to these supposed "standards" within an organization often doesn't make any sense.  Do a Google search for "SQL Development Standards" and you'll find a myriad of blog posts and sample documents that have been instituted at many organizations.  Here are some "standards" that are commonly seen that I think are just plain ridiculous:


Never use SELECT * 


Anytime I see "never" in a standards document I cringe.  "Never" is a strong word.  If SELECT * was so bad why did the ANSI SQL guys give it to us?  When people say not to use SELECT * it is because they saw specific cases where it backfired, namely when the client was looking at the column list ordinally (instead of by name) and a new column was added between two columns (instead of at the end of the table), meaning the ordinal column list changed.  Really then, the standard should be all new columns go at the end of the table so the ordinal positions never change.  Also, clients should really be referencing cols in the result set by name and not ordinal.  


There are other reasons for not using SELECT * that may be legitimate.  But there are even more reasons for utilizing SELECT *.  I've worked on applications where we allow the customer to add "user-defined fields" to screens.  On the fly we add a column to an "outrigger" table that has a 1:0 or 1 relationship to the parent.  In the outrigger we dynamically add the cols the customer desires.  Some metadata helps the application handle data type and custom constraint requirements, as well as how to handle CRUD.  Obviously if the customer wanted to track the data he probably wants to add it to reports as well.  Our reporting views/procedures always LEFT JOIN'd to the outrigger tables and added "outrigger.*" to every SELECT list after all of our "system-defined" reporting columns.  This was very handy and allowed totally flexibility for our customers.  As long as our report developers never referenced columns ordinally it worked perfectly.  


Table names should indicate or not indicate plurality


Some will advocate an Employees table, some would call it Employee.  Advocates of plurality insist a table is a collection of rows so it should be plural, opponents point to logical reasons why tables should be singular.  This becomes a religious debate.  Who cares?  Pick a method and go with it.  


Object names should/should not have underscores


FirstName or first_name.  Another religious war.  Proponents of the underscore point to ISO 11179 (an established metadata standard) as a naming convention.  Following the standard aids in EAI activities (or that is the claim, in my experience just following the standard is not a panacea).  Opponents say the underscore is difficult to read depending on font and how your IDE highlights text.  Again, who cares?  Pick a standard and go with it.  If you need ISO certification then of course follow the standard.  But following ISO standards for the sake of following a standard seems to me to be a waste of time, especially in the case where your application likely will never participate in EAI activities.  My main complaint is the shear amount of time we waste with the slavish devotion to these standards.  While the rest of us are busily working to meet deadlines some pencil-pusher is coming up with unworkable standards that will not help us generate revenue.  It just isn't logical to me.  


TSQL code should use fixed-width fonts


.sql files, regardless of whether they are SQL Server or Oracle .sql files, are still text files.  The font is not specified in the ANSI/Unicode/UTF text file standards, that's your choice.  Who cares?  


Align SQL keywords, indent with 5 spaces, capitalize SQL keywords, etc etc


Again, who cares?  In smaller development shops it's actually easier to determine quickly who authored code based on individual coding styles.  Why try to change people who are comfortable with a style they have been using for years?  Some people think block indenting is easier to read, others don't.  I would rather have my people be productive writing working code than worrying about polishing the apple.  


I'll admit that I often format other people's code to make it more readable for me if I need to alter it, but I've seen other people do the same thing to my code.  Some people also use code formatters to do this.  My one complaint is a situation where a developer checks out a lengthy stored procedure, reformats it, alters maybe a single line, then checks it back in.  Later, when we are debugging issues with the procedure, it becomes *impossible* to determine what that developer actually altered since potentially every line of code has changed when versions are diff'd.  I always tell developers who love to reformat code to simply check out the originally routine first, then format it and check it back in with a comment indicating it was a formatting change only, then check it out again to perform the actual required change.  


Never Use Cursors, Temp Tables, etc


People swear they have seen cases where cursors are faster than set based approaches.  I've never seen this with SQL Server (but have with Oracle) but I can imagine it can happen.  When the standard says "never use cursors" people come up with pseudo-cursor constructs anyway (WHILE loops being the biggest culprit) because they don't know any better.  These constructs usually perform far worse than a cursor anyway.  There are many times where I've used a cursor to quickly get code out the door, knowing refactoring would need to occur later.  Instead of a "never use cursors" policy, maybe soften it to "Consider seeking help before writing a cursor".  Encourage learning instead of ruling with an iron fist.  


Never use SELECT...INTO...FROM syntax


Really?  People think this is a good rule because of the old 'select into/bulk copy' option that invalidated transaction log backups in SQL 7 and earlier.  The fact is, this is not applicable to temp tables.  Why declare a temp table and INSERT INTO it in 2 steps?  In SQL 2000 and earlier doing this actually caused your procedure to recompile causing performance degradation under concurrency load.  SELECT INTO doesn't do that.  Also, if your underlying data types change you need to find all of your CREATE #whatever statements and fix them.  SQL Server doesn't have the concept of Oracle's %ROWTYPE and %TYPE (which copies the underlying object's data type to the new object).  This is often known as a "STRUCT" data type in other programming languages.  You don't need to worry about data type changes with SELECT INTO syntax since the new table inherits the data type of the underlying.  


Always code 2 part/3 part object names


Using 3 part names is just stupid, it's likely your db name will change at some point.  Dynamic linked servers would be the only case where a 3 part name might make sense, but I've seen SQL Standards that advocate 3 part names.  


Using 2 part names might be OK given old, obsolete SQL Server development habits that have carried over since the "Pre-Schema" days of SQL Server.  In the Oracle world it's not uncommon to have 2 schemas with the same database objects, one called Prod and one called Test.  If I use 2 part naming I can't simply revert to test by changing my default schema or logging in with a different user.  In SQL Server we usually use separate dbs to do this, and the case can be made that performance is a *teensy* bit faster if you schema qualify your objects.  Still, if you only use the dbo schema I'm not sure I would worry about enforcing this standard solely for performance reasons.  It would seem to me your time would be better spent looking at *real* performance problems instead of enforcing this policy.  


All code must be commented in the form...


Some people prefer flower boxes, some like double dashes.  Who cares?  Some standards say that every variable's purpose must be documented, every RETURN code, etc.  Sometimes things are obvious, do they need to be documented?  Some standards insist on certain formats to get documentation-generation plugins to work properly.  In those cases a standard is important.  


And on and on and on


There are many more examples of documented SQL Standards that are pointless, but I think you get my point.  In my opinion the best standards are the briefest that are easiest for everyone to actually want to follow.  Standards that use "NEVER" are probably not good standards.  Allowing developers to have latitude and creativity probably leads to more productivity than strict adherence to "standards" ever could.  



Add new comment