Data Contracts for Stored Procedures

For years I've been tasked with fixing performance problems with SQL code, usually stored procedures.  If you've ever done this you know there are a few things you can and can't do when refactoring SQL if you can't change the underlying client.  For instance, say you have a report that is running slowly.  It's easy to quickly cowboy a change to your stored procedure because TSQL is not compiled code, but changing the report definition usually can't be done on the fly, at least not easily.  

I follow a process whenever I refactor TSQL.  I make sure that I never change the inputs or outputs, only the "guts in the middle."  If you adhere to this rule your TSQL refactoring will never break your application.  I call this a data contract.  Any part of my TSQL that is exposed to a client cannot be changed.  This includes:

  • result set column list
  • result set column order
  • result set column names/data types
  • input parameter order (for those developers who do not name their param lists), names, and data types (stored procs only)
  • output parameter order, names, and data types (stored procs only)
  • RETURN values (stored procs only)

SQL Server natively has no concept of a data contract, although plenty of people have proposed the idea with sample syntax changes for TSQL and there is even a Connect request for this.  

 If I can refactor TSQL code without breaking the data contract I don't need a .NET programmer to change and compile *any* of his code.  This actually isn't a "data contract" in the WCF/.NET sense of the term, but I think it adequately describes what is important.  

There will be times when you need to ALTER the data contract to add new functionality.  I always handle this by requirement:

  • If a new result set column is required it goes at the end of the list.  If it is at the end then developers who coded against a named or ordinal result set column will not break.  At that point forward I can no longer change the name, data type, or ordinal number of that new result set column.
  • If a new input parameter is required it always goes at the end of the list and is NULLable.  
  • Output params are handled similarly.  

All of this is probably common sense.  But there really are not any good tools that will tell you if your refactoring efforts have broken your data contract.  In the past I've relied on:  

  • My manual testing tools
  • Using SET FMTONLY will give you data types on underlying result set cols which may be necessary if you are changing arithmetic formulas in a calculation.  
  • On Oracle a PACKAGE defines the data contract nicely for the contained procedures.  SQL Server does not have this
  • Most ORM tools can validate your schema for you.  If it reports changes, then you broke the daa contract.  

My new favorite tool for handling data contract testing (and TSQL unit testing in general) is tsqlt.  It is open-source and works like a champ.  Soon I'll cover some additional details regarding how to test data contracts with tsqlt.