DaveWentzel.com            All Things Data

SQL Server

Are some new SQL Server features a response to the NoSQL movement?

Are some new SQL Server features a response to the NoSQL movement?


The last few releases of SQL Server, in my mind, were focused heavily on business intelligence.  And I believe the best SQL Server jobs in the last few years are focused on BI and data warehousing.  OLTP does not seem to be where the new features and money are these days.  Here is a quick list of relatively new SQL Server features targeted to "bigger data":

  • Partitioning.  Yes, you can use it for OLTP, but specific features like SWITCH are meant specifically for data warehouses.   
  • Sparse Columns.  Great for denormalized data structures.  
  • Compression.  Again, great for OLTP, but seems to be more tailored for OLAP applications.  
  • Parallel Data Warehouse
  • And there are just way too many obvious BI and SSIS enhancements that I don't even want to list them.  PowerPivot for instance.  

But there is a shift.  I see many more new features centered around "BigData" and as a response to the NoSQL movement (if it is a movement, is it of type religious...or bowel?).  Here are some examples:

  • StreamInsight:  allows you to analyze streams of incoming data in real-time.  
  • HDInsight:  The HD is for Hadoop, and is available only on Azure.  
  • Columnstore indexes:  This is where you data tuples are organized by column vs by row.  This aids in aggregations of bigger data sets that tend to be read-mostly.  First MS gave us this feature, how long will it be befoe these are updated (it looks like this may be SQL Server 2014)?  Or like SAP HANA...will magically migrated from row to column oriented, and back, as the engine optimizes your data for you.  
  • Hekaton.  In-Memory tables.  Microsoft removed the DBCC PINTABLE feature a few releases ago, but it was a good idea at the time, and it still is.
  • QL Server 2014 will introduce "delayed transaction durability" to support "extreme write workloads."  http://msdn.microsoft.com/en-us/library/dn449490(v=sql.120).aspx

    The latest CTP for SQL Server 2014 has a "delayed transaction durability feature".  According to the article this is meant to assist with "extreme write workloads".  It looks like this is only for Hekaton, for now.  This feature is clearly meant to help SQL Server compete with NoSQL products with "eventual consistency" features.    

  • It looks like SQL Server 2014 will support the in-memory buffer pool being extended to SSDs.  This is a great alternative to a RAMdrive.  I hope this feature makes it to RTM.  
  • Hybrid cloud features.  One of the touted benefits of NoSQL is your ability to deploy it in-house and then migrate it into the cloud.  This is definitely something Microsoft has latched on to.  

Will MS Licensing Drive Customers to NoSQL?

Will MS Licensing Drive Customers to NoSQL?

Yep.  But that's just my opinion.  I have been asked to assist my client with evaluating various alternatives to SQL Server for an enterprise-class, multi-TB transactional database.  Both the client and me have very realistic expectations...we don't think we will find a good alternative, but we can always begin rolling out new features under a second DBMS.  Clearly if we stayed relational then something like PostgreSQL or MySQL would be great choices.  But perhaps now is the time to think about alternative data organization paradigms that are not relational.  Why not a network model or graph model?  The next bunch of blog posts I'm going to do are my experiences working hands-on with various NoSQL solutions.  

For this post I want to talk about Microsoft practices that I believe are pushing more customers towards SQL Server alternatives.  SQL Server 2012 now has an option for core licensing for Standard and Enterprise.  You must buy 4 cores at a minimum and additional packs are a minimum of 2 cores.  Core licensing gets a little hairy for VMs.   Each VM CPU is considered a core, and the 4 core minimum still applies.  However, if you have Ent with Software Assurance and license all cores on the host then you can run unlimited VMs with SQL Server configured however you like.  

Prior to SQL Server 2012, the processor license cost was kinda high, so the accepted strategy was to get as much performance and capacity as you could afford for each expensive processor socket license that you purchased. In other words, max out each socket.  But socket licensing is gone.  With Enterprise Edition, which is what I see in-use almost exclusively at my customers, you only have core-based licensing.  Standard and BI Edition have "Server + CAL"-based licensing as well.  But BI Edition does not have core-based licensing.  Got all that?  Yeah yeah, Microsoft eased some confusion by ditching Datacenter Edition, so I guess it's a wash.  And there's still Web Edition for hosting providers only.  Whew.  

But again, my guess is core-based licensing is what most enterprise customers are choosing.  

Frankly I don't see what the point of Standard Edition is.  In both 2008 R2 and 2012 the RAM limit is 64GB and 16 physical cores.  The RAM limit is the killer.  Right now my laptop has 32GB of RAM.  RAM is so dirt cheap that anyone running Standard with less than 64GB is, frankly, nuts.  The licensing costs for the software vs the price for the RAM is just that skewed.  Here I'm a fan of using Microsoft's rules to my advantage.  If you have, say, 128GB RAM in your server, why not run two instances of SQL Server with the same core-based licensing?  Granted, you'll need some kind of "sharding" strategy to do this, but isn't it time you had this anyway?  

So that means Enterprise Edition is going to be what, I believe, most enterprise customers will deploy.  And I gotta say, online index operations and data compression, by themselves, easily justifies the cost difference.  And then if you really need AlwaysOn availability groups, Enterprise is your only choice.  

Let's get back to the mathematics of core-based licensing.  Again, in the past the best practice was to pay the price premium for the absolute best processor available for each socket in your database server.  That may kill your budget with SQL 2012 EE.  In core-based licensing each physical socket in your server must use a minimum of four core licenses. Let's say you have some old dual-core processor hardware lying around...I really wouldn't run EE on that since you'll need to pay MS for four core licenses for each socket.  So if your intention is to recycle old hardware and maybe just upgrade it from SQL 2008 to 2012, you might want to rethink that.  Granted, this kind of hardware is probably 5 years old anyway and should be retired anyway.  

So, what are some LEGAL ways to cut down on licensing fees?

  • Use Standard Edition and forego the EE features.  
  • You can still use lots of RAM, but you'll need to run multiple SE instances 
  • Use SE and any RAM over 64GB consider using a RAMdrive.  For instance, put tempdb on your RAMdrive in any physical RAM you may have over and above 64GB.  Microsoft even graciously gave us a KB article about it.  
  • Pay nothing and use Express Edition.  Granted, you are limited to about 1.5GB of RAM and 10GB databases, but if you are CAREFUL and shard properly this can actually be viable.  I worked for a company that sold software run entirely on Express and we had "databases" that were well over 100GB because we used sharding and FILESTREAM effectively.  I wrote about this in my blog post Express vs Standard and How to save a *lot* of money with SQL Server Express.  

I have a client that is actively looking to replace SQL Server with less costly alternatives.  I doubt they'll find a replacement that has a better TCO after factoring in the massive SQL code rewriting that will need to happen. But my next few posts will be my experiences with various NoSQL solutions.  I've been working on this project for months and I've learned (and re-learned) a lot of great things that I'm excited to share.  


Aaron Bertrand has a really good blog post called "Tip on using caution with MERGE" where he lists all of the bugs and weirdnesses with the MERGE statement.  It's a great post and I'm going to bookmark it because every time I use the MERGE statement I try to remember all of the little "issues" with it.  This blog post will be a checklist summary of issues that I can use next time I use the MERGE statement.  Aaron concludes that it is a productivity booster, but you need to be careful.  Good advice.  

Here are the isssues:

  • Always use HOLDLOCK [MERGE blah WITH (HOLDLOCK) AS targ].  Internally the UPSERT is susceptible to race conditions without it.  I thought I wrote about this in the past, but I guess I didn't.  I always add HOLDLOCK to my MERGE statements.  
  • There are tons of connect bugs around MERGE.  Most of them, I think, are spurious...I've never seen the problems, but YMMV.  Best to be familiar with them.  
  • Be careful with TRIGGERs.  @@ROWCOUNT is not reliable.  Aaron has a good repro script.  It's probably a minor issue, but the lesson learned is to test your triggers.  



Metrics are a funny thing.  You hear people spout out metrics all the time.  "My Page Life Expectencies are 200."  "My disk latencies are 20 ms."  Just regurgitating the given metric says nothing about context.  There is no good guidance for PLEs or disk latencies and if you ask ten knowledgeable data professionals you'll likely get ten different answers.  But put all of the opinions together and you'll get a good "feel" for what might be good and what might be bad.  My general rule of thumb, I call it the "smell test", is if I don't have a benchmarked baseline for a metric then I look at "reasonable guidance".  If my metric is close then I look at some other aspect of the system to tune.  If my metric is wildly out of the range of reasonable guidance, then that is worth investigating further.  

Disk metrics are really tough.  There is so much abstraction going on with a system hooked up to a SAN that any given metric could be normal even if it falls wildly outside of the norm.  Latencies are one such metric.  About a year ago I pinned down an MS SQL Server field engineer who stated what the then current latencies guidance was (10 ms).  I blogged about that in my post Latency and IOPs.  I had a harder time extracting that guidance than that Microsoft employee's dentist had extracting his abcessed tooth.  

At the time I was told that IOPS of 100 Page Reads/sec was the guidance for "high".  We were at the time averaging 400 with sustained peaks over 4000.  This guidance made sense to me at the time because I knew our database code was horrendous (too much tempdb bucketing in procedures, lousy index decisions, procedures that returned 64MB result sets, lots of PAGEIOLATCH_SH waits, etc) and I was working on fixing the most egregious things.  So, using "Dave's Smell Test" I knew that our IOPS were probably outside of the norm and warranted further investigation.  

Now, I've always thought IOPS was a kinda nebulous term.  What is the definition of an IO?  Is it a 4KB block, or a 64KB block, which is of course what SQL Server cares about.  Depending on what the definition is your IOPS number could be off by a factor of 16.  Clearly it is in the interests of the vendor to spout off the highest IOPS number possible in their marketing material...so do you think the published number is for 4KB or 64KB blocks?  Read the fine print.  

It just seems a bit too confusing, which is why I prefer my disk metrics to always be in terms of "latency", not IOPS.  Your IO subsystem's goal should be, IMHO, the least amount of latency as possible, with the highest IOPS.  This is why you'll often see DBAs ask for their HBAs to have higher Q depth settings than the factory defaults.  Lower (default) Q depths will sacrifice some latency for better IOPS.  That seems backwards to me on an OLTP system.  

And then of course you have to take into consideration your "effective IOPS".  This is a term I made up (I think anyway) and it takes into consideration whether you are using RAID and what RAID level.  Single disk drives have a raw IOPS number.  Using RAID 10 effectively halves your raw IOPS number.  Why?  An extra write is required for the mirror.  It's twice as bad with RAID 5 (RAID 5 Write Penalty).  For instance, SATA drives average about 150 IOPS.  If I had 6 of those I would have about 900 IOPS, but I need to halve that number to get the true 450 IOPS at RAID 10.  This is all theoretical and confusing which is why I tend to ignore IOPS.  

IOPS numbers tend to NEVER hit the published numbers, especially on rotational media.  This is because database access tends to be somewhat random in nature.  Sequential transaction log access is the exception.  The more that disk head has to move, the worse the effective IOPS.  That's another reason why I firmly believe that ALTER INDEX...REORGANIZE is one of the best ways to speed up your effective IOPS.  

But even the latest SSDs will rarely give you IOPS numbers anywhere close to what they publish.  You'll likely saturate the PCI-E bus first.  

So I find myself not really relying on published IOPS or even "reasonable guidance".  There is no substitute for measuring your disk subsystem yourself.  Even so, once you have a good measurement, is there anything you can really do about it if it isn't up to the smell test?  Other than changing HBA QDs or using local SSDs for tempdb, I haven't had much luck in my career getting IO subsystems upgraded, even when they were woefully under-spec'd/misconfigured given my measured latency and IOPS.  The only thing I can ever really do is to lessen my reliance on IO.  And that of course means good indexing, good data hygiene, good caching, adding RAM, and tweaking IO intensive queries.  

CONNECT BUG: When SET IMPLICIT_TRANSACTIONS ON not every SELECT will start a transaction

I submitted the following BUG to connect due to some anomalies with SQL Server implicit transactions.  When SET IMPLICIT_TRANSACTIONS ON not every SELECT will start a transaction.  

We are having some serious issues debugging implicit transactions from a java/jdbc application. In the process of doing that we think we encountered a bug (possibly with just the BOL documentation...regardless). Under the topic of SET IMPLICIT_TRANSACTIONS BOL states: 

And here is how to reproduce the issue:  

Here is the actual repro script so you can try it yourself.  


You're Only As Good As Your Last Success

"Could you take a look at ticket 56789?"  I nodded my head in the affirmative.  I was already behind in my existing tasks so I was little annoyed.
I opened up the ticket in the system and read the following:  "Customer says that procedure FooBar consistently executes in 5 seconds.  Customer wants to know if there is any way we could make that a little bit faster.  Customer feels this is not a priority but since they use the function multiple times a day they would save some time."
Our system has almost 4000 stored procedures and I didn't immediately recognize the FooBar procedure as something I've worked on.  I went into our source control system and found FooBar.  The last check-in was seven years ago and, surprisingly, it was by me.  Even more surprisingly, the last check-in note stated:"Per Schlemiel, if you ever have to improve performance remove line 262."
What?  I was baffled.  Why would I write a check-in comment like that?  After pondering the issue for a few seconds, I remembered!  I shook my head, chuckled, remove line 162, checked-in my code, and went to find Schlemiel.  
Flashback seven years ago...
I was still a new employee and was trying to make a name for myself.  Our benevolent dictator at the time was Schlemiel.  He has since been promoted to a bigger team with more responsibility.  He was a well respected, godlike architect and I admired him very much.  Schlemiel always saw something in me and I was his to-go guy when he had data issues.
Schlemiel asked me if I could look at the FooBar procedure because it was running a little slow.  I asked him for the ticket number.
"12345.  Just do this as a favor for me.  A couple years back Schlomo wrote this FooBar procedure as a skunkworks project.  We never thought customers would really use it so we never bothered to ensure it performed over larger data sets.  Now it takes about 14 minutes to execute for most customers.  Customers are starting to complain and Schlomo can't figure out what to fix.  Take a look, make some recommendations, and we'll figure out how to prioritize."
I pulled up the ticketing system and opened up 12345.  Sure enough:  "Customer states the FooBar procedure is totally unusable.  It takes about 14 minutes to execute.  User state she clicks submit and then goes for coffee and bagel.  Sometimes it's done by the time she gets back.  Please fix it."
I pulled up the code for the FooBar procedure and began looking at it.  Within 20 minutes I figured it out.  One of the queries joined 12 different tables and one of the JOIN/ON conditions was missing.  Effectively there was a cartesian product causing billions and billions of rows to qualify for the final result set.  Thankfully the WHERE clause was filtering out those rows before they were sent to the client.  Through dumb luck to query was working, but it was not efficient.
I added the JOIN/ON condition and began writing some quick unit tests.  My new version proved to be functionally equivalent but took orders of magnitude fewer reads and executed in 500 milliseconds vs about 9 minutes on my test server.  I immediately scurried off to tell Schlemiel.
Schlemiel was less-than-enthusiastic.  "Now you've gone and gotten yourself a bit of a problem."
Puzzled, I replied, "Why?  The tests are perfect, clearly this is a slam dunk. Simple oversight by Schlomo, if you will."  
"I've no doubt you solved the problem, but the fact is, you made the performance TOO good.  If you release this version of your procedure then management and the customer will never believe that such a dramatic increase in performance doesn't introduce too much risk.  The change will never be approved.  No one will ever believe that simply adding an ON clause could fix an issue like this.  Instead, management and the customer will demand to see endless proofs and reams of unit tests before they have a comfort level.  Meanwhile, you are creating agita for yourself and I frankly need you to work on other, more important issues."  
As every word passed his lips I felt more and more deflated.  "What do you suggest?"
"I would suggest adding a WAITFOR DELAY '00:00:05' somewhere in your code.  Let's see that code."  He peered over my shoulder, "Right here on Line 262 looks good to me.  Once you've done that go ahead and check it in and note that I code reviewed and approved your work.  And next time, try not to be so efficient."  
The look of dejection on my face must have been overwhelming.  Perhaps he noticed a stray tear running down my face.  He tried to console me, "Dave, don't worry so much.  You did a great job.  You took a 14 minute procedure and go it down to 5 seconds.  You're a hero.  But just remember, today's hero can be tomorrow's goat.  Someday you'll do something stupid and you won't be hero.  At that moment just remember that you can always go back and remove that artificial WAITFOR DELAY and you'll be the hero again.  Remember, you're only as good as your last success."  
I pulled up the ticketing system and opened up 12345:  " customer states the FooBar procedure is totally unusable.  It takes about 14 minutes to execute.  User state she clicks submit and then goes for coffee and bagel."
I pulled up the code for the FooBar procedure and began looking at it.  Within an hour I figured it out.  One of the queries joined 12 different tables and one of the join conditions was missing.  Effectively there was a cartesian product causing billions and billions of rows to qualify for the final result set.  Thankfully the where clause was filtering out those rows before they were sent a client.  Through dumb luck to query was working, but it was not efficient.
I added the join condition and began writing some quick unit tests.  My new version proved to be functionally equivalent but took orders of magnitude fewer reads and executed in 500 milliseconds vs about 9 minutes on my test server.  I immediately scurried off to tell Schlemiel.
Schlemiel was less-than-enthusiastic.  " you've got a bit of a problem.
Flashforward to the present...
"Hey Schlemiel," I found him in the hallway in another building, "remember seven years ago when I tweaked FooBar and it performed so well that you wanted me to put a WAITFOR DELAY in the code?  Well, we just got another ticket by a customer asking us to tweak FooBar again.  I took out the WAITFOR DELAY and checked the code back in."  
I could feel my chest was puffed out to the fullest, I was proud of my accomplishment.  Schlemiel paused a second and then broke out in a terrific belly laugh.  "You idiot!  I can't believe you actually checked in that WAITFOR DELAY.  I was kidding of course.  No sane person would do something like that.  Didn't you realize I was kidding?"  
I walked away...again...dejected.  I had two major successes, both times the customer was thrilled, and both times I was the goat.  

Yet another reason not to use TSQL scalar user-defined functions

...as if you really needed another one.  

I was poking around in SQL Server 2012 and I noticed a new feature of the XML SHOWPLAN.  You can now determine why a given query plan will be serial instead of parallel.  

I, of course, knew this already, but this will be another arrow in my quiver of proof for why, in many cases, TSQL scalar UDFs are evil.  

I'm actually more interested in finding out what the allowable values are for NonParallelPlanReason.  I haven't seen this documented yet, and I'm not really sure I care why a plan would/would not go parallel, but this is interesting.  

BTW, I've written NUMEROUS times about the performance problems with TSQL scalar UDFs:

Bind Variables, Prepared Execution, ad-hoc SQL statements, and Performance...an allegory

And The Developer went to the Oracle and asked, "If programming in Delphi, should I use bind variables? What if I'm programming in Cleveland?  Or Visual Basic?" 

Tom Kyte, aka the Oracle, pondered the question for 3 days while sitting over a crack on a rock, inhaling noxious volcanic fumes.  He then, suddenly, responded.  "You will bind never performance will suffer."

Immediately the Oracle fainted from the fumes. 

The Developer tried to rustle him to consciousness.  He did not understand the Oracle's prophesy.  Should the missing comma come BEFORE or AFTER the word "NEVER" ?

The Developer could not wait for the Oracle to awaken, the Kanban card was already in the "blocked" state for too long and The ScrumMaster was angry.  What should he do? 

At that moment Bill Gates appeared in a burning bush. 

The Developer:  "Father, are bind variables good?"

"Son, we haven't the concept.  The closest thing is prepared execution.  As with all things my son, imbibe in moderation, for thy performance cup shall never runneth over." 

The burning bush disappeared.  Still The Developer was confused. 

The Developer came across The Great Google and structured his query thusly, "When sending a repeated SQL statement to my RDBMS, should I use a cacheable, parameterized statement?" 

He clicked "I'm Feeling Lucky"...because, well, he was. 

Stack Overflow responded immediately with The Commandments:

  1. "Thou shalt not commit non-parameterized querying."  Regardless of platform…prepared execution/bind variables in ad hoc SQL statements are thy friend.  On modern RDBMSs every single call should be parameterized.  Always.  No exception.  Ever.  
  2. "Remember to bind, to keep Oracle holy."  It is a PITA (pain in thine ass) to return result sets from Oracle stored procedures/packages which is why bind variables are the cornerstone of good performance on that platform.
  3. "Honor thy SQL Server Stored Procedure."  They are generally much better than adhoc parameterized statements for a litany of reasons. 
  • No need to pollute thine query cache with thy garbage singleton statements.
  • Stored procs are easier to change (sql can be recompiled on the fly with any kind of fix…I don't need to monkey with externalized sql embedded in compiled Java.  
  • I can use lots of different kinds of "hinting" that I can't use in ad hoc sql…changing isolation levels, locking paradigms, plan guides, etc. 
  • It's just good design.  Best to keep data logic at the data tier. 
  • Decrease the "network effect"…a stored proc call will always be more compact than an equivalent parameterized ad hoc statement.  Think a couple bytes don’t matter?  Try developing using "the cloud" with SQL Server on EC2 or Azure without using stored procs.  The network effect literally kills performance. 
  • Your friendly neighborhood DBA can't "profile" when you need help.  "All I see is that you ran sp_execute 15, 1417.  Sorry I can't help you. I can't find a stored procedure named "15,1417" in my database.  " 
  • If you use stored procs you can always "bear false witness against thy DBA."  (Seriously.  Just say, "I don't know why that stored proc isn't performing, isn't performance the DBA's job?"  It works flawlessly). 

4.  "Thou Shalt not be Chatty, but should remember to be Chunky, and keep it holy."  Regardless of platform, if you are going to send the same SQL statement a gazillion times a minute (say because you are too lazy to "batch up" your INSERT statements) then prepared execution isn't going to help much.  The performance benefits of a prepared INSERT statement may save a millisecond, but the real killer is the network chattiness, which can be 16, or sometimes even 50, milliseconds per call.  Best to be chunky. 

<TODO insert picture of chunky 'ol Dave Wentzel here.  Until then, here's another "Chunky">

Sanitize those stored procedure parameters and don't split your pants!

The Level 6 Architect needed to pass some XML into a stored procedure that would be the payload for a Service Broker message.  

"Piece of cake," I said, "Here you go!" and passed him the following procedure...edited for brevity and to protect the innocent.  

"Oh no no no no no.  That will never do.  We have a strict requirement that we cannot pass XML parameters into a procedure, we must pass in a varchar because the XML schema will change dynamically with the...".  My eyes and ears started to glaze over as I heard a litany of ridiculous reasons why an XML datatype could not be used.  

"Let me get this straight...does our JDBC driver support SQL Server's XML datatype?"


"Then we should be passing in the XML data to a strongly typed parameter, a la one declared as XML."  

"No!  We cannot do that.  Use a varchar(500) instead.  Just do as you are told."  

"No problem-o Mr. Level 6.  Here you go.  But I suggest you sanitize your data in the Java tier to be sure you will never overflow a varchar(500).  Are you sure that is big enough?  XML tends to be obese." I was trying not to lose my cool.  

"Just do as you are told."  

"OK, just remember that when this blows up in your face...I told you so."  








Fast forward 12 months when this code is in production for mission critical, life-or-death data situations...

Level6:  "Your gd stored proc is throwing errors everywhere."
Me:  "Can you show me the error please?"

Msg 9400, Level 16, State 1, Procedure FooBar, Line 8
XML parsing: line 1, character 500, unexpected end of input
Me: "That's interesting that it fails exactly at character 500 every time, don't you think?  Do you remember when I told you this would blow up in your face if you didn't sanitize your inputs?"
Level6:  "Yeah, so what?"
Me:  "So, you ignored me and you are passing in more than 500 characters and the XML parser in SQL Server cannot parse the string as well-formed XML.  I told you to sanitize your parameters before calling the procedure.  You chose to ignore me.  This is your problem."
Level6:  "Prove it. SQL Server has to report an error when we overflow the datatype."
Me:  "Here you go."
Level6:  "There is no way we are possibly passing in more than 500 characters of data.  Impossible.  500 characters should be at least four times more than we could possibly ever send."
Me:  "Well, apparently you are wrong.  I bought my pants two sizes too big last year and now my butt doesn't fit in 'em.  I figured two sizes bigger should be more than enough.  It wasn't.  The difference is, my pants don't silently truncate the extra girth like SQL Server does.  My pant seams scream loudly, "hold together men!"  Maybe what you need to do is make sure your Java code doesn't overflow the datatype just like I need to make sure my butt doesn't overflow the pants."
Level6:  "Well, SQL Server should not silently truncate input data.  That's just crazy."
Me:  "It's only been doing that since at least 1990.  Go fix your code.  And remember, a good architect uses the most specific datatype possible and always sanitizes the input."


Subscribe to RSS - SQL Server