Buffer Pool Extensions in SQL 2014

I've been very busy lately but one of my tasks is to evaluate migrating to SQL 2014.  After poking around some of the new features I stumbled across Buffer Pool Extensions.  If your db server is memory-constrained or you have an IO-intensive workload, and you can afford a cheap SSD drive, you really should research this feature.  Essentially, you can extend your buffer pool onto nonvolatile storage without buying pricey RAM.  Technically you could use rotational media and not even use an SSD, but I can't imagine you would get much of a performance boost.  

How Does It Work?

When your working set is small the buffer pool is constantly performing random reads and writes whenever a CHECKPOINT is issued.  Each of these small IO operations is competing for your rotational media's attention.  Latency goes up, throughput goes down.  It's not uncommon on large systems to see CHECKPOINTs take a very long time and the Page Life Expectencies (PLEs) to drop radically.  

One of the beauties of SSDs is that random and sequential IO have nearly identical performance characteristics, just like volatile RAM.  Buffer Pool Extensions (we'll call them BPE for the rest of this post) essentially becomes a middle man between the RAM and the rotational media.  It would be nice if the small, random seeks of the dirtied buffer pool pages would go to the SSDs first, and from there write to our slower media without impacting the buffer pool.  Unfortunately that wouldn't be a very safe thing to do.  Instead, pages that are getting ready to be aged out of the buffer cache...we'll call them warm pages, will be moved to the BPE area instead...assuming the pages are "clean".  

Mechanics of Setting It Up

  • This can be enabled, disabled, and altered on-the-fly without service interruption.  Although you will see performance implications of doing these things obviously.  
  • You can query the current state of buffer pool extensions by using the DMV sys.dm_os_buffer_pool_extension_configuration.  
  • Ensure your SSD is formatted with a drive letter.  
  • MS's recommendation is for a 16:1 ratio of BPE size to MAX SERVER MEMORY.  

Performance Implications

I work on any application where everyone complains about the PLEs dropping to under 30 seconds when we do certain operations.  I've written about this in the past...IMHO the PLE is a symptom to be monitored, but doesn't help you determine root cause of any performance problems I've ever seen.  At best I correlate block and wait statistics information during periods of "bad PLEs" to determine exactly what activity was occurring and then determine how to fix it to be less IO intensive.  

I don't have a production load available on my SQL 2014 test box, and I'm not exactly sure how to test BPEs accurately.  Having said that in my test env it does appear as though PLEs are less "volatile" and more stable.  Elapsed checkpoint time is also better.  It will be interesting to see real perf numbers by some of the SQL experts on the blogosphere.  

This is an awesome feature.