DaveWentzel.com            All Things Data


Database Server Consolidation - History and Future

DB servers, especially SQL Servers, seem to sprout up like weeds.  Somebody has some requirement for a little departmental application and right away that requires a new db server (for it's huge database of perhaps 100 MB) and another RDBMS license.  Oracle folks tend not to have this problem...the planning and allocation phase seem to be given more attention for some reason with Oracle.  If this trend continues you find after many years that your data center is full of SQL Servers with low utilization throwing off tremendous heat and using a lot of power.  

History of Server Consolidation

Server consolidation has been hot in IT for quite a few years now.  It's only natural to want to consolidate your db servers.  When consolidation first became a hot topic the idea was to replace those commodity Wintel departmental SQL Servers with blade servers.  They use less space and generate less heat.  

Still, this required purchasing blades, which were not cheap, and potentially tossing out the old Wintels.  Then we discovered the joys of VMWare and virtualization.  Those departmental SQL Servers could be virtualized quickly and easily using a P2V (physical to virtual) tool.  In fact, this worked so well that we began virtualizing every SQL Server we could find, even the clustered solutions (of course ensuring each cluster node resides on a different physical host).  This is pretty much the current state of virtualizing database servers.  

The Future

But there's much more we can be doing.  Virtualizing a server saves on power, rackspace, and climate control, but it saves very little on licensing.  SQL Server does have some interesting new licensing options when you virtualize, but there are still license costs for the ancillary software we all put on our servers (backup software, AV, monitoring, etc).  And of course each of those VMs still has to be patched every Patch Tuesday.  

A much better method is to virtualize up and down the stack where it makes sense.  Again, currently in the industry most people are only virtualizing at the machine-level.  

Consolidate at the Instance level

Some applications/databases can be virtualized by having them co-exist on a given SQL instance/machine.  For very simple departmental databases that are really just a database and a few logins to access the data and require no other features of SQL Server (agent, access to the OS, etc) then we can co-locate those databases on the same instance/machine without any issue.  Usually this involves copying the database and logins to another machine/instance and then updating a connection string on the web server.  

The benefit to this is we've eliminated a SQL license, we've eliminated another machine that will need to be patched, and we've eliminated all of the CPU cycles required to operate the OS and supporting software...we are only using the true cycles that the app/db requires.  Kinda sounds like the old TSO (time sharing option) systems from the 1960's.  The circle of life continues, what is old is new again.  

There are some drawbacks.  Since all of the databases will run under the same SQL service account, with the same global settings, shared memory, and tempdb, we need to be careful.  Departmental apps often simply log in as sa or as an account with sysadmin privileges.  You may need to re-evaluate this if security is a concern among consolidated databases.  Collation is another concern.  The databases must share the same collation.  Ah, but you say that SQL Server collations can be different on each database now.  That's true, but most applications use tempdb to create #temptables.  Those may not function correctly if tempdb's collation is not the same as that of your database.  You only get one collation for tempdb.  

If one of your databases is a resource hog (constantly reading massive amounts of data into and out of the buffer cache) it may contend with the other databases.  On a 32bit OS you have a fixed amount of RAM available for the buffer cache and you really don't want that contention.  

Consolidate Instances

Once you've considered consolidating databases to one instance, it's time to consider consolidating instances to a single machine.  VM hosts tend to be big, beefy boxes.  They need to be in order to handle multiple guests.  However, if you've ever looked at a quiet SQL Server you'll note there is still a lot of CPU activity.  This is due to AV and other data center monitoring tools, SQL Agent heartbeat activity, etc.  This is all wasteful if we just virtualize the SQL box and make it a guest using P2V.  

Instead, let's consider taking all of our instances and consolidating them onto a single server.  Your databases probably reside in the default, unnamed instance of SQL Server (the default instance for SQL Express is called \SQLEXPRESS).  You can create multiple named instances on a single machine.  What are the pros and cons...

Pros Cons
A single 32bit SQL instance can only address so much RAM (depending on Server OS).  If your SQL Server Host (running multiple instances) has lots of memory then each instance can run the maximum RAM without sharing buffer cache or experiencing memory contention.   Ensure your application can connect to a named instance/alternative port
Each instance can have a different service account, hence, different security needs can be addressed.   The CPU/disk can still be a bottleneck since they are still shared across instances.  
Each instance can have a different collation, global settings, etc If the db/app requires OS-level access (for file copying for instance) than inter-database security may still be a concern
$$$ (see next section)  

License Consolidation

Whenever you eliminate SQL Server hosts using instance consolidation you no longer need a license for that host.  A SQL Server license entitles you to multiple instances to a machine at no extra charge.  This is very compelling.  But, you should also consider the free SQL Express edition to save even more money.  SQL Express can run multiple instances on a single machine, the only caveat is the buffer cache is limited to 1GB and db size is limited to 10GB.  Before you scoff at using Express due to the limitations remember that we are only considering consolidating small databases anyway.  Also, remember that there are legal ways to bypass these limitations.  I've written an article about some interesting ways to use SQL Express in non-traditional ways.  License consolidation is just another method.  

Some other notes

Virtualization using Windows 2008 RS Hyper-V and Live Migration allows vms to move between hosts without any perceived service interruption.  

A P2V tool is included with System Center Virtual Machine Manager.  


When embarking on a database consolidatoin effort, don't just run a P2V on your SQL box and call it a day.  Really think about your architecture.  Can you consolidate databases on a single instance?  Can you consolidate many instances on a single machine?  Can you consider SQL Express to save your company money?  All of these ideas will make you a hero to management and a friend to the environmentalists.  

Advanced tips for a better performing virtual SQL Server

Here are some tips for a better performing SQL Server vm.  Note that almost every tip involves disk, which should be no surprise since rotational media is the slowest component of any system since it is the only component with moving parts:

  1. raw LUNs can help if you really have I/O pressure.  (Converting a virtual disk into a Raw Device Mapping)
  2. For Hyper-V solutions, never use dynamic VHD's since the file is not actually allocated on the host and therefore must be allocated on the fly.  This is never a good idea and is similar to autogrowth of a SQL Server data file.  It's performance-draining and always happens at the worst possible time.  The better solution is pass-through disks in Hyper-V.  A pass-through disk is storage (using a SAN LUN) that is mapped to the root partition (or host) but is in an offline state (it's actually brought online, then initialized, then taken offline again).  It is then available to the guest as a pass-through disk, giving that single guest exclusive access.  This also means there is no theoretical size limitation to a pass-through disk.  This makes monitoring a little more difficult at the root partition level since the logical disk counters won't be available (but the physical disk counters will be).  
  3. Whenever monitoring IO performance of a virtual SQL Server prefer monitoring on the host to monitoring on the guest.  Microsoft's own analysis is here.  
  4. For the best I/O characteristics prefer raw LUNs first, then pass-through disks, then fixed VHDs, then, lastly dynamic VHDs.  
  5. Avoid using emulated devices and instead use synthetic devices.  An emulated device "emulates" a device through software in the child partition, meaning that there is good compatibility, but at the expense of more CPU cycles to do the emulation.  Synthetic devices are proxies for real resources on the host so their job is merely to forward I/O around, meaning less host cycles are spent on I/O.  
  6. Set the minimum memory reservation on the VM and don't use memory ballooning.  Memory ballooning (overcommitting memory) is thought to be terrible and should never be used by many.  That's just wrong.  In many cases VMs will never use the amount of RAM granted to them on the host.  In those cases that RAM is being wasted, so why not overcommit?  I am simply saying here that if performance is your goal for your virtualized VM, then consider not using ballooning or restrict it appropriately.  (But of course RAM is cheap, why not just buy more?)
  7. CPU overcommitment should also be avoided if performance is paramount to cost/vm.  CPU overcommitment is conceptually similar to memory overcommitment, and obviously it will require additional CPU cycles on the host to implement.  
  8. When possible, do your system monitoring from the host, not the VM, to assure you aren't seeing bogus metrics.  Also, certain things like CPU q'ing will really only show up on the host, not the VM.  

More available on the Performance Page

Data Virtualization

For some reason "data virtualization" is one of the new hot buzzwords in my industry.  Data virtualization is really just combining a bunch of data sources (databases, unstructured data, DWs, weblogs, whatever) into a single virtual data tier layer that provides access to applications.  Haven't we been doing this for years.  I remember when linked servers came out for SQL Server.  I thought they were the coolest thing at the time.  I could now connect to Oracle from my SQL Server and manipulate the data in the more familiar TSQL, realtime, without the need of an ETL process.

Subscribe to RSS - virtualization