DaveWentzel.com            All Things Data

SQL Server

Obnoxious SSIS Designer Layout Issue

I don't want to bloviate too much, but there are some annoyances with SSIS, most of them well documented on the blogosphere.  My biggest annoyance is that the mouse is required for EVERYTHING.  Want to do a task conditionally?  That would be an IF branch in any programming language, but because SSIS is so visual it requires lots of mouse clicks.  

Parallel Data Warehouse as a NoSQL Alternative

Another post in my NoSQL series...this one on Microsoft's Parallel Data Warehouse...PDW for short.  This is an installed appliance delivered right to your door with everything completely setup for you.  

Buffer Pool Extensions in SQL 2014

I've been very busy lately but one of my tasks is to evaluate migrating to SQL 2014.  After poking around some of the new features I stumbled across Buffer Pool Extensions.  If your db server is memory-constrained or you have an IO-intensive workload, and you can afford a cheap SSD drive, you really should research this feature.  Essentially, you can extend your buffer pool onto nonvolatile storage without buying pricey RAM.  Technically you could use rotational media and not even use an SSD, but I can't imagine you would get much of a performance boost.  

How Does It Work?

When your working set is small the buffer pool is constantly performing random reads and writes whenever a CHECKPOINT is issued.  Each of these small IO operations is competing for your rotational media's attention.  Latency goes up, throughput goes down.  It's not uncommon on large systems to see CHECKPOINTs take a very long time and the Page Life Expectencies (PLEs) to drop radically.  

One of the beauties of SSDs is that random and sequential IO have nearly identical performance characteristics, just like volatile RAM.  Buffer Pool Extensions (we'll call them BPE for the rest of this post) essentially becomes a middle man between the RAM and the rotational media.  It would be nice if the small, random seeks of the dirtied buffer pool pages would go to the SSDs first, and from there write to our slower media without impacting the buffer pool.  Unfortunately that wouldn't be a very safe thing to do.  Instead, pages that are getting ready to be aged out of the buffer cache...we'll call them warm pages, will be moved to the BPE area instead...assuming the pages are "clean".  

Mechanics of Setting It Up

  • This can be enabled, disabled, and altered on-the-fly without service interruption.  Although you will see performance implications of doing these things obviously.  
  • You can query the current state of buffer pool extensions by using the DMV sys.dm_os_buffer_pool_extension_configuration.  
  • Ensure your SSD is formatted with a drive letter.  
  • MS's recommendation is for a 16:1 ratio of BPE size to MAX SERVER MEMORY.  

Performance Implications

I work on any application where everyone complains about the PLEs dropping to under 30 seconds when we do certain operations.  I've written about this in the past...IMHO the PLE is a symptom to be monitored, but doesn't help you determine root cause of any performance problems I've ever seen.  At best I correlate block and wait statistics information during periods of "bad PLEs" to determine exactly what activity was occurring and then determine how to fix it to be less IO intensive.  

I don't have a production load available on my SQL 2014 test box, and I'm not exactly sure how to test BPEs accurately.  Having said that in my test env it does appear as though PLEs are less "volatile" and more stable.  Elapsed checkpoint time is also better.  It will be interesting to see real perf numbers by some of the SQL experts on the blogosphere.  

This is an awesome feature.  


This is the next post in my NoSQL series.   As I was starting my first NoSQL POC (using SAP HANA) SQL Server announced Hekaton in one of their CTPs for SQL Server 2014.  I was intrigued because it appeared as though Hekaton, an in-memory optimization feature, was, basically, HANA.  It turns out HANA is much more.  This post has been at least 9 months in the making and since then everyone in the blogosphere has evaluated and posted about Hekaton so I'll keep this post really short.  

Hekaton is Greek for a hundred, and that was the targeted performance improvement that Microsoft set out to achieve when building this new technology.  Oracle has an in-memory product called TimesTen and I wonder if the Hekaton name was a bit of one-upmanship.  

Initially I was skeptical that Hekaton was little more than DBCC PINTABLE.  It's more...a lot more.  

In case you've never heard of Hekaton, there are two main features:

  • a table can be declared as in-memory (similar to SAP HANA)
  • stored procedures can be compiled into native DLLs if they only touch in-memory tables.  

Most database managers in existence today (except the "newer" varieties like HANA) were built on the assumption that data lives on rotational media and only small chunks of data will be loaded into memory at any given time. Therefore there is a lot of emphasis on IO operations and latching within the database engine. For instance, when looking up data in SQL Server we traverse a B-Tree structure, which is a rotational media-optimized structure.  Hekaton does not use B-Trees, instead it uses memory pointers to get to the data.  This is orders of magnitude faster.  

Hekaton transactions are run in the equivalent of snapshot isolation level.  New versions of changed data are stored in RAM with a new pointer.  Transactions still get logged to the tran log and data is still persisted to disk, but the disk-based table data for Hekaton tables is only read from disk when the database starts up.  And it is not stored in B-Trees.  Instead, the versioned memory pointers are persisted to disk and on database startup those pointers/versions are re-read into memory.  

You will see errors just like with other in memory-based database managers if the data grows too large to fit into RAM.  The system does not fall back to traditional disk-based B-Trees.
Other Interesting Features
  • use SCHEMA_ONLY when creating table and it is non-durable and non-logged.  The data will be gone when the instance restarts (or fails over), but the schema remains.  This is good for ETL and session state information.  
  • If indexes on these tables are not B-trees then what are they?  Hash indexes...therefore all memory-optimized tables must have an index. Indexes are rebuilt on instance restart as the data is streamed to memory.  Indexes are not persisted to disk and are not part of your backup.  
  • No locks are acquired and there are no blocking waits.  In-memory tables use completely optimistic multi-version concurrency control.  
Implementation Features
  • the database must have a filegroup that CONTAINS MEMORY_OPTIMIZED_DATA that is used to recover the data.  This makes sense since legacy filegroups are B-Tree-organized.  
  • the tables (or database) must use a Windows BIN2 collation.  
  • tables can have no blobs or XML datatypes, no DML triggers, no FK/check constraints, no Identity cols, no unique indexes other than PK.  
  • maximum 8 indexes. 
  • There are no schema changes to the table once it is created.  That includes indexes.  

There is lots of good information on Hekaton on the internet.  Far more than I can put into a blog post.  This is an interesting development.  

Presentation on Metadata Driven Database Deployments is tonight

As a reminder, I am giving a presentation on metadata driven database deployments using my tool at 5:30 tonight at Microsoft's Malvern office.  The presentation and source code can be downloaded from CodePlex.  

See you there.


Handling Conflicts with Eventual Consistency and Distributed Systems

In distributed data architectures like some NoSQL solutions, a mechanism is needed to detect and resolve conflicting updates on different nodes.  There are many different ways to do this.  You don't need to understand the nitty gritty details of how this is done, but understanding the basic concepts is fundamental to understanding the strengths and limitations of your distributed data system.  In this post I'll cover some of the methods.
Distributed Hash Tables
BitTorrent's distributed tracker uses this technology.  In fact, some key-value systems like Cassandra and memcached (I'll cover both in a later post) are just giant DHTs.  Essentially a data element is passed to a common hashing function to generate a key, which is then passed around to other nodes.  
Quorom Protocol
Very simply, technologies that use QP must "commit" on n number of nodes for the transaction to be considered successful.  The "commit" can be handled in one of two ways:
  1. Like a traditional two-phase commit (2PC).  In this case the write commit is delayed until a quorum of n acknowledge the change.  Obviously this can introduce a high amount of transaction latency.  
  2. The quorum is obtained on the read side.  When the data is read the quorum is obtained then.  Again, latency is introduced in this model.  
Gossip Protocol
This method allows nodes to become aware of other node crashes or new nodes joining the distributed system.  Changes to data are propogated to a set of known neighbors, who in turn propogate to a different set of neighbors.  After a certain period of time the data view becomes consistent.  The problem is that the more nodes the system contains, the longer it will take for updates to propogate, which in turn means the "eventual consistency" takes longer and therefore the possibility of conflicts occurring increases.  
Vector Clocks
A vector clock is probably the simplest way to handle conflict resolution in a distributed system.  A vector clock is a token that distributed systems pass around to keep the order of conflicting updates intact.  You could just timestamp the updates and let the last update win...if your requirements are that simple.  But if the servers are geographically disparate it may be impossible to keep the clocks synchronous.  Even using something like NTP (Network Time Protocol) on a LAN may not keep the clocks synchronized enough.  Vector clocks tag the data event so that conflicts can be handled logically by the application developer.  This is basically how Git works under-the-covers.  
There are many ways to implement vector clocks but the simplest way is for the client to stamp its data event with a "tag" that contains what the client knows about all of the other clients in the distributed system at that point in time.  A typical tag may look like this:
Assuming this was client1, the above tag indicates client1 knows that client2 is on its "Revision 100", client3 is on its "Revision 78", client4 is on its "Revision 90" and it is on "Revision 50".  "Revision" in this sense is a monotonically increasing identifier specific to that node.  You could use a GUID, but that is rarely done.  
Problems with Vector Clocks
A vector clock tag grows very large as more clients participate in the system, and as the system generates more messages.  Some vector clock-based systems like Riak have a pruning algorithm to keep the vector clock tags more compact. Vector clocks are also used by Amazon's Dynamo architecture.  

Presenting at the December meeting of PSSUG

I am presenting at the December meeting of PSSUG.   December The topic is "Metadata Driven Database Deployments."  I have a mechanism I have been using for about 13 years where we can deploy a new database or upgrade an existing database in a fast, reliable, repeatable manner.  And the best part is, you don't need to know esoteric DDL commands.  Certainly you can use things like "scripting wizards" and "SQL compare" tools to do this but I'll show you some benefits that having a custom DDL deployer can handle that other tools cannot.   

Feel free to come out and learn about my deployment techniques.  Or just come out to heckle me.

SQL Server Central Management Server Query Patterns

If you deal with a lot of SQL Servers you know that Central Management Server (CMS) is your friend.  If you use CMS correctly you can run queries against all of your instances with very little additional work.  But there is *some* additional work.  For instance, what if you need to run queries against databases with different names or naming conventions on each instance?  Or only run the query on a given database if it contains a given application table?  Or only run a query for user databases?  Here are the query patterns I use to overcome some of these issues.  I hope you find them useful.  

Show me all register servers in my CMS

You should connect directly to your CMS and switch your db context to msdb to run this query.  

declare @name varchar(256), @server_name varchar(256)
declare dave cursor for 
	select name, server_name 
	from dbo.sysmanagement_shared_registered_servers_internal
	--from dbo.sysmanagement_shared_server_groups_internal
	where server_group_id = 6
open dave
fetch next from dave into @name, @server_name
while (@@FETCH_STATUS = 0)
	print 'echo Running: ' + @name 
	print 'PAUSE'
	print 'powershell .\something -ServerName "' + @server_name + '"'
fetch next from dave into @name, @server_name
close dave
deallocate dave

The remaining queries can be run by connecting to your CMS server in "Registered Servers" in Management Studio, and then selecting "Connect to All"

Run a command on a subset of databases based on a property and name

declare @something varchar(4000)
declare dave cursor for
	select name
	from sys.databases WHERE name like '%tnd%' and is_broker_enabled = 1
open dave
fetch next from dave into @something
while @@FETCH_STATUS = 0 
	select @something = 'EXEC ' + @something + '.dbo.Blah'
	exec (@something)
	fetch next from dave into @something
close dave
deallocate dave

Run a query against all databases named something

declare @something varchar(4000)
declare dave cursor for
	select 'select COUNT(*) from ' + name + '.dbo.Something WHERE Blah IS NULL AND StpDTime IS NULL '
	from sys.databases WHERE name like '%blah%'
open dave
fetch next from dave into @something
while @@FETCH_STATUS = 0 
	exec (@something)
	fetch next from dave into @something
close dave
deallocate dave


Structuring Your ETL like a Queue

This is a follow-on to Performant ETL and SSIS Patterns.  I really need to do a longer, explanatory post on this.  Two of the largest performance problems I see with SSIS packages is their lack of parallelism and the fact that they are written to run large batches during a defined time window.  The former is totally unnecessary, the latter is unecessary if you structure your processing smartly.  

Performant ETL and SSIS Patterns

It's a helluva job market out there right now if you have ETL, SSIS, DataStage, or equivalent experience.  I guess you can make some generalized deductions about this:

  • more companies are trying to integrate their data stores.
  • more companies need to copy data from OLTP to OLAP systems.  
  • It's hard to find good ETL people.  

Unfortunately, too many job postings ask candidates to have specific ETL tooling experience such as SSIS or DataStage.  This is unfortunate.  Too many candidates have great tooling experience but have very little grounding in ETL best practices, regardless of chosen tool.  I've been called in a lot lately to help fix various ETL processes.  Each one is using a different ETL tool and each one is exhibiting the same terrible ETL anti-patterns.  When I fix those anti-patterns everything just magically runs better.  I have yet to touch actual ETL code.  

To quickly summarize the most egregious issue...developers are doing too much work in the ETL tool and not enough work in their RDBMS.  The RDBMS will almost always do things faster than the ETL tool can.  There are few exceptions to this rule (string manipulation and regexp is better in most ETL tools than in SQL for instance).   

I've written tons of blog posts (here's a link to an entire series of ETL Best Practices) about how to do performant ETL with good patterns.  However, I find myself constantly searching my blog to find a succinct list of things to check whenever I'm brought into another ETL engagement.  Here's the biggies:

  • ETL Best Practices.  I'm not going to list all of them.  That's a long blog post.  You should go reference that.  
  • Do more in SQL and less in the ETL tool.  Examples:
    • SSIS is not available in SQL Express 
    • Sorts are better handled in the RDBMS.  Use an ORDER BY clause on your SQL statement instead of relying on your ETL tool to sort.  If using SSIS, mark your OLEDB source metadata on the data source as sorted.  
    • Any set-based, relational operation will be faster in SQL than in your ETL tool.  Your RDBMS will likely automatically determine the best parallelism and memory management to use.  On the contrary, you'll never get this right in your ETL tool.  
    • Big packages are not good.  If your SSIS package is swapping to disk then you are not running efficiently.  There is a SSIS performance counter called "Buffers Spooled".  It should always stay at 0.  Always.  Otherwise you are using your swap file.  
    • If you run Integration Services on your SQL Server then use the "SQL Server Destination" vs the "OLEDB destination".  Performance is markedly better.  
  • Understand and optimize your data structures.  Examples:
    • Understand locking, Lock Escalation, and transaction isolation semantics for your given RDBMS.  If using SSIS then understand the difference between BATCHSIZE vs ROWS_PER_BATCH and ensure they are optimized for your destination RDBMS.  
    • Partitioned Tables.  Understand them and try to use them if your destination is SQL Server.  The SWITCH statement is your friend, maybe your best friend.  
    • Make sure you are always doing Minimally Logged Operations whenever possible.  And always verify with testing and performance monitoring.  Every RDBMS has different rules for what equates to a minimally logged operation.
    • Determine whether it is faster to disable/drop your indexes before your data loads.  There is lots of conflicting guidance and there is no substitute for testing.  I've found that every engagement a different setting is needed.   Here is some additional guidance you may not find elsewhere:
      • A commit size of 0 is fastest on heaps
      • if you can't use 0 because you have concurrency concerns then use the highest value you can to reduce the overhead of multiple batches and transaction control.  
      • A commit size of 0 on a clustered index is a bad idea because all incoming rows must be sorted.  This will likely cause spooling to tempdb which should be avoided.  
  • There are network-related settings that are invaluable for performance:
    • jumbo frames will increase your packet payload from 1500 bytes/frame to 9000 bytes/frame.  
    • Change sqlConnection.PacketSize.  The default is 4096 bytes.  That's small for moving around large amounts of data.  32767 would be better.  
    • use network affinity at the OS level.  
  • When using the OLEDB source, always set the Data Access Mode to "SQL Command".  Never "table or view".  Performance is almost always better.  The reason is that sp_prepare is called under the covers using the latter and therefore has a better shot at getting a good execution plan.  The former does the equivalent of a SET ROWCOUNT 1 to get its column metadata.  That can lead to bad execution plans.  
  • You should seriously consider restructuring your ETL processes so they work like a queue.  I'll cover this in the next post, Structuring Your ETL like a Queue.  


Subscribe to RSS - SQL Server