DaveWentzel.com            All Things Data

Virtualized SANs


There are a lot of conflicting definitions of virtualized SANs or "storage virtualization" in the press given this is a new "hot" technology.  You may also hear this called "VSAN" or "zoning" VSAN
Simply, anything virtualized is a "logical" representation of a normalized resource that may not be entirely "there".  This may make sense on a SAN if you work at a place that insists on no DAS and only SAN disk.  Most web servers need only a small amount of disk, and it is usually read only.  In a traditional SAN we would need to allocate a LUN, sizing it appropriately for future growth (just in case), and in all likelihood you'd probably use a very, very small amount of disk allocated.  A virtualized SAN obviously wouldn't need to do this.  Of course, virtualization has an overhead cost...which is critical on a transactional database. 
Storage virtualization can occur at different points in the stack.  There is block virtualization (the actual block on the disk is virtualized), disk virtualization (what would be a "virtual disk" in a LVM tool like Veritas Volume Manager or even a .vhd file in MS Virtual PC), network virtualization, even virtualization on the host itself (using LVM tools).  Even tape can be virtualized. 
Basically, SCSI communications with host systems are managed by LUs in the virtualization system and not by the LUs in the storage system.  That's the textbook definition I use.  This is "in-band" virtualization, which is the de facto today.  It can add significant latency as IOs need to be regenerated and forwarded.  Transmission errors *can* also increase. 
Out-of-band virtualization is more like distributed volume management software.  It's not native to your fabric, so to speak.  It will likely require a new piece of hardware or software. 
MultiPathing could even be considered a component of a virtualized SAN. 
Oversubscribed LUNs are also an aspect of SAN virtualization. 
The biggest disadvantage of a virtualized SAN (I can't say this enough) is latency.  Test, test, test to make sure latency won't kill your application.  Latency can be improved by better caching. 
So why virtualize?  The key benefit is scalable storage.  This is usually achieved with concatenation - which is the adding of blocks from one storage address space to another.  Almost like RAID striping. 
Another benefit is subdivision - the opposite of concatenation.  I can take a large address space and divide it into smaller units.  You can purchase bulk storage and parcel it out later. 
Fan-In and Fan-Out
fan-in is the # of simultaneous initiator/target connections available through a single port.  Fan-in helps to add resilience to the fabric.  Good, for instance, for clustering. 
fan-out is the number of downstream LUNs used to form a single upstream LUN.  In other words, the ratio of downstream to upstream entities in a network. 
By using fan-in/fan-out, device ports can be shared across more hosts.  The concept is a network topology concept really.  For device fan-in it is best to connect devices that do not need full bandwidth to an edge switch, such as tape devices, older storage, and older hardware. 
SAN Virtualization Best Practices
  • Virtualization sounds great, but it can kill a database.  Know your disk sec/read, and what is acceptable for your application.  Recommended PerfMon Counters.  In a SAN there aren't many valuable disk counters so use the ones that can prove you have issues. 
  • Look for abnormal service behavior.  In other words, know your baseline. 
  • Look for shared resource contention.  Virtualization is about sharing, yet your data files don't want to share physical disks with your log files.  Make sure they don't. 
  • Convince management not to virtualize to saturation, or maximum utilization.  You should virtualize to the point of maximum utilization while still maintaining performance.  In many cases this means large amount of storage need to go unused if the database performance is critical.  This is optimal throughput. 
Final Thoughts
SAN virtualization systems can become a single point of failure if not designed properly.  The solution is to cluster your SAN to avoid this.