DaveWentzel.com            All Things Data

August 2010

-ilities for Data Architects

There are a bunch of non-functional requirements that every system should have that I don't think data architects think enough about.  These are sometimes referred to as the "ilities" because most tend to share the same suffix...-ility.  Rarely does anyone bother to document these on a new project and rarely are solutions architected around them.  Most people say, "Yeah, we know about the ilities.  Whenever we take on new work it's just assumed we will take care of things like quality, extensibility, maintainability, scalability, reliability, and performance.  It will be inherent in the design!  No need to waste time on the obvious!"

Bad idea.  If any of that were true I know a lot of support and performance management people that would be unemployed.  

Granted, some ilities are the manifestation of a proper implementation of the given requirements, and frankly, yes, just a product of good architecture and design.  Take reliability for example.  If are system is architected properly then it should be reliable from a user perspective.  

Some examples where we don't consider ilities requirements properly

  • security.  How many systems have you seen or worked on where the database connection is made from the webserver using sa or SYS/SYSTEM?  Or there is only one application level db account and that has been disseminated to everyone to use to connect to the db with Excel to do ad-hoc reporting.  And then they decide to bulk update the data using Access because it's easier than using the approved front-end.  
  • Performance.  "Let's just get the requirement out the door, we'll fix performance later when a few customers have larger data sets."
  • Supportability:  "We don't need to fix this shortcoming of our product, we'll let support handle it when the customer calls in."  But how can support handle it without a tool or documentation?
  • Manageability:  How many times do you ship a product using 3rd party components without documentation or a plan in place regarding how to upgrade those components.  
  • Deployability:  Do you require the database upgrade component of your application to be handled in a repeatable fashion?  Or do you just throw some scripts together and tell the DBA to run them?  Change control should be thought out from the start.  
  • Learnability:  You certainly provide documentation for end-users.  How about sysadmins and DBAs?  Do the DBAs just make index changes willy nilly, or is a process followed where they are taught how to handle the special needs of your database.  
  • Testability:  Do your LoadRunner guys know how to test your app?  Do you?  Do you use viewstate?  Consider building test harnesses directly into your business tier where appropriate.  

You might be able to sell your software on the whiz bang UI, but your customers will keep coming back if you pay attention to the ilities.  In golf you drive for show but putt for dough.  Ilities are the putter in your architecture golf bag.

Database Server Consolidation - History and Future

DB servers, especially SQL Servers, seem to sprout up like weeds.  Somebody has some requirement for a little departmental application and right away that requires a new db server (for it's huge database of perhaps 100 MB) and another RDBMS license.  Oracle folks tend not to have this problem...the planning and allocation phase seem to be given more attention for some reason with Oracle.  If this trend continues you find after many years that your data center is full of SQL Servers with low utilization throwing off tremendous heat and using a lot of power.  

History of Server Consolidation

Server consolidation has been hot in IT for quite a few years now.  It's only natural to want to consolidate your db servers.  When consolidation first became a hot topic the idea was to replace those commodity Wintel departmental SQL Servers with blade servers.  They use less space and generate less heat.  

Still, this required purchasing blades, which were not cheap, and potentially tossing out the old Wintels.  Then we discovered the joys of VMWare and virtualization.  Those departmental SQL Servers could be virtualized quickly and easily using a P2V (physical to virtual) tool.  In fact, this worked so well that we began virtualizing every SQL Server we could find, even the clustered solutions (of course ensuring each cluster node resides on a different physical host).  This is pretty much the current state of virtualizing database servers.  

The Future

But there's much more we can be doing.  Virtualizing a server saves on power, rackspace, and climate control, but it saves very little on licensing.  SQL Server does have some interesting new licensing options when you virtualize, but there are still license costs for the ancillary software we all put on our servers (backup software, AV, monitoring, etc).  And of course each of those VMs still has to be patched every Patch Tuesday.  

A much better method is to virtualize up and down the stack where it makes sense.  Again, currently in the industry most people are only virtualizing at the machine-level.  

Consolidate at the Instance level

Some applications/databases can be virtualized by having them co-exist on a given SQL instance/machine.  For very simple departmental databases that are really just a database and a few logins to access the data and require no other features of SQL Server (agent, access to the OS, etc) then we can co-locate those databases on the same instance/machine without any issue.  Usually this involves copying the database and logins to another machine/instance and then updating a connection string on the web server.  

The benefit to this is we've eliminated a SQL license, we've eliminated another machine that will need to be patched, and we've eliminated all of the CPU cycles required to operate the OS and supporting software...we are only using the true cycles that the app/db requires.  Kinda sounds like the old TSO (time sharing option) systems from the 1960's.  The circle of life continues, what is old is new again.  

There are some drawbacks.  Since all of the databases will run under the same SQL service account, with the same global settings, shared memory, and tempdb, we need to be careful.  Departmental apps often simply log in as sa or as an account with sysadmin privileges.  You may need to re-evaluate this if security is a concern among consolidated databases.  Collation is another concern.  The databases must share the same collation.  Ah, but you say that SQL Server collations can be different on each database now.  That's true, but most applications use tempdb to create #temptables.  Those may not function correctly if tempdb's collation is not the same as that of your database.  You only get one collation for tempdb.  

If one of your databases is a resource hog (constantly reading massive amounts of data into and out of the buffer cache) it may contend with the other databases.  On a 32bit OS you have a fixed amount of RAM available for the buffer cache and you really don't want that contention.  

Consolidate Instances

Once you've considered consolidating databases to one instance, it's time to consider consolidating instances to a single machine.  VM hosts tend to be big, beefy boxes.  They need to be in order to handle multiple guests.  However, if you've ever looked at a quiet SQL Server you'll note there is still a lot of CPU activity.  This is due to AV and other data center monitoring tools, SQL Agent heartbeat activity, etc.  This is all wasteful if we just virtualize the SQL box and make it a guest using P2V.  

Instead, let's consider taking all of our instances and consolidating them onto a single server.  Your databases probably reside in the default, unnamed instance of SQL Server (the default instance for SQL Express is called \SQLEXPRESS).  You can create multiple named instances on a single machine.  What are the pros and cons...

Pros Cons
A single 32bit SQL instance can only address so much RAM (depending on Server OS).  If your SQL Server Host (running multiple instances) has lots of memory then each instance can run the maximum RAM without sharing buffer cache or experiencing memory contention.   Ensure your application can connect to a named instance/alternative port
Each instance can have a different service account, hence, different security needs can be addressed.   The CPU/disk can still be a bottleneck since they are still shared across instances.  
Each instance can have a different collation, global settings, etc If the db/app requires OS-level access (for file copying for instance) than inter-database security may still be a concern
$$$ (see next section)  

License Consolidation

Whenever you eliminate SQL Server hosts using instance consolidation you no longer need a license for that host.  A SQL Server license entitles you to multiple instances to a machine at no extra charge.  This is very compelling.  But, you should also consider the free SQL Express edition to save even more money.  SQL Express can run multiple instances on a single machine, the only caveat is the buffer cache is limited to 1GB and db size is limited to 10GB.  Before you scoff at using Express due to the limitations remember that we are only considering consolidating small databases anyway.  Also, remember that there are legal ways to bypass these limitations.  I've written an article about some interesting ways to use SQL Express in non-traditional ways.  License consolidation is just another method.  

Some other notes

Virtualization using Windows 2008 RS Hyper-V and Live Migration allows vms to move between hosts without any perceived service interruption.  

A P2V tool is included with System Center Virtual Machine Manager.  

Summary

When embarking on a database consolidatoin effort, don't just run a P2V on your SQL box and call it a day.  Really think about your architecture.  Can you consolidate databases on a single instance?  Can you consolidate many instances on a single machine?  Can you consider SQL Express to save your company money?  All of these ideas will make you a hero to management and a friend to the environmentalists.  

Data Tier Applications (DTA) vs My DB Porting Process

Data Tier Applications (abbreviated DAC) are the latest method in SQL 2008 R2 to make database deployment and versioning a little bit easier and native to the RDBMS stack.  A few years ago MS created database projects in Visual Studio that was a slightly different take on deployment and versioning.  Database projects are actually a superset and have some useful features, but I never thought the packaging/deployment stuff was all that great.  It's affectionately called Data Dude.  

Back to DAC...my take is that this technology won't take off and implementations will be few, leading to a quick deprecation similar to Notification Services (it died after 3 years).  The reasons are numerous but mainly database deployments are not the same, nor as simple, as your .NET code.  Database upgrades require you to understand the current state of the db and what must happen to the data to get it up to the current release.  DAC actually does a good job of this by running schema compares.  I'm impressed here.  But it still issues warnings and gives an admin choices as to what to deploy and skip.  That's the problem...I would never allow someone else to deploy my database upgrade piecemeal.  Like everything else in a database, the upgrade needs to be ACID.  

Obviously I like My Build Process better than anything else on the market.  It handles these situations cleanly, yet is still flexible.  Let's compare and contrast.  

Feature DAC My DB Porting Process
single unit of deployment that contains all elements used by the application (schema, db objects, instance-level objects) Yes Yes
centrally manage, monitor, deploy, upgrade applications using SQL Server Management Studio Yes Kinda...it can be centrally managed, just not with SSMS
requires additional licensing fees No No
can migrate an existing db to the DAC/DB Porting Process format Yes Yes
SQL Server release compatibility 2008R2 only for deployment. Package can be created on older versions 7,2000,2005,2008,2008R2
can utilize other RDBMS backends No Yes (Oracle only...but can be ported easily)
compares database schemas between 2 dbs (dev and prod for instance) Yes Yes
compares master/system/lookup data between 2 dbs (dev and prod for instance) No Yes
Integrates with any version control system easily Yes Yes
robust error handling (fail/continue on error, warnings, etc Yes Yes

The differences are in the flexibility when upgrading a database from version x to version y.  This is especially cumbersome when version x is not known but the process "figures it out" dynamically and knows just what to do.  DAC does a good job at this where all other tools require a source/target compare, which is difficult for deployment to thousands of customers.  

Feature DAC My DB Porting Process
Method to preserve data prior to upgrade copies db to new db standard sql backup
Method to upgrade data deploys new db/schema, copies data to new db from the copy, renames new db to old db name, maintaining the copy In place.  This will be *much* faster than DAC since most data in most upgrades won't need to be touched, yet DAC is always copying data.  
Upgrade is an online operation No Partially.  New objects (without dependencies) are deployed while the db is online, saving downtime.  
Can upgrade FROM any version TO any version Yes.  Has a "change detection" phase that looks for differences and auto-generates the SQL to make the change Yes.  Scripts are written assuming a change is always needed, hence will run *much* faster since the evaluation is not done during upgrade-time.  
Has a "confirmation page" before applying changes Yes. No.  Not needed, nor desired.  This should never be left to an anyone but the developer.  An admin should not make these decisions.  We either succeed, fail, or log a warning but the outcome of the scripts is always known in advance, we never leave decisions to someone outside of the approved change control process.  

Various DTA links

 

 

Keynesian Quote of the Day

Too large a proportion of recent 'mathematical' economics are mere concoctions, as imprecise as the initial assumptions they rest on, which allow the author to lose sight of the complexities and interdependence of the real world in a maze of pretentious and unhelpful symbols.  

--JM Keynes
 
I read this last night and totally forgot this quotation.  This is single-handedly the most important contribution Keynes made to economics in my opinion.  Even still it's the typically wordy Keynes style that makes him difficult to read and comprehend.  
 
I find it interesting that not a single current Keynesian/post Keynesian/Post-Keynesian/neo-Keynesian (yes these are distinctions with actual differences) actually believes this today.  I read constantly how economists have updated their models for x,y, and z reasons...personal consumption is up 0.03%, etc etc.  Do any of these people think that either a)they found "the" magic formula that is the divining rod for perfect economic equilibrium and prosperity or b)or their math is even remotely accurate?  Go read your Mises my friends!
 

A GUI-less Profiler

If you do a lot of troubleshooting like me you really love Profiler/SQLTrace.  I've written tons of scripts that manipulate SQLTrace (SQLTrace is the actual tracing mechanism, Profiler is the GUI most DBAs are familiar with) to handle, for instance, capturing data during a LoadRunner test and then later analyzing the data by grouping the calls to spot trends and bottlenecks.  

For adhoc and day-to-day profiling Profiler.exe is great.  However, the tool does not ship with SQL Express.  There are free, open source alternatives that are excellent (my favorite), but when I'm remoted to a customer's machine I really don't want to have to load up more software.  That just looks bad.  Our shipping software should be manageable without another tool.  

My solution is to create a little stored proc, dbo.Profiler, that accepts a START or STOP parameter.  The START parameter starts SQLTrace using a known set of events and filters that I care about (customizable) and writes the data to a file.  The STOP parameter stops and trace and closes the file.  The file is then read into a table called dbo.ProfilerTable that I can then examine.  Finally the .trc file is removed from the system.  

This works very well when I need to see the exact call that our application is making for a customer so I can do troubleshooting.  This process is also much easier to document for support personnel to run than something like Profiler.exe.  

You can download the script here...let's just look at some interesting parts.  

 

When you run SQLTrace the trace data is output to a .trc file.  We need to specify that location, and we also specify to rollover the file after 200 MB.  









We need to save this file location somewhere so we can stop the trace later and load the file.  I have an EAV (named value pair) table to store this kind of data.  









Here we create the trace and get the TraceID back from the call.  Again, we save that to our options table so we can stop it later.  













The syntax to specify the events and cols we trace is a little goofy.  We call sp_trace_setevent for the trace/event/data col that we want.  In this case 10 = RPC:Completed.  The data cols are documented in BOL.  This means that there are potentially *a lot* of these calls (something like 300 in this script.)




 




Here we apply the trace filters.  I filter Profiler's own traffic, although this isn't need in a SQLTrace script you will see it from the Profiler GUI.  SSRS is extremely chatty so I filter those calls as well. 







After the trace is set up with start it...































...and save the @TraceID to the EAV so we can programmatically stop the trace later.  





 
Here is the section that handles stopping the running trace.  



We first retrieve the @TraceID















Since we don't have the Profiler GUI to see the trace output let's just load it to a table.  We do this using ::fn_trace_gettable() which simply takes a .trc file and displays the results relationally.  Instead of just displaying them I load them to a table so I can requery the data easier later, by ORDERing the data differently, applying WHERE predicates, etc.  





Finally, I delete the .trc file.  We need to do this via xp_cmdshell, which is a security risk, so if the DBA has it disabled I want to respect that but re-disabling it when I'm through.   

 

Automatically Generating Restore Scripts

This is the third part of a quick refresher on database log backup options.  It seems like remembering all of the various RESTORE options, especially at 3AM when your system just crashed, is difficult.  It's also something we don't do everyday.  I like to make processes as automated and checklisted as possible.  Granted, you can't automate a restore, but you can automate a series of RESTORE commands that you can then double-check and execute in a pinch.  These scripts do just that (GenRestoreScripts.sql).  They are meant to work with my DBA-in-a-Box Scripts that automate your backups to a consistent file naming convention, but you can certainly change them to fit your environment.  Basically, once you have the .bak file(s) the script will generate the RESTORE commands by reading the headers for the relevant information.  

Let's take a look at the more interesting aspects of the scripts. 

















this is the only variable that needs to be set for the script





the script handles restoring over an existing db, that may not be the case so it's important to doublecheck the output SQL commands before running them 







Here we backup the "tail of the log" before any RESTORE operation



Here we begin building the string to RESTORE the database...




























However, we first need to know what mdf, ldf, filestream, etc files exist.  

After restoring the database and log sequence we restore the log tail and recover the database, bringing it online and synching logins to sysusers.

/*
given a full database .bak file, generates a restore script to restore the db properly
is coded to work specifically with DBA-in-a-box scripts


exec sp_backupAllFull 'c:\sqlbackups\'
exec sp_backupAllLog 'c:\sqlbackups\'
change variables below
then run on any sql server pointing to the local folder with the .bak files
*/

USE master;
GO 
SET NOCOUNT ON
DECLARE @dbName sysname
DECLARE @BackupFile VARCHAR(5000) , @PathOnly varchar(5000)
SELECT @BackupFile = 'C:\DATA\backups\MaxAppITB_201007201343.bak'
DECLARE @cmd NVARCHAR(500)
DECLARE @fileList TABLE (backupFile NVARCHAR(255))
DECLARE @lastFullBackup NVARCHAR(500)
DECLARE @lastDiffBackup NVARCHAR(500)
DECLARE @Position int

DECLARE @dbExists bit
SELECT @PathOnly = substring(@BackupFile,1,len(@BackupFile) - CHARINDEX('\',REVERSE(@BackupFile),1)+1)
SELECT @dbName = SUBSTRING (@BackupFile,(LEN(@PathOnly)+10),2000)
SELECT @dbName = SUBSTRING (@dbName,1,len(@dbName)-9)
PRINT '--restore scripts for ' + @BackupFile
PRINT '--for database: ' + quotename(@dbName)
PRINT '--these statements assume you are overwriting the existing db, are not changing the dbname,'
PRINT '--and are restoring to the most recent available log records'
PRINT '--you will likely need to change this script to fit your needs'


IF OBJECT_ID('tempdb..#DUMMYfiles') IS NOT NULL
BEGIN
      drop table #DUMMYfiles
END;

CREATE TABLE #DUMMYfiles...
SELECT @cmd = 'RESTORE FILELISTONLY FROM DISK = ''' + @BackupFile + ''''
INSERT INTO #DUMMYFiles EXEC (@cmd)
IF EXISTS (select * from sys.databases WHERE name = @dbName)
BEGIN
      SELECT @dbExists = 1
END
ELSE
BEGIN
      SELECT @dbExists = 0
END;
IF @dbExists = 1
BEGIN
      PRINT '--Step 1: backing up the log tail if possible to folder/setting database offline'
      PRINT 'BACKUP LOG ' + quotename(@dbName) + ' TO DISK = ''' + @PathOnly + @dbname + '_logtail.bak'''
      PRINT 'ALTER DATABASE ' + quotename(@dbName) + ' SET OFFLINE WITH ROLLBACK IMMEDIATE;'
END;
ELSE
BEGIN
      PRINT '--Step 1: restoring to non-existent db, no work to do'
END;
PRINT '--Step 2:  restoring db full backup, but not recovering'
PRINT 'RESTORE DATABASE ' + quotename(@dbName) + ' FROM DISK = ''' + @BackupFile + ''''
PRINT '     WITH '
DECLARE dave CURSOR FOR
     SELECT '     MOVE ''' + LogicalName + ''' TO ''' + PhysicalName + ''','
      FROM #DUMMYFiles
     ORDER BY Type
OPEN dave
FETCH NEXT FROM dave INTO @cmd
WHILE (@@FETCH_STATUS = 0)
BEGIN
      PRINT @cmd

     FETCH NEXT FROM dave INTO @cmd
END
CLOSE dave
DEALLOCATE dave
PRINT '     NORECOVERY, REPLACE;'
PRINT '--Step 3:  restoring log chain if possible'
IF OBJECT_ID('tempdb..#DUMMYHeader') IS NOT NULL
BEGIN
      drop table #DUMMYHeader
END;
CREATE TABLE #DUMMYHeader ...
SELECT @cmd = 'RESTORE HEADERONLY FROM DISK = ''' + replace(@BackupFile,'_full.','_log.') + ''''
INSERT INTO #DUMMYHeader EXEC (@cmd)
DECLARE dave CURSOR FOR
      SELECT Position FROM #DUMMYHeader ORDER BY Position
OPEN dave
FETCH NEXT FROM dave INTO @Position
WHILE (@@FETCH_STATUS = 0)
BEGIN
      PRINT 'RESTORE LOG ' + quotename(@dbName) + ' FROM DISK = ''' + replace(@BackupFile,'_full.','_log.')  + ''''
      PRINT '     WITH NORECOVERY, FILE = ' + convert(varchar(5),@Position) + ','
      PRINT '     STOPAT = ''' + convert(varchar(200),getdate()) + ''';'
      FETCH NEXT FROM dave INTO @Position
END
CLOSE dave
DEALLOCATE dave 
PRINT '--Step 4:  tail of the log restore'
PRINT 'RESTORE LOG ' + quotename(@dbName) + ' FROM DISK = ''' + @PathOnly + @dbname + '_logtail.bak'''
      PRINT '     WITH NORECOVERY,'
      PRINT '     STOPAT = ''' + convert(varchar(200),getdate()) + ''';'
PRINT '--Step 5:  recover db and bring online'

PRINT 'RESTORE DATABASE ' + quotename(@dbName) + ' WITH RECOVERY;'

--finally, some cleanup steps after any restore. 
PRINT 'ALTER DATABASE ' + quotename(@dbName) + ' SET ONLINE;'
PRINT 'ALTER DATABASE ' + quotename(@dbName) + ' SET MULTI_USER;'
PRINT '--Step 6:  sync logins/users...this only needs to be run if the backup is being restored to a different server'
PRINT '--this will sync logins if possible, else create a new sql login with a password = user account name'
declare dave cursor for
      select 'exec sp_change_users_login auto_fix, ' + CHAR(39) + name + CHAR(39) + ', NULL, '
      + CHAR(39) + name  +  CHAR(39)
      + CHAR(13) + CHAR(10)+ 'Go'
      from sys.sysusers
      where islogin = 1 and issqluser =and hasdbaccess = 1 and sid > 0x01
OPEN dave...


 

   

The output of the script should be something like this...which is a very good jumping-off point when you need to restore a database quickly from a .bak file.  

 

--restore scripts for C:\DATA\DebbySCO_201006041207.bak
--for database: [DebbySCO]
--these statements assume you are overwriting the existing db, are not changing the dbname,
--and are restoring to the most recent available log records
--you will likely need to change this script to fit your needs
--Step 1: restoring to non-existent db, no work to do
--Step 2:  restoring db full backup, but not recovering
RESTORE DATABASE [DebbySCO] FROM DISK = 'C:\DATA\DebbySCO_201006041207.bak'
     WITH 
     MOVE 'MaxAppDB' TO 'C:\data\Debby_3.mdf',
     MOVE 'MaxAppDB_log' TO 'C:\data\Debby_3_1.LDF',
     MOVE 'BLOB' TO 'C:\data\Debby_3_2.MaxAppDB_BLOB',
NORECOVERY, REPLACE;
--Step 3:  restoring log chain if possible
RESTORE LOG [DebbySCO] FROM DISK = 'C:\DATA\DebbySCO_201006041207.bak'
     WITH NORECOVERY, FILE = 1,
     STOPAT = 'Aug 17 2010  4:04PM';
--Step 4:  tail of the log restore
RESTORE LOG [DebbySCO] FROM DISK = 'C:\DATA\2010060_logtail.bak'
     WITH NORECOVERY,
     STOPAT = 'Aug 17 2010  4:04PM';
--Step 5:  recover db and bring online
RESTORE DATABASE [DebbySCO] WITH RECOVERY;
ALTER DATABASE [DebbySCO] SET ONLINE;
ALTER DATABASE [DebbySCO] SET MULTI_USER;
--Step 6:  sync logins/users...this only needs to be run if the backup is being restored to a different server
--this will sync logins if possible, else create a new sql login with a password = user account name
exec sp_change_users_login auto_fix, '##MS_PolicyEventProcessingLogin##', NULL, '##MS_PolicyEventProcessingLogin##'
Go
exec sp_change_users_login auto_fix, 'maxappuser', NULL, 'maxappuser'
Go

More on Log Backups

After covering tail of the log backups, which most people forget about, I thought I'd cover the other log backup options, what they do, and their differences.  These have changed in recent releases so what you thought you knew may no longer be applicable.  

 

What is the difference between COPY_ONLY and NO_TRUNCATE?

NO_TRUNCATE is COPY_ONLY plus CONTINUE_AFTER_ERROR.  COPY_ONLY does not mark the appropriate log records as inactive, just like NO_TRUNCATE.  COPY_ONLY is used so an ad hoc log backup does not break your log chain that your DBA has established.  Say for instance your DBA backs up your logs every 10 mins using a 3rd party tool, or maybe has a complicated process that copies the data to a log shipped warm spare, or copies it directly to a NAS device.  Let's also say you need to run an UPDATE command 

OK, then what's TRUNCATE_ONLY/NO_LOG?

SQL Server 2005 and newer doesn't allow these options anymore.  

But the theory was...log records that are inactive (ie, the data was committed to the data files) are usually backed up via BACKUP LOG, then truncated.  TRUNCATE_ONLY (or NO_LOG) skip the BACKUP portion and simple truncate the records.  This should only be done when

  1. your disk with your ldf file is entirely and completely full.  When this happens you can't even add a new ldf file to another drive because the process of adding an ldf file actually causes log records to be written to the original ldf.  
  2. your disk with your PRIMARY datafile is entirely and completely full.  In this case the log records to indicate that a new mdf/ndf was added to the db can be written, but the metadata cannot be written properly to the PRIMARY data file.  
  3. you can't find any disk space anywhere to do a proper log backup and you are willing to sacrifice your log recoverability because your ldf is growing and you need to truncate the old log records.

Obviously these scenarios shouldn't happen often.  I've actually seen this happen at a government agency once.  Oy vey.  Whenever you issue a TRUNCATE_ONLY your log chain is broken and you should try to take a full db backup ASAP.  

Since these options are discontinued the appropriate way to handle the situations above is to simply put your db into SIMPLE mode.  I'm not sure if that process requires metadata writes to either the mdf or ldf so I'm not sure if my scenarios above will be handled correctly anymore.  I have no time to test since those scenarios are frankly ridiculous.  

Backup And Recovery Page

Window Function Enhancements

As I mentioned earlier, Microsoft didn't implement the full ANSI standard for window functions and specifically the OVER clause.  The more you understand these things the more you really really wish MS would drop everything and get the standard implemented soon.  There is a summarized list of cool, but missing, features of window functions that might whet your appetite and get you to think about set-based processing a little more.  Since the article is sooo long, here is a summary...

  • TOP OVER syntax...for instance, give me the TOP 5 items in each group.
  • DISTINCT...for instance...COUNT(DISTINCT colA) OVER (PARTITION BY colB)...is the number of distinct items different from the number of items...or "do I have duplicates?"  
  • LEAD/LAG...these functions have the ability to perform calculations using values from the next x rows (LEAD) or previous x rows (LAG)
  • using UDFs in the OVER clause...for instance, Pareto distribution calculations (you know this as the 80/20 rule)...currently we would need to calculate a running total and divide that by the Total.  We could write our own function and use that in the OVER clause to create our own windows.

Long Iron Condors...my new favorite options play

An iron condor utilizes 2 vertical spreads, a bull put and a bear call spread.  The profit/loss graph kinda looks like a bird.  It is considered "iron" because it has components above and below current market price of the underlying.  

I generally just sell vertical spreads as my option strategy, simply to generate a little income.  I almost never buy puts or calls for leverage purposes on directional bets, although I did buy a bunch of calls in March of 2009 when it seemed like every stock was way too depressed.  I also don't just sell vertical spreads willy nilly.  It's said that Victor Niederhoffer lost his shirt selling premium like this when the markets suddenly went against him dramatically and he couldn't close out his positions fast enough.  I guess that's the six sigma event for you.  Yeah, it's gotten me too in the past but I like to only sell vertical spreads on stocks that I've studied carefully, I understand how they react to market events, their market correlations, and where their IV always seems higher than the historical volatility.  I also only like to sell vertical put spreads in cases where I wouldn't mind owning the underlying anyway (and have the funds).  I also only focus on selling when the VIX is over 20, otherwise volatility can really only go up, which always works against the seller.  And yes, I'm always ready to bail out.  

To be long an iron condor is to sell the put and call that are closer to the underlying's price, and buy the put and call that are farther out from the underlying's price.  Basically two net credit spreads.  The benefit is margin is only needed for either the put or call side since only one side can ever be ITM at a time.  I especially like long iron condors now since the markets appear range-bound, yet volatility is very high (in my opinion).  Further, ameritrade offers this at a discounted trade price via the thinkorswim interface.  

Tags: 

RAID 60

I always recommend RAID 10 (mirrored stripe set) when cost isn't an obstacle and high performance is needed.  Of course the standard caveats about testing and YMMV apply here.  RAID 50 (mirrored RAID 5) seems to be what our SAN guys keeping pushing us towards.  I don't like anything that has RAID 5 in it because of the RAID 5 Write Penalty which hinders performance on a write-heavy database.  I doubt we'll ever see an end to new RAID levels and just recently I started hearing RAID 60 being touted much more by the SAN guys.  RAID 60 is RAID 6 + 0.

Pages