DaveWentzel.com            All Things Data

Performance Page

Performance Problem Diagnosis Methodology

I'm often asked after solving some long-standing, difficult performance problem that couldn't be solved what methodology I use to diagnose and solve problems.  My answer usually varies depending on the problem just solved, but that's not really the correct answer to the problem.  The cop-out answer is simply "experience and a lot of study." 
 
Read about PerformanceCollector...my utility that catches lots of interesting performance metrics and problems on a running SQL Server.  Then, download the code.  
 
The methodology I use is a lot like how a doctor diagnoses an illness...one of the reasons I love watching the show House on Fox.  If you watch House then you know every show follows a script that involves, among other things, House (or his staff) believing root cause was found and then realizing in the next 5 minutes that what he thought was root cause is really just a symptom of some other problem yet discovered. 
 
That's my methodology...understanding the difference between 'symptom' and 'root cause'.  The next question is...how does one do that?  First, you need a working knowledge of more than just your problem domain.  House is a diagnostician, his original staff was composed of a surgeon, a neurologist, and an immunologist.  His new staff involves a different set of specialists (including a plastic surgeon).  The point is...it's a well-rounded staff.  DBAs need to be well-rounded too in the other disciplines which touch their databases...networking, SAN, systems administration, development, etc.  You don't need to be experts, simply have a good working knowledge of the topics.  The same can be said for performance engineers, data architects, developers, whatever.  A developer that doesn't understand basic data modeling or basic index structure concepts isn't the best developer he can be. 
 
I have attempted to automate the gathering of important performance data (deadlocks, locking/blocking/waiting data, index utilization, cache stats, etc) with my Performance Engineer-in-a-Box scripts.  These deploy quickly and within a few minutes will provide you with detailed performance information on your SQL Server.  
 
Don't just rely on tool-based knowledge to diagnose performance problems (Profiler, PerfMon, third party tools)...*understand* what the data those tools provide you with actually means.  Also, an elementary knowledge of Performance Management concepts is helpful.  I have a section on my website dealing with Performance Management topics such as queueing theory, knee of the curve analysis, load testing, and probabilities. 
 
This page is a collection of data that I find useful in diagnosing performance problems.  This is by no means everything.  I have SAN specific information, for instance, in other wikis elsewhere on the site.  I'm working on consolidating everything into a more logical view. 
 
The first table is a top-down performance diagnosis progression that I use when dealing with a completely new system to me, such as on a new contract assignment.  Start with items outside of SQL Server...make sure the OS is set up correctly.  After that look at SQL Server configuration items, then drill down further. 
 
The second table is basically the same information, but organized by topic vs by methodology. 
 
Diagnosis Progression Table
First, make sure your OS is working optimally.  Yes, often the underlying data design sucks, which causes resource utilization problems, but those issues take time to fix.  The focus here is determining that the given resources are functioning optimally. 

Hardware is cheap, so a quick fix might be to add more of it.  But often adding more hardware does very little for performance. 
 
PerfMon is your friend at this level of monitoring detail.  Contrary to urban legend PerfMon adds very little overhead (if done correctly) to a system because the performance buffers are not locked and are updated whether a monitoring tool is running or not. 
 
For memory, don't just look at PerfMon, find out exactly how much physical RAM the sysadmin installed.  Ensure you are seeing that much in Windows and SQL.  If I had a nickel for every 32 bit server that had 64GB physical RAM but didn't have AWE/PAE set correctly. 

SQL Server Setup

  1. Memory Configuration 
  2. tempdb
  3. Parallelism

Know how your SQL Server is configured.  Are you using AWE?  Is it configured correctly?  How about data file placement on the LUNs?  Parallelism? tempdb? 

There are tons of items that could be mentioned here.  Do your due diligence and ensure every item of the SQL setup is configured for optimum performance. 

Virtualized SQL Server Peformance Best Practices

SQL Server Utilization
Now that you have a feel for overall system performance it is time to dive into SQL Server's subsystem performance.  I have a whole section of my website dedicated to this.  What we need to know here is the information various DMVs show us.  You need to know the waits, queues, blocking, locking, and latching.  Most DBAs don't know how to diagnose at the subsystem level, rather, they jump right to Profiler.  This is because prior to SQL 2005 the subsystem was not easy to monitor unless you understood the nuances of sysprocesses or some poorly documented system views.  Now we have DMVs for this.  Know your DMVs. 
 
Surprisingly, Oracle has had the equivalent of DMVs for over 20 years (the v$ dynamic performance views).  What Oracle still lacks is the equivalent of SQL Profiler (or more appropriately, SQL Trace).  This is why SQL Server performance engineers tend to attack things so much differently from Oracle performance engineers. 
 
Next, look at Profiler.  There are very useful canned Profiler templates, or my website has various scripts and templates I find even more helpful.  Know Profiler. 

General DBA Items

It's best now to start looking at some basic DBA tasks that might be overlooked.  Is any index maintenance occuring at all, if so when?  This includes index defrags, reorgs, and rebuilds.  Run SHOWCONTIG or its equivalent and look for any issues. 

Also, look now to ensure backups are running correctly as well as other DBA tasks like SQL Agent jobs.  You think these don't cause performance problems?  Have you ever seen a busy transactional system taking log backups every 5 minutes to the same LUN that has the log files, data files, and tempdb? 

By now you should have a good handle on the performance issues of your system.  You may even have a list of queries that need tuning based on your analysis of Profiler traces and subsystem info.  Plug those queries into SSMS and look at the execution plans.  This link will help you read those plans and make beneficial changes
 
 
 
 
Monitoring Tools
Other Topics
I/O Issues
PAL Tool: Performance Analysis of Logs...helps you determine what perfmon counters to collect and how to analyze them.  
DBCC FLUSHPROCINDB
DBCC PROCCACHE
     PSSDiag
ClearTrace (simplified Profiler analysis)
TraceAnalyzer (similar to ClearTrace)
 
 
SQL Nexus (used to analyze sqldiag traces)
DMVStats (performance data warehouse)
 
Information on using sp_ for procedure names
Stored Procedure Logging ... this is a routine that captures runtime metrics (IO, cpu time, duration) within any stored procedure. 
DBCC SQLMGRSTATS
  • Memory Used (8k Pages): If the amount of memory pages is very large, this may be an indication that some user connection is preparing many Transact-SQL statements, but it not un-preparing them.
  • Number CSql Objects: Measures the total number of cached Transact-SQL statements.
  • Number False Hits: Sometimes, false hits occur when SQL Server goes to match pre-existing cached Transact-SQL statements. Ideally, this figure should be as low as possible.
 
 
 
Data Access Performance
     LINQ
     ORM Notes
 
 
 
 
    Forwarded Records  
 
 
 

1 comments

Hi Dave. I was searching for some SQL performance issues and came to your blog. Here in this page you mention a "Performance Engineer-in-a-Box" and link it, but when we click on it the site says I can't access it.

Is it really locked or is this a bug?

The message I get is:
"Access denied
You are not authorized to access this page."

the link is:
http://www.davewentzel.com/freelinking/Performance%20Engineer-in-a-Box

thanks,

Add new comment