Myths: SQL Server Index Fragmentation and REORG vs REBUILD

I think there are a lot of misconceptions around index fragmentation in SQL Server.  I've written about this before [[Index Fragmentation Issues|here]], [[Rebuild Indexes|here]], and [[Forwarded Records|here]].  But I guess I was not clear enough.  Some people swear by REORGs, some swear by REBUILDs.  This should not be a religious debate, they do different things and solve different problems.  

In many/most cases we want to eliminate fragmentation to make index scans better performing.  When data is logically fragmented (the left to right linked list is unordered) we incur additional physical IO which manifests itself with a waittype of PAGEIO_LATCH and far more logical/physical IO then should be reasonably necessary to satisfy the query.  I frankly do not care if an index is 98.99999% fragmented if I'm only doing an index seek (WHERE ObjId = @Param).  In that case fragmentation is MEANINGLESS and does not need to be fixed and will not cause perf problems.  An "index scan" is something like (WHERE ObjId BETWEEN x AND y).  The other ancillary problem is that data is pulled into memory a page-at-a-time.  If my pages are compact, then I can fit more rows into memory, etc.

Myth #1:  If your index is >x% fragmented, use REBUILD, not REORG.  

If your fragmentation is above x use REBUILD, else use REORG.  That's just not true.  BOL quotes numbers that really are just made up.  


Myth #2:  Physically ordered indexes are better, therefore I must use REBUILD
Let's assume an index is fragmented such that its pages on disk, in left-to-right order are (Page 18, 27, 2, 19, 78, 47, 118, 419, 12, 900, 40).  On an index scan operation the disk will thrash (SAN, SSDs, and disk virtualization technologies can sometimes help or hurt this situation, but let's just assume good 'ol rotational media) on this.  REORG fixes the logical order (Page 1, 15, 18, 27, 100, 107, 450) by shuffling pages, but REBUILD makes them physically ordered as well (Page 1,2,3,4,5,6).  Logical ordering is all we really require to avoid the disk thrashing problem and PAGEIO_LATCH waits.  We actually get almost no additional benefit from having things phyiscally ordered.  Sounds weird, but it's true.  I have even seen people quote that if your data is physically ordered then you will get larger I/O's, hence greater throughput.  I haven't seen any tests to prove or disprove this, and I don't have the time to do it, but I think the premise is wrong.  I have written previously [[Sequential or Random Reads?|here]] that Sequential Reads will get you larger I/O's than random reads, which is a general axiom of I/O theory.  When REORG puts things in "logical" order you should therefore have a sequential read.  Sequential read equates to logical ordering in my mind, not physical ordering.  That's an important distinction.  
Myth #3:  REBUILD is better at compaction
Both REBUILD and REORG have a "compaction" phase.  REORG's is very simple, it says that if a page is not referenced, remove it.  REBUILD says that, when possible, we will remove "air" from data pages and get them to resemble more closely their FILLFACTOR settings.  REBUILD is "better" at compaction because it can allocate *new* data pages to do the compaction, it isn't limited to just shuffling "existing" pages.  The last difference is that REBUILD tries to eliminate mixed extents (8 data pages belong to more than 1 object in a given extent).  In SHOWCONTIG you'll see mixed extent problems as being EXTENTS SCANNED being much less than EXTENT SWITCHES.  This all boils down to REORG doing LOGICAL only ordering and REBUILD doing PHYSICAL ordering.  But honestly, the performance difference between the two is neglible on an INDEX SCAN, which is the usual problem you are trying to solve.  Logical ordering is all you need, which is why MS gave us REORG/INDEXDEFRAG back in SQL 2000 to compliment the existing tools.  They either needed to give us that, or make all REBUILDs ONLINE, which they didn't until 2005, and even then there are lots of caveats.  
Myth #4:  REBUILD gets me back to my original FILLFACTOR, which is always good
REBUILD has a huge side-effect no one talks about and can kill performance.  Note in my last paragraph where I said REBUILD is really just getting your pages back to their original FILLFACTORs.  Let's think about that for a sec.  If my current data pages don't reflect my declared FILLFACTOR, then doesn't that mean I didn't design my FILLFACTOR correctly?  It's not SQL Server's fault!!!  I didn't factor in page splits, row modifications, and row deletions properly into my design.  When a data page does not match it's FILLFACTOR then you have a design issue.  This means that over time a whole bunch of page splits occurred which are performance-degrading.  If you run REBUILD you are removing the "air" that has naturally occurred in your data page over time.  Post-REBUILD you have a nice compact index, but SQL Server has to start page splitting again which causes even worse performance.  And what happens over time? have a fragmented index that doesn't match its FILLFACTOR.  What do you do?  You run REBUILD.  And the vicious cycle continues.  You really never solve anything doing things this way.  So, you really need to determine if your declared FILLFACTORs are correct.  
There are caveats where a REBUILD/resetting of FILLFACTORs might be a good idea...after a conversion event, mass data loading operation, etc.  But rarely on a steady-state system. 
This is a standard industry approach to fragmentation in most RDBMSs.  Example...Oracle has a REORG equivalent command, but no REBUILD.  Their solution is to fix your FILLFACTOR (they call it PCTFREE, etc) and reload your table.  Again, it's a design problem.  
Myth #5: REORG holds shorter locks, so it is better
REBUILD WITH (ONLINE=ON) generally doesn't hold locks that would affect concurrency either.  
Myth #6: One takes longer than the Other, Which is Why I Prefer the Latter
I've seen this myth as "REBUILD takes longer" and as "REORG takes longer".  If an index is gigantic and heavily fragmented then common sense dictates that REORG will take longer because it needs to visit and shuffle each page.  REBUILD just creates a nice, new compact index.  But in my mind I really don't care which takes longer, I'm more interested in ensuring I don't affect concurrency and the final product is what I need.  
Myth #7:  One takes more transaction log space, so it should be avoided
This correlates roughly to Myth #6.  And I've seen it touted that both are tran log hogs.  Again, I don't see why it matters, I want the best index at the end of the day.  It is true though that REBUILD will hold longer transactions and you risk log_reuse waits and replication delays.  
So, which should you use?  It depends.  You really have to understand the problems each is designed to solve.  In general I do have some guidance:
  • on steady-state systems I prefer REORG.  
  • If my FILLFACTORs are wrong I prefer REBUILD WITH (ONLINE=ON) so I can fix those fillfactors.
  • If I just did a bulk loading of data I like to do a REBUILD to suck out the air and get the pages nice and tight.