DaveWentzel.com            All Things Data

JOIN Operators



IO and CPU Concerns

Certain types of operators are more CPU intensive than others. By their nature, the Hash operator and Sort operator scan through their respective input data. With read ahead (prefetch) being used during such a scan, the pages are almost always available in the buffer cache before the page is needed by the operator. Thus, waits for physical I/O are minimized or eliminated. When these types of operations are no longer constrained by physical I/O, they tend to manifest themselves by high CPU consumption. By contrast, nested loop joins have many index lookups and can quickly become I/O bound if the index lookups are traversing to many different parts of the table so that the pages can't fit into the buffer cache.


Semi Joins

return a row from one input if we can find at least one match in the second input.  Commonly seen in queries like select * from tblA where exists (select * from tblB where tblA.id = tblB.id). So, EXISTS queries, IN queries, and even INNER JOINs where no column data is evaluated from the second table (so, basically, an existence check). 


You may also see things like "Left Anti-Semi Join".  You'll see these in subqueries that evaluate NOT EXISTS clauses. 


Are there any performance problems with Semi Joins?  Maybe.  Assume tblB has 100 rows and tblA has 1 Million.  The query plan will probably be a hash match Right Semi Join...indicating a hash table is created against the smaller tblB, then probed against the larger tblA.  Assume we are table scanning or clustered index scanning to do this.  We know Nested Loops are more efficient (see below) so we may be tempted to put an index on tblB.id.  But that won't change anything since nested loops can't operate on Right Semi Joins.  Likely even with the index we won't get a nested loops join because of all of the index lookups being more expensive than the hash operation. 


If we put an index on tblA.id we will likely see the left semi join replaced with an NESTED LOOPS/INNER JOIN.  Now, instead of touching all rows in tblA we are performing only index lookups. 


You should think about using this trick (transforming Semi Joins to Inner Joins) when the table with the "preserved data" has many rows and an index seek is not being used in the query plan.  It is also valuable if the "tblB" (lookup table) contains many duplicates.  The semi join doesn't care about duplicates.  We will still see hashing, but it will likely be a has match aggregate operation...so again, not every row will need to be touched. 


Hash Joins


Used for larger inputs and require more memory than Nested Loops joins. 


Always evaluate hash joins.  Often they are used b/c there are no indexes that the optimizer can use to perform a more efficient nested loop or merge join.  If you see this on a query executed frequently you really should try to add the necessary index.  They are also CPU intensive.  Queries with parallel execution plans do, however, need to use hash joins to merge the parallel streams and are probably OK in this situation. 


Why would a hash join be used when there is no index?  If there was an index than the result would be ordered and a merge would be far more efficient.  


But my table does have the necessary index yet a hash join is still used?  in this case I'd bet the index is so fragmented that the optimizer knows that the physical reads will be so expensive that the hash join is a better choice.  (Just a guess since I've actually seen this once in the past).  So, basically a physical scan would be less expensive than seeks all around the disk.  


In many cases you may experiment and see that you made some changes and the logical/physical IO decreased when you moved to hash joins.  Don't get a false sense of security.  IO might have decreased but you are going to take a huge hit in CPU by the hash tables that will need to be created, not to mention the probing, not to mention the fact that no "work" can be passed to the next step of the query plan until both hash tables are created and probing has started.  This is different from a merge or nested loop join that doesn't require as much upfront setup.  For this reason a Hash Join is considered a "blocking" operator. 


The "hash concept":  Two inputs...the "build input" and the "probe input".  The build input is used for hash buckets (essentially a grouping of data elements).  A hash value is calculated for each row from the build input.  The second set of rows is hashed and “probed” into the first.  Hashing is very fast on modern processors. 


HASH JOIN, HASH AGGREGATION, and HASH UNION (may be others too) require memory to manage the hash buckets.  In the XML query plan this is shown in the MemoryGrant attribute.  The number is shown in KB (so 2874 is 2874KB).  Other operators in the query may also need a memory grant, but if your query is utilizing a lot of memory you should probably start looking here. 


In the query execution plan the upper arrow into the hash join should have fewer rows than the lower arrow, else you are probably hashing the wrong input and may need a join hint.  How to Read a Graphical Plan


Hash Match



--There are at least 3 different logical operators for Hash Match (Hash Match/Inner Join)
--Hash behavior is different based on the logical operator being performed
--Performance Ramifications:
     --In general, hashing is efficient
     --For DISTINCT operators try to write a query or design your tables such that the optimizer will not need to handle duplicate rows
     --Try to limit Union operations by specifying UNION ALL when possible

     --look for WHERE clauses performing calculations that make it non-sargeable

Make sure that you investigate hash joins in a query execution plan. A hash join may be the best option, but frequently, a hash join is selected because there are no indexes that the optimizer can use to perform an efficient nested loop or merge join. In the absence of indexes, a hash join is the best option. However, better indexing may occur from a nested loop or merge join. Note that hash joins are also fairly CPU intensive. If you have high CPU usage, and you do not feel that enough work is being performed against the server to justify this, evaluate the execution plans by using SQL Profiler to find out if you have a lot of hash joins.


Queries that use a parallel execution plan often have to perform hash joins to recombine the finished parallel streams. Hash joins in this scenario are usually optimal and should not be a concern.

Problems with Hash Operations
As mentioned above hash operations do require memory.  Occassionally you won't have the memory necessary to perform hashing effectively.  In this case sql server starts spilling rows into tempdb.  This can be detected using SQL Profiler and the "Hash Warning" events.  Obviously this creates IO pressure and poor performance. 



Key Lookups...formerly known as Bookmark Lookups ...follow the link...I moved this to its own page.



RID Lookup

These are basically Bookmark Lookups that occur against heap tables.  Since there is no clustered index a RID is used for the lookup. 



Nested Loops



--Performs inner joins, left joins, left semi-joins (no right semi-joins [see above]), and left anti-semi joins
--works best on smaller tables. 

--Searches the inner table for each row in the outer table
--Performance Ramifications
     --Use an index if possible
     --If a Sort occurs on the outer input then a covering index or clustered index will help
          --The sort is done to improve locality of the searches on the index over the inner input. 

     --ensure the lower arrow into the JOIN operator affects fewer rows than the upper arrow, else you are looping over the wrong table.  See nodetitle for more. 
--The nested loop operator will return any rows that satisfy the optional predicate in the Argument column. 
Performance Ramification:
     --If your query has a WHERE predicate, try moving the condition to the JOIN/ON
SELECT a.*, b.*
FROM a JOIN b ON a.id = b.id
WHERE <condition>
SELECT a.*, b.*
FROM a JOIN b ON a.id = b.id AND <condition>



Merge Join

  • Works with all joins and UNION operations

  • Requires two inputs, both sorted on the required columns.  Data is then “merged” together

  • the operation will *usually* only scan each input once.  BUT, when there is a many-to-many join.  In this case a worktable is created (very costly obviously) to make the algorithm more efficient.  You can see this in the query plan.  Although this is still efficient you may be surprised to find that the worktable was not even needed because you had a join condition incorrect. Don't immediately assume that the presence of the many-to-many join is problematic, it's still probably more efficient than the alternative hash join. 

  • If required an explicit Sort operator will precede the Merge operation

  • Performance Ramifications

    • Avoid the explicit Sort if possible by using an index.  

    • For UNION operations consider using UNION ALL.

    • Potentially an extra scan is eliminated since duplicates are not removed in UNION ALL operations