DaveWentzel.com            All Things Data

All About Fact Tables

 

I tend to spend more time thinking about dimension tables than fact tables, so I quickly put together some notes/thoughts that I use when working with and designing fact tables in a data warehouse. 
 
Grain
The grain of the fact table is declared at the very start of the design.  The grain tells you what each table row business event represents.  It is a measurement.  It could be an order event, a sales event, an invoicing event, etc. 
 
When in doubt, keep your fact table as fine grained as possible.  You can always aggregate later, but drilling down is next to impossible without a complete redesign later. 
 
Foreign Keys
Once the grain of the fact table is known you can now focus on the foreign keys of the nodetitle.  Again, make sure those FKs have the same grain as the fact table. 
 
Not every row in the fact table will have all FKs populated. 
 
Types of Fact Tables
Type
Fine-ness of Grain
Sparse or Dense
FKs Represented 
Uses
Transaction Grain
finest grain
depends on the application. If facts are arriving on a given schedule the density will be well-known.  An order entry DW will be unpredictable...sparser during a recession, denser during the holiday shopping season
Not all will be represented, necessarily. 
order entry and anywhere that requires the finest grain possible
Periodic Snapshot Grain
less fine grain
density will be known.  If I am snapshotting at the month level per account then I know I will have total rows/yr = # accts x 12. 
All.  The point of this type of fact table is to ensure we can aggregate and "slice and dice" reliably. 
accounting
Accumulating Snapshot
usually less fine
it depends.  We use this type to track "workflow".  For instance, one row to track an order, when did it ship, when did it arrive, when was it paid, when was it returned.  These tables are updated as additional steps in the workflow are completed.  Generally fact tables are not updated, this is the exception. 
Not all will be represented and more will likely be added over the life of the row. 
claims processing, help desk resolution
 
 
Good Fact Table Design Principles
  • measures in a cube should be additive.  It's really not useful to store an average in a cube. 
 
 
Other Notes
Facts should always be additive across dimensions and consistent with the grain.  There will be exceptions when some numerics are not additive (temperatures, acct balances in some cases) but if something is not additive across dimensions you might want to think about the design again. 
 
 

Add new comment