DaveWentzel.com            All Things Data

Data Models and Data Organization Methods

This is the next post in my series on NoSQL alternatives to RDBMSs.  How a datastore persists its data can be roughly divided into the following categories.  There may be others, depending on your perspective.  It can also be argued that some of the types below are subtypes of others.  
Data Model Description Has Schema/No Schema Relationships are Modeled As... Data Is... Examples Possible Disadvantages
Relational We probably all know this by now.   Has Schema Predefined Typed SQL Server, MySQL, Oracle Everybody has different opinions on this.  
Probably the oldest electronic data storage mechanism.  Data is modeled in a pyramid fashion (parent and child records).
Probably the oldest electronic data storage mechanism.  Data is modeled in a pyramid fashion (parent and child records).With the advent of the relational model most hierarchical data stores died out.  However, XML started a bit of a renaissance in hierarchical data stores.  Many NoSQL solutions could be categorized as hierarchical.  
Has Schema Predefined Typed IBM IMS, the Windows registry,XML
  • relationships among children is not permitted. 
  • Extreme schema rigidity.  Adding a new data property usually requires rebuilding the entire data set
Network very similar to hierarchical datastores.  It was the next evolution of the hierarchical model, allowing multiple parents to have multiple children.  Much like a graph. Many healthcare data stores are based on MUMPS which falls into this category.  "data" is stored as data and methods (instructions on what to do with the data) Has Schema Predefined Typed CODASYL, Intersystems Cache and other Object-oriented databases Never gained much traction because IBM wouldn't embrace it and the relational model came along and displaced both.  
Graph could be considered the next generation of network data stores.  These are prevalent in the social-networking space where applications like Twitter, LinkedIn, and Facebook want to visualize your "network" for you.  Nodes represent users who have relationships (the edges) to each other.  Modeling this relationally is challenging at best.  A graph system can query these structures easily.  Basically, if you can model your data on a whiteboard, then a graph database can model it too.   It depends (but generally no schema) Edges Ad hoc Neo4j data is usually untyped.  Very complex.  
Document designed for storing document-oriented data, such as XML, or semi-structured data.   No Schema None (generally) Typed CouchDb, MongoDb Does not support ad hoc reporting tools, ie Crystal Reports
Columnar stores its data as columns or groups of columns of data, rather than as rows of data.  Good for OLAP applications and anywhere where aggregation query is important.   Has Schema Predefined (similar to relational) Typed HBase generally not good for OLTP applications.
Key-Value stores its data like an EAV model.  Entirely schema-less.  Data can be stored as byte-streams so you can persist the programming language's objects directly in the key-value store No Schema Links via keys Semi-typed Redis, Riak  

Add new comment