As I mentioned in the last couple of blog posts, I've been tasked with looking at BigData and NoSQL solutions for one of the projects I'm working on. We are seriously considering SAP HANA. HANA is an in-memory data platform that is deployed on an appliance or in the cloud. It is mostly focused on real-time analytics. SAP HANA is an optimized platform that combines column-oriented data storage, massively parallel processing, in-memory computing, and partitioning across multiple hosts.
HANA optimizes your data on the fly. It can compress data and convert to/from columnstore/rowstore data depending on how you are using your data. For instance if calculations are single column-based then columnstore storage is chosen.
It has a standard SQL language, called SQLScript, which is very similar to TSQL, supports MDX (just like SQL Server...so it works well with Excel) and ABAP, and has standard jdbc and ODBC drivers. It is ACID compliant. This makes the move from a standard SQL RDBMS a little less painful. SAP provides prepackaged algorithms optimized for HANA.
My project is an Accounting System with lots of aggregate tables that holds summarized data at various grains of detail. We aggregate, of course, because aggregated data is faster to read than performing the requisite calculations on the fly. With HANA the aggregate tables are not needed. They believe that they can retrieve the necessary aggregated data by querying the column stores directly, in-memory. This of course would simplify our data model tremendously since we wouldn't need all kinds of logic to populate the aggregated tables. ETL goes away. We simply compute on the fly. This would eliminate a lot of our data and a lot of our storage costs.
Like many financial applications, ours is batch-oriented. There are certain long-running, complex calculations we just can't do real-time with an RDBMS. With SAP HANA “batch is dead”. Even if HANA can't support your most demanding batch jobs, at a minimum they become "on demand" jobs instead. Operational reports can run real-time on the OLTP system...no more need for an ODS.
It will be interesting to see where this proof of concept leads us.
Where Can I Download It?
You can't. But SAP gives you FREE development instances with all of the tools you need on their website (see below). Here's why...SAP HANA is sold as pre-configured hardware appliances through select vendors. It runs on SUSE Linux SLES 11. It uses Intel E7 chips, Samsung RAM, Fusion IO SSD cards, and standard 15K rotational media for overflow and logging.
Where Can I Get More Information?
The SAP HANA website provides lots of free information. Lots of example use cases and code.
The SAP HANA Essentials ebook is being written in "real time". Google around for the free promo code and then you don't have to pay for it. It is being continuously updated with new chapters as the content becomes available.
data architecture nosql