SAP HANA Evaluation

As I mentioned in theAs I mentioned in the last couple of blog posts, I've been tasked with looking at big data slash no SQL solutions for one of the projects I'm working on.  We began looking at SP P HAMA.
As I mentioned in the last couple of blog posts, I've been tasked with looking at big data slash no SQL solutions for one of the projects I'm working on.  We began looking at SP P HAMA.
 
As I mentioned in the last couple of blog posts, I've been tasked with looking at BigData and NoSQL solutions for one of the projects I'm working on.  We are seriously considering SAP HANA.  HANA is an in-memory data platform that is deployed on an appliance or in the cloud.  It is mostly focused on real-time analytics.  SAP HANA is an optimized platform that combines column-oriented data storage, massively parallel processing, in-memory computing, and partitioning across multiple hosts.
 
HANA optimizes your data on the fly.  It can compress data and convert to/from columnstore/rowstore data depending on how you are using your data.  For instance if calculations are single column-based then columnstore storage is chosen.  
 
It has a standard SQL language, called SQLScript, which is very similar to TSQL, supports MDX (just like SQL Server...so it works well with Excel) and ABAP, and has standard jdbc and ODBC drivers.  It is ACID compliant.  This makes the move from a standard SQL RDBMS a little less painful.  SAP provides prepackaged algorithms optimized for HANA.  
 
My project is an Accounting System with lots of aggregate tables that holds summarized data at various grains of detail.  We aggregate, of course, because aggregated data is faster to read than performing the requisite calculations on the fly.  With HANA the aggregate tables are not needed.  They believe that they can retrieve the necessary aggregated data by querying the column stores directly, in-memory.  This of course would simplify our data model tremendously since we wouldn't need all kinds of logic to populate the aggregated tables.  ETL goes away.  We simply compute on the fly.  This would eliminate a lot of our data and a lot of our storage costs.  
 
Like many financial applications, ours is batch-oriented.  There are certain long-running, complex calculations we just can't do real-time with an RDBMS.  With SAP HANA “batch is dead”.  Even if HANA can't support your most demanding batch jobs, at a minimum they become "on demand" jobs instead.  Operational reports can run real-time on the OLTP system...no more need for an ODS.  
 
It will be interesting to see where this proof of concept leads us.
 
Where Can I Download It?
 
You can't.  But SAP gives you FREE development instances with all of the tools you need on their website (see below).  Here's why...SAP HANA is sold as pre-configured hardware appliances through select vendors.  It runs on SUSE Linux SLES 11.  It uses Intel E7 chips, Samsung RAM, Fusion IO SSD cards, and standard 15K rotational media for overflow and logging.  
 
Where Can I Get More Information?
 
The SAP HANA website provides lots of free information.  Lots of example use cases and code.  
 
The SAP HANA Essentials ebook is being written in "real time".  Google around for the free promo code and then you don't have to pay for it.  It is being continuously updated with new chapters as the content becomes available.  
 
 
 
My project is an Accounting System with lots of aggregate tables that holds summarize data at various greens of detail.  We aggregate, of course, so the wha reads of the aggregated data are faster than performing the calculations on the fly.  The SAP folks who believe that the aggregate tables are not needed.  They believe that they can retrieve the necessary aggregated data by querying the column stores directly.  This of course would simplify the data model tremendously since we wouldn't need all kinds of logic to populate the aggregated tables.  We simply compute on the fly.  This would eliminate a lot of our data and a lot of our storage costs.  It will be interesting to see where this proof of concept leads us. last couple of blog posts, I've been tasked with looking at big data slash no SQL solutions for one of the projects I'm working on.  We began looking at SP P HAMA.
 
SB PHAMA is an in memory data platform that is deployed on an appliance or in the cloud.  It is mostly focused on real time analytics.  SAP Hanna is an optimized platform that combines column oriented Data Storage, massively parallel processing, and in memory computing.  Can also partition and distribute data across multiple hosts.
 
As CP had a optimizes your data on the fly.  It can compress data, convert two columns store data, or convert two rows store data depending on how you are using your data.  For instance if calculations are single column based that column stores chosen.  And
 
It has a standard SQL language, supports acid, and high availability.  This makes the move from standard SQL RDBMS is a little less painful.  It supports MDX so it works well with Microsoft excel.  But to fully take advantage of SAP have a you must code application specific logic that can take advantage of the NPP architecture.  SP eight P provides prepackaged algorithms to help.
 
My project is an Accounting System with lots of aggregate tables that holds summarize data at various greens of detail.  We aggregate, of course, so the wha reads of the aggregated data are faster than performing the calculations on the fly.  The SAP folks who believe that the aggregate tables are not needed.  They believe that they can retrieve the necessary aggregated data by querying the column stores directly.  This of course would simplify the data model tremendously since we wouldn't need all kinds of logic to populate the aggregated tables.  We simply compute on the fly.  This would eliminate a lot of our data and a lot of our storage costs.  It will be interesting to see where this proof of concept leads us.

CONTENT
data architecture nosql