In my blog entry yesterday I concluded that Big Data as an acronym is on the rise and ISVs need to pay attention to this. The next question that one needs to pose is how is Big Data different from the traditional enterprise data warehousing? I still remember vividly the arguments 15 years ago whether Bill Inmon (considered the father of data warehousing) Top Down approach should be replaced by Ralph Kimball’s approach (Bottom Up) where the Enterprise Data Warehouse is built as collection of data marts that then together conform the enterprise data warehouse. There are also concepts such as operational data store, master data etc. Following link shows a couple of pictures that explains the difference in these approaches and a blog entry that explains pretty well the differences in these two approaches.
During my career, I have personally been involved with all and above and the latest implementation was based on SQL Server 2008 R2 with not only ETL logic to the ERP applications, but also a staging area, relational data warehouse and then the multi-dimensional OLAP cubes with SharePoint 2010. Needless to say, you need to have an understanding of multi-layer architecture and how all of this work together.
The question is how Big Data relates to all of this? One view of this is that different market segments sees it in a different way. Start-ups will see this more of a web-based approach with cloud solutions supporting Big Data. The SMB market has invested in Business Intelligence solutions and to get scale, they are going to look at cloud solutions that can take their analytics to the next stage. An then the larger enterprises that have invested huge amounts in enterprise data warehousing, data marts, ETL processes etc. will probably keep these solutions but might amend to cloud-based solutions when it is appropriate.
The competition in the Big Data space will increase during 2013 and we have already seen this by new solutions being introduced to the market like Amazon Redshift and Windows Azure Big Data. The distinction in the Big Data solutions is that many of them are typically based on NoSQL technology and data is dumped into computer memory (In-memory) and these solutions are specifically good for non-structured data. It is important to understand that there isn’t one “turn-key” solution as these types of Big Data implementations are both complex and require very distinctive skills to maneuver like “programming, statistics and how to visualize and communicate data”.
What we also need to remember is that the need to integrate data from different sources still exist, the data will be typically very different to what we are used to (like digital sensor and cameras) and when you add social media to all of this, you will have a mixture of data that never existed.
And finally, if you have been involved in Business Intelligence or Data Warehousing projects, the data/information still has to be presented in a format that makes sense for your audience, whether it be your management or other information junkies. What I do know is that analyzing the data won’t be easier than before given the fact that there is so much statistical swing into it, but the results of that data could take you and your company to the next level if information is used in proper manner.
To answer to the question I posed in my heading. No, I do not think one thing replaces another, but I would say is that you can expect to see multiple different variations on implementations and you can call them what you like and cloud will definitely be part of that implementation.