Business Intelligence, Data Analytics and Data Warehousing in the Information Age
When I first started working with SAP in the 90’s, I never understood why companies branded products and rebranded them. I always thought that it was due to some major breakthrough and the product was now much better. That worked well in some cases, but terrible in others. This became apparent in the 2000’s with Technology. Everyone had to have the latest iPhone because it combined a camera, music player, and communication device all in one handheld unit. For others, it did not work out so well: New Coke versus Old Coke. Customers lined up the night before to get their hands on that new Technology, however, there were protests over the change in taste of Coke.
So, what does this have to do with Business Intelligence? In the software world, there is no cool package to open, no great devices to show off to all your friends, nobody pounds their chest and says, “I just upgraded to Windows 10!”. So, what do marketers do to get folks talking about their latest software product? They change the name and throw a big marketing campaign to show new customers and old customers that their product has “transformed” and is now something new that you cannot live without. It is so much better than the old version that you need to throw it away immediately and upgrade to the latest version. Regardless if you have finally perfected your use of the older version. I can still recall the “Start me Up!” campaign Microsoft threw for the brand new start button on Windows. In the world of Information Technology, this marketing scheme has never been truer than in the world of Data Warehousing, Business Intelligence, and Big Data.
Let’s look at SAP for a moment. Their Business Intelligence and DataWarehousing platform, initially called Business Intelligence Warehouse or (BIW), lasted a brief moment in history. Complications (i.e., copyrights) in the acronym BIW caused them to rethink the name. So, shortly afterwards, BIW became Business Warehouse or BW. Their tagline was, “No coding necessary, just drag and drop”. Years later, SAP purchased Business Objects (BOBJ) to shore up the reporting end of the BW Tool. BW and BOBJ were rebranded as SAP Business Intelligence or BI. BW never changed, however, BOBJ would, after several years, be transformed into a pure reporting platform focused primarily on BW. Before BOBJ would complete this transformation, SAP decided that BW should still be called BW and BOBJ would become a suite of tools called Business Intelligence. This suite of tools would be a more efficient use of BOBJ against BW plus they would include java re-coded versions of their Excel Analyzer tool and Query Builder, which were now called, “BI Analytics Tools”. Confused? So were all of us BW, err SAP BI experts out there. We had to deal with Migration issues, Terminology Issues, Training Issues, and what not. These days, if you live in the SAP world, you get to work with the likes of BW on Hana and BW for Hana and the remnants of SAP BW Netweaver systems out there. Sounds a bit like Old Coke versus New Coke, however with Coke, you can just taste it to know the difference. For Hana, you have to ask someone who knows how to explain what these systems are.
In our Data Oriented world, we have been going through a similar crisis. Gone are the days of Data Warehousing. Now we have either Business Intelligence or Big Data with it’s Data Analytics. What is the difference? First off, Data Warehousing is not only a tool, but it is a concept. The original concept was to properly store data so that you could rapidly retrieve results from the database. This is in sharp contrast to a Transactional Application. If you look at any application or ERP system, they are designed as an Online Transaction Processor or OLTP for short. They are designed to quickly enter data into a system. Creating a new Purchase Order is easy if you can find all of the relevant data quickly and easily for the human entering it. The User Interface is much more important at finding and helping the user enter data. A Data Warehouse is an Online Analytical Processor or OLAP for short. They can crunch through Millions of records and show the resultant report in seconds. The OLTP enters data in quickly, but the table structures look like spaghetti when connected together. Many, many joins are in the application and that tends to slow down reporting dramatically. The OLAP processor is always in a similar structure, the Star Schema. Very few joins are included and those joins restrict what data is processed in a central table (the fact table). This results in a dramatic improvement over the OLTP’s processing methodology. As an example, earlier on, folks would run OLTP financial reports over the weekend because they took hours (many took 18 hours or more) and required a lot of server time to run. The same report on an OLAP system would typically run in under one minute and would not consume much in the CPU resource department. Actually, if any of my OLAP reports took more than thirty seconds to run, I did not design my report and table structures correctly.
A Data Warehouse as a concept is about storing data and making it available to quickly report on. A physical Data Warehouse is the server and it stores the data. There is no limit to what you can do with the data as long as you can imagine it. The data is there and waiting to be used in any format you like. You could be looking at Sales History, Current Sales, Forecasted Sales. You can create reports to help you predict future sales or you can inspect the data and look for behavioral patterns in it. You can also write code and use functions that tell you if your data is inconsistent, or if you need to take actions in a Prescriptive manner. You can do anything you want with that data.
Now, here come the marketers. A Data Warehouse is difficult to setup and it’s not sexy. You need to know what you want to get the most from it. Many folks took the path of creating super-complex reports for their first reports that came out of the Data Warehouse -- 2 years later, 20 design revisions later, and a staff change and they ended up with reports that nobody used or wanted anymore. In order to get out of this rut, the term Business Intelligence came out.
Yes, Business Intelligence is about decision support. It gives you insight into your business. It’s so much better than DATA WAREHOUSING. It was almost to the point where Data Warehouse folks should be wearing a smock and driving forklifts because they were only “warehouse” workers. Now, a Business Intelligence Consultant, that’s a subject-matter expert giving you the tools to make better decisions. These consultants will not only help you understand your business, but they can help model or architect a solution that will work for you. Oh, I get it. This is the Sales piece the Data Warehousing person does not have because he works in a warehouse. I cannot put down Business Intelligence too much because it actually does further the Data Warehouse concept. The problem is that it does not go far enough. It’s concept was too early. The technology was not there to support it. Business Intelligence adds a reporting environment or Visual Component to the Data Warehouse. You see, Data Warehouses focused on data storage for quick slicing and dicing reporting on your data. It typically fell short on displaying data effectively or visually. Data Warehouses typically provided a link to Microsoft Excel and let a reporting super-user create reports in Excel directly. Business Intelligence either coupled a reporting tool with the Data Warehouse or was a standalone reporting tool that could hook into any database. It’s focus was on Dashboarding and Data Display in a pleasant, eye catching format. This moved the industry forward and gave marketers a much needed new concept to promote. The Concept of Data Warehousing did not change. However, the physical data warehouse was complimented with a suite of tools to help consumers understand the data more clearly. Thus, Business Intelligence moved Data Warehousing forward by making it more visually appealing and easier to grab data from different sources.
Business Intelligence was premature though. The concept was great, the timing was bad. At the time, there were only a few major reporting tools out there. Most notably, Business Objects was king until SAP bought them and took them off the market. Even with BOBJ, Business Intelligence was a chore. Companies started using multiple applications from different companies, data needed to be combined and consolidated (Essbase anyone?), and after consolidation, consumers still needed their insights. A number of BI tools popped up to help with this. After Oracle purchased Hyperion, Essbase came off the market. Then other companies started showing up with new Business Intelligence Solutions. This latest round of applications reach into any database or application out there and report on the data directly from the application or they can pull the data out and place it into its own database and report on the data there. The concept of Data Warehousing is not what these companies were focused on. They focused on getting your data to you quickly and efficiently so that you could read and understand it and make more informed decisions. Typically the tools were bulky and inefficient to set up. We needed tools that were super easy to setup and end users could intuitively click on their charts and review their data visually.
For SAP, this era included a few dashboarding tools that never really got off the ground and a horrendous copy of Crystal Reports that did not last more than a few years. Then SAP purchased a tool called Lumira and now they had their dashboarding and reporting solution. For non-SAP folks, Tableau was the up and rising reporting star. There are a few others out there such as Qlikview, however, many of them are being swapped out for Tableau. So, Business Intelligence Tools are finally coming out into the limelight to help all sorts of companies and their varying data sets and architectures.
Where does this put Big Data? Big Data is a “bigger” concept than both Data Warehousing and Business Intelligence. Big Data is large, complex, always changing data sets. These data sets are so voluminous that traditional data processing software cannot process them efficiently. However, these massive volumes of data can be used to address business problems you could not handle with traditional toolsets due to the volume, variety and velocity (the 3 V’s) of your data.
Big Data likes to not only look at the entire view of your data sets, but also includes non-traditional data sets in its offering like Web Page Data, Twitter feeds, raw media (video and audio), or even streaming data from IoT devices. It also expands on the concepts of Analytics. It breaks Analytics up into four categories: Descriptive (these are your reports on current events and what happened … your rear view mirror), Predictive (using regression analysis or other models to predict the future), Discovery (Looking for Behavioral Patterns in the data to gain further insights), and Prescriptive (models that predict something negative is going to happen and offers up corrective action to course correct). Not only are we including more data and clarifying the concepts better, but they have a set of tools for making the gathering of the data much easier. The chart below shows how much effort is involved for each level of analytics involved.
They have also broken up the roles for folks handling each of the tasks. For instance, you are now hearing terms like Data Engineer, Data Scientist, and Data Analyst. These relate specifically to a person’s role in the Big Data ecosystem. Do you go out and find the data you want and describe how to put it together in a database? That’s a Data Scientist. Do you get the data and combine it and implement it and monitor the data loads? That’s a Data Engineer. Do you create reports and analyze the data? That’s the Data Analyst role. The Data Analyst presents the data visually for their consumers and the role is more identifiable than the Scientist or Engineer. The Scientist and Engineer roles can overlap. However, they are distinctly different. I see the Data Scientist as someone trying to answer some specific questions. They break new ground and, as Larry the Cable Guy would say, “Get ‘er done!”. The Data Engineer will put a repeatable system in place so that data can be stored, updated and viewed in a convenient manner for whoever has access. I like to think about it as one-time analysis versus a repeatable, ongoing system with monitoring and error correction built into the process. Both the Scientist and Engineer roles can overlap, however, they are distinctly different if you look closer at the roles. In the old days, folks undertook all three roles. Since Big Data is so big, the jobs are broken out between three people. It makes us Data Warehousing folks feel good because now there are three folks doing the job that one of us used to do.
When I mentioned that folks have a role in the Big Data ecosystem, that is because Big Data had correct timing with technology. Everyone will tell you that Hadoop is not Big Data, yet you really cannot talk about Big Data without mentioning Hadoop. What is Hadoop? First and foremost it is an environment and suite of tools that allow developers to easily gather, load, monitor, correct, analyze, and store the data you are after. Secondly, it is a clustered database that can scale up and consume more servers as your data and analytics needs grow. This helps customers who really don’t know how much volume (raw record count), variety (number of data sources), or the velocity (rate record for data changes) of their data. The Hadoop environment was built to help you analyze and store these disparate, third party datasets. For example, you can analyze data from all twitter feeds in the past two weeks to see which feeds mention your business. With that information, you can analyze those feeds to determine if they are positive or negative stories. If you compare the amount of positive versus negative stories about your business over the past few weeks, you can gauge whether your company’s stock will go up or down based on whether the number of positive and negative stories are increasing or decreasing. This can be easily setup with mapreduce or Spark or Google new tech called Cloud Dataflow. Since this is not a Hadoop review, I will stop here and move on.
Now that we know what Big Data, Business Intelligence, and DataWarehousing are, can we really tell the difference? On the outside, you can notice from the tools being used. However, on the inside, those data gatherers/modellers/architects/analysts/scientists/engineers are all doing the same basic jobs. The only real difference is the tool sets being used. Of course, Big Data has the latest tool sets and they help the most. It also has a huge amount of traction from marketers. However, venturing away from Hadoop and Big Data, you can still see the concepts in place in Business Intelligence. Just the other day, I was reviewing Looker and their Business Intelligence platform is all about actions on the data. Another Business Intelligence offering included an analytics tab in it’s Demo. What a surprise! Business Intelligence Tools are now including Big Data Concepts and making data not only visually appealing, but actionable as well. Who would have thought of that.
There no longer is a debate on New Coke vs. Old Coke. In that case, for most people, Old Coke was better. Why change. Lots of folks in Data Warehousing and Business Intelligence are asking the same thing. Why change to Big Data? Well, I can tell you this, Big Data brings big concepts to the table. It is more than just a marketer's dream. Whether you choose to create new roles and call your Developers Data Scientists, or you start taking action on your data, Big Data brings many new concepts and tools to the table. Currently, those tools are still a bit complex for the average user. So, this Data Warehouse person has to say that applying Big Data Concepts to your current or new environment will greatly help you get to actionable data to improve your business. Whether you use Hadoop or not, implement Big Data’s Analytics to help give you those business insights you have been looking for. They will take your Data Warehouse or Business Intelligence solutions to the next level.
For an idea of some of the tools that you can use in Business Intelligence and Big Data, check out the links on our website at: https://www.titaniumconsultant.com/links/