When listening to discussions of many of the core concepts of the big data world, it often can feel like being caught in a hurricane of technobabble and buzzwords. Three of the most relevant concepts to understand, though, are data warehousing, data analysis, and business intelligence (BI).
Individually, each of these concepts engenders one-third of an overall process. When that process comes together, a company can more efficiently collect data, analyze it and turn it into actionable information for decision-makers at all levels of an operation.
Data warehousing is the most straightforward of the three concepts to understand. As the term suggests, it’s the process of taking collected data in a company and storing it in places where it can be kept secure and accessible. This means having access to either on-site database servers or off-site cloud storage platforms.
Data analysis is the process of scanning through the available data an organization has in order to produce insights. Many people misuse this concept interchangeably with BI. The distinction is that data analysis tools help professionals handle the tasks of:
- Acquiring data from sources
- Prepping data for analysis
- Confirming data integrity
- Identifying statistically grounded methods for gaining insights
- Using computing resources to rapidly cull massive amounts of data
- Iterating through permutations of statistical models to generate insights
- Verifying that any generated insights are statistically valid
Business intelligence is about taking the raw insights gained using those data analysis tools and turning them into actionable information. BI platforms are designed to provide visualizations and data to stakeholders. For example, a U.S. retailer might offer its buyers in China real-time data streams of insights derived from scanning millions of influencers’ feeds on Twitter, Instagram, Facebook and other social media platforms. This allows the buyers to look at the insights and quickly make decisions about what’s likely to sell well in the upcoming fashion season.
All of this work calls for the support of folks who have experience in working with computing resources at large scales. There’s a lot more going on here than simply putting entries into a spreadsheet. The industry employs plenty of data scientists, computer programmers and IT professionals. Likewise, individuals with business backgrounds in consulting are often in high demand.
From end to end, a company has to build its training and hiring practices around fostering a culture that values big data and insights. Building such a culture often presents its own set of challenges, as many people prefer to make choices based on tastes, gut reactions and “eye tests.”
If you want an insight into how this process unfolds, look no further than the world of professional baseball. Few sports are now as driven by analytics as baseball. Starting at the turn of the century, small clubs that were strapped for cash began hunting for market inefficiencies. Two decades later, everyone in the business is using data analytics tools to make decisions. In 2019, the Houston Astros announced they were cutting their scouting department significantly while adding more people in analytics.
One of the classic examples of how statistically driven insights can defy expectations is the so-called Monty Hall problem. The original version of the show “Let’s Make a Deal” featured a game where a contestant had to choose one of three doors to win a prize like a new car. Behind one door was something no one wanted, such as a goat. Another door hid the car, and a third one hid a lesser prize.
After the contestant picked a door, the host would reveal what was behind one of the other doors. For the sake of dramatic tension, the host never showed the goat or the car in the first reveal. The host then would ask, “Do you want to change your pick?”
According to volumes of computer simulations and PhD-level stats papers, the answer should always be “yes.” By switching, the contestant improves their chance of winning from 1/3 to 2/3.
If that feels wrong to you, don’t feel bad. The answer is not intuitive. Most people assume the contestant has somewhere between a 1/3 and 1/2 chance when switching. Thousands of respected mathematicians even tried to refute the solution.
Lots of business decisions are basically the Monty Hall problem scaled into the thousands, millions or even billions. There are plenty of doors to pick from, and the goats far outnumber the cars. Also, you’re competing against numerous other contestants simultaneously.
Unless you need to pay a dowry, you probably don’t want that many goats. How do you improve your chances of finding the winning prize? You embrace the value of data warehousing, data analysis and business intelligence.