Most business executives understand by now that strategic decisions are always better when supported by data, and that the more data there is, the better the outcome will be. Unfortunately, in the 90’s and 00’s, after the decimation of BI/MIS capabilities within business departments, the ability of business stakeholders to get the data they needed when they needed it was severely restricted.
Lately, with the advent of Big Data and its promise of cheap and fast access to vast amounts of data, a kind of euphoria has taken hold among business executives: “we are finally going to get what we need when we need it; we just have to wait a bit for IT to get all of the data into our data lake and then: Nirvana!”
There are however several problems with that, which we observe practically everywhere in the industry:
- The technology departments tasked with getting the data into the lakes and making it available don’t fully understand what kind of data the business actually needs to make decisions. The promise of Hadoop (cheaper and faster) makes it very tempting to dump all of the data into the lake, and then make the business accountable for the rest. The business, however, has neither the capability nor the capacity to provide Technology with clear answers with respect to data needs, metadata, data quality rules, and such, as all of these require specialized skills and expertise.
- Business departments everywhere are currently building up their advanced analytics capabilities by hiring data scientists and unleashing them on whatever data is already available in their data lakes. However, as mentioned above, that data is currently not well understood in terms of metadata and data quality. Building up repeatable data science processes without taking metadata and data quality into account will only get the business into painfully familiar territory – lots of answers that nobody can trust.
- The impact the new technology on existing business processes has yet to be fully assessed and appreciated. Hadoop and advanced analytics provide the business with amazing capabilities, but these capabilities may not be fully adopted (even if done properly) unless the business processes change to accommodate them.
In essence, everything we observe happening now with Hadoop and Data Science we’ve already seen before – in 80’s and 90’s with Data Warehousing and BI.
The hope is that we’ve learned the lessons of history.