Big Value or Big Chaos?
Information Management and Governance in Times of Big Data Analytics
Brent Dykes, Director of Data Strategy at Domo Inc., once wrote in Forbes: “Data is the new purple.” He referred to the fact that the color purple used to be “royal” and only available for wealthy people; until someone discovered a way of producing it synthetically and made it accessible to the masses.
We are in a similar situation with data and analytics, which used to require sophisticated and expensive IT projects only large organizations with dedicated IT departments could afford. Meanwhile, the “data democratization” trend with self-service business intelligence tools like Tableau and Qlik enables ordinary users to perform analytics.
As an enabler for this trend, “big data” is more and more stored in its raw form in repositories called data lakes. Data models emerge with usage rather than being imposed up front. This all sounds like the brave new world, doesn’t it?
Key success factor #1: Data quality management
One problem with the data lake approach is that data understanding, profiling and cleansing activities you usually go through when building a strongly integrated and governed data warehouse are omitted. In fact, they are pushed to the user performing the analytics. According to Forbes, data scientists spend 60% of their time with cleansing and organizing their data rather than analyzing it. Be aware of this fact and consider data quality management in your planning!
Key success factor #2: Business metadata
Jochen Demuth, Senior Director of Partner Engineering at MicroStrategy once said: “Self-service BI is the new spreadsheet.” The problem he referred to is the fragmented understanding of data meaning, transformations and calculations if you decentralize these aspects to the user. Actually, you are creating silos like in times when data was floating in spreadsheets on file servers. According to Gartner, data governance addresses this by identifying and describing data sources, making lineage visible and providing context via business metadata.
Over various projects, Simplity has evolved a methodology we call “business information modeling”, providing structured business requirements definition, data harmonization and traceability.
Key success factor #3: Collaboration
One lesson learned from our projects is that data governance initiatives require a high degree of collaboration. One of the main goals is to achieve harmonization of business definitions across business areas and between business and IT. Proper tools with versioning and workflow support, such as the Accurity software suite by Simplity, are inevitable.
Key success factor #4: Stringent project management
Experience shows that when harmonizing business definitions within an organization, discussions (e.g. defining what is an “active customer”) can be endless. However, after certain amount of time and effort the added value becomes marginal; sometimes it is advisable to “agree to disagree”. For that reason, Simplity’s business information modeling methodology is based on a strict plan with a fixed “time-boxed” workshop schedule.
Key success factor #5: Automation
In many situations, you cannot afford an “army of data modelers” to create your business metadata manually. While this may be feasible for “classic” strongly governed data warehouse environments, it for sure is not in in a “data democratization” world with rather loosely governed data lakes. In our Accurity software suite we address this issue with upcoming automation support.
Interested in learning more and understanding how this fits to other trends like logical data warehouse or regulations like GDPR? Please visit www.simplity.ro or contact us at email@example.com and we will see how we can help you better understand, leverage and improve your data.