White Papers

13 Essential Data Validation Checks for Trustworthy Data in the Cloud and Lake

When data moves in and out of a Data Lake or a Cloud, the IT and the business users are faced with the same question- is the data trustworthy? Automating these 13 essential data validation checks will immediately engender trust in the Cloud and Lake.


Turing Award Winner and MIT Professor, Dr. Michael Stonebraker wrote a white paper outlining his transformative view on data. He believes real digital transformation must start with clean, accurate, consolidated data sets. These ideas are already driving major change at GE, HPE, Thomson Reuters, and Toyota. This is a summary of his paper.

Data Quality issues are hidden in all organizations, yet prevalent. Data Quality identification process is generally static, obsolete, time-consuming, and low on controls. This paper outlines the failures of the traditional DQ process, and how using cognitive algorithms in identification of poor data reduces effort and cost, and improves DQ scores dramatically. The only scalable path to good, reliable data is to leverage the power of AI to validate data autonomously.