White Papers

How to Architect Data Quality on Snowflake – Serverless, Autonomous, In-Situ Data Validation

Snowflake data warehouse runs the risk of becoming a data swamp. Existing rule-based data quality solutions can validate Snowflake data, but are not scalable for 100's of data assets and are prone to rules coverage issues. More importantly, these solutions provide an easy way to access the audit trail of results.
Solution: Organizations must consider a scalable solution that can autonomously monitor 1000's of tables to detect data errors as soon as the data lands.

Turbo Charging Data Governance Platform with Data Trust Score

Trust and Data Quality are keys to making the most efficient use of data and data governance platforms. It is vital to measure and communicate the quality of data to ensure that stakeholders are making decisions based on good information. DataBuck enables Alation users to evaluate data quality with a trust score for data assets as part of the Alation Data Catalog.

13 Essential Data Validation Checks for Trustworthy Data in the Cloud and Lake

When data moves in and out of a Data Lake or a Cloud, the IT and the business users are faced with the same question- is the data trustworthy? Automating these 13 essential data validation checks will immediately engender trust in the Cloud and Lake.

First Eigen

Turing Award Winner and MIT Professor, Dr. Michael Stonebraker wrote a white paper outlining his transformative view on data. He believes real digital transformation must start with clean, accurate, consolidated data sets. These ideas are already driving major change at GE, HPE, Thomson Reuters, and Toyota. This is a summary of his paper.

Data Quality issues are hidden in all organizations, yet prevalent. Data Quality identification process is generally static, obsolete, time-consuming, and low on controls. This paper outlines the failures of the traditional DQ process, and how using cognitive algorithms in identification of poor data reduces effort and cost, and improves DQ scores dramatically. The only scalable path to good, reliable data is to leverage the power of AI to validate data autonomously.

First Eigen