Recent News

Sept 09. 2022

FirstEigen launches new module to flag inaccurate records in Snowflake and Azure Data Lake

FirstEigen, the Autonomous Data Validation Company, announced the general availability of DataBuck for Snowflake and Azure Data Lake.

July 21, 2022

FirstEigen announces that the DataBuck Observability Module now has production Customers in all three Clouds

"DataBuck will enable data engineers to flag inaccurate records as soon as data lands in the data lake and moves through the data pipeline.” 

Feb 16, 2022

FirstEigen and Alation partnership to boost seamless integration of autonomously generated Data Trust Scores into Data Catalog

Data Quality validation leaders FirstEigen have announced their partnership with Alation, experts in enterprise data catalog and data governance.

FirstEigen launches a new module to flag inaccurate records in Snowflake and Azure Data Lake

Sept 09, 2022

DataBuck is an autonomous data quality validation software that automatically detects 100% of all Systems Risks with minimal human intervention using AI/ML. It is more than 10x faster than any traditional approach.

FirstEigen, the Autonomous Data Validation Company, announced the general availability of DataBuck for Snowflake and Azure Data Lake.

“Current data monitoring and validation tools and processes fare very poorly under conditions like Cloud/Lake use, high volume of data, and new sources or changing structures of data, among others. That’s where DataBuck comes into the picture,” says Angsuman Dutta, CTO and co-founder of FirstEigen.

Dutta says DataBuck will enable data owners to flag inaccurate records as soon as data lands in snowflake tables. It scans each data asset in the targeted Snowflake and Azure Data Lake platform. Assets are rescanned every time the data asset is refreshed or whenever a scheduler invokes DataBuck. FirstEigen assures no data is moved to DataBuck.

With DataBuck, data owners do not need to write data validation rules or engage the data engineers to perform any tasks. DataBuck uses machine learning algorithms to generate an 11-vector data fingerprint to identify records with issues.

Another impressive feature is that the Data Fingerprint approach reduces false positives. For a Fortune 500 industrial company, DataBuck reduced false alerts related to data quality issues in the Internet of Things (IoT) sensor data by 85%. This helped the company save over $1.2 million.

DataBuck also autonomously creates data health metrics specific for each data asset. The well-accepted and standardized metrics are customized for every data set, individually leveraging AI/ML algorithms, and published in the Alation data catalog using Alation’s Open Data Catalog Framework.

Health metrics are computed based on quality dimensions for each column in the data asset and monitored over time to detect unacceptable data risks. Health metrics are translated to a data trust score.

In addition, DataBuck continuously monitors the health metrics and trust score and alerts users when the trust score becomes unacceptable. It features a one-click integration with Snowflake, Azure Data Lake, and Alation. FirstEigen stressed that no data moves out of Snowflake or Azure Data Lake, and it is focused on errors that matter.

FirstEigen is the creator of award-winning software DataBuck which detects data quality errors without coding by leveraging AI/ML. Machine Learning powers data quality testing and data matching to automatically set thousands of automated data validation checks and their thresholds. 

To learn more about FirstEigen, as well as the benefits of Databuck, visit the blogs page and its social channels for more information.

FirstEigen announces that the DataBuck Observability Module now has production Customers in all three Clouds.

July 21, 2022

"DataBuck will enable data engineers to flag inaccurate records as soon as data lands in the data lake and moves through the data pipeline.” – Angsuman Dutta, CTO, and Co-founder of FirstEigen.

FirstEigen is proud to announce that the DataBuck Observability Module has production customers in all three clouds. The move is critical for helping data owners flag inaccuracies. The award-winning software, DataBuck, detects data quality errors without coding by leveraging AI and Machine Learning.

FirstEigen shares that data errors resulting from system risks are the leading contributors to untrustworthy data. With DataBuck, each data asset is scanned, and any inaccuracies are flagged early, ensuring that data owners have accurate and reliable data in their pipelines.

Undetected errors in data assets steadily multiply across an enterprise, infecting the whole asset. It takes 10x the effort and cost to remedy these errors. But with DataBuck, data observability is at a new level since the tool is at par with conditions like Cloud/Lake use, new sources, high data volumes, changing data structures, and others.

FirstEigen CTO and co-founder Angsuman Dutta says DataBuck will enable data owners to catch errors fast. The DataBuck Observability Module is an autonomous solution that prevents critical data issues. “It scans each data asset and looks for critical errors that may break the data pipeline or disrupt downstream processes. With DataBuck, data engineers do not need to write data validation rules. DataBuck automates tedious, labor-intensive, and time-consuming process for coding rules and orchestration mechanism to detect data issues from 4 hours to 2 minutes.”

With the ability to prevent critical issues in Lake (AWS, Azure Data Lake, GCP Storage) and data pipeline, DataBuck Observability Module can be programmatically integrated into any data pipeline. The module works with AWS Glue, Airflow, Synapse, and Databricks.


Upon flagging critical issues in a data asset, DataBuck Observability Module alerts the data engineer and development teams of:

  • Missing file or table
  • Additional file or data
  • Changes in the data schema
  • Duplicate files and records
  • Record count mismatch between two steps of the data pipeline

DataBuck also sets thousands of validation checks for continuous testing and data matching to monitor the health metrics and data trust scores. The software’s machine learning algorithms generate an 11-vector data fingerprint that quickly identifies records with issues. Data Consumers can use DataBuck’s self-service feature to turn off or turn on data quality checks specific to their business context.

DataBuck Observability Module ensures customers have higher trust in their reports, analytics, and models. It also lowers data maintenance and costs, multiplying efficiency in scaling data quality operations.

Follow the Blog page for more.

FirstEigen and Alation

FirstEigen and Alation partnership to boost seamless integration of autonomously generated Data Trust Scores into Data Catalog

Feb 16, 2022 - Data Quality validation leaders FirstEigen have announced their partnership with Alation, experts in enterprise data catalog and data governance. FirstEigen's software DataBuck provides a Data Trust Score for every data asset on the Alation catalog, autonomously, and without human intervention or complex integration efforts.

Alation’s active data governance helps organizations drive data culture by democratizing data discovery and access for all users, regardless of technical skillset. FirstEigen complements their investment in the data catalog and governance platform by providing objective view on the health and usability of the data asset.

Seth Rao, the CEO, FirstEigen, emphasized that "it's not sufficient for Data Catalogs to operate in a silo. Companies can get greater value if they get an objective data trust score for every data asset. A system where individuals give subjective opinions on trustworthiness of data is a beauty contest that is not scalable when 1,000's of data sets have to be validated constantly."

The FirstEigen-Alation partnership will benefit their growing customer base with:

  • Greater trust in the data and the catalog with the availability of standardized, objective Data Trust Scores for every data asset
  • Lower data maintenance costs by automating a labor-intensive task from 4-6 weeks per data set to a few mins
  • Easy to democratize data within the corporation
  • Easy to integrate and implement the catalog and data quality tools

Want to know more about how the FirstEigen and Alation partnership will help organizations? Contact us for more information.

First Eigen

About FirstEigen

FirstEigen is the creator of award-winning software DataBuck, which detects data quality errors without coding by leveraging AI/ML. Machine Learning powers Data Quality testing and Data Matching to automatically set thousands of automated data validation checks and their thresholds. Executed on a Spark platform with specialized algorithms, DataBuck is >10x faster than any traditional approach.

Contact us for more details.

alation logo

About Alation

Alation is the leader in enterprise data intelligence solutions including data search & discovery, data governance, data stewardship, analytics, and digital transformation. Alation’s initial offering dominates the data catalog market. Thanks to its powerful Behavioral Analysis Engine, inbuilt collaboration capabilities, and open interfaces, Alation combines machine learning with human insight to successfully tackle even the most demanding challenges in data and metadata management.

Visit Alation for more details.