DataBuck Validates Big Data in Real-Time Even as it Flows’ between Many Different IT Platforms

Naperville, IL.

April 04, 2016.

Incorrect data is not just worthless it’s a liability, especially in the age of Big Data. Poor Data Quality results in compliance failures, inaccurate insights, failed initiatives, PR nightmares and costly manual effort to look for errors.

FirstEigen, a Data Validation software provider specializing Big Data and Cloud, today announced the immediate availability of DataBuck 2.0, a SaaS or On- Premise software that validates Big Data Quality and detects Anomalies in Big Data. Leveraging massive parallel processing power of Spark and proprietary algorithms, DataBuck performs Cross-Platform-Validation (Hadoop/ SQL/NoSQL/Cloud) and is 10x faster than any traditional approach. It’s web-based, code-free, intuitive interface enables even beginner-users to setup Data Quality validation rules in under 15 mins. DataBuck natively connects to major Big Data platforms like Apache Hadoop, Cloudera, MapR, Hortonworks, MongoDB, Datastax (Cassandra), and Cloud platforms like Amazon AWS and Microsoft Azure.

“Data errors are hard to catch for high-volume, high-velocity data moving across diverse platforms and originating from different sources. The conventional Data Validation tool and approaches are architecturally and functionally limited. They can neither handle massive scale of Big Data nor meet processing-speed requirements. We are very happy to be first to market with a purpose built tool specifically for Big Data,” said Seth Rao, CEO, FirstEigen. “We made a big bet on Big Data technology, because we believe every data is valuable and 100% validation coverage at extremely high speeds is a must-have for accuracy.”

Whether companies are looking to leverage Big Data for analytics, operational decisions, compliance reporting or get an accurate view of the customer, DataBuck provides the validation functionalities and speed required for real-time validation. DataBuck can easily plug into any traditional ETL/ELT process to accelerate the Data Validation step. Our platform will improve developer productivity 10x as compared to hand-coding.

DataBuck 2.0 is packed with a broad array of enterprise-class features including:

Big Data Profiling:

Profiles data and compares it with historical profiles and also across different applications for batch and streaming data processes, to determine unexpected changes in different instances of the same data and over time

Big Data Quality:

Performs standardized and custom rules-driven Big Data Quality tests. Standard tests include null check of the key data elements, format validation of key data fields and comparison against master list to validate the field values

Anomaly Detection:

Identifies outlier data using standard and custom algorithms. Standard algorithms include standard deviation, moving average-based routines

Big Data Matching:

Performs Data Completeness test across multiple Big Data sources


Jen Holmes, or visit: