Making the Data Quality Rules Discovery Process Easier

EigenRules

#1 Software for auto discovering data quality rules directly from your data. Point EigenRules to the data, wait a few minutes, and out comes the auto-discovered DQ rules in plain English!

What it does:

  • Uses AI/ML to understand the expected behavior of your data and gives you the essential data quality rules you MUST have in place to validate it.
  • Complement the rules already have with these auto-discovered rules.
  • Or, use this to jump start on-boarding new data sets in less than 5 mins.

Benefits

Give these auto-discovered rules to your Subject Matter Experts (SMEs)and they’ll be amazed!

  • DataBuck will accelerate SME’s work.
  • Reduce time to market for onboarding new data sources or apps.

Ask What ETL Can Do for You and What EigenRules Can Do for You

  • Streamlines data quality rule discovery process
  • The SME’s can piggy back on the auto-discovered DQ rules to accelerate their rule discovery
  • If you already use an ETL tool and you are writing rules, find gaps in your rule set
  • Augment what you already have with a thorough set of rules
  • Very quick “time to market”: For every data source, you can cut 3-4 weeks of work to just 15 mins with only 1 resource to discover DQ rules including multicolumn relationships

Examples of Types of Data Quality Rules Auto Discovered

Every data set will have few 100s of essential data quality rules that must be checked to validate data thoroughly. EigenRules will discover rules in all 6 data quality dimensions. Below are examples of the actual rules discovered for a loan data set and printed out by the software in plain English. User gave ZERO inputs as to the meanings and relevance of the columns, EigenRules auto discovers relationships and rules that govern every microsegment of data.

  • Uniqueness, Loan Number, Cannot be duplicate
  • Completeness, Loan Closing Date, Cannot be Null
  • Conformity, Loan Closing Date, Valid Format, yyyyMMdd
  • Validity, Inter Column Relationships , IF `Property State`=GA AND `Loan Source`=4 AND `Product Type`=1 THEN `Investor Type`=3
  • Drift, `Income Documentation`, Acceptable Values 1, 3, 4, 6
  • Timeliness, First_Payment_Date must be within 90 days of the Loan_Closing_Date
  • Consistency, Differences between Original_Credit_Score – Current-Credit_Score must have Lower_Limit: -221.2, Upper_Limit:207.2
  • Accuracy, IF `Property State`=GA AND `Loan Source`=4 AND `Product Type`=1 AND `Investor Type`=3 Then `Unpaid Principal Balance` will have the following range: Lower_Limit:0 Upper_Limit:260,853
  • Accuracy, IF `Property State`=CT AND `Loan Source`=2 AND `Product Type`=6 AND `Investor Type`=7 Then `Unpaid Principal Balance` will have the following range: Lower_Limit:0 Upper_Limit:1,929,964