The Egyptian Plover: Rhymes with "over". An African waterbird that maintains a (dubious) symbiotic relationship with crocodies, feeding on decaying meet lodged between their teeth.

Automatically find, explain, and fix errors without rules.

Redpoll Plover is the only AI data platform that finds and explains erroneous values in datasets to cultivate information and improve your data's most important function: to help you learn.

What Plover can do:




Automatically identify erroneous values

Protect your dataset from future erroneous values

Identify the best data to collect for learning

How it works

1. Plover builds a model of your data

from plover.source import ImmutableSqlSource
from plover.store import AwsS3Store
from plover.backend import AwsBackend
from plover.engine import DatabaseEngine

conn = connect_to_sql_database()

db_engine = (
  DatabaseEngine(
    source=ImmutableSqlSource(conn, "SATELLTIES_WH"),
    store=AwsS3Store("myorg-plover", "SATELLTIES_WH"),
    backend=AwsBackend("myorg-plover"),
  )
  .fit()
  .persist()
  .metalearn()
  .persist()
)

Holistic modeling and metalearning

Plover builds a holistic model of the world as defined by your data, then learns a metamodel to further refine its understanding of the information in your data.

2. Identify Likely Errors

top_5_errors = (
  db_engine
    .metrics
    .errorness
    .sort(by=['Confusion'], descending=True)
    .head(5)
)
Row Column Confusion Observed Predicted
Intelsat 903 Eccentricity 10.514456 0.793069999999… 0.000335826120…
Intelsat 902 Inclination_ra… 10.036763 25.06467339 0.002274957131…
Intelsat 903 Apogee_km 7.853968 358802 35792.92656399…
DSP 20 (USA 14… Period_minutes… 6.695287 142.08 1436.095724064…
SDS III-6 (Sat… Source_Used_fo… 4.93515 JM/5_11 ZARYA

Inconsistency is key

Plover finds errors by identifying data that are inconsistent with its model of your data or cause confusion.

Plover also shows you the observed value and its predicted value, which you may use to overwrite erroneous values.

3. Find similar errors

errs_like = db_engine.errors_like(
  "DSP 20 (USA 149) (Defense Support Program)",
  "Period_minutes"
)
row rowsim Observed Predicted
SDS III-7 (Sat… 0.988281 23.94 1436.088006
SDS III-6 (Sat… 0.964844 14.36 1436.113453
Advanced Orion… 0.9453125 23.94 1436.105354

Metareasoning

After identifying an error, Plover can identify similar errors by finding data that are inconsistent or confusing in similar ways.

4. Identify errors in incoming data

err = db_engine.detect(new_satellite_record)
err[0]
Row Column Incon Quantile Observed Predicted
Satmex 8 Users 0.99998 Commecial Commercial

Protect Prod

Compute how confusing or inconsistent data are before they make it into the database to protect production systems

5. Fill knowledge gaps

to_fill = db_engine.find_missing_to_fill(
  to_help_predict="Purpose"
)

to_fill[0].show()

All (missing) data are not created equal

Plover can identify missing fields that are most likely to reduce uncertainty in specific predictions if filled in.

Machine learning + human learning + engineering




Baxter Eaves

CEO

Baxter is a US Navy veteran and holds a PhD in Experimental Psychology from the University of Louisville where he developed computational models of human trust and social learning. He has led a number of DARPA projects and brings 13 years of experience deploying human-inspired AI tech in high-risk industries.

Patrick Shafto

Scientist at large

Patrick is a program manager at DARPA under the Information Innovation office (I20) and professor of Data Sciences at Rutgers University - Newark. He has led a number of projects for agencies including DARPA, DOD, and NSF, and his publications have appeared in top journals of machine and human learning.

Michael Schmidt

Principal ML Engineer

Michael has 14 years of research and engineering experience. He has built production models for healthcare, agronomy, finance, and law; and has conducted research in the areas of high-energy physics, differential geometry, plasma physics, and high-performance computing.

For information cultivation

What it is not

There are many data quality and observabilty platforms out there and they all do one or more of the following:

  • focus on mechanical failures of your data architecture that can be completely avoided with solid database architecture i.e. you can do it yourself;

  • do anomaly/outlier detection by looking for distributional changes in single columns, which requires that a certain amount of bad data make it into prod in order to make the comparison and ignores the context of the data;

  • use error between predictions made by your machine learning models and the observed data as a measure of anomaly, which requires you to go through the effort of building an ML model that is more accurate than your data;

  • flatten your database, throwing out your carefully designed relational structure and biasing learning toward the entities that interact most with the database.

What it is

Plover focuses singly of improving the veracity of the values in your databases toward improving the information in your databases, and thus improving your ability to learn. By creating a model of your entire database (without flattening it) we are able to evaluate the erroneousness of each datum individually. This allows us to pluck out (or impute) individual bad data and protect prod without lag.

Become a partner

Partner with us to bring Plover to your infrasturucture and to enhance the information within your data systems.