Data cleaning & validation pipeline

Attach a raw dataset and the agent profiles, cleans, and validates it against explicit rules, then drafts the cleaned data and a quality report for you to approve.

What it installs

Agents 1

  • Data Engineer Agent

    Cleans and validates data, QAs its own output, and drafts a quality report.

Workflows 1

  • Clean and validate data

    Profile, clean, validate, QA the output, then draft the cleaned data and quality report for approval.

Documents 1

  • Validation rules

    Your editable config of the checks, tolerances, and exclusions that define when a dataset counts as clean. Set these to steer the pipeline.

Goals 1

  • Only clean data flows downstream

    Only validated, reproducibly cleaned data enters downstream analysis.

Skills 3

  • validate-data

    QA a dataset and the analysis built on it before it is trusted — methodology, calculation, and bias checks producing a confidence assessment. Adapted from anthropics/knowledge-work-plugins/validate-data.

  • data-context-extractor

    Profile messy data and extract its structure, entities, metrics, and hidden hygiene rules before transforming it. Adapted from anthropics/knowledge-work-plugins/data-context-extractor.

  • etl-pipeline

    Design a repeatable extract-transform-load flow with cleaning, validation, idempotency, and quality reporting. Adapted from claude-office-skills/skills/etl-pipeline.

Folders 1

  • Data Cleaning

Requirements

What this template expects to do its job. Task Machine does not verify these — you decide whether your setup is ready.

  • Product analytics access — Connect your product analytics so the agent can profile and cross-check source tables directly. Until you connect it, it works from attached exports and the validation rules document.

Get started

Install Data cleaning & validation pipeline and run it with approvals.

Join the waitlist and we will send early access when the first private beta spots open.

Private beta. We invite teams in batches and never share your email.