Data cleaning & validation pipeline

Attach a raw dataset and the agent profiles, cleans, and validates it against explicit rules, then drafts the cleaned data and a quality report for you to approve.

Join the waitlist

What it installs

Agents 1

Data Engineer Agent

Cleans and validates data, QAs its own output, and drafts a quality report.

Workflows 1

Clean and validate data

Profile, clean, validate, QA the output, then draft the cleaned data and quality report for approval.

Documents 1

Validation rules

Your editable config of the checks, tolerances, and exclusions that define when a dataset counts as clean. Set these to steer the pipeline.

Goals 1

Only clean data flows downstream

Only validated, reproducibly cleaned data enters downstream analysis.

Skills 3

validate-data

QA a dataset and the analysis built on it before it is trusted — methodology, calculation, and bias checks producing a confidence assessment. Adapted from anthropics/knowledge-work-plugins/validate-data.
data-context-extractor

Profile messy data and extract its structure, entities, metrics, and hidden hygiene rules before transforming it. Adapted from anthropics/knowledge-work-plugins/data-context-extractor.
etl-pipeline

Design a repeatable extract-transform-load flow with cleaning, validation, idempotency, and quality reporting. Adapted from claude-office-skills/skills/etl-pipeline.

Folders 1

Data Cleaning

Requirements

What this template expects to do its job. Task Machine does not verify these — you decide whether your setup is ready.

Product analytics access — Connect your product analytics so the agent can profile and cross-check source tables directly. Until you connect it, it works from attached exports and the validation rules document.

Get started

Install Data cleaning & validation pipeline and run it with approvals.

Join the waitlist and we will send early access when the first private beta spots open.

Private beta. We invite teams in batches and never share your email.