Supported formats
We read PDF, DOCX, HTML, and plain text. The best results come from a machine-readable PDF (one you exported from Word or Google Docs, not a scan). If your prior year was a scan, the agent still handles it — OCR runs automatically — but expect to confirm a few extractions manually.
What the agent extracts
- Section structure — the order and headings of your prior CCR.
- Voice and tone — we preserve how your utility describes itself, its sources, and its consumer relationships.
- Contaminant table schema — the columns you used, the ordering, the footnote style.
- Distribution preferences — which delivery channels you mentioned in last year’s methods.
What the agent does not reuse
- Actual contaminant values — those come from this year’s lab data.
- Violation disclosures — those come from this year’s SDWIS record.
- Deadline references — those come from the current reporting calendar.
If extraction misses a section
Open the Evidence panel, click the section, and drag the missing region from the source PDF. The agent re-drafts with the updated structure.