Data extraction
Higher-level pipelines for getting structured data out of PDFs: pdfplumber-style table detection, interactive form fields, font-filtered extraction from prestamped forms, and Tagged-PDF logical structure.
Each guide comes with a downloadable sample PDF and runnable examples with real output:
| Your PDF | Guide | Sample |
|---|---|---|
| Tables drawn with ruling lines or aligned text | Tables | table.pdf |
| Interactive AcroForm/XFA fields | Interactive forms | form.pdf |
| Data printed on a static form template (no fields) | Form-aware extraction | form.pdf |
| Tagged PDF (accessibility export) | Struct tree | — |