mirror of
https://github.com/docling-project/docling.git
synced 2026-05-17 13:10:38 +00:00
c23622f6f5
* docs: add agent skill bundle with convert/evaluate helpers - Add docs/examples/agent_skill/docling-document-intelligence/ with SKILL.md, pipelines.md, EXAMPLE.md, improvement-log template, and scripts/docling-convert.py + docling-evaluate.py (standard/vlm-local/vlm-api). - Document InputFormat.PDF + PdfFormatOption for explicit PdfPipelineOptions. - Link from examples index and mkdocs nav. Made-with: Cursor * docs: align agent skill README and EXAMPLE with Cursor bundle - Document both ~/.cursor/skills and docs/examples paths. - README notes repo parity for PRs and local installs. Made-with: Cursor * DCO Remediation Commit for jehlum11 <jehlum11@gmail.com> I, jehlum11 <jehlum11@gmail.com>, hereby add my Signed-off-by to this commit:2d268ffb6fI, jehlum11 <jehlum11@gmail.com>, hereby add my Signed-off-by to this commit:041e709c66Signed-off-by: jehlum11 <jehlum11@gmail.com> Made-with: Cursor * docs: refactor agent skill to use docling CLI for conversion Address maintainer feedback: the custom docling-convert.py script was largely redundant with the existing docling CLI. This commit: - Removes scripts/docling-convert.py (redundant with `docling` CLI) - Refactors SKILL.md (v1.4 → v2.0) to use `docling` CLI for all conversion tasks, reserving the Python API only for features the CLI does not expose (chunking, VLM API endpoint config, force_backend_text hybrid mode) - Updates docling-evaluate.py recommended_actions to reference `docling` CLI flags instead of the removed script - Updates README.md, EXAMPLE.md, pipelines.md to use `docling` CLI examples throughout - Simplifies requirements.txt (removes packaging dependency) The only custom script retained is docling-evaluate.py, which provides heuristic quality evaluation — functionality the CLI does not cover. Signed-off-by: jehlum11 <jehlum11@gmail.com> Made-with: Cursor * docs: fix ruff format on docling-evaluate.py Signed-off-by: jehlum11 <jehlum11@gmail.com> Made-with: Cursor --------- Signed-off-by: jehlum11 <jehlum11@gmail.com>
Docling agent skill (Cursor & compatible assistants)
This folder is an Agent Skill-style bundle for AI coding assistants: structured instructions (SKILL.md), a pipeline reference (pipelines.md), and a quality evaluator (scripts/docling-evaluate.py).
Conversion is done via the docling CLI (included with pip install docling).
The evaluator provides a convert → evaluate → refine feedback loop that the
existing CLI does not cover.
It complements the official Docling documentation and the docling CLI reference.
The same layout is published in the Docling repo at docs/examples/agent_skill/docling-document-intelligence/ (for docs and PRs).
Contents
| Path | Purpose |
|---|---|
SKILL.md |
Full skill instructions (pipelines, chunking, evaluation loop) |
pipelines.md |
Standard vs VLM pipelines, OCR engines, API notes |
EXAMPLE.md |
Installing into ~/.cursor/skills/; running the CLI and evaluator |
improvement-log.md |
Optional template for local "what worked" notes |
scripts/docling-evaluate.py |
Heuristic quality report on JSON (+ optional Markdown) |
scripts/requirements.txt |
Minimal pip deps for the evaluator |
Quick start
pip install docling docling-core
# Convert to Markdown
docling https://arxiv.org/pdf/2408.09869 --output /tmp/
# Convert to JSON
docling https://arxiv.org/pdf/2408.09869 --to json --output /tmp/
# Evaluate quality
python3 scripts/docling-evaluate.py /tmp/2408.09869.json --markdown /tmp/2408.09869.md
Use --pipeline vlm for vision-model pipelines; see SKILL.md and pipelines.md.
License
MIT (aligned with Docling).