mirror of https://github.com/docling-project/docling.git synced 2026-05-17 13:10:38 +00:00

Files

T

Jehlum Pandit c23622f6f5 docs: add agent skill bundle for coding assistants (SKILL.md, pipelines, convert/evaluate) (#3174 )

* docs: add agent skill bundle with convert/evaluate helpers

- Add docs/examples/agent_skill/docling-document-intelligence/ with
  SKILL.md, pipelines.md, EXAMPLE.md, improvement-log template, and
  scripts/docling-convert.py + docling-evaluate.py (standard/vlm-local/vlm-api).
- Document InputFormat.PDF + PdfFormatOption for explicit PdfPipelineOptions.
- Link from examples index and mkdocs nav.

Made-with: Cursor

* docs: align agent skill README and EXAMPLE with Cursor bundle

- Document both ~/.cursor/skills and docs/examples paths.
- README notes repo parity for PRs and local installs.

Made-with: Cursor

* DCO Remediation Commit for jehlum11 <jehlum11@gmail.com>

I, jehlum11 <jehlum11@gmail.com>, hereby add my Signed-off-by to this commit: 2d268ffb6f
I, jehlum11 <jehlum11@gmail.com>, hereby add my Signed-off-by to this commit: 041e709c66

Signed-off-by: jehlum11 <jehlum11@gmail.com>
Made-with: Cursor

* docs: refactor agent skill to use docling CLI for conversion

Address maintainer feedback: the custom docling-convert.py script was
largely redundant with the existing docling CLI. This commit:

- Removes scripts/docling-convert.py (redundant with `docling` CLI)
- Refactors SKILL.md (v1.4 → v2.0) to use `docling` CLI for all
  conversion tasks, reserving the Python API only for features the
  CLI does not expose (chunking, VLM API endpoint config,
  force_backend_text hybrid mode)
- Updates docling-evaluate.py recommended_actions to reference
  `docling` CLI flags instead of the removed script
- Updates README.md, EXAMPLE.md, pipelines.md to use `docling` CLI
  examples throughout
- Simplifies requirements.txt (removes packaging dependency)

The only custom script retained is docling-evaluate.py, which provides
heuristic quality evaluation — functionality the CLI does not cover.

Signed-off-by: jehlum11 <jehlum11@gmail.com>
Made-with: Cursor

* docs: fix ruff format on docling-evaluate.py

Signed-off-by: jehlum11 <jehlum11@gmail.com>
Made-with: Cursor

---------

Signed-off-by: jehlum11 <jehlum11@gmail.com>

2026-04-13 15:02:51 +02:00

3.0 KiB

Vendored

Raw Blame History

Using the Docling agent skill

Agent Skills are folders of instructions that AI coding agents (Cursor, Claude Code, GitHub Copilot, etc.) can load when relevant.

Where this bundle lives

Cursor (local): ~/.cursor/skills/docling-document-intelligence/ (or copy this folder there).
Docling repository (docs + PRs): docs/examples/agent_skill/docling-document-intelligence/ in github.com/docling-project/docling.

The two trees are kept in sync; use either source.

Install (copy into your agent's skills directory)

# From a checkout of the Docling repo
cp -r docs/examples/agent_skill/docling-document-intelligence ~/.cursor/skills/

# Or copy from another machine / archive into e.g. ~/.claude/skills/

No extra config is required beyond installing Python dependencies (below).

Usage

Open your agent-enabled IDE and ask, for example:

Parse report.pdf and give me a structural outline

Convert https://arxiv.org/pdf/2408.09869 to markdown

Chunk invoice.pdf for RAG ingestion with 512 token chunks

Process scanned.pdf using the VLM pipeline

The agent should read SKILL.md, match the task, and run the appropriate docling CLI command or Python API call.

Running the docling CLI directly

pip install docling docling-core

# Basic conversion to Markdown
docling report.pdf --output /tmp/

# JSON output
docling report.pdf --to json --output /tmp/

# Custom OCR engine
docling report.pdf --ocr-engine rapidocr --output /tmp/

# VLM pipeline
docling scanned.pdf --pipeline vlm --output /tmp/

# VLM with specific model
docling scanned.pdf --pipeline vlm --vlm-model granite_docling --output /tmp/

# Remote VLM services
docling doc.pdf --pipeline vlm --enable-remote-services --output /tmp/

Evaluate and refine

docling report.pdf --to json --output /tmp/
docling report.pdf --to md --output /tmp/
python3 scripts/docling-evaluate.py /tmp/report.json --markdown /tmp/report.md

If the report shows warn or fail, follow recommended_actions, re-convert with docling using the suggested flags, and optionally append a note to improvement-log.md (see SKILL.md section 7).

What the skill covers

Task	How to ask
Parse PDF / DOCX / PPTX / HTML / image	"parse this file"
Convert to Markdown	"convert to markdown"
Export as structured JSON	"export as JSON"
Chunk for RAG	"chunk for RAG", "prepare for ingestion"
Analyze structure	"show me the headings and tables"
Use VLM pipeline	"use the VLM pipeline", "process scanned PDF"
Use remote inference	"use vLLM", "call the API pipeline"

3.0 KiB Vendored Raw Blame History