docling

mirror of https://github.com/docling-project/docling.git synced 2026-05-17 13:10:38 +00:00

Author	SHA1	Message	Date
Jehlum Pandit	c23622f6f5	docs: add agent skill bundle for coding assistants (SKILL.md, pipelines, convert/evaluate) (#3174 ) * docs: add agent skill bundle with convert/evaluate helpers - Add docs/examples/agent_skill/docling-document-intelligence/ with SKILL.md, pipelines.md, EXAMPLE.md, improvement-log template, and scripts/docling-convert.py + docling-evaluate.py (standard/vlm-local/vlm-api). - Document InputFormat.PDF + PdfFormatOption for explicit PdfPipelineOptions. - Link from examples index and mkdocs nav. Made-with: Cursor * docs: align agent skill README and EXAMPLE with Cursor bundle - Document both ~/.cursor/skills and docs/examples paths. - README notes repo parity for PRs and local installs. Made-with: Cursor * DCO Remediation Commit for jehlum11 <jehlum11@gmail.com> I, jehlum11 <jehlum11@gmail.com>, hereby add my Signed-off-by to this commit: `2d268ffb6f` I, jehlum11 <jehlum11@gmail.com>, hereby add my Signed-off-by to this commit: `041e709c66` Signed-off-by: jehlum11 <jehlum11@gmail.com> Made-with: Cursor * docs: refactor agent skill to use docling CLI for conversion Address maintainer feedback: the custom docling-convert.py script was largely redundant with the existing docling CLI. This commit: - Removes scripts/docling-convert.py (redundant with `docling` CLI) - Refactors SKILL.md (v1.4 → v2.0) to use `docling` CLI for all conversion tasks, reserving the Python API only for features the CLI does not expose (chunking, VLM API endpoint config, force_backend_text hybrid mode) - Updates docling-evaluate.py recommended_actions to reference `docling` CLI flags instead of the removed script - Updates README.md, EXAMPLE.md, pipelines.md to use `docling` CLI examples throughout - Simplifies requirements.txt (removes packaging dependency) The only custom script retained is docling-evaluate.py, which provides heuristic quality evaluation — functionality the CLI does not cover. Signed-off-by: jehlum11 <jehlum11@gmail.com> Made-with: Cursor * docs: fix ruff format on docling-evaluate.py Signed-off-by: jehlum11 <jehlum11@gmail.com> Made-with: Cursor --------- Signed-off-by: jehlum11 <jehlum11@gmail.com>	2026-04-13 15:02:51 +02:00
Anish Raghavendra	3a64f41af8	docs: add line-based chunker documentation and examples (#3210 ) Signed-off-by: anish.raghavendra <anish.raghavendra@ibm.com> Co-authored-by: anish.raghavendra <anish.raghavendra@ibm.com>	2026-03-30 10:55:31 +02:00
Tejas Kumar	1321b39cd8	docs: add audio & video processing guide (#3038 ) * Update docs for media * DCO Remediation Commit for Tejas Kumar <tejas.kumar@datastax.com> I, Tejas Kumar <tejas.kumar@datastax.com>, hereby add my Signed-off-by to this commit: `33089ccd73` Signed-off-by: Tejas Kumar <tejas.kumar@datastax.com> --------- Signed-off-by: Tejas Kumar <tejas.kumar@datastax.com>	2026-03-01 09:00:48 +01:00
Cesar Berrospi Ramis	1eb5c21dab	docs: add XBRL conversion example notebook and update feature listings (#3039 ) docs(xbrl): add notebook for XBRL parsing Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com>	2026-02-27 16:09:19 +01:00
Michele Dolfi	d4c87133f3	feat: Introduce pluggable VLM runtime system with preset-based configuration (#2919 ) * model runtime refactoring Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * add test Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * fix code formula preset Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * batch prediction Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * use presets and new vlm options in CLI Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * use new model settings by default Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * running Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * update examples Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * fixes for running examples Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * keep old stage Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * update model Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * use granite 3.3 and set options Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * revisit init logic and propagate the proper options to the runtimes Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * update all stages with original setup Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * per stage registry Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * use chat template Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * remove duplicated predict() and factor out some utils Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * working picture description examples Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * add granite docling as code formula model Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * rename code formula presets Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * fix running minimal_vlm example Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * add all models to presets and run compare_vlm Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * remove unused repo_id Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * update vlm api model example Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * fix legacy examples Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * add another legacy example Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * fix test Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * avoid automatic fallback to mlx and fix end_of_utterance in codeformula Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * move vlm_convert_model Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * use new vlm runtime class Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * flasg for CI Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * rename runtimes to explicit vlm_runtimes Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * renaming from runtime to inference engine and model families Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * fixes Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * fix test Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * add docs with stages Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * update docs catalog page Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * rename runtime to inference engine Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> --------- Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2026-02-04 17:29:17 +01:00
Mohd Kaif	bf80e329d1	docs: add Semantica integration (#2860 ) * docs: add Semantica integration * DCO Remediation Commit for KaifAhmad1 <kaifahmad087@gmail.com> I, KaifAhmad1 <kaifahmad087@gmail.com>, hereby add my Signed-off-by to this commit: `bbff355863` Signed-off-by: KaifAhmad1 <kaifahmad087@gmail.com> * docs: add Semantica to mkdocs navigation Signed-off-by: KaifAhmad1 <kaifahmad087@gmail.com> * docs: add title with emoji to Semantica integration page Signed-off-by: KaifAhmad1 <kaifahmad087@gmail.com> * docs: refine Semantica integration with pipeline example and cookbook link Signed-off-by: KaifAhmad1 <kaifahmad087@gmail.com> --------- Signed-off-by: KaifAhmad1 <kaifahmad087@gmail.com>	2026-01-09 09:41:44 +01:00
Michele Dolfi	be085c0e39	docs(RTX): Guidelines for best performance on RTX GPUs (#2765 ) * add RTX docs Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * add artwork and fix title Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * fix series definition Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * add nvidia logo and update todo Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> --------- Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2025-12-19 13:16:59 +01:00
Julia Pap	cc5e3cee74	docs: add docstrings to DocumentConverter #2748 (#2782 ) * docs: add docstrings for DocumentConverter Signed-off-by: Julia Pap <papjuli@gmail.com> * Apply suggestions from code review Improve docstrings in DocumentConverter Co-authored-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com> Signed-off-by: Julia Pap <papjuli@gmail.com> * docs: improve docstring formatting and wording in DocumentConverter * docs: show init method in document converter reference * docs: change back indents to 4x in DocumentConverter docstrings griffe was issuing warnings of confusing indentation * docs: clarify `max_num_pages` and `page_range` args in `DocumentConverter` methods * docs: fix some Yields and Returns in DocumentConverter docstrings * DCO Remediation Commit for Julia Pap <papjuli@gmail.com> I, Julia Pap <papjuli@gmail.com>, hereby add my Signed-off-by to this commit: `cf2ea4e0f0` I, Julia Pap <papjuli@gmail.com>, hereby add my Signed-off-by to this commit: `57446af168` I, Julia Pap <papjuli@gmail.com>, hereby add my Signed-off-by to this commit: `5d613edb8c` I, Julia Pap <papjuli@gmail.com>, hereby add my Signed-off-by to this commit: `b195281f56` I, Julia Pap <papjuli@gmail.com>, hereby add my Signed-off-by to this commit: `5d4a3af5d5` Signed-off-by: Julia Pap <papjuli@gmail.com> * docs: ignore init description, rephrased docstrings Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> --------- Signed-off-by: Julia Pap <papjuli@gmail.com> Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> Co-authored-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com> Co-authored-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com>	2025-12-19 11:20:33 +01:00
Edoardo Abati	807303e33e	chore: `mkdocstring` python handler to render pydantic field (#2770 ) * fix mkdocstring handlers Signed-off-by: Edoardo Abati <29585319+EdAbati@users.noreply.github.com> * fix indentation Signed-off-by: Edoardo Abati <29585319+EdAbati@users.noreply.github.com> --------- Signed-off-by: Edoardo Abati <29585319+EdAbati@users.noreply.github.com>	2025-12-11 12:54:13 +01:00
kadirpekel	ce5a099dfd	docs: Add Hector as compatible AI agent platform integration (#2662 ) docs: add Hector as compatible AI agent platform integration Signed-off-by: Kadir Pekel <kadirpekel@gmail.com>	2025-11-20 13:02:47 +01:00
Harry Ho	b216ad848d	docs: Added documentation to use SuryaOCR via plugin docling-surya (#2533 ) * docs: Added documentation to use SuryaOCR via plugin `docling-surya` Signed-off-by: Harry Ho <kho7@student.umgc.edu> * Add PyPI link for docling-surya package Added a link to the PyPI page for docling-surya. Signed-off-by: Harry Ho <4719770+harrykhh@users.noreply.github.com> * Add licensing note for SuryaOCR integration Added important licensing note regarding SuryaOCR integration. Signed-off-by: Harry Ho <4719770+harrykhh@users.noreply.github.com> * Ran linter to reformat Signed-off-by: Harry Ho <4719770+harrykhh@users.noreply.github.com> --------- Signed-off-by: Harry Ho <kho7@student.umgc.edu> Signed-off-by: Harry Ho <4719770+harrykhh@users.noreply.github.com> Co-authored-by: Harry Ho <kho7@student.umgc.edu>	2025-11-19 15:27:24 +01:00
Michele Dolfi	8af228f1e2	docs(examples): processing parquet file of images (#2641 ) * add example processing parquet file of images Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * vlm using vllm api Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * use openvino and add more docs Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * add default input file Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * change default to standard for running in CI Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * use simple rapidocr without openvino in the CI example Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> --------- Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2025-11-19 06:39:25 +01:00
Ryan Soliveres	d549445e78	docs: Move Installation and Quickstart (Usage) under Getting started (#2644 ) * docs: Move Installation and Quickstart (Usage) under Getting started Moved Installation and Usage (Quickstart) under Getting started section Rename installation folder to documentation folder Rename installation/index.md to documentation/installation.md Duplicate usage/index.md to documentation directory and rename it to documentation/quickstart.md Add redirection from installation and usage Signed-off-by: Ryan S <ryansoliveres@users.noreply.github.com> * docs: Move Installation and Quickstart under Getting started Signed-off-by: ryansoliveres <ryan.soliveres@yahoo.com> * docs: Move Installation and Quickstart under Getting started Signed-off-by: ryansoliveres <ryan.soliveres@yahoo.com> * git commit -m "DCO Remediation Commit for rysoliveres <ryan.soliveres@yahoo.com> I, rysoliveres <ryan.soliveres@yahoo.com>, hereby add my Signed-off-by to this commit: `b7ae13e3d8` Signed-off-by: rysoliveres <ryan.soliveres@yahoo.com>" Signed-off-by: ryansoliveres <ryan.soliveres@yahoo.com> * git commit --allow-empty -m "DCO Remediation Commit for rysoliveres <ryan.soliveres@yahoo.com> I, rysoliveres <ryan.soliveres@yahoo.com>, hereby add my Signed-off-by to this commit: `b7ae13e3d8` Signed-off-by: rysoliveres <ryan.soliveres@yahoo.com>" Signed-off-by: ryansoliveres <ryan.soliveres@yahoo.com> * DCO Remediation Commit for rysoliveres <ryan.soliveres@yahoo.com> I, rysoliveres <ryan.soliveres@yahoo.com>, hereby add my Signed-off-by to this commit: `b7ae13e3d8` Signed-off-by: rysoliveres <ryan.soliveres@yahoo.com> Signed-off-by: ryansoliveres <ryan.soliveres@yahoo.com> * DCO Remediation Commit for rysoliveres <ryan.soliveres@yahoo.com> I, rysoliveres <ryan.soliveres@yahoo.com>, hereby add my Signed-off-by to this commit: `b7ae13e3d8` Signed-off-by: rysoliveres <ryan.soliveres@yahoo.com> Signed-off-by: ryansoliveres <ryan.soliveres@yahoo.com> --------- Signed-off-by: Ryan S <ryansoliveres@users.noreply.github.com> Signed-off-by: ryansoliveres <ryan.soliveres@yahoo.com>	2025-11-18 17:09:41 +01:00
Panos Vagenas	ac9fc585bb	docs: add redirection from getting started page (#2640 ) Signed-off-by: Panos Vagenas <pva@zurich.ibm.com>	2025-11-17 14:13:51 +01:00
Robyn Johnson	ae30373ee7	docs: combine Home and Getting Started pages (#2600 ) * Update mkdocs.yml Remove navigations.sections feature so that navigation menus will collapse & expand. They are collapsed by default. * docs: add sign-off DCO Remediation Commit for Robyn J <bobbinrobyn@users.noreply.github.com> I, Robyn J <bobbinrobyn@users.noreply.github.com>, hereby add my Signed-off-by to this commit: `b7d7441827` Signed-off-by: Robyn J <bobbinrobyn@users.noreply.github.com> * docs: Combine Home and Getting Started page Combine home and getting stated pages, and rename the page "Documentation" Signed-off-by: Robyn J <bobbinrobyn@users.noreply.github.com> --------- Signed-off-by: Robyn J <bobbinrobyn@users.noreply.github.com>	2025-11-14 13:29:25 +01:00
Robyn Johnson	8da3d287ed	docs: make navigation menus collapse and expand (#2573 ) * Update mkdocs.yml Remove navigations.sections feature so that navigation menus will collapse & expand. They are collapsed by default. * docs: add sign-off DCO Remediation Commit for Robyn J <bobbinrobyn@users.noreply.github.com> I, Robyn J <bobbinrobyn@users.noreply.github.com>, hereby add my Signed-off-by to this commit: `b7d7441827` Signed-off-by: Robyn J <bobbinrobyn@users.noreply.github.com> --------- Signed-off-by: Robyn J <bobbinrobyn@users.noreply.github.com>	2025-11-06 05:25:19 +01:00
Michele Dolfi	97aa06bfbc	docs: Add details and examples on optimal GPU setup (#2531 ) * docs for GPU optimizations Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * improve time reporting and improve execution Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * fix standard pipeline Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * tune examples with batch size 64 Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * add benchmark results Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * improve docs Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * typo in excluded tests Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * explicit pipeline in table Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> --------- Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2025-10-30 13:22:05 +01:00
Michele Dolfi	dd03b53117	docs: discord badge with join link (#2473 ) * add discord link Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * Add Discord link to social section in mkdocs.yml Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com> * Add Discord link to getting started documentation Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com> --------- Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com> Co-authored-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com>	2025-10-16 10:13:50 +02:00
Peter W. J. Staar	3e6da2c62d	docs: Example on PII obfuscation (#2459 ) * added example on PII obfuscation Signed-off-by: Peter Staar <taa@zurich.ibm.com> * reformatting code Signed-off-by: Peter Staar <taa@zurich.ibm.com> * add in index and fix heading formatting Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * add GLINER to PII Signed-off-by: Peter Staar <taa@zurich.ibm.com> * final commit Signed-off-by: Peter Staar <taa@zurich.ibm.com> --------- Signed-off-by: Peter Staar <taa@zurich.ibm.com> Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> Co-authored-by: Michele Dolfi <dol@zurich.ibm.com>	2025-10-14 15:39:16 +02:00
Utsav Talwar	8a4b946a1a	docs: add RAG example with MongoDB Atlas Vector Search and VoyageAI embeddings (#2341 ) * Add MongoDB RAG example * Update MongoDB RAG Example * Update MongoDB RAG Example * Update MongoDB RAG Example * DCO Remediation Commit for utsavMongoDB <utsav.talwar@mongodb.com> I, utsavMongoDB <utsav.talwar@mongodb.com>, hereby add my Signed-off-by to this commit: `fbdbf53aa8` I, utsavMongoDB <utsav.talwar@mongodb.com>, hereby add my Signed-off-by to this commit: `9b3065ba2b` I, utsavMongoDB <utsav.talwar@mongodb.com>, hereby add my Signed-off-by to this commit: `1983f9db35` I, utsavMongoDB <utsav.talwar@mongodb.com>, hereby add my Signed-off-by to this commit: `0522aa105d` I, utsavMongoDB <utsav.talwar@mongodb.com>, hereby add my Signed-off-by to this commit: `f5a67e8012` Signed-off-by: utsavMongoDB <utsav.talwar@mongodb.com> * DCO Remediation Commit for utsavMongoDB <utsav.talwar@mongodb.com> I, utsavMongoDB <utsav.talwar@mongodb.com>, hereby add my Signed-off-by to this commit: `fbdbf53aa8` I, utsavMongoDB <utsav.talwar@mongodb.com>, hereby add my Signed-off-by to this commit: `9b3065ba2b` I, utsavMongoDB <utsav.talwar@mongodb.com>, hereby add my Signed-off-by to this commit: `1983f9db35` I, utsavMongoDB <utsav.talwar@mongodb.com>, hereby add my Signed-off-by to this commit: `0522aa105d` I, utsavMongoDB <utsav.talwar@mongodb.com>, hereby add my Signed-off-by to this commit: `f5a67e8012` Signed-off-by: utsavMongoDB <utsav.talwar@mongodb.com> * docs: Add example with MongoDB * DCO Remediation Commit for utsavMongoDB <utsav.talwar@mongodb.com> I, utsavMongoDB <utsav.talwar@mongodb.com>, hereby add my Signed-off-by to this commit: `bb245a31ed` I, utsavMongoDB <utsav.talwar@mongodb.com>, hereby add my Signed-off-by to this commit: `25436e543c` Signed-off-by: utsavMongoDB <utsav.talwar@mongodb.com> * DCO Remediation Commit for utsavMongoDB <utsav.talwar@mongodb.com> I, utsavMongoDB <utsav.talwar@mongodb.com>, hereby add my Signed-off-by to this commit: `bb245a31ed` I, utsavMongoDB <utsav.talwar@mongodb.com>, hereby add my Signed-off-by to this commit: `25436e543c` Signed-off-by: utsavMongoDB <utsav.talwar@mongodb.com> * DCO Remediation Commit for utsavMongoDB <utsav.talwar@mongodb.com> I, utsavMongoDB <utsav.talwar@mongodb.com>, hereby add my Signed-off-by to this commit: `bb245a31ed` I, utsavMongoDB <utsav.talwar@mongodb.com>, hereby add my Signed-off-by to this commit: `25436e543c` Signed-off-by: utsavMongoDB <utsav.talwar@mongodb.com> --------- Signed-off-by: utsavMongoDB <utsav.talwar@mongodb.com> Signed-off-by: Utsav Talwar <114057324+utsavMongoDB@users.noreply.github.com>	2025-10-03 13:29:43 +02:00
Hakeem Abbas	246de77d8c	fix(docs): fixed the color scheme (#2371 ) * fix(docs): fixed the color scheme Signed-off-by: Hakeem Abbas <hakeemsyd@gmail.com> * fix(docs): colors background Signed-off-by: Hakeem Abbas <hakeemsyd@gmail.com> --------- Signed-off-by: Hakeem Abbas <hakeemsyd@gmail.com>	2025-10-03 10:20:44 +02:00
Michele Dolfi	a975a790c9	docs: example using Hashicorp Vault PII transform (#2373 ) docs: add example using Hashicorp Vault PII transform Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2025-10-03 09:53:29 +02:00
Lucas Morin	e6c3b05e63	docs: Jobkit and connectors (#2357 ) * feat: create documentation for docling-jobkit Signed-off-by: Lucas Morin <lucas.morin222@gmail.com> * small text fixes Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> --------- Signed-off-by: Lucas Morin <lucas.morin222@gmail.com> Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> Co-authored-by: Michele Dolfi <dol@zurich.ibm.com>	2025-10-02 13:46:56 +02:00
Christoph Auer	17afb664d0	feat: Add granite-docling model (#2272 ) * adding granite-docling preview Signed-off-by: Peter Staar <taa@zurich.ibm.com> * updated the model specs Signed-off-by: Peter Staar <taa@zurich.ibm.com> * typo Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * use granite-docling and add to the model downloader Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * update docs and README Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * Update final repo_ids for GraniteDocling Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Update final repo_ids for GraniteDocling Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Fix model name in CLI usage example Signed-off-by: Christoph Auer <60343111+cau-git@users.noreply.github.com> * Fix VLM model name in README.md Signed-off-by: Christoph Auer <60343111+cau-git@users.noreply.github.com> --------- Signed-off-by: Peter Staar <taa@zurich.ibm.com> Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> Signed-off-by: Christoph Auer <cau@zurich.ibm.com> Signed-off-by: Christoph Auer <60343111+cau-git@users.noreply.github.com> Co-authored-by: Peter Staar <taa@zurich.ibm.com> Co-authored-by: Michele Dolfi <dol@zurich.ibm.com>	2025-09-17 15:15:49 +02:00
Cesar Berrospi Ramis	f8cc545bab	docs: add an example of RAG with OpenSearch (#2238 ) * docs: add an example of RAG with OpeanSearch Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * chore: pin latest docling-core and update uv.lock Pin latest version release of docling-core in pyproject.toml Update the dependencies in uv.lock file Run the notebook rag_opensearch.ipynb to pick up changes from docling-core Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> --------- Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com>	2025-09-10 14:37:22 +02:00
Roy Derks	e5cd7020bd	docs: Add instructions for using Docling with MCP to README (#2219 ) * docs: Add instructions for using Docling with MCP to README * DCO Remediation Commit for Roy Derks <10717410+royderks@users.noreply.github.com> Signed-off-by: Roy Derks <roy.derks@ibm.com> * DCO Remediation Commit for Roy Derks <10717410+royderks@users.noreply.github.com> I, Roy Derks <10717410+royderks@users.noreply.github.com>, hereby add my Signed-off-by to this commit: `4b9ba1d0ef` Signed-off-by: Roy Derks <roy.derks@ibm.com> * docs: reorganize documentation on MCP server Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * docs: align README with documentation index page Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> --------- Signed-off-by: Roy Derks <roy.derks@ibm.com> Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> Co-authored-by: Roy Derks <roy.derks@ibm.com> Co-authored-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com>	2025-09-10 10:02:28 +02:00
Panos Vagenas	a9f41b088e	docs: add information extraction example (#2199 ) * docs: add information exctraction example Signed-off-by: Panos Vagenas <pva@zurich.ibm.com> * update README Signed-off-by: Panos Vagenas <pva@zurich.ibm.com> * minor typo Signed-off-by: Panos Vagenas <pva@zurich.ibm.com> * update README Signed-off-by: Panos Vagenas <pva@zurich.ibm.com> --------- Signed-off-by: Panos Vagenas <pva@zurich.ibm.com>	2025-09-05 11:27:09 +02:00
Michele Dolfi	c0268416cf	chore: add analytics (#2133 ) add analytics Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2025-08-25 18:25:38 +02:00
Maroun Touma	e76298c40d	docs: DPK pipeline example using docling library (#2112 ) * Notebook showing example on how to use docling transforms in DPK Signed-off-by: Maroun Touma <touma@us.ibm.com> * fix HF Token name Signed-off-by: Maroun Touma <touma@us.ibm.com> * use %pip instead of pip install jupyter lab Signed-off-by: Maroun Touma <touma@us.ibm.com> * run formatter Signed-off-by: Maroun Touma <touma@us.ibm.com> * add example to mkdocs and fix typo Signed-off-by: Maroun Touma <touma@us.ibm.com> --------- Signed-off-by: Maroun Touma <touma@us.ibm.com>	2025-08-21 10:14:36 +02:00
Panos Vagenas	8996d612aa	docs: add Getting Started page (#2113 ) * docs: add Getting Started page Signed-off-by: Panos Vagenas <pva@zurich.ibm.com> * refactor usage Signed-off-by: Panos Vagenas <pva@zurich.ibm.com> * minor renaming Signed-off-by: Panos Vagenas <pva@zurich.ibm.com> --------- Signed-off-by: Panos Vagenas <pva@zurich.ibm.com>	2025-08-21 08:44:53 +02:00
Eric Deandrea	76c1fbd6e8	docs: Add docling Quarkus integration (#2083 ) * Add docling Quarkus integration * DCO Remediation Commit for Eric Deandrea <eric.deandrea@ibm.com> I, Eric Deandrea <eric.deandrea@ibm.com>, hereby add my Signed-off-by to this commit: `86aa0b80f4` Signed-off-by: Eric Deandrea <eric.deandrea@ibm.com> --------- Signed-off-by: Eric Deandrea <eric.deandrea@ibm.com>	2025-08-18 06:55:51 +02:00
Panos Vagenas	e2cca931be	docs: add Langflow integration (#2068 ) * docs: add langflow integration Signed-off-by: Panos Vagenas <pva@zurich.ibm.com> * fix link Signed-off-by: Panos Vagenas <pva@zurich.ibm.com> --------- Signed-off-by: Panos Vagenas <pva@zurich.ibm.com>	2025-08-11 16:03:29 +02:00
Thomas Vitale	bfda6d34d8	docs: Add Arconia integration (#2061 ) Signed-off-by: Thomas Vitale <ThomasVitale@users.noreply.github.com>	2025-08-08 09:35:47 +02:00
Michele Dolfi	90a7cc4bdd	docs: enrich existing DoclingDocument (#1969 ) add example for enriching an existing doclingdocument Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2025-07-22 16:20:15 +02:00
Fabiano Franz	5d98bcea1b	docs: add documentation for confidence scores (#1912 ) * docs: add documentation for confidence scores Signed-off-by: Fabiano Franz <contact@fabianofranz.com> * Increase focus on confidence grades, scores are informational only Signed-off-by: Fabiano Franz <contact@fabianofranz.com> * Update confidence_scores.md Signed-off-by: Christoph Auer <60343111+cau-git@users.noreply.github.com> --------- Signed-off-by: Fabiano Franz <contact@fabianofranz.com> Signed-off-by: Christoph Auer <60343111+cau-git@users.noreply.github.com> Co-authored-by: Christoph Auer <60343111+cau-git@users.noreply.github.com>	2025-07-21 10:16:17 +02:00
Peter W. J. Staar	f3ae3029b8	docs: update readme and add ASR example (#1836 ) * updated the README Signed-off-by: Peter Staar <taa@zurich.ibm.com> * added minimal_asr_pipeline Signed-off-by: Peter Staar <taa@zurich.ibm.com> * Updated README and added ASR example Signed-off-by: Peter Staar <taa@zurich.ibm.com> * Updated docs.index.md Signed-off-by: Peter Staar <taa@zurich.ibm.com> * updated CI and mkdocs Signed-off-by: Peter Staar <taa@zurich.ibm.com> * added link tp existing audio file Signed-off-by: Peter Staar <taa@zurich.ibm.com> * added link tp existing audio file Signed-off-by: Peter Staar <taa@zurich.ibm.com> * reformatting Signed-off-by: Peter Staar <taa@zurich.ibm.com> --------- Signed-off-by: Peter Staar <taa@zurich.ibm.com>	2025-06-23 18:55:16 +02:00
Michele Dolfi	49b10e7419	docs: add open webui (#1734 ) Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2025-06-10 09:35:20 +02:00
Peter W. J. Staar	cfdf4cea25	feat: new vlm-models support (#1570 ) * feat: adding new vlm-models support Signed-off-by: Peter Staar <taa@zurich.ibm.com> * fixed the transformers Signed-off-by: Peter Staar <taa@zurich.ibm.com> * got microsoft/Phi-4-multimodal-instruct to work Signed-off-by: Peter Staar <taa@zurich.ibm.com> * working on vlm's Signed-off-by: Peter Staar <taa@zurich.ibm.com> * refactoring the VLM part Signed-off-by: Peter Staar <taa@zurich.ibm.com> * all working, now serious refacgtoring necessary Signed-off-by: Peter Staar <taa@zurich.ibm.com> * refactoring the download_model Signed-off-by: Peter Staar <taa@zurich.ibm.com> * added the formulate_prompt Signed-off-by: Peter Staar <taa@zurich.ibm.com> * pixtral 12b runs via MLX and native transformers Signed-off-by: Peter Staar <taa@zurich.ibm.com> * added the VlmPredictionToken Signed-off-by: Peter Staar <taa@zurich.ibm.com> * refactoring minimal_vlm_pipeline Signed-off-by: Peter Staar <taa@zurich.ibm.com> * fixed the MyPy Signed-off-by: Peter Staar <taa@zurich.ibm.com> * added pipeline_model_specializations file Signed-off-by: Peter Staar <taa@zurich.ibm.com> * need to get Phi4 working again ... Signed-off-by: Peter Staar <taa@zurich.ibm.com> * finalising last points for vlms support Signed-off-by: Peter Staar <taa@zurich.ibm.com> * fixed the pipeline for Phi4 Signed-off-by: Peter Staar <taa@zurich.ibm.com> * streamlining all code Signed-off-by: Peter Staar <taa@zurich.ibm.com> * reformatted the code Signed-off-by: Peter Staar <taa@zurich.ibm.com> * fixing the tests Signed-off-by: Peter Staar <taa@zurich.ibm.com> * added the html backend to the VLM pipeline Signed-off-by: Peter Staar <taa@zurich.ibm.com> * fixed the static load_from_doctags Signed-off-by: Peter Staar <taa@zurich.ibm.com> * restore stable imports Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * use AutoModelForVision2Seq for Pixtral and review example (including rename) Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * remove unused value Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * refactor instances of VLM models Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * skip compare example in CI Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * use lowercase and uppercase only Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * add new minimal_vlm example and refactor pipeline_options_vlm_model for cleaner import Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * rename pipeline_vlm_model_spec Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * move more argument to options and simplify model init Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * add supported_devices Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * remove not-needed function Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * exclude minimal_vlm Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * missing file Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * add message for transformers version Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * rename to specs Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * use module import and remove MLX from non-darwin Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * remove hf_vlm_model and add extra_generation_args Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * use single HF VLM model class Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * remove torch type Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * add docs for vision models Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> --------- Signed-off-by: Peter Staar <taa@zurich.ibm.com> Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> Co-authored-by: Michele Dolfi <dol@zurich.ibm.com>	2025-06-02 17:01:06 +02:00
Panos Vagenas	9f28abf061	docs: add advanced chunking & serialization example (#1589 ) Signed-off-by: Panos Vagenas <pva@zurich.ibm.com>	2025-05-14 14:35:07 +02:00
Panos Vagenas	3220a592e7	docs: add serialization docs, update chunking docs (#1556 ) * docs: add serializers docs, update chunking docs Signed-off-by: Panos Vagenas <pva@zurich.ibm.com> * update notebook to improve MD table rendering Signed-off-by: Panos Vagenas <pva@zurich.ibm.com> --------- Signed-off-by: Panos Vagenas <pva@zurich.ibm.com>	2025-05-08 21:43:01 +02:00
Ryan Lin	a2fbbba9f7	feat: add tutorial using Milvus and Docling for RAG pipeline (#1449 ) * feat: add milvus rag with docling tutorial Signed-off-by: Ryan Lin <linjinhong@yandex.com> * chore: run pre-commit Signed-off-by: Ryan Lin <linjinhong@yandex.com> * feat: add RAG with Milvus example to mkdocs Signed-off-by: Ryan Lin <linjinhong@yandex.com> --------- Signed-off-by: Ryan Lin <linjinhong@yandex.com>	2025-04-25 09:12:35 +02:00
Gabe Goodhart	c605edd8e9	feat: OllamaVlmModel for Granite Vision 3.2 (#1337 ) * build: Add ollama sdk dependency Branch: OllamaVlmModel Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> * feat: Add option plumbing for OllamaVlmOptions in pipeline_options Branch: OllamaVlmModel Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> * feat: Full implementation of OllamaVlmModel Branch: OllamaVlmModel Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> * feat: Connect "granite_vision_ollama" pipeline option to CLI Branch: OllamaVlmModel Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> * Revert "build: Add ollama sdk dependency" After consideration, we're going to use the generic OpenAI API instead of the Ollama-specific API to avoid duplicate work. This reverts commit bc6b366468cdd66b52540aac9c7d8b584ab48ad0. Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> * refactor: Move OpenAI API call logic into utils.utils This will allow reuse of this logic in a generic VLM model NOTE: There is a subtle change here in the ordering of the text prompt and the image in the call to the OpenAI API. When run against Ollama, this ordering makes a big difference. If the prompt comes before the image, the result is terse and not usable whereas the prompt coming after the image works as expected and matches the non-OpenAI chat API. Branch: OllamaVlmModel Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> * refactor: Refactor from Ollama SDK to generic OpenAI API Branch: OllamaVlmModel Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> * fix: Linting, formatting, and bug fixes The one bug fix was in the timeout arg to openai_image_request. Otherwise, this is all style changes to get MyPy and black passing cleanly. Branch: OllamaVlmModel Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> * remove model from download enum Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * generalize input args for other API providers Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * rename and refactor Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * add example Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * require flag for remote services Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * disable example from CI Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * add examples to docs Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> --------- Signed-off-by: Gabe Goodhart <ghart@us.ibm.com> Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> Co-authored-by: Michele Dolfi <dol@zurich.ibm.com>	2025-04-10 18:03:04 +02:00
Michele Dolfi	2e99e5a54f	docs: add plugins docs (#1319 ) add plugin docs Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2025-04-08 09:44:37 +02:00
Panos Vagenas	71148eb381	docs: add visual grounding example (#1270 ) Signed-off-by: Panos Vagenas <pva@zurich.ibm.com>	2025-04-02 14:03:19 +02:00
Michele Dolfi	54a78c307d	docs: move apify to docs (#1182 ) move apify to docs Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2025-03-18 16:43:55 +01:00
Michele Dolfi	fa16b12316	chore: move to docling-project org (#1160 ) * chore: rename org Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * Update docs/faq/index.md Co-authored-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com> Signed-off-by: Michele Dolfi <97102151+dolfim-ibm@users.noreply.github.com> * update github pages Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * revert test content Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> --------- Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> Signed-off-by: Michele Dolfi <97102151+dolfim-ibm@users.noreply.github.com> Co-authored-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com>	2025-03-14 12:35:29 +01:00
Michele Dolfi	357d41cc47	docs: Enrichment models (#1097 ) * warning for develop examples Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * add docs for enrichment models Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * minor reorg of top-level docs (#1098) * minor reorg of top-level docs Signed-off-by: Panos Vagenas <pva@zurich.ibm.com> * fix typo [no ci] Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com> --------- Signed-off-by: Panos Vagenas <pva@zurich.ibm.com> Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com> * trigger ci Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> --------- Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> Signed-off-by: Panos Vagenas <pva@zurich.ibm.com> Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com> Co-authored-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com>	2025-03-04 14:24:38 +01:00
Panos Vagenas	27c04007bc	docs: revamp picture description example (#1015 ) * docs: revamp picture description example Signed-off-by: Panos Vagenas <pva@zurich.ibm.com> * Improvements for visualization example (#1017) * fix colab install, use granite and improve viz of description Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * switch docs to notbook Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * show results with all models Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * show other vlm Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> --------- Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> --------- Signed-off-by: Panos Vagenas <pva@zurich.ibm.com> Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> Co-authored-by: Michele Dolfi <dol@zurich.ibm.com>	2025-02-19 11:28:54 +01:00
Tobias Strebitzer	00d9405b0a	feat: Add support for CSV input with new backend to transform CSV files to DoclingDocument (#945 ) * feat: Implement csv backend and format detection Signed-off-by: Tobias Strebitzer <tobias.strebitzer@magloft.com> * test: Implement csv parsing and format tests Signed-off-by: Tobias Strebitzer <tobias.strebitzer@magloft.com> * docs: Add example and CSV format documentation Signed-off-by: Tobias Strebitzer <tobias.strebitzer@magloft.com> * feat: Add support for various CSV dialects and update documentation Signed-off-by: Tobias Strebitzer <tobias.strebitzer@magloft.com> * feat: Add validation for delimiters and tests for inconsistent csv files Signed-off-by: Tobias Strebitzer <tobias.strebitzer@magloft.com> --------- Signed-off-by: Tobias Strebitzer <tobias.strebitzer@magloft.com>	2025-02-14 08:55:09 +01:00
Michele Dolfi	2d66e99b69	docs: Examples for picture descriptions (#951 ) * add more examples for picture descriptions Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * fix merge typo Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> --------- Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2025-02-13 08:33:12 +01:00

1 2

82 Commits