docling

mirror of https://github.com/docling-project/docling.git synced 2026-05-17 13:10:38 +00:00

Author	SHA1	Message	Date
Maksym Lysak	38354b7d13	Added support of "row_section" semantics of HTML_backend. Improvements on complex rendering example. Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>	2026-05-12 17:08:27 +02:00
Maksym Lysak	85a2a9c5fd	Complex example of html page renderer, with custom logic of table extraction and dynamic resolution based on seed images Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>	2026-05-12 14:16:06 +02:00
Peter El Hachem	336f942854	feat: add 2 stage model dowload from hf and call it for threaded layout model. (#3267 ) * feat: add 2 stage model dowload from hf and call it from threaded_layout_vlm_pipeline Signed-off-by: ElHachem02 <peterelhachem02@gmail.com> * feat: cleanup extra space and import Signed-off-by: ElHachem02 <peterelhachem02@gmail.com> * fix: move demo to docs/examples/ Signed-off-by: ElHachem02 <peterelhachem02@gmail.com> --------- Signed-off-by: ElHachem02 <peterelhachem02@gmail.com> Co-authored-by: Michele Dolfi <dol@zurich.ibm.com>	2026-05-06 13:29:42 +02:00
EliSchwartz	24f2d148d9	feat(vlm): upgrade Granite Vision model to 4.1 for table + chart extraction (#3382 ) * feat(table-structure): swap VLM model to granite-vision-4.1-4b Updates GraniteVisionTableStructureModel to use the 4.1 model. The 4.1 weights are pre-merged, so merge_lora_adapters() is now hasattr-guarded. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Eli Schwartz <eliyahu.schwartz@ibm.com> * feat(chart-extraction): swap V4 VLM model to granite-vision-4.1-4b Updates ChartExtractionModelGraniteVisionV4 to use the 4.1 model. hasattr-guards the merge_lora_adapters() call since 4.1 weights are pre-merged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Eli Schwartz <eliyahu.schwartz@ibm.com> * docs(example): mention granite-vision-4.1-4b in table-structure example Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Eli Schwartz <eliyahu.schwartz@ibm.com> * docs(catalog): update Granite Vision entry to 4.1-4b Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Eli Schwartz <eliyahu.schwartz@ibm.com> * feat(chart-extraction): honor cuda_use_flash_attention2 in V4 loader Mirrors the table-structure loader so ChartExtractionModelGraniteVisionV4 also passes _attn_implementation based on AcceleratorOptions. Without this the chart model falls back to the transformers SDPA default, which can hit cuDNN backend failures on some torch/cuDNN stacks while the table model (which already passed the flag) runs cleanly. Stores accelerator_options on the base class so subclasses can read it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Eli Schwartz <eliyahu.schwartz@ibm.com> * fix(model-downloader): update Granite Vision log message to 4.1 The log message in download_models still mentioned "Granite Vision 4.0" after the model swap. Correct it to match the current model version. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Eli Schwartz <eliyahu.schwartz@ibm.com> * fix(chart-extraction): fall back to bare CSV when V4 model omits ```csv``` fence granite-vision-4.1-4b sometimes emits raw CSV without a ```csv``` code fence for the <chart2csv> prompt, which caused _extract_csv_to_dataframe to raise ValueError and drop the chart's tabular_chart metadata. Mirror the tolerant parsing already used by the v3 class: prefer a fenced block, otherwise strip any stray backtick prefix/suffix and parse the text as-is. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Eli Schwartz <eliyahu.schwartz@ibm.com> --------- Signed-off-by: Eli Schwartz <eliyahu.schwartz@ibm.com> Co-authored-by: Eli Schwartz <eliyahu.schwartz@ibm.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 08:36:08 +02:00
geoHeil	eb4724ee4c	ci: prototype tach-based modular skipping (#3333 ) * ci: prototype tach-based modular skipping Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com> * ci: modularize ubuntu setup and refine gating Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com> * ci: adopt metaxy-inspired governance helpers - replace custom aggregate check with re-actors/alls-green - set FORCE_JAVASCRIPT_ACTIONS_TO_NODE24 on every workflow - keep PR concurrency alive when the graphite:merge label is present Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com> * ci: tune checks and pin action versions Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com> * ci: split CI suites and heavy examples Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com> * DCO Remediation Commit for Georg Heiler <georg.kf.heiler@gmail.com> I, Georg Heiler <georg.kf.heiler@gmail.com>, hereby add my Signed-off-by to this commit: ecaa4777886157d5c2a7b3893c3a820983089dbf I, Georg Heiler <georg.kf.heiler@gmail.com>, hereby add my Signed-off-by to this commit: d15416f3ca94ac97af2a8317cd6404208db9d896 Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com> * ci: sharpen tach graph and per-suite path filters - Split docling.pipeline into per-pipeline tach modules (asr, vlm, standard_pdf, threaded_standard_pdf, legacy_standard_pdf, extraction_vlm, base, base_extraction, simple) so pytest --tach-base impact analysis can attribute changes to a specific pipeline rather than the whole package. - Split the asr- and vlm-specific docling.datamodel option files (asr_model_specs, pipeline_options_asr_model, vlm_engine_options, vlm_model_specs, pipeline_options_vlm_model, layout_model_specs, stage_model_specs, backend_options) into their own tach modules so a narrow spec/options change no longer marks the full datamodel as impacted. - Narrow the per-suite pipeline path filters in checks.yml to the concrete pipeline files relevant to each suite, so editing vlm_pipeline.py only triggers the vlm matrix cell and editing asr_pipeline.py only the asr one. - Rekey the model cache in setup-ubuntu-ci to include runner.os and hashFiles(uv.lock, pyproject.toml), with ordered restore-keys fallbacks so a lockfile bump no longer silently stales the cache. Metaxy parity note: layered tach enforcement (layer = "...") is blocked by existing backend<->datamodel and utils<->stages cycles; depot runners, nox dynamic matrices, devenv/nix, dprint and ty are not applicable to docling's stack. All pinned action SHAs are on their latest release as of this commit. Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com> * ci: introduce pipeline and orchestration tach layers Earlier notes claimed layers were blocked. That was only true for the cyclic core (backend<->datamodel, utils<->stages). The boundary above core is clean: - No module under docling/backend, docling/datamodel, docling/models, docling/utils, docling/exceptions, or docling/chunking imports anything from docling.pipeline (verified by grep). - No module anywhere in docling/ imports from docling.cli, docling.document_converter, docling.document_extractor, or docling.service_client (also verified). So we can introduce two real layers on top of the cyclic core: - "pipeline" — docling.pipeline and all nine concrete pipelines (base, simple, base_extraction, asr, vlm, extraction_vlm, standard_pdf, threaded_standard_pdf, legacy_standard_pdf). - "orchestration" — docling.cli, docling.document_converter, docling.document_extractor, and docling.experimental.pipeline. Unlayered modules stay "below" both layers (tach allows them to be depended on freely) and continue to carry the declared-but-cyclic backend<->datamodel and utils<->stages edges. A VLM-only layer was explored but rejected: only docling.pipeline.vlm_pipeline and docling.pipeline.extraction_vlm_pipeline could be cleanly layered as "vlm", because the matching datamodel options (pipeline_options_vlm_model, vlm_engine_options, vlm_model_specs) and model stages (vlm_convert, vlm_pipeline_models) sit inside the datamodel/models cycle and cannot be promoted to a higher layer without first breaking that cycle. Layering only the two pipeline files is not worth the extra config. Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com> * ci: expand tach layers to entrypoints/pipeline/models/core Follow-up to the two-layer attempt. After verifying via grep that nothing in datamodel/utils/backend imports from docling.models.{extraction,factories,plugins,vlm_pipeline_models} or from the "upper" stages (page_assemble, page_preprocessing, reading_order, picture_description, vlm_convert), those nine modules can be promoted out of the cyclic core into a dedicated "models" layer. The resulting order (highest first): - entrypoints — cli, document_converter, document_extractor, experimental.pipeline - pipeline — docling.pipeline + the nine concrete pipelines - models — model factories, extraction, plugins, vlm_pipeline_models, and the five "upper" stages - core — datamodel, backend, utils, exceptions, chunking, models (base), models.utils, inference_engines., the six "core stages" that utils cycles with (chart_extraction, code_formula, layout, ocr, picture_classifier, table_structure), and the experimental. and service_client modules Rename the previous "orchestration" layer to "entrypoints" to match the common docling vocabulary. Every module now carries an explicit layer tag instead of relying on implicit unlayered behaviour, so future additions must pick a layer deliberately. A VLM layer, a stand-alone inference-engines layer, and separating datamodel from backend all remain blocked by the bidirectional backend<->datamodel and utils<->core-stages edges; those need a code-level refactor first. Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com> * ci: refine tach client and foundation layers Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com> * ci: add optional windows and macos smoke lanes Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com> * ci: normalize reusable workflow boolean inputs Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com> * ci: replace external all-green action Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com> * ci: use org-allowed setup-uv action Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com> * ci: install compiler toolchain for ML tests Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com> * DCO Remediation Commit for Georg Heiler <georg.kf.heiler@gmail.com> I, Georg Heiler <georg.kf.heiler@gmail.com>, hereby add my Signed-off-by to this commit: bb714afb42cd1b29ab073a7f59cc72874ff2fdcd I, Georg Heiler <georg.kf.heiler@gmail.com>, hereby add my Signed-off-by to this commit: a1f2761da8f72bfed636bd571ebf77b42c8771b6 Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com> * DCO Remediation Commit for Georg Heiler <georg.kf.heiler@gmail.com> I, Georg Heiler <georg.kf.heiler@gmail.com>, hereby add my Signed-off-by to this commit: cc6551b54c5bf4815ae9cd57cf43a98928a74be0 I, Georg Heiler <georg.kf.heiler@gmail.com>, hereby add my Signed-off-by to this commit: b21b0e7ca12b552dbdd54fac1bda113719c286f1 Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com> * ci: simplify ML pytest suite patterns Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com> * ci: gate heavy examples on label, add job timeouts - ci-heavy-examples: run only on main push, schedule, workflow_dispatch, or when a PR is labeled tests:full / tests:heavy-examples. Drops the path-based auto-trigger so that common edits to pyproject.toml, uv.lock, or .github/actions do not kick off the 45-60min matrix on every PR push. Collapses the changes job into a job-level if gate and adds timeout-minutes: 90. - checks.yml: add timeout-minutes to every job so stuck runners cannot burn the full 6h default. Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com> * ci: tolerate cancelled allowed-skip jobs in check aggregator Intentional cancellations (manual cancel, concurrency replacement) on jobs that are already in ALLOWED_SKIPS should not mark the overall workflow red. Treat `cancelled` the same as `skipped` when the job is listed as an allowed skip; any unexpected cancellation of a required job still fails. Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com> * docs: make minimal vlm example portable Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com> * DCO Remediation Commit for Georg Heiler <georg.kf.heiler@gmail.com> I, Georg Heiler <georg.kf.heiler@gmail.com>, hereby add my Signed-off-by to this commit: 2135051da3ed73d4b8a9130f584f40b56155af1a I, Georg Heiler <georg.kf.heiler@gmail.com>, hereby add my Signed-off-by to this commit: 4f6d1d7960f7418d0cde6425ae61538da84fda40 Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com> * ci: install workspace packages in CI syncs Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com> * DCO Remediation Commit for Georg Heiler <georg.kf.heiler@gmail.com> I, Georg Heiler <georg.kf.heiler@gmail.com>, hereby add my Signed-off-by to this commit: 492fa9883d4de6d98ebcb40fa863eafe2facff3c I, Georg Heiler <georg.kf.heiler@gmail.com>, hereby add my Signed-off-by to this commit: 3eefae71643f9ca3df0264690c0c6eb1f67f06f1 Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com> * DCO Remediation Commit for Georg Heiler <georg.kf.heiler@gmail.com> I, Georg Heiler <georg.kf.heiler@gmail.com>, hereby add my Signed-off-by to this commit: fe8c9689a0ee94f36eb826da8e2177ef87404f5e I, Georg Heiler <georg.kf.heiler@gmail.com>, hereby add my Signed-off-by to this commit: eabdd24a6734ec873cdaac857718aef2473677e7 Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com> * ci: remove unused graphite concurrency exception Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com> * ci: document test labels and gate cross-platform lanes Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com> * ci: select ml tests with pytest markers Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com> * ci: fix marker selector typing Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com> * ci: simplify ml suite scheduling Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com> * ci: mark cross-platform smoke tests Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com> * ci: reuse test trigger for ml matrix Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com> * ci: tighten full ci aggregation Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com> * ci: share required job result check Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com> --------- Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 14:15:35 +02:00
EliSchwartz	1569e42f84	feat: implement GraniteVisionTableStructureModel for VLM-based table extraction (#3323 ) Add a new table structure model using IBM Granite Vision to extract table structure from document images via OTSL token generation. Changes: - Add `GraniteVisionTableStructureOptions` with configurable model repo, device, batch size, and crop padding options - Implement `GraniteVisionTableStructureModel` that uses a VLM pipeline to generate OTSL tokens from cropped table images, then parses them into `TableData` with cells, rows, and columns - Register the model in `table_structure_engines` alongside existing engines - Add example script `docs/examples/granite_vision_table_structure.py` - Add tests covering options, model enable/disable, OTSL parsing (including self-closing tags xcel/srow/ecel), and invalid-backend error handling - Update model catalog docs and CI workflow accordingly Signed-off-by: Eli Schwartz <eli.shw@gmail.com>	2026-04-17 11:02:20 +02:00
geoHeil	8ec14f2c6f	docs: fix nanonets_ocr2 runtime support matrix (#3317 ) Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>	2026-04-17 06:24:53 +02:00
geoHeil	251c8b217a	fix(ocr): align RapidOCR english assets with 3.8 mobile models (#3291 ) * fix(ocr): support language selection for RapidOCR engine Allows specifying 'english' or 'chinese' via the --ocr-lang flag and automatically downloads the correct models. Signed-off-by: DevAbdullah90 <abdullahkashif12b3@gmail.com> * fix(ocr): fix linting and add unit tests for RapidOCR language selection Signed-off-by: DevAbdullah90 <abdullahkashif12b3@gmail.com> * fix(ocr): align RapidOCR english assets with 3.8 mobile models Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com> * fix(ocr): restore RapidOCR default model compatibility Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com> * fix(examples): disable OCR in code formula comparison Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com> --------- Signed-off-by: DevAbdullah90 <abdullahkashif12b3@gmail.com> Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com> Co-authored-by: DevAbdullah90 <abdullahkashif12b3@gmail.com>	2026-04-15 12:16:41 +02:00
nuri	cd2e5b633d	docs: add indexed picture placeholder example to serialization notebook (#3293 ) * docs: add indexed picture placeholder example to serialization notebook Show how to subclass MarkdownPictureSerializer to resolve {index} tokens in image placeholders using self_ref, as an alternative to modifying the library default. Ref: docling-project/docling-core#555 * DCO Remediation Commit for nuri-yoo <nuri-yoo@users.noreply.github.com> I, nuri-yoo <nuri-yoo@users.noreply.github.com>, hereby add my Signed-off-by to this commit: `82cb733c0b` Signed-off-by: nuri-yoo <nuri-yoo@users.noreply.github.com> --------- Signed-off-by: nuri-yoo <nuri-yoo@users.noreply.github.com> Co-authored-by: nuri-yoo <nuri-yoo@users.noreply.github.com>	2026-04-15 05:45:39 +02:00
Jehlum Pandit	c23622f6f5	docs: add agent skill bundle for coding assistants (SKILL.md, pipelines, convert/evaluate) (#3174 ) * docs: add agent skill bundle with convert/evaluate helpers - Add docs/examples/agent_skill/docling-document-intelligence/ with SKILL.md, pipelines.md, EXAMPLE.md, improvement-log template, and scripts/docling-convert.py + docling-evaluate.py (standard/vlm-local/vlm-api). - Document InputFormat.PDF + PdfFormatOption for explicit PdfPipelineOptions. - Link from examples index and mkdocs nav. Made-with: Cursor * docs: align agent skill README and EXAMPLE with Cursor bundle - Document both ~/.cursor/skills and docs/examples paths. - README notes repo parity for PRs and local installs. Made-with: Cursor * DCO Remediation Commit for jehlum11 <jehlum11@gmail.com> I, jehlum11 <jehlum11@gmail.com>, hereby add my Signed-off-by to this commit: `2d268ffb6f` I, jehlum11 <jehlum11@gmail.com>, hereby add my Signed-off-by to this commit: `041e709c66` Signed-off-by: jehlum11 <jehlum11@gmail.com> Made-with: Cursor * docs: refactor agent skill to use docling CLI for conversion Address maintainer feedback: the custom docling-convert.py script was largely redundant with the existing docling CLI. This commit: - Removes scripts/docling-convert.py (redundant with `docling` CLI) - Refactors SKILL.md (v1.4 → v2.0) to use `docling` CLI for all conversion tasks, reserving the Python API only for features the CLI does not expose (chunking, VLM API endpoint config, force_backend_text hybrid mode) - Updates docling-evaluate.py recommended_actions to reference `docling` CLI flags instead of the removed script - Updates README.md, EXAMPLE.md, pipelines.md to use `docling` CLI examples throughout - Simplifies requirements.txt (removes packaging dependency) The only custom script retained is docling-evaluate.py, which provides heuristic quality evaluation — functionality the CLI does not cover. Signed-off-by: jehlum11 <jehlum11@gmail.com> Made-with: Cursor * docs: fix ruff format on docling-evaluate.py Signed-off-by: jehlum11 <jehlum11@gmail.com> Made-with: Cursor --------- Signed-off-by: jehlum11 <jehlum11@gmail.com>	2026-04-13 15:02:51 +02:00
Christoph Auer	42157a3e10	feat(service): Establish client SDK for docling serve (#3264 ) * Move client SDK to docling Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Add client SDK examples Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Mark client SDK as experimental Signed-off-by: Christoph Auer <cau@zurich.ibm.com> --------- Signed-off-by: Christoph Auer <cau@zurich.ibm.com>	2026-04-13 14:54:06 +02:00
geoHeil	9970d1ef94	feat(vlm): add Nanonets OCR2 onboarding (#3274 ) * feat(vlm): add nanonets ocr2 onboarding Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com> * feat(vlm): add vLLM and API runtimes for Nanonets-OCR2 Extend the Nanonets-OCR2 preset with vLLM + remote API paths so all standard docling runtimes (Transformers, MLX, vLLM, API, LM Studio, OpenAI-compatible) work out of the box. Drop the restricted supported_engines set to match the GLM-OCR / LightOnOCR / Falcon-OCR pattern, add top-level torch_dtype on the Transformers override, and register NANONETS_OCR2_VLLM / NANONETS_OCR2_VLLM_API / NANONETS_OCR2_LMSTUDIO_API legacy specs plus VlmModelType enum entries. Folds in the remote-API scope that was on the superseded PR #3275. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com> --------- Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-13 06:51:35 +02:00
Faridun Mirzoev	1fed840506	docs: add AG2 multi-agent document analysis example (#3261 ) * docs: add AG2 multi-agent document analysis example Add a Jupyter notebook demonstrating how to combine Docling document conversion with AG2 multi-agent orchestration. A Document Processor agent uses Docling tools to convert PDFs to markdown and extract tables, while an Analyst agent synthesizes findings into a structured summary. * DCO Remediation Commit for Faridun Mirzoev <faridun@ag2.ai> I, Faridun Mirzoev <faridun@ag2.ai>, hereby add my Signed-off-by to this commit: `e80e0f3375` Signed-off-by: Faridun Mirzoev <faridun@ag2.ai> * docs: fix ruff PD901 lint — rename df to table_df Signed-off-by: Faridun Mirzoev <faridun@ag2.ai> --------- Signed-off-by: Faridun Mirzoev <faridun@ag2.ai>	2026-04-12 07:30:39 +02:00
Peter W. J. Staar	fd834204fa	feat: Support for GraniteVision v4 (#3217 ) * feat: add GV4 Signed-off-by: Peter Staar <taa@zurich.ibm.com> * got everything to work with granite-vision-4 Signed-off-by: Peter Staar <taa@zurich.ibm.com> * robustifying the output of granite-vision-v4 Signed-off-by: Peter Staar <taa@zurich.ibm.com> * refactored the code to reduce duplications: Signed-off-by: Peter Staar <taa@zurich.ibm.com> * refactored the code to align with pipeline options Signed-off-by: Peter Staar <taa@zurich.ibm.com> * ran pre-commit Signed-off-by: Peter Staar <taa@zurich.ibm.com> * fixed the circular import Signed-off-by: Peter Staar <taa@zurich.ibm.com> * added the chart_extraction_options file Signed-off-by: Peter Staar <taa@zurich.ibm.com> --------- Signed-off-by: Peter Staar <taa@zurich.ibm.com>	2026-04-10 08:55:40 +02:00
Viktor Kuropiatnyk	d046390bf4	feat: Switch to the latest version of DocumentFigureClassifier model v2.5 (#3171 ) * Switch to the latest version of DocumentFigureClassifier model v2.5 Signed-off-by: Viktor Kuropiatnyk <vku@zurich.ibm.com> * CI trigger Signed-off-by: Viktor Kuropiatnyk <vku@zurich.ibm.com> --------- Signed-off-by: Viktor Kuropiatnyk <vku@zurich.ibm.com>	2026-04-01 11:49:55 +02:00
Peter W. J. Staar	e9a39e8720	feat: remove the deprecation of extraction (#3220 ) * updated the example Signed-off-by: Peter Staar <taa@zurich.ibm.com> * chore: removed the experimental warnings in the extractor Signed-off-by: Peter Staar <taa@zurich.ibm.com> --------- Signed-off-by: Peter Staar <taa@zurich.ibm.com>	2026-04-01 09:39:34 +02:00
Anish Raghavendra	3a64f41af8	docs: add line-based chunker documentation and examples (#3210 ) Signed-off-by: anish.raghavendra <anish.raghavendra@ibm.com> Co-authored-by: anish.raghavendra <anish.raghavendra@ibm.com>	2026-03-30 10:55:31 +02:00
Maxim Lysak	1c74a9b9c7	feat: Implementation of HTML backend with headless browser (#2969 ) - Implementation of HTML backend that (optionally) uses headless browser (via Playwright) to materialize HTML pages into images, and add provenances with bboxes to all elements in the converted docling document. - Conversion preserves reading order given by HTML DOM tree - Added support for HTML "input" fields: checkboxes, radiobuttons, text inputs, etc. - Added support to Key-Value convention in HTML (i.e. elements with id "key1" and "key1_value1" will be paired as key-values, see test cases as examples) - Heuristic that glues independent inline HTML elements with single-character text in them into larger text blocks - Support for inline styling (bold, italic, etc.) Signed-off-by: Maksym Lysak <mly@zurich.ibm.com> Co-authored-by: Maksym Lysak <mly@zurich.ibm.com>	2026-03-24 14:28:57 +01:00
Max Swain	fffd445789	docs: Fix Erroneous vLLM VLM pipeline engine option params causing empty/bad responses (#3167 ) docs: Update vLLM VLM pipeline engine option params to match hugging face Signed-off-by: Max Swain <89113255+maxdswain@users.noreply.github.com>	2026-03-23 10:44:37 +01:00
Peter W. J. Staar	96d7c7ec79	feat: route plain-text and Quarto/R Markdown files to the Markdown backend (#3161 ) * feat: route plain-text and Quarto/R Markdown files to the Markdown backend Signed-off-by: Peter Staar <taa@zurich.ibm.com> * updated the README and index.md Signed-off-by: Peter Staar <taa@zurich.ibm.com> * _mime_from_extension: Added a check for extensions in the intersection of XML_USPTO and MD extension lists (currently just txt). These ambiguous extensions get pass — leaving mime=None — so the full content-probing chain (_detect_html_xhtml → _detect_csv → text/plain fallback) runs instead of prematurely assigning text/markdown. _guess_from_content: Removed the elif InputFormat.MD in formats MD fallback for text/plain content. Unrecognised .txt content now correctly returns None. MD is only returned from explicit mime types (text/markdown, text/x-markdown) which come from unambiguous extensions like .md, .text, .qmd, .rmd. Signed-off-by: Peter Staar <taa@zurich.ibm.com> * ran pre-commit Signed-off-by: Peter Staar <taa@zurich.ibm.com> --------- Signed-off-by: Peter Staar <taa@zurich.ibm.com>	2026-03-20 16:38:16 +01:00
Maree Carroll	d113e611c4	docs: fix code in rag langchain chunker tokenizer (#2993 ) * docs: fix code in rag langchain chunker tokenizer Signed-off-by: Maree Carroll <maree.carroll@gmail.com> * DCO Remediation Commit for Maree Carroll <maree.carroll@gmail.com> I, Maree Carroll <maree.carroll@gmail.com>, hereby add my Signed-off-by to this commit: c5abe24ab0c8f29188373dfeaca2c0d6aff8cca2 Signed-off-by: Maree Carroll <maree.carroll@gmail.com> * fix:Loosen pillow version constraints to allow CVE-2026-25990 fix (#2992) * Loosen pillow version constraints to allow CVE-2026-25990 fix Signed-off-by: divekarsc <divekar.samved@gmail.com> * Added numba>=0.63.0 constraint directly to the asr optional dependency in pyproject.toml Signed-off-by: divekarsc <divekar.samved@gmail.com> * fix:Added numba>=0.63.0 constraint directly to the asr optional dependency in pyproject.toml Signed-off-by: divekarsc <divekar.samved@gmail.com> --------- Signed-off-by: divekarsc <divekar.samved@gmail.com> Signed-off-by: Maree Carroll <maree.carroll@gmail.com> --------- Signed-off-by: Maree Carroll <maree.carroll@gmail.com> Signed-off-by: divekarsc <divekar.samved@gmail.com> Co-authored-by: Samved Divekar <divekar.samved@gmail.com>	2026-03-10 06:50:24 +01:00
Robert Sokolewicz	95b759e519	docs: update code snippet to use modern pipeline options syntax (#3087 ) Signed-off-by: Robert Sokolewicz <rsokolewicz@gmail.com>	2026-03-09 08:49:53 +01:00
Kaiiiiiiiii	5d3ac38a65	docs: set HuggingFaceEndpoint task for Mixtral examples (#2945 ) * docs: set HuggingFaceEndpoint task for Mixtral examples * DCO Remediation Commit for Kaiiiiiiiii <2761362118@qq.com> I, Kaiiiiiiiii <2761362118@qq.com>, hereby add my Signed-off-by to this commit: `1288c1fadd` Signed-off-by: Kaiiiiiiiii <2761362118@qq.com> * DCO Remediation Commit for Kaiiiiiiiii <2761362118@qq.com> I, Kaiiiiiiiii <2761362118@qq.com>, hereby add my Signed-off-by to this commit: `1288c1fadd` Signed-off-by: Kaiiiiiiiii <2761362118@qq.com> --------- Signed-off-by: Kaiiiiiiiii <2761362118@qq.com>	2026-03-08 10:38:06 +01:00
geoHeil	7aacc6c18d	docs: add metaxy integration (#3058 ) * feat: add metaxy integration * Georg Heiler <georg.kf.heiler@gmail.com> DCO Remediation Commit for Georg Heiler <georg.kf.heiler@gmail.com> I, Georg Heiler <georg.kf.heiler@gmail.com>, hereby add my Signed-off-by to this commit: `d4e08697ac` Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com> * DCO Remediation Commit for Georg Heiler <georg.kf.heiler@gmail.com> I, Georg Heiler <georg.kf.heiler@gmail.com>, hereby add my Signed-off-by to this commit: `d4e08697ac` Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com> --------- Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>	2026-03-02 09:48:35 +01:00
Phil Nash	672125cd1b	docs: removes merge conflict artifacts (#3055 ) fix(docs): removes merge conflict artifacts Signed-off-by: Phil Nash <philnash@gmail.com>	2026-03-02 08:17:02 +01:00
Tejas Kumar	1321b39cd8	docs: add audio & video processing guide (#3038 ) * Update docs for media * DCO Remediation Commit for Tejas Kumar <tejas.kumar@datastax.com> I, Tejas Kumar <tejas.kumar@datastax.com>, hereby add my Signed-off-by to this commit: `33089ccd73` Signed-off-by: Tejas Kumar <tejas.kumar@datastax.com> --------- Signed-off-by: Tejas Kumar <tejas.kumar@datastax.com>	2026-03-01 09:00:48 +01:00
Cesar Berrospi Ramis	1eb5c21dab	docs: add XBRL conversion example notebook and update feature listings (#3039 ) docs(xbrl): add notebook for XBRL parsing Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com>	2026-02-27 16:09:19 +01:00
Cesar Berrospi Ramis	d276e60561	feat: export to WebVTT format (#3036 ) * style(cli): apply python 3.10+ syntax, remove unnecessary imports Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * feat(vtt): export of DoclingDocument to WebVTT format Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * build: pin docling-core version 2.66.0 Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> --------- Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com>	2026-02-27 14:22:52 +01:00
Cesar Berrospi Ramis	334ba6e51f	feat: create a backend parser for XBRL instance reports (#3017 ) * build(xbrl): add Arelle as open-source library for XBRL Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * feat(xbrl): design and implement a backend parser for XBRL documents Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * test: remove print statements to reduce verbosity Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * style(XBRL): apply PEP8 naming convention for acronyms Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * refactor(XBRL): set XBRL dependencies as optional Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> --------- Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com>	2026-02-24 16:52:02 +01:00
Christoph Auer	03532938b5	feat: Unified model-family inference engines (including image-classification) and KServe v2 API support (#2979 ) * feat: Inference engines abstraction for image classification model family with HF Transformers and ONNX runtime Implements runtime abstraction for image classification models with support for both ONNX Runtime and HuggingFace Transformers engines. Users can switch between engines without model retraining, similar to the object detection abstraction (#2959). Key components: - BaseImageClassificationEngine with factory pattern - OnnxRuntimeImageClassificationEngine and TransformersImageClassificationEngine implementations - Shared HfVisionModelMixin for common HF model utilities - Engine-specific configuration options - Test suite and example demonstrating runtime engine switching Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Add missing files and re-export for backward compat Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Don't run with OCR in the example. Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Remove excess onnxruntime related options for inuts and outputs Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * feat: centralize torch compile defaults with DOCLING_INFERENCE_COMPILE_TORCH_MODELS Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * feat: Add Kserve2 API engine for image classifier and object detection models (#2999) * fix: add failed pages to DoclingDocument for page break consistency (#2939) * fix: add failed pages to DoclingDocument for page break consistency When some PDF pages fail to parse, they were not added to DoclingDocument.pages, causing page break markers to be incorrect during export. This adds failed/skipped pages with their size info (if available) to maintain correct page numbering and structure. - Add _add_failed_pages_to_document() method in StandardPdfPipeline - Add test cases for failed page handling - Add test cases for normal page handling (regression test) - Add test PDF files Signed-off-by: jhchoi1182 <jhchoi1182@gmail.com> * fix: ensure resource cleanup and simplify type hints - Wrap page_backend usage in try-finally to guarantee unload (prevents resource leaks). - Simplify redundant 'float \| None \| None' type hint. Signed-off-by: jhchoi1182 <jhchoi1182@gmail.com> * fix: add groundtruth for normal_4pages.pdf and exclude failing PDFs from e2e test Signed-off-by: jhchoi1182 <jhchoi1182@gmail.com> * fix: ensure correct status assertion for failed pages in tests Signed-off-by: jhchoi1182 <jhchoi1182@gmail.com> --------- Signed-off-by: jhchoi1182 <jhchoi1182@gmail.com> * fix: Use timezone-aware datetime (#2947) * Use timezone-aware datetime for profiling timestamps Updated timestamp recording to use timezone-aware datetime. Signed-off-by: Nikhil Singh <124866156+Ritinikhil@users.noreply.github.com> * run formatter Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> --------- Signed-off-by: Nikhil Singh <124866156+Ritinikhil@users.noreply.github.com> Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> Co-authored-by: Michele Dolfi <dol@zurich.ibm.com> * fix(asciidoc): handle commas in image alt text (#2983) * Fix: Handle commas in AsciiDoc image alt text - Modified _parse_picture() to gracefully handle alt text containing commas - Commas in alt text are now preserved instead of causing ValueError - Added test case with realistic auto-generated alt text - split('=', 1) prevents issues when values contain '=' characters * DCO Remediation Commit for n0rdp0l <n90.w135@gmail.com> I, n0rdp0l <n90.w135@gmail.com>, hereby add my Signed-off-by to this commit: `ee752491fc` Signed-off-by: n0rdp0l <n90.w135@gmail.com> * style: fix ruff formatting in test_backend_asciidoc.py Signed-off-by: n0rdp0l <n90.w135@gmail.com> --------- Signed-off-by: n0rdp0l <n90.w135@gmail.com> Co-authored-by: Michele Dolfi <dol@zurich.ibm.com> * chore: bump version to 2.73.1 [skip ci] * First attempt at establishing API Kserve2 facet Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * refactor: improve KServe v2 engine implementation after code review - Add comprehensive error handling to KserveV2HttpClient - Catch and wrap Timeout, ConnectionError, HTTPError with context - Validate response formats with clear error messages - Refactor URL building to eliminate duplication - Extract _build_model_url() helper method - Single source of truth for infer_url and model_metadata_url - Make URL required parameter (remove default localhost:8000) - Update ApiKserveV2EngineOptions to require explicit URL - Add preset validation with helpful error messages - Rename constants for clarity: TRITON_ → KSERVE_V2_* - Add comment explaining KServe v2 uses Triton type system - Improve error messages with actual values - Show counts, shapes, and supported types in validation errors - Document official KServe Python SDK alternative - Note async-only requirement and alpha status - Update tests for required URL parameter Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Cleanup in kserve http helper and options Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Further cleanup Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Fix for remote-services on tablemodel Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * fix: improved deserialization of engine_options (#3008) * add registry of discriminated subclasses Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * fix detection of engine_type value Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> --------- Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * Add options serialization improvements Signed-off-by: Christoph Auer <cau@zurich.ibm.com> --------- Signed-off-by: jhchoi1182 <jhchoi1182@gmail.com> Signed-off-by: Nikhil Singh <124866156+Ritinikhil@users.noreply.github.com> Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> Signed-off-by: n0rdp0l <n90.w135@gmail.com> Signed-off-by: Christoph Auer <cau@zurich.ibm.com> Co-authored-by: jhchoi1182 <jhchoi1182@gmail.com> Co-authored-by: Nikhil Singh <124866156+Ritinikhil@users.noreply.github.com> Co-authored-by: Michele Dolfi <dol@zurich.ibm.com> Co-authored-by: Felix Wente <63914035+n0rdp0l@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Michele Dolfi <97102151+dolfim-ibm@users.noreply.github.com> * Fixes from review Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * DCO Remediation Commit for Christoph Auer <cau@zurich.ibm.com> I, Christoph Auer <cau@zurich.ibm.com>, hereby add my Signed-off-by to this commit: `4cdb01e6d3` Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * DCO Remediation Commit for Christoph Auer <60343111+cau-git@users.noreply.github.com> I, Christoph Auer <60343111+cau-git@users.noreply.github.com>, hereby add my Signed-off-by to this commit: `e293ba3270` Signed-off-by: Christoph Auer <60343111+cau-git@users.noreply.github.com> Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Add fallback for API variants Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Recreate uv.lock Signed-off-by: Christoph Auer <cau@zurich.ibm.com> --------- Signed-off-by: Christoph Auer <cau@zurich.ibm.com> Signed-off-by: jhchoi1182 <jhchoi1182@gmail.com> Signed-off-by: Nikhil Singh <124866156+Ritinikhil@users.noreply.github.com> Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> Signed-off-by: n0rdp0l <n90.w135@gmail.com> Signed-off-by: Christoph Auer <60343111+cau-git@users.noreply.github.com> Co-authored-by: jhchoi1182 <jhchoi1182@gmail.com> Co-authored-by: Nikhil Singh <124866156+Ritinikhil@users.noreply.github.com> Co-authored-by: Michele Dolfi <dol@zurich.ibm.com> Co-authored-by: Felix Wente <63914035+n0rdp0l@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Michele Dolfi <97102151+dolfim-ibm@users.noreply.github.com>	2026-02-18 10:49:19 +01:00
Peter W. J. Staar	bf417e6d26	feat: Introduce docling-parse v5 and deprecate old docling-parse backends (#2872 ) * feat: simplifying towards docling-parse v5 Signed-off-by: Peter Staar <taa@zurich.ibm.com> * working on integrating docling-parse v5 Signed-off-by: Peter Staar <taa@zurich.ibm.com> * added the test_backend_docling_parse Signed-off-by: Peter Staar <taa@zurich.ibm.com> * Updated the docling-parse to 5.3.0 Signed-off-by: Peter Staar <taa@zurich.ibm.com> * ran the pre-commit Signed-off-by: Peter Staar <taa@zurich.ibm.com> * fixed the backend_docling_parse Signed-off-by: Peter Staar <taa@zurich.ibm.com> * ran pre-commit Signed-off-by: Peter Staar <taa@zurich.ibm.com> * updated the groundtruth to deal with rounding errors Signed-off-by: Peter Staar <taa@zurich.ibm.com> * updated comments for later docling-parse integrations Signed-off-by: Peter Staar <taa@zurich.ibm.com> * ran pre-commit Signed-off-by: Peter Staar <taa@zurich.ibm.com> * Make DoclingParseV2 and DoclingParseV4 backend stubs that route to new backend, emit warning. Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * lock docling-parse Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * updated to 3.5.2 Signed-off-by: Peter Staar <taa@zurich.ibm.com> --------- Signed-off-by: Peter Staar <taa@zurich.ibm.com> Signed-off-by: Christoph Auer <cau@zurich.ibm.com> Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> Co-authored-by: Christoph Auer <cau@zurich.ibm.com> Co-authored-by: Michele Dolfi <dol@zurich.ibm.com>	2026-02-17 20:27:56 +01:00
Peter W. J. Staar	704ef0afba	docs: Add LaTeX and WebVTT as supported types (#2974 ) * Update README.md Signed-off-by: Peter W. J. Staar <91719829+PeterStaar-IBM@users.noreply.github.com> * Fix spelling of 'WebVTT' in README Corrected the spelling of 'WebVTT' in the features list. Signed-off-by: Christoph Auer <60343111+cau-git@users.noreply.github.com> * updated the index Signed-off-by: Peter Staar <taa@zurich.ibm.com> --------- Signed-off-by: Peter W. J. Staar <91719829+PeterStaar-IBM@users.noreply.github.com> Signed-off-by: Christoph Auer <60343111+cau-git@users.noreply.github.com> Signed-off-by: Peter Staar <taa@zurich.ibm.com> Co-authored-by: Christoph Auer <60343111+cau-git@users.noreply.github.com>	2026-02-10 19:59:23 +01:00
Christoph Auer	14e474c955	feat: Inference engines abstraction for object detection model family with HF Transformers and ONNX runtime (#2959 ) * Add object_detection family and inference engine Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Fixes for alignment with vlm family Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Proposals for layout and table models Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Update the object-detection family, runtime, plugin, and more Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Add comments Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Clean up artifacts path handling. Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * feat: Load label mappings from HuggingFace config in object detection engines - Add abstract get_label_mapping() method to BaseObjectDetectionEngine - Implement label loading from config.json in OnnxRuntimeObjectDetectionEngine - Refactor LayoutObjectDetectionModel to use engine-provided labels instead of hardcoded mapping - Centralizes label mapping logic in the inference engine layer This eliminates hardcoded label dictionaries and makes label mappings configurable through model configs. Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * feat: Add Transformers engine for object detection Implement TransformersObjectDetectionEngine as a PyTorch-based alternative to ONNX Runtime. Works as drop-in replacement for both layout and table detection models with support for CPU, CUDA, and MPS devices. - Add TransformersObjectDetectionEngine with AutoModelForObjectDetection - Update TransformersObjectDetectionEngineOptions (score_threshold, torch_dtype) - Update factory to instantiate Transformers engine - Switch OBJECT_DETECTION_LAYOUT_HERON preset to use Transformers by default - Add logging configuration to layout_object_detection_example.py Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Improve OD example with different runtimes to demonstrate abstraction. Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Added onnxruntime as extra Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Update example header. Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Add missing transformers_engine for OD Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Remove unnused cleanup hook. Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Guard against onnxruntime missing on python 3.14 Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * refactor: extract shared HF object-detection engine base Signed-off-by: Christoph Auer <cau@zurich.ibm.com> --------- Signed-off-by: Christoph Auer <cau@zurich.ibm.com>	2026-02-10 16:15:30 +01:00
Aditya Sasidhar	e6ccb8b2c1	feat: added support for parsing LaTeX (.tex) documents (#2890 ) * feat: added support for parsing LaTeX (.tex) documents Signed-off-by: Aditya Sasidhar <telikicherlaadityasasidhar@gmail.com> * feat: implement PR #2890 feedback for LaTeX backend - Add text formatting options (bold, italic, underline) for LaTeX macros - Enhance image embedding with PIL and ImageRef.from_pil() - Refactor list processing to use GroupItem structure - Refactor bibliography to use GroupItem structure - Add nested list test coverage - All tests passing (39/39), all linters passing Signed-off-by: Aditya Sasidhar <telikicherlaadityasasidhar@gmail.com> * DCO Remediation Commit for Aditya Sasidhar <telikicherlaadityasasidhar@gmail.com> I, Aditya Sasidhar <telikicherlaadityasasidhar@gmail.com>, hereby add my Signed-off-by to this commit: f19f135b431d489cd8bf3982524505a0bbd8696d Signed-off-by: Aditya Sasidhar <telikicherlaadityasasidhar@gmail.com> * DCO Remediation Commit for Aditya Sasidhar <telikicherlaadityasasidhar@gmail.com> I, Aditya Sasidhar <telikicherlaadityasasidhar@gmail.com>, hereby add my Signed-off-by to this commit: f19f135b431d489cd8bf3982524505a0bbd8696d Signed-off-by: Aditya Sasidhar <telikicherlaadityasasidhar@gmail.com> * feat: enhance latex backend with robustness fixes and ground truth - Add custom macro expansion for improved text quality - Fix preamble filtering to remove metadata garbage - Support recursive \input{} and \include{} file loading - Organize test data into subdirectories for complex papers - Add full end-to-end ground truth for 4 major arXiv papers (Attention, Mistral, DeepSeek, OTSL) - Pass all 41 unit tests and pre-commit checks Addresses @cau-git feedback for ground-truth data. Signed-off-by: Aditya Sasidhar <telikicherlaadityasasidhar@gmail.com> * fix: minor formatting in test file Signed-off-by: Aditya Sasidhar <telikicherlaadityasasidhar@gmail.com> * feat: enhance LaTeX backend with robust math and figure support - Fixed re.error: bad escape in macro expansion by using lambda in re.sub - Fixed sentences breaking at inline math ($) by preserving it within paragraphs - Improved figure environment with proper grouping and structured representation - Fixed crashes on documents starting with % comments - Added comprehensive unit tests and updated all ground truth data Signed-off-by: Aditya Sasidhar <telikicherlaadityasasidhar@gmail.com> * WIP: saving work for laptop migration Signed-off-by: Aditya Sasidhar <telikicherlaadityasasidhar@gmail.com> * got rid of the line breaking issues, still some do exist Signed-off-by: Aditya Sasidhar <telikicherlaadityasasidhar@gmail.com> * fix: generalized LaTeX macro parsing and robustness improvements This commit addresses several issues with LaTeX parsing: - Correctly handle unknown macros (like \ion{N}{2}) inline to avoid line breaks. - Fix extraction of structural macros (section, caption, etc.) vs text-only groups. - Address PR feedback regarding inline math spacing and splitting. - Regenerate ground truth files reflecting these improvements. Signed-off-by: Aditya Sasidhar <telikicherlaadityasasidhar@gmail.com> * style: apply automatic formatting fixes Signed-off-by: Aditya Sasidhar <telikicherlaadityasasidhar@gmail.com> * style: fix ruff linter and formatter errors Signed-off-by: Aditya Sasidhar <telikicherlaadityasasidhar@gmail.com> * fix: typing issues identified by mypy Signed-off-by: Aditya Sasidhar <telikicherlaadityasasidhar@gmail.com> * style: apply formatting fixes to tests Signed-off-by: Aditya Sasidhar <telikicherlaadityasasidhar@gmail.com> * fix: update groundtruth files for latex backend Signed-off-by: Aditya Sasidhar <telikicherlaadityasasidhar@gmail.com> * fixed the ackward line breaking issue, turns out im stupid at considering text buffer * i forgot to add the groundtruth so here it is * DCO Remediation Commit for Aditya Sasidhar <telikicherlaadityasasidhar@gmail.com> I, Aditya Sasidhar <telikicherlaadityasasidhar@gmail.com>, hereby add my Signed-off-by to this commit: `7e032635ef` I, Aditya Sasidhar <telikicherlaadityasasidhar@gmail.com>, hereby add my Signed-off-by to this commit: `aeba688384` Signed-off-by: Aditya Sasidhar <telikicherlaadityasasidhar@gmail.com> * Ran the precommit as requested Signed-off-by: Aditya Sasidhar <telikicherlaadityasasidhar@gmail.com> --------- Signed-off-by: Aditya Sasidhar <telikicherlaadityasasidhar@gmail.com>	2026-02-10 15:13:09 +01:00
Michele Dolfi	d4c87133f3	feat: Introduce pluggable VLM runtime system with preset-based configuration (#2919 ) * model runtime refactoring Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * add test Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * fix code formula preset Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * batch prediction Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * use presets and new vlm options in CLI Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * use new model settings by default Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * running Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * update examples Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * fixes for running examples Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * keep old stage Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * update model Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * use granite 3.3 and set options Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * revisit init logic and propagate the proper options to the runtimes Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * update all stages with original setup Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * per stage registry Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * use chat template Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * remove duplicated predict() and factor out some utils Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * working picture description examples Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * add granite docling as code formula model Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * rename code formula presets Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * fix running minimal_vlm example Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * add all models to presets and run compare_vlm Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * remove unused repo_id Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * update vlm api model example Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * fix legacy examples Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * add another legacy example Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * fix test Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * avoid automatic fallback to mlx and fix end_of_utterance in codeformula Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * move vlm_convert_model Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * use new vlm runtime class Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * flasg for CI Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * rename runtimes to explicit vlm_runtimes Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * renaming from runtime to inference engine and model families Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * fixes Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * fix test Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * add docs with stages Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * update docs catalog page Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * rename runtime to inference engine Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> --------- Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2026-02-04 17:29:17 +01:00
Peter W. J. Staar	a5ad8f24ff	docs: add granite vision for charts (#2946 ) * feat: add chart extraction models Signed-off-by: Peter Staar <taa@zurich.ibm.com> * refactored code Signed-off-by: Peter Staar <taa@zurich.ibm.com> * fixed the meta Signed-off-by: Peter Staar <taa@zurich.ibm.com> * cleaned up the ChartExtractionModelGraniteVision Signed-off-by: Peter Staar <taa@zurich.ibm.com> * cleaned up the chart2csv model Signed-off-by: Peter Staar <taa@zurich.ibm.com> * fixed the merge issues Signed-off-by: Peter Staar <taa@zurich.ibm.com> * fixed the imports Signed-off-by: Peter Staar <taa@zurich.ibm.com> * added the missing files Signed-off-by: Peter Staar <taa@zurich.ibm.com> * fixed the output post-processing Signed-off-by: Peter Staar <taa@zurich.ibm.com> * example: add chart extraction example Signed-off-by: Peter Staar <taa@zurich.ibm.com> * working chart conversion Signed-off-by: Peter Staar <taa@zurich.ibm.com> --------- Signed-off-by: Peter Staar <taa@zurich.ibm.com>	2026-02-03 12:31:27 +01:00
Michele Dolfi	7f386587ed	feat: Drop support for Python 3.9 (#2905 ) * chore: drop support for Python 3.9 Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * disable CI for python 3.9 Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * fix: test bump version Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * add chore to the changelog but without bumping the version Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * force newer langchain-core Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * fix linter for 3.10 Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * Add python 3.9 removal notice Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * avoid upgrading docling-core Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * restore semantic release settings Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> --------- Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> Signed-off-by: Christoph Auer <cau@zurich.ibm.com> Co-authored-by: Christoph Auer <cau@zurich.ibm.com>	2026-01-23 10:15:58 +01:00
ParvaP	16e88d50fa	docs: correct broken link to supported formats (#2878 ) Fix link to supported formats in quickstart guide Updated link to supported formats documentation. Signed-off-by: ParvaP <55171512+ParvaP@users.noreply.github.com>	2026-01-14 19:50:16 +01:00
Michele Dolfi	b14ee1557b	refactor: organize models in submodules (#2845 ) * refactor ocr models Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * refactor vlm api models Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * refactor vlm_models Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * refactor picture description Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * refactor table structure Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * refactor all into stages Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> --------- Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2026-01-13 13:45:48 +01:00
Nikhil Singh	72851cce66	docs: fix Colab badge links and Weaviate typo in docs examples (#2871 ) * Fix typo in vector store name from 'Weavieate' to 'Weaviate' Signed-off-by: Nikhil Singh <124866156+Ritinikhil@users.noreply.github.com> * Fix link to hybrid_rag_qdrant notebook in markdown Signed-off-by: Nikhil Singh <124866156+Ritinikhil@users.noreply.github.com> * Fix markdown link formatting in retrieval_qdrant.ipynb Signed-off-by: Nikhil Singh <124866156+Ritinikhil@users.noreply.github.com> * Update Colab link in retrieval_qdrant notebook Signed-off-by: Nikhil Singh <124866156+Ritinikhil@users.noreply.github.com> * Fix HTML escaping in Colab link in notebook Signed-off-by: Nikhil Singh <124866156+Ritinikhil@users.noreply.github.com> * Create output directory if it doesn't exist Ensure output directory exists before saving files. Signed-off-by: Nikhil Singh <124866156+Ritinikhil@users.noreply.github.com> --------- Signed-off-by: Nikhil Singh <124866156+Ritinikhil@users.noreply.github.com>	2026-01-12 08:50:41 +01:00
Nikhil Singh	211c759fd0	docs(example): fix update sample image path to be relative (#2864 ) * Update sample image path to be relative The example script currently uses a relative path (tests\data\...) to load the sample image. This causes a FileNotFoundError if the script is not executed strictly from the repository root (e.g., when running directly from an IDE or the script's own directory). Signed-off-by: Nikhil Singh <124866156+Ritinikhil@users.noreply.github.com> * Update docs/examples/experimental/process_table_crops.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Peter W. J. Staar <91719829+PeterStaar-IBM@users.noreply.github.com> * formatter and linter Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> --------- Signed-off-by: Nikhil Singh <124866156+Ritinikhil@users.noreply.github.com> Signed-off-by: Peter W. J. Staar <91719829+PeterStaar-IBM@users.noreply.github.com> Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> Co-authored-by: Peter W. J. Staar <91719829+PeterStaar-IBM@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Michele Dolfi <dol@zurich.ibm.com>	2026-01-09 12:46:48 +01:00
Mohd Kaif	bf80e329d1	docs: add Semantica integration (#2860 ) * docs: add Semantica integration * DCO Remediation Commit for KaifAhmad1 <kaifahmad087@gmail.com> I, KaifAhmad1 <kaifahmad087@gmail.com>, hereby add my Signed-off-by to this commit: `bbff355863` Signed-off-by: KaifAhmad1 <kaifahmad087@gmail.com> * docs: add Semantica to mkdocs navigation Signed-off-by: KaifAhmad1 <kaifahmad087@gmail.com> * docs: add title with emoji to Semantica integration page Signed-off-by: KaifAhmad1 <kaifahmad087@gmail.com> * docs: refine Semantica integration with pipeline example and cookbook link Signed-off-by: KaifAhmad1 <kaifahmad087@gmail.com> --------- Signed-off-by: KaifAhmad1 <kaifahmad087@gmail.com>	2026-01-09 09:41:44 +01:00
Michele Dolfi	aab3ff5d82	feat: Enrichment annotations in the new meta format (#2859 ) * produce meta annotations in the new format Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * update examples Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * update test and check for deprecation Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * update last example Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> --------- Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2026-01-08 18:28:52 +01:00
Rehan Khan	a0530a271e	fix: correct type hint for table_structure_options usage (#2823 ) * fix(#2785): correct type hint for table_structure_options in PdfPipelineOptions Signed-off-by: ryyhan <dayel.rehan@gmail.com> * fix: use Union for table_structure_options and update examples Signed-off-by: ryyhan <dayel.rehan@gmail.com> * revert(pipeline_options): use BaseTableStructureOptions as requested Signed-off-by: ryyhan <dayel.rehan@gmail.com> --------- Signed-off-by: ryyhan <dayel.rehan@gmail.com>	2026-01-06 08:59:43 +01:00
jspast	2b83fdd0de	feat: Add XPU device support for Intel GPUs (#2809 ) * feat: Add XPU device support * feat: Add XPU as supported device on some models * docs: Add XPU usage to examples * DCO Remediation Commit for jspast <140563347+jspast@users.noreply.github.com> I, jspast <140563347+jspast@users.noreply.github.com>, hereby add my Signed-off-by to this commit: `f26e8b8c3a` I, jspast <140563347+jspast@users.noreply.github.com>, hereby add my Signed-off-by to this commit: `a4a2bf90fa` I, jspast <140563347+jspast@users.noreply.github.com>, hereby add my Signed-off-by to this commit: `a2d5dac2e1` Signed-off-by: jspast <140563347+jspast@users.noreply.github.com> --------- Signed-off-by: jspast <140563347+jspast@users.noreply.github.com>	2026-01-05 17:35:26 +01:00
Michele Dolfi	be085c0e39	docs(RTX): Guidelines for best performance on RTX GPUs (#2765 ) * add RTX docs Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * add artwork and fix title Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * fix series definition Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * add nvidia logo and update todo Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> --------- Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2025-12-19 13:16:59 +01:00
Julia Pap	cc5e3cee74	docs: add docstrings to DocumentConverter #2748 (#2782 ) * docs: add docstrings for DocumentConverter Signed-off-by: Julia Pap <papjuli@gmail.com> * Apply suggestions from code review Improve docstrings in DocumentConverter Co-authored-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com> Signed-off-by: Julia Pap <papjuli@gmail.com> * docs: improve docstring formatting and wording in DocumentConverter * docs: show init method in document converter reference * docs: change back indents to 4x in DocumentConverter docstrings griffe was issuing warnings of confusing indentation * docs: clarify `max_num_pages` and `page_range` args in `DocumentConverter` methods * docs: fix some Yields and Returns in DocumentConverter docstrings * DCO Remediation Commit for Julia Pap <papjuli@gmail.com> I, Julia Pap <papjuli@gmail.com>, hereby add my Signed-off-by to this commit: `cf2ea4e0f0` I, Julia Pap <papjuli@gmail.com>, hereby add my Signed-off-by to this commit: `57446af168` I, Julia Pap <papjuli@gmail.com>, hereby add my Signed-off-by to this commit: `5d613edb8c` I, Julia Pap <papjuli@gmail.com>, hereby add my Signed-off-by to this commit: `b195281f56` I, Julia Pap <papjuli@gmail.com>, hereby add my Signed-off-by to this commit: `5d4a3af5d5` Signed-off-by: Julia Pap <papjuli@gmail.com> * docs: ignore init description, rephrased docstrings Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> --------- Signed-off-by: Julia Pap <papjuli@gmail.com> Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> Co-authored-by: Cesar Berrospi Ramis <75900930+ceberam@users.noreply.github.com> Co-authored-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com>	2025-12-19 11:20:33 +01:00
Shivaditya Meduri	150fe90728	docs(style): fix link visibility in dark mode (#2804 ) fix: link visibility in dark mode Hello, I noticed that links were using the same color as regular text making them indistinguishable in dark mode (default). This PR changes the color of links to a different color for better accessibility in dark more. This has been bugging me for a while, so I am creating this PR to fix this 😅. Signed-off-by: Shivaditya Meduri <77324692+shivaditya-meduri@users.noreply.github.com>	2025-12-18 16:08:11 +01:00
Michele Dolfi	241d19ed6f	feat: add preset for using granite-docling via vllm and other apis (#2792 ) add preset for using granite-docling via vllm and other apis Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2025-12-16 17:22:35 +01:00
Michele Dolfi	d03439ccc5	docs(gpu): Add benchmarks of standard pipeline with OCR (#2764 ) * add results for standard + OCR and more Windows timings Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * fix runtime selection for py 3.14 in CI Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> --------- Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2025-12-10 20:43:20 +01:00

1 2 3 4 5

239 Commits