docling

mirror of https://github.com/docling-project/docling.git synced 2026-05-17 13:10:38 +00:00

Author	SHA1	Message	Date
EliSchwartz	24f2d148d9	feat(vlm): upgrade Granite Vision model to 4.1 for table + chart extraction (#3382 ) * feat(table-structure): swap VLM model to granite-vision-4.1-4b Updates GraniteVisionTableStructureModel to use the 4.1 model. The 4.1 weights are pre-merged, so merge_lora_adapters() is now hasattr-guarded. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Eli Schwartz <eliyahu.schwartz@ibm.com> * feat(chart-extraction): swap V4 VLM model to granite-vision-4.1-4b Updates ChartExtractionModelGraniteVisionV4 to use the 4.1 model. hasattr-guards the merge_lora_adapters() call since 4.1 weights are pre-merged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Eli Schwartz <eliyahu.schwartz@ibm.com> * docs(example): mention granite-vision-4.1-4b in table-structure example Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Eli Schwartz <eliyahu.schwartz@ibm.com> * docs(catalog): update Granite Vision entry to 4.1-4b Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Eli Schwartz <eliyahu.schwartz@ibm.com> * feat(chart-extraction): honor cuda_use_flash_attention2 in V4 loader Mirrors the table-structure loader so ChartExtractionModelGraniteVisionV4 also passes _attn_implementation based on AcceleratorOptions. Without this the chart model falls back to the transformers SDPA default, which can hit cuDNN backend failures on some torch/cuDNN stacks while the table model (which already passed the flag) runs cleanly. Stores accelerator_options on the base class so subclasses can read it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Eli Schwartz <eliyahu.schwartz@ibm.com> * fix(model-downloader): update Granite Vision log message to 4.1 The log message in download_models still mentioned "Granite Vision 4.0" after the model swap. Correct it to match the current model version. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Eli Schwartz <eliyahu.schwartz@ibm.com> * fix(chart-extraction): fall back to bare CSV when V4 model omits ```csv``` fence granite-vision-4.1-4b sometimes emits raw CSV without a ```csv``` code fence for the <chart2csv> prompt, which caused _extract_csv_to_dataframe to raise ValueError and drop the chart's tabular_chart metadata. Mirror the tolerant parsing already used by the v3 class: prefer a fenced block, otherwise strip any stray backtick prefix/suffix and parse the text as-is. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Eli Schwartz <eliyahu.schwartz@ibm.com> --------- Signed-off-by: Eli Schwartz <eliyahu.schwartz@ibm.com> Co-authored-by: Eli Schwartz <eliyahu.schwartz@ibm.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 08:36:08 +02:00
EliSchwartz	1569e42f84	feat: implement GraniteVisionTableStructureModel for VLM-based table extraction (#3323 ) Add a new table structure model using IBM Granite Vision to extract table structure from document images via OTSL token generation. Changes: - Add `GraniteVisionTableStructureOptions` with configurable model repo, device, batch size, and crop padding options - Implement `GraniteVisionTableStructureModel` that uses a VLM pipeline to generate OTSL tokens from cropped table images, then parses them into `TableData` with cells, rows, and columns - Register the model in `table_structure_engines` alongside existing engines - Add example script `docs/examples/granite_vision_table_structure.py` - Add tests covering options, model enable/disable, OTSL parsing (including self-closing tags xcel/srow/ecel), and invalid-backend error handling - Update model catalog docs and CI workflow accordingly Signed-off-by: Eli Schwartz <eli.shw@gmail.com>	2026-04-17 11:02:20 +02:00
geoHeil	8ec14f2c6f	docs: fix nanonets_ocr2 runtime support matrix (#3317 ) Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>	2026-04-17 06:24:53 +02:00
geoHeil	9970d1ef94	feat(vlm): add Nanonets OCR2 onboarding (#3274 ) * feat(vlm): add nanonets ocr2 onboarding Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com> * feat(vlm): add vLLM and API runtimes for Nanonets-OCR2 Extend the Nanonets-OCR2 preset with vLLM + remote API paths so all standard docling runtimes (Transformers, MLX, vLLM, API, LM Studio, OpenAI-compatible) work out of the box. Drop the restricted supported_engines set to match the GLM-OCR / LightOnOCR / Falcon-OCR pattern, add top-level torch_dtype on the Transformers override, and register NANONETS_OCR2_VLLM / NANONETS_OCR2_VLLM_API / NANONETS_OCR2_LMSTUDIO_API legacy specs plus VlmModelType enum entries. Folds in the remote-API scope that was on the superseded PR #3275. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com> --------- Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-13 06:51:35 +02:00
Viktor Kuropiatnyk	d046390bf4	feat: Switch to the latest version of DocumentFigureClassifier model v2.5 (#3171 ) * Switch to the latest version of DocumentFigureClassifier model v2.5 Signed-off-by: Viktor Kuropiatnyk <vku@zurich.ibm.com> * CI trigger Signed-off-by: Viktor Kuropiatnyk <vku@zurich.ibm.com> --------- Signed-off-by: Viktor Kuropiatnyk <vku@zurich.ibm.com>	2026-04-01 11:49:55 +02:00
Tejas Kumar	1321b39cd8	docs: add audio & video processing guide (#3038 ) * Update docs for media * DCO Remediation Commit for Tejas Kumar <tejas.kumar@datastax.com> I, Tejas Kumar <tejas.kumar@datastax.com>, hereby add my Signed-off-by to this commit: `33089ccd73` Signed-off-by: Tejas Kumar <tejas.kumar@datastax.com> --------- Signed-off-by: Tejas Kumar <tejas.kumar@datastax.com>	2026-03-01 09:00:48 +01:00
Cesar Berrospi Ramis	d276e60561	feat: export to WebVTT format (#3036 ) * style(cli): apply python 3.10+ syntax, remove unnecessary imports Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * feat(vtt): export of DoclingDocument to WebVTT format Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * build: pin docling-core version 2.66.0 Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> --------- Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com>	2026-02-27 14:22:52 +01:00
Cesar Berrospi Ramis	334ba6e51f	feat: create a backend parser for XBRL instance reports (#3017 ) * build(xbrl): add Arelle as open-source library for XBRL Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * feat(xbrl): design and implement a backend parser for XBRL documents Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * test: remove print statements to reduce verbosity Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * style(XBRL): apply PEP8 naming convention for acronyms Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * refactor(XBRL): set XBRL dependencies as optional Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> --------- Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com>	2026-02-24 16:52:02 +01:00
Christoph Auer	03532938b5	feat: Unified model-family inference engines (including image-classification) and KServe v2 API support (#2979 ) * feat: Inference engines abstraction for image classification model family with HF Transformers and ONNX runtime Implements runtime abstraction for image classification models with support for both ONNX Runtime and HuggingFace Transformers engines. Users can switch between engines without model retraining, similar to the object detection abstraction (#2959). Key components: - BaseImageClassificationEngine with factory pattern - OnnxRuntimeImageClassificationEngine and TransformersImageClassificationEngine implementations - Shared HfVisionModelMixin for common HF model utilities - Engine-specific configuration options - Test suite and example demonstrating runtime engine switching Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Add missing files and re-export for backward compat Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Don't run with OCR in the example. Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Remove excess onnxruntime related options for inuts and outputs Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * feat: centralize torch compile defaults with DOCLING_INFERENCE_COMPILE_TORCH_MODELS Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * feat: Add Kserve2 API engine for image classifier and object detection models (#2999) * fix: add failed pages to DoclingDocument for page break consistency (#2939) * fix: add failed pages to DoclingDocument for page break consistency When some PDF pages fail to parse, they were not added to DoclingDocument.pages, causing page break markers to be incorrect during export. This adds failed/skipped pages with their size info (if available) to maintain correct page numbering and structure. - Add _add_failed_pages_to_document() method in StandardPdfPipeline - Add test cases for failed page handling - Add test cases for normal page handling (regression test) - Add test PDF files Signed-off-by: jhchoi1182 <jhchoi1182@gmail.com> * fix: ensure resource cleanup and simplify type hints - Wrap page_backend usage in try-finally to guarantee unload (prevents resource leaks). - Simplify redundant 'float \| None \| None' type hint. Signed-off-by: jhchoi1182 <jhchoi1182@gmail.com> * fix: add groundtruth for normal_4pages.pdf and exclude failing PDFs from e2e test Signed-off-by: jhchoi1182 <jhchoi1182@gmail.com> * fix: ensure correct status assertion for failed pages in tests Signed-off-by: jhchoi1182 <jhchoi1182@gmail.com> --------- Signed-off-by: jhchoi1182 <jhchoi1182@gmail.com> * fix: Use timezone-aware datetime (#2947) * Use timezone-aware datetime for profiling timestamps Updated timestamp recording to use timezone-aware datetime. Signed-off-by: Nikhil Singh <124866156+Ritinikhil@users.noreply.github.com> * run formatter Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> --------- Signed-off-by: Nikhil Singh <124866156+Ritinikhil@users.noreply.github.com> Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> Co-authored-by: Michele Dolfi <dol@zurich.ibm.com> * fix(asciidoc): handle commas in image alt text (#2983) * Fix: Handle commas in AsciiDoc image alt text - Modified _parse_picture() to gracefully handle alt text containing commas - Commas in alt text are now preserved instead of causing ValueError - Added test case with realistic auto-generated alt text - split('=', 1) prevents issues when values contain '=' characters * DCO Remediation Commit for n0rdp0l <n90.w135@gmail.com> I, n0rdp0l <n90.w135@gmail.com>, hereby add my Signed-off-by to this commit: `ee752491fc` Signed-off-by: n0rdp0l <n90.w135@gmail.com> * style: fix ruff formatting in test_backend_asciidoc.py Signed-off-by: n0rdp0l <n90.w135@gmail.com> --------- Signed-off-by: n0rdp0l <n90.w135@gmail.com> Co-authored-by: Michele Dolfi <dol@zurich.ibm.com> * chore: bump version to 2.73.1 [skip ci] * First attempt at establishing API Kserve2 facet Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * refactor: improve KServe v2 engine implementation after code review - Add comprehensive error handling to KserveV2HttpClient - Catch and wrap Timeout, ConnectionError, HTTPError with context - Validate response formats with clear error messages - Refactor URL building to eliminate duplication - Extract _build_model_url() helper method - Single source of truth for infer_url and model_metadata_url - Make URL required parameter (remove default localhost:8000) - Update ApiKserveV2EngineOptions to require explicit URL - Add preset validation with helpful error messages - Rename constants for clarity: TRITON_ → KSERVE_V2_* - Add comment explaining KServe v2 uses Triton type system - Improve error messages with actual values - Show counts, shapes, and supported types in validation errors - Document official KServe Python SDK alternative - Note async-only requirement and alpha status - Update tests for required URL parameter Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Cleanup in kserve http helper and options Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Further cleanup Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Fix for remote-services on tablemodel Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * fix: improved deserialization of engine_options (#3008) * add registry of discriminated subclasses Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * fix detection of engine_type value Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> --------- Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * Add options serialization improvements Signed-off-by: Christoph Auer <cau@zurich.ibm.com> --------- Signed-off-by: jhchoi1182 <jhchoi1182@gmail.com> Signed-off-by: Nikhil Singh <124866156+Ritinikhil@users.noreply.github.com> Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> Signed-off-by: n0rdp0l <n90.w135@gmail.com> Signed-off-by: Christoph Auer <cau@zurich.ibm.com> Co-authored-by: jhchoi1182 <jhchoi1182@gmail.com> Co-authored-by: Nikhil Singh <124866156+Ritinikhil@users.noreply.github.com> Co-authored-by: Michele Dolfi <dol@zurich.ibm.com> Co-authored-by: Felix Wente <63914035+n0rdp0l@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Michele Dolfi <97102151+dolfim-ibm@users.noreply.github.com> * Fixes from review Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * DCO Remediation Commit for Christoph Auer <cau@zurich.ibm.com> I, Christoph Auer <cau@zurich.ibm.com>, hereby add my Signed-off-by to this commit: `4cdb01e6d3` Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * DCO Remediation Commit for Christoph Auer <60343111+cau-git@users.noreply.github.com> I, Christoph Auer <60343111+cau-git@users.noreply.github.com>, hereby add my Signed-off-by to this commit: `e293ba3270` Signed-off-by: Christoph Auer <60343111+cau-git@users.noreply.github.com> Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Add fallback for API variants Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Recreate uv.lock Signed-off-by: Christoph Auer <cau@zurich.ibm.com> --------- Signed-off-by: Christoph Auer <cau@zurich.ibm.com> Signed-off-by: jhchoi1182 <jhchoi1182@gmail.com> Signed-off-by: Nikhil Singh <124866156+Ritinikhil@users.noreply.github.com> Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> Signed-off-by: n0rdp0l <n90.w135@gmail.com> Signed-off-by: Christoph Auer <60343111+cau-git@users.noreply.github.com> Co-authored-by: jhchoi1182 <jhchoi1182@gmail.com> Co-authored-by: Nikhil Singh <124866156+Ritinikhil@users.noreply.github.com> Co-authored-by: Michele Dolfi <dol@zurich.ibm.com> Co-authored-by: Felix Wente <63914035+n0rdp0l@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Michele Dolfi <97102151+dolfim-ibm@users.noreply.github.com>	2026-02-18 10:49:19 +01:00
Aditya Sasidhar	e6ccb8b2c1	feat: added support for parsing LaTeX (.tex) documents (#2890 ) * feat: added support for parsing LaTeX (.tex) documents Signed-off-by: Aditya Sasidhar <telikicherlaadityasasidhar@gmail.com> * feat: implement PR #2890 feedback for LaTeX backend - Add text formatting options (bold, italic, underline) for LaTeX macros - Enhance image embedding with PIL and ImageRef.from_pil() - Refactor list processing to use GroupItem structure - Refactor bibliography to use GroupItem structure - Add nested list test coverage - All tests passing (39/39), all linters passing Signed-off-by: Aditya Sasidhar <telikicherlaadityasasidhar@gmail.com> * DCO Remediation Commit for Aditya Sasidhar <telikicherlaadityasasidhar@gmail.com> I, Aditya Sasidhar <telikicherlaadityasasidhar@gmail.com>, hereby add my Signed-off-by to this commit: f19f135b431d489cd8bf3982524505a0bbd8696d Signed-off-by: Aditya Sasidhar <telikicherlaadityasasidhar@gmail.com> * DCO Remediation Commit for Aditya Sasidhar <telikicherlaadityasasidhar@gmail.com> I, Aditya Sasidhar <telikicherlaadityasasidhar@gmail.com>, hereby add my Signed-off-by to this commit: f19f135b431d489cd8bf3982524505a0bbd8696d Signed-off-by: Aditya Sasidhar <telikicherlaadityasasidhar@gmail.com> * feat: enhance latex backend with robustness fixes and ground truth - Add custom macro expansion for improved text quality - Fix preamble filtering to remove metadata garbage - Support recursive \input{} and \include{} file loading - Organize test data into subdirectories for complex papers - Add full end-to-end ground truth for 4 major arXiv papers (Attention, Mistral, DeepSeek, OTSL) - Pass all 41 unit tests and pre-commit checks Addresses @cau-git feedback for ground-truth data. Signed-off-by: Aditya Sasidhar <telikicherlaadityasasidhar@gmail.com> * fix: minor formatting in test file Signed-off-by: Aditya Sasidhar <telikicherlaadityasasidhar@gmail.com> * feat: enhance LaTeX backend with robust math and figure support - Fixed re.error: bad escape in macro expansion by using lambda in re.sub - Fixed sentences breaking at inline math ($) by preserving it within paragraphs - Improved figure environment with proper grouping and structured representation - Fixed crashes on documents starting with % comments - Added comprehensive unit tests and updated all ground truth data Signed-off-by: Aditya Sasidhar <telikicherlaadityasasidhar@gmail.com> * WIP: saving work for laptop migration Signed-off-by: Aditya Sasidhar <telikicherlaadityasasidhar@gmail.com> * got rid of the line breaking issues, still some do exist Signed-off-by: Aditya Sasidhar <telikicherlaadityasasidhar@gmail.com> * fix: generalized LaTeX macro parsing and robustness improvements This commit addresses several issues with LaTeX parsing: - Correctly handle unknown macros (like \ion{N}{2}) inline to avoid line breaks. - Fix extraction of structural macros (section, caption, etc.) vs text-only groups. - Address PR feedback regarding inline math spacing and splitting. - Regenerate ground truth files reflecting these improvements. Signed-off-by: Aditya Sasidhar <telikicherlaadityasasidhar@gmail.com> * style: apply automatic formatting fixes Signed-off-by: Aditya Sasidhar <telikicherlaadityasasidhar@gmail.com> * style: fix ruff linter and formatter errors Signed-off-by: Aditya Sasidhar <telikicherlaadityasasidhar@gmail.com> * fix: typing issues identified by mypy Signed-off-by: Aditya Sasidhar <telikicherlaadityasasidhar@gmail.com> * style: apply formatting fixes to tests Signed-off-by: Aditya Sasidhar <telikicherlaadityasasidhar@gmail.com> * fix: update groundtruth files for latex backend Signed-off-by: Aditya Sasidhar <telikicherlaadityasasidhar@gmail.com> * fixed the ackward line breaking issue, turns out im stupid at considering text buffer * i forgot to add the groundtruth so here it is * DCO Remediation Commit for Aditya Sasidhar <telikicherlaadityasasidhar@gmail.com> I, Aditya Sasidhar <telikicherlaadityasasidhar@gmail.com>, hereby add my Signed-off-by to this commit: `7e032635ef` I, Aditya Sasidhar <telikicherlaadityasasidhar@gmail.com>, hereby add my Signed-off-by to this commit: `aeba688384` Signed-off-by: Aditya Sasidhar <telikicherlaadityasasidhar@gmail.com> * Ran the precommit as requested Signed-off-by: Aditya Sasidhar <telikicherlaadityasasidhar@gmail.com> --------- Signed-off-by: Aditya Sasidhar <telikicherlaadityasasidhar@gmail.com>	2026-02-10 15:13:09 +01:00
Michele Dolfi	d4c87133f3	feat: Introduce pluggable VLM runtime system with preset-based configuration (#2919 ) * model runtime refactoring Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * add test Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * fix code formula preset Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * batch prediction Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * use presets and new vlm options in CLI Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * use new model settings by default Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * running Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * update examples Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * fixes for running examples Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * keep old stage Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * update model Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * use granite 3.3 and set options Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * revisit init logic and propagate the proper options to the runtimes Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * update all stages with original setup Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * per stage registry Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * use chat template Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * remove duplicated predict() and factor out some utils Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * working picture description examples Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * add granite docling as code formula model Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * rename code formula presets Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * fix running minimal_vlm example Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * add all models to presets and run compare_vlm Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * remove unused repo_id Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * update vlm api model example Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * fix legacy examples Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * add another legacy example Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * fix test Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * avoid automatic fallback to mlx and fix end_of_utterance in codeformula Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * move vlm_convert_model Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * use new vlm runtime class Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * flasg for CI Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * rename runtimes to explicit vlm_runtimes Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * renaming from runtime to inference engine and model families Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * fixes Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * fix test Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * add docs with stages Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * update docs catalog page Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * rename runtime to inference engine Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> --------- Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2026-02-04 17:29:17 +01:00
jspast	2b83fdd0de	feat: Add XPU device support for Intel GPUs (#2809 ) * feat: Add XPU device support * feat: Add XPU as supported device on some models * docs: Add XPU usage to examples * DCO Remediation Commit for jspast <140563347+jspast@users.noreply.github.com> I, jspast <140563347+jspast@users.noreply.github.com>, hereby add my Signed-off-by to this commit: `f26e8b8c3a` I, jspast <140563347+jspast@users.noreply.github.com>, hereby add my Signed-off-by to this commit: `a4a2bf90fa` I, jspast <140563347+jspast@users.noreply.github.com>, hereby add my Signed-off-by to this commit: `a2d5dac2e1` Signed-off-by: jspast <140563347+jspast@users.noreply.github.com> --------- Signed-off-by: jspast <140563347+jspast@users.noreply.github.com>	2026-01-05 17:35:26 +01:00
Michele Dolfi	d03439ccc5	docs(gpu): Add benchmarks of standard pipeline with OCR (#2764 ) * add results for standard + OCR and more Windows timings Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * fix runtime selection for py 3.14 in CI Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> --------- Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2025-12-10 20:43:20 +01:00
Michele Dolfi	b75c6461f4	docs: More GPU results and improvements in the example docs (#2674 ) * add more results and improve the example docs Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * 5070 windows timing Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * add reference for cpu-only Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> --------- Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2025-11-24 15:26:08 +01:00
Muhammad Ali Hasan	146b4f0535	docs: fix typo on jobkit page (#2671 ) * kubeflow DCO Remediation Commit for Muhammad Ali Hasan <alihxn23@gmail.com> I, Muhammad Ali Hasan <alihxn23@gmail.com>, hereby add my Signed-off-by to this commit: `a007cf07af` Signed-off-by: Muhammad Ali Hasan <alihxn23@gmail.com> --------- Signed-off-by: Muhammad Ali Hasan <alihxn23@gmail.com>	2025-11-24 09:35:45 +01:00
Michele Dolfi	463a3fd474	fix: Enable GPU for RapidOCR when available (#2659 ) * add setting for using gpu Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * add docs Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> --------- Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2025-11-19 17:12:00 +01:00
Yasir Ali	741c44fa45	docs: fix typos (#2546 ) docs: fix typos in enrichments.md ('analize' -> 'analyze', 'consise' -> 'concise') Signed-off-by: Yasir Ali <engr23002@gmail.com>	2025-10-31 10:29:34 +01:00
Michele Dolfi	97aa06bfbc	docs: Add details and examples on optimal GPU setup (#2531 ) * docs for GPU optimizations Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * improve time reporting and improve execution Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * fix standard pipeline Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * tune examples with batch size 64 Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * add benchmark results Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * improve docs Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * typo in excluded tests Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * explicit pipeline in table Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> --------- Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2025-10-30 13:22:05 +01:00
Cesar Berrospi Ramis	9a6fdf936b	docs: update opensearch notebook and backend documentation (#2519 ) * docs(opensearch): update the example notebook RAG with OpenSearch Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * docs(uspto): remove direct usage of the backend class for conversion Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * docs: remove direct usage of backends from documentation Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> --------- Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com>	2025-10-27 10:02:50 +01:00
McGuireMark	86556d8367	docs: fix typo in mcp.md (#2502 ) Update mcp.md Typo fix Signed-off-by: McGuireMark <mark.mcguire@nimblegravity.com>	2025-10-21 17:31:28 +02:00
Imad Saddik	2a0f56390a	docs: fixed a few typos (#2441 ) Signed-off-by: Imad Saddik <79410781+ImadSaddik@users.noreply.github.com>	2025-10-13 09:04:50 +02:00
Lucas Morin	e6c3b05e63	docs: Jobkit and connectors (#2357 ) * feat: create documentation for docling-jobkit Signed-off-by: Lucas Morin <lucas.morin222@gmail.com> * small text fixes Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> --------- Signed-off-by: Lucas Morin <lucas.morin222@gmail.com> Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> Co-authored-by: Michele Dolfi <dol@zurich.ibm.com>	2025-10-02 13:46:56 +02:00
Cesar Berrospi Ramis	46efaaefee	feat: add a backend parser for WebVTT files (#2288 ) * feat: add a backend parser for WebVTT files Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * docs: update README with VTT support Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * docs: add description to supported formats Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * chore: upgrade docling-core to unescape WebVTT in markdown Pin the new release of docling-core 2.48.2. Do not escape HTML reserved characters when exporting WebVTT documents to markdown. Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * test: add missing copyright notice Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> --------- Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com>	2025-09-22 15:24:34 +02:00
Christoph Auer	17afb664d0	feat: Add granite-docling model (#2272 ) * adding granite-docling preview Signed-off-by: Peter Staar <taa@zurich.ibm.com> * updated the model specs Signed-off-by: Peter Staar <taa@zurich.ibm.com> * typo Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * use granite-docling and add to the model downloader Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * update docs and README Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * Update final repo_ids for GraniteDocling Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Update final repo_ids for GraniteDocling Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Fix model name in CLI usage example Signed-off-by: Christoph Auer <60343111+cau-git@users.noreply.github.com> * Fix VLM model name in README.md Signed-off-by: Christoph Auer <60343111+cau-git@users.noreply.github.com> --------- Signed-off-by: Peter Staar <taa@zurich.ibm.com> Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> Signed-off-by: Christoph Auer <cau@zurich.ibm.com> Signed-off-by: Christoph Auer <60343111+cau-git@users.noreply.github.com> Co-authored-by: Peter Staar <taa@zurich.ibm.com> Co-authored-by: Michele Dolfi <dol@zurich.ibm.com>	2025-09-17 15:15:49 +02:00
Roy Derks	e5cd7020bd	docs: Add instructions for using Docling with MCP to README (#2219 ) * docs: Add instructions for using Docling with MCP to README * DCO Remediation Commit for Roy Derks <10717410+royderks@users.noreply.github.com> Signed-off-by: Roy Derks <roy.derks@ibm.com> * DCO Remediation Commit for Roy Derks <10717410+royderks@users.noreply.github.com> I, Roy Derks <10717410+royderks@users.noreply.github.com>, hereby add my Signed-off-by to this commit: `4b9ba1d0ef` Signed-off-by: Roy Derks <roy.derks@ibm.com> * docs: reorganize documentation on MCP server Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * docs: align README with documentation index page Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> --------- Signed-off-by: Roy Derks <roy.derks@ibm.com> Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> Co-authored-by: Roy Derks <roy.derks@ibm.com> Co-authored-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com>	2025-09-10 10:02:28 +02:00
VIktor Kuropiantnyk	cdf079dd06	feat(CLI): Option to download arbitrary HuggingFace model (#2123 ) * Added option to docling-tools to download arbitrary HuggingFace model Signed-off-by: Viktor Kuropiatnyk <vku@zurich.ibm.com> * Added note in documentation Signed-off-by: Viktor Kuropiatnyk <vku@zurich.ibm.com> * Removed note on custom artifact path usage from HF download option Signed-off-by: Viktor Kuropiatnyk <vku@zurich.ibm.com> * Fixed typo Signed-off-by: Viktor Kuropiatnyk <vku@zurich.ibm.com> --------- Signed-off-by: Viktor Kuropiatnyk <vku@zurich.ibm.com>	2025-08-22 15:23:29 +02:00
Panos Vagenas	8996d612aa	docs: add Getting Started page (#2113 ) * docs: add Getting Started page Signed-off-by: Panos Vagenas <pva@zurich.ibm.com> * refactor usage Signed-off-by: Panos Vagenas <pva@zurich.ibm.com> * minor renaming Signed-off-by: Panos Vagenas <pva@zurich.ibm.com> --------- Signed-off-by: Panos Vagenas <pva@zurich.ibm.com>	2025-08-21 08:44:53 +02:00
stephencox-ict	d6d2dbe2f9	docs: Fix typos (#1943 ) Fix typos Signed-off-by: stephencox-ict <scox@ict.co>	2025-07-15 09:51:56 +02:00
Peter W. J. Staar	cfdf4cea25	feat: new vlm-models support (#1570 ) * feat: adding new vlm-models support Signed-off-by: Peter Staar <taa@zurich.ibm.com> * fixed the transformers Signed-off-by: Peter Staar <taa@zurich.ibm.com> * got microsoft/Phi-4-multimodal-instruct to work Signed-off-by: Peter Staar <taa@zurich.ibm.com> * working on vlm's Signed-off-by: Peter Staar <taa@zurich.ibm.com> * refactoring the VLM part Signed-off-by: Peter Staar <taa@zurich.ibm.com> * all working, now serious refacgtoring necessary Signed-off-by: Peter Staar <taa@zurich.ibm.com> * refactoring the download_model Signed-off-by: Peter Staar <taa@zurich.ibm.com> * added the formulate_prompt Signed-off-by: Peter Staar <taa@zurich.ibm.com> * pixtral 12b runs via MLX and native transformers Signed-off-by: Peter Staar <taa@zurich.ibm.com> * added the VlmPredictionToken Signed-off-by: Peter Staar <taa@zurich.ibm.com> * refactoring minimal_vlm_pipeline Signed-off-by: Peter Staar <taa@zurich.ibm.com> * fixed the MyPy Signed-off-by: Peter Staar <taa@zurich.ibm.com> * added pipeline_model_specializations file Signed-off-by: Peter Staar <taa@zurich.ibm.com> * need to get Phi4 working again ... Signed-off-by: Peter Staar <taa@zurich.ibm.com> * finalising last points for vlms support Signed-off-by: Peter Staar <taa@zurich.ibm.com> * fixed the pipeline for Phi4 Signed-off-by: Peter Staar <taa@zurich.ibm.com> * streamlining all code Signed-off-by: Peter Staar <taa@zurich.ibm.com> * reformatted the code Signed-off-by: Peter Staar <taa@zurich.ibm.com> * fixing the tests Signed-off-by: Peter Staar <taa@zurich.ibm.com> * added the html backend to the VLM pipeline Signed-off-by: Peter Staar <taa@zurich.ibm.com> * fixed the static load_from_doctags Signed-off-by: Peter Staar <taa@zurich.ibm.com> * restore stable imports Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * use AutoModelForVision2Seq for Pixtral and review example (including rename) Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * remove unused value Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * refactor instances of VLM models Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * skip compare example in CI Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * use lowercase and uppercase only Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * add new minimal_vlm example and refactor pipeline_options_vlm_model for cleaner import Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * rename pipeline_vlm_model_spec Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * move more argument to options and simplify model init Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * add supported_devices Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * remove not-needed function Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * exclude minimal_vlm Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * missing file Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * add message for transformers version Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * rename to specs Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * use module import and remove MLX from non-darwin Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * remove hf_vlm_model and add extra_generation_args Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * use single HF VLM model class Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * remove torch type Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * add docs for vision models Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> --------- Signed-off-by: Peter Staar <taa@zurich.ibm.com> Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> Co-authored-by: Michele Dolfi <dol@zurich.ibm.com>	2025-06-02 17:01:06 +02:00
Elwin	12dab0a1e8	feat: support image/webp file type (#1415 ) * support image/webp file type Signed-off-by: Elwin <61868295+hzhaoy@users.noreply.github.com> Signed-off-by: Elwin <hzywong@gmail.com> * docs: add webp image format in supported_formats.md Signed-off-by: Elwin <61868295+hzhaoy@users.noreply.github.com> Signed-off-by: Elwin <hzywong@gmail.com> * test: add a test case for `image/webp` file Signed-off-by: Elwin <hzywong@gmail.com> * style: apply styling Signed-off-by: Elwin <hzywong@gmail.com> * test: update test case of converting `image/webp` file with more ocr engines Signed-off-by: Elwin <hzywong@gmail.com> * style: apply styling Signed-off-by: Elwin <hzywong@gmail.com> * rename test file Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> --------- Signed-off-by: Elwin <61868295+hzhaoy@users.noreply.github.com> Signed-off-by: Elwin <hzywong@gmail.com> Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> Co-authored-by: Michele Dolfi <dol@zurich.ibm.com>	2025-05-14 09:47:28 +02:00
Emmanuel Ferdman	3afbe6c969	docs: update supported formats guide (#1463 ) Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com>	2025-04-28 08:51:54 +02:00
Maxim Lysak	1c26769785	feat(SmolDocling): Support MLX acceleration in VLM pipeline (#1199 ) * Initial implementation to support MLX for VLM pipeline and SmolDocling Signed-off-by: Maksym Lysak <mly@zurich.ibm.com> * mlx_model unit Signed-off-by: Maksym Lysak <mly@zurich.ibm.com> * Add CLI choices for VLM pipeline and model Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Initial implementation to support MLX for VLM pipeline and SmolDocling Signed-off-by: Maksym Lysak <mly@zurich.ibm.com> * mlx_model unit Signed-off-by: Maksym Lysak <mly@zurich.ibm.com> * Add CLI choices for VLM pipeline and model Signed-off-by: Christoph Auer <cau@zurich.ibm.com> * Updated minimal vlm pipeline example Signed-off-by: Maksym Lysak <mly@zurich.ibm.com> * make vlm_pipeline python3.9 compatible Signed-off-by: Maksym Lysak <mly@zurich.ibm.com> * Fixed extract_text_from_backend definition Signed-off-by: Maksym Lysak <mly@zurich.ibm.com> * Updated README Signed-off-by: Maksym Lysak <mly@zurich.ibm.com> * Updated example Signed-off-by: Maksym Lysak <mly@zurich.ibm.com> * Updated documentation Signed-off-by: Maksym Lysak <mly@zurich.ibm.com> * corrections in the documentation Signed-off-by: Maksym Lysak <mly@zurich.ibm.com> * Consmetic changes Signed-off-by: Christoph Auer <cau@zurich.ibm.com> --------- Signed-off-by: Maksym Lysak <mly@zurich.ibm.com> Signed-off-by: Christoph Auer <cau@zurich.ibm.com> Co-authored-by: Maksym Lysak <mly@zurich.ibm.com> Co-authored-by: Christoph Auer <cau@zurich.ibm.com>	2025-03-19 15:38:54 +01:00
serced	7e01798417	docs: fix spelling of picture in usage (#1165 ) Signed-off-by: serced <52759935+serced@users.noreply.github.com>	2025-03-17 09:33:51 +01:00
Christoph Auer	eb97357b05	feat: Use new TableFormer model weights and default to accurate model version (#1100 ) * feat: New tableformer model weights [WIP] Signed-off-by: Christoph Auer <60343111+cau-git@users.noreply.github.com> * Updated TF version Signed-off-by: Maksym Lysak <mly@zurich.ibm.com> * Updated tests, after merging with Main, Switched to Accurate TF model by default Signed-off-by: Maksym Lysak <mly@zurich.ibm.com> --------- Signed-off-by: Christoph Auer <60343111+cau-git@users.noreply.github.com> Signed-off-by: Maksym Lysak <mly@zurich.ibm.com> Co-authored-by: Maksym Lysak <mly@zurich.ibm.com>	2025-03-11 10:53:49 +01:00
Michele Dolfi	e1c49ad727	docs: add description of DOCLING_ARTIFACTS_PATH env var (#1124 ) add env var in docs Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>	2025-03-06 07:30:07 +01:00
Michele Dolfi	357d41cc47	docs: Enrichment models (#1097 ) * warning for develop examples Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * add docs for enrichment models Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> * minor reorg of top-level docs (#1098) * minor reorg of top-level docs Signed-off-by: Panos Vagenas <pva@zurich.ibm.com> * fix typo [no ci] Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com> --------- Signed-off-by: Panos Vagenas <pva@zurich.ibm.com> Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com> * trigger ci Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> --------- Signed-off-by: Michele Dolfi <dol@zurich.ibm.com> Signed-off-by: Panos Vagenas <pva@zurich.ibm.com> Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com> Co-authored-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com>	2025-03-04 14:24:38 +01:00

36 Commits