Commit Graph

142 Commits

Author SHA1 Message Date
Maksym Lysak 38354b7d13 Added support of "row_section" semantics of HTML_backend.
Improvements on complex rendering example.

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>
2026-05-12 17:08:27 +02:00
Maksym Lysak 85a2a9c5fd Complex example of html page renderer, with custom logic of table extraction and dynamic resolution based on seed images
Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>
2026-05-12 14:16:06 +02:00
Peter El Hachem 336f942854 feat: add 2 stage model dowload from hf and call it for threaded layout model. (#3267)
* feat: add 2 stage model dowload from hf and call it from threaded_layout_vlm_pipeline

Signed-off-by: ElHachem02 <peterelhachem02@gmail.com>

* feat: cleanup extra space and import

Signed-off-by: ElHachem02 <peterelhachem02@gmail.com>

* fix: move demo to docs/examples/

Signed-off-by: ElHachem02 <peterelhachem02@gmail.com>

---------

Signed-off-by: ElHachem02 <peterelhachem02@gmail.com>
Co-authored-by: Michele Dolfi <dol@zurich.ibm.com>
2026-05-06 13:29:42 +02:00
EliSchwartz 24f2d148d9 feat(vlm): upgrade Granite Vision model to 4.1 for table + chart extraction (#3382)
* feat(table-structure): swap VLM model to granite-vision-4.1-4b

Updates GraniteVisionTableStructureModel to use the 4.1 model. The 4.1
weights are pre-merged, so merge_lora_adapters() is now hasattr-guarded.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Eli Schwartz <eliyahu.schwartz@ibm.com>

* feat(chart-extraction): swap V4 VLM model to granite-vision-4.1-4b

Updates ChartExtractionModelGraniteVisionV4 to use the 4.1 model.
hasattr-guards the merge_lora_adapters() call since 4.1 weights are
pre-merged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Eli Schwartz <eliyahu.schwartz@ibm.com>

* docs(example): mention granite-vision-4.1-4b in table-structure example

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Eli Schwartz <eliyahu.schwartz@ibm.com>

* docs(catalog): update Granite Vision entry to 4.1-4b

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Eli Schwartz <eliyahu.schwartz@ibm.com>

* feat(chart-extraction): honor cuda_use_flash_attention2 in V4 loader

Mirrors the table-structure loader so ChartExtractionModelGraniteVisionV4
also passes _attn_implementation based on AcceleratorOptions. Without this
the chart model falls back to the transformers SDPA default, which can
hit cuDNN backend failures on some torch/cuDNN stacks while the table
model (which already passed the flag) runs cleanly.

Stores accelerator_options on the base class so subclasses can read it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Eli Schwartz <eliyahu.schwartz@ibm.com>

* fix(model-downloader): update Granite Vision log message to 4.1

The log message in download_models still mentioned "Granite Vision 4.0"
after the model swap. Correct it to match the current model version.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Eli Schwartz <eliyahu.schwartz@ibm.com>

* fix(chart-extraction): fall back to bare CSV when V4 model omits ```csv``` fence

granite-vision-4.1-4b sometimes emits raw CSV without a ```csv``` code fence
for the <chart2csv> prompt, which caused _extract_csv_to_dataframe to raise
ValueError and drop the chart's tabular_chart metadata. Mirror the tolerant
parsing already used by the v3 class: prefer a fenced block, otherwise strip
any stray backtick prefix/suffix and parse the text as-is.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Eli Schwartz <eliyahu.schwartz@ibm.com>

---------

Signed-off-by: Eli Schwartz <eliyahu.schwartz@ibm.com>
Co-authored-by: Eli Schwartz <eliyahu.schwartz@ibm.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 08:36:08 +02:00
geoHeil eb4724ee4c ci: prototype tach-based modular skipping (#3333)
* ci: prototype tach-based modular skipping

Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>

* ci: modularize ubuntu setup and refine gating

Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>

* ci: adopt metaxy-inspired governance helpers

- replace custom aggregate check with re-actors/alls-green

- set FORCE_JAVASCRIPT_ACTIONS_TO_NODE24 on every workflow

- keep PR concurrency alive when the graphite:merge label is present

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>

* ci: tune checks and pin action versions

Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>

* ci: split CI suites and heavy examples

Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>

* DCO Remediation Commit for Georg Heiler <georg.kf.heiler@gmail.com>

I, Georg Heiler <georg.kf.heiler@gmail.com>, hereby add my Signed-off-by to this commit: ecaa4777886157d5c2a7b3893c3a820983089dbf
I, Georg Heiler <georg.kf.heiler@gmail.com>, hereby add my Signed-off-by to this commit: d15416f3ca94ac97af2a8317cd6404208db9d896

Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>

* ci: sharpen tach graph and per-suite path filters

- Split docling.pipeline into per-pipeline tach modules
  (asr, vlm, standard_pdf, threaded_standard_pdf, legacy_standard_pdf,
  extraction_vlm, base, base_extraction, simple) so pytest --tach-base
  impact analysis can attribute changes to a specific pipeline rather
  than the whole package.
- Split the asr- and vlm-specific docling.datamodel option files
  (asr_model_specs, pipeline_options_asr_model, vlm_engine_options,
  vlm_model_specs, pipeline_options_vlm_model, layout_model_specs,
  stage_model_specs, backend_options) into their own tach modules so
  a narrow spec/options change no longer marks the full datamodel as
  impacted.
- Narrow the per-suite pipeline path filters in checks.yml to the
  concrete pipeline files relevant to each suite, so editing
  vlm_pipeline.py only triggers the vlm matrix cell and editing
  asr_pipeline.py only the asr one.
- Rekey the model cache in setup-ubuntu-ci to include runner.os and
  hashFiles(uv.lock, pyproject.toml), with ordered restore-keys
  fallbacks so a lockfile bump no longer silently stales the cache.

Metaxy parity note: layered tach enforcement (layer = "...") is
blocked by existing backend<->datamodel and utils<->stages cycles;
depot runners, nox dynamic matrices, devenv/nix, dprint and ty are
not applicable to docling's stack. All pinned action SHAs are on
their latest release as of this commit.

Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>

* ci: introduce pipeline and orchestration tach layers

Earlier notes claimed layers were blocked. That was only true for the
cyclic core (backend<->datamodel, utils<->stages). The boundary
*above* core is clean:

- No module under docling/backend, docling/datamodel, docling/models,
  docling/utils, docling/exceptions, or docling/chunking imports
  anything from docling.pipeline (verified by grep).
- No module anywhere in docling/ imports from docling.cli,
  docling.document_converter, docling.document_extractor, or
  docling.service_client (also verified).

So we can introduce two real layers on top of the cyclic core:

- "pipeline"      — docling.pipeline and all nine concrete pipelines
                     (base, simple, base_extraction, asr, vlm,
                     extraction_vlm, standard_pdf,
                     threaded_standard_pdf, legacy_standard_pdf).
- "orchestration" — docling.cli, docling.document_converter,
                     docling.document_extractor, and
                     docling.experimental.pipeline.

Unlayered modules stay "below" both layers (tach allows them to be
depended on freely) and continue to carry the declared-but-cyclic
backend<->datamodel and utils<->stages edges.

A VLM-only layer was explored but rejected: only
docling.pipeline.vlm_pipeline and docling.pipeline.extraction_vlm_pipeline
could be cleanly layered as "vlm", because the matching datamodel
options (pipeline_options_vlm_model, vlm_engine_options,
vlm_model_specs) and model stages (vlm_convert, vlm_pipeline_models)
sit inside the datamodel/models cycle and cannot be promoted to a
higher layer without first breaking that cycle. Layering only the
two pipeline files is not worth the extra config.

Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>

* ci: expand tach layers to entrypoints/pipeline/models/core

Follow-up to the two-layer attempt. After verifying via grep that
nothing in datamodel/utils/backend imports from
docling.models.{extraction,factories,plugins,vlm_pipeline_models}
or from the "upper" stages (page_assemble, page_preprocessing,
reading_order, picture_description, vlm_convert), those nine
modules can be promoted out of the cyclic core into a dedicated
"models" layer.

The resulting order (highest first):

- entrypoints — cli, document_converter, document_extractor,
                experimental.pipeline
- pipeline    — docling.pipeline + the nine concrete pipelines
- models      — model factories, extraction, plugins,
                vlm_pipeline_models, and the five "upper" stages
- core        — datamodel*, backend*, utils, exceptions, chunking,
                models (base), models.utils, inference_engines.*,
                the six "core stages" that utils cycles with
                (chart_extraction, code_formula, layout, ocr,
                picture_classifier, table_structure), and the
                experimental.* and service_client modules

Rename the previous "orchestration" layer to "entrypoints" to
match the common docling vocabulary. Every module now carries an
explicit layer tag instead of relying on implicit unlayered
behaviour, so future additions must pick a layer deliberately.

A VLM layer, a stand-alone inference-engines layer, and separating
datamodel from backend all remain blocked by the bidirectional
backend<->datamodel and utils<->core-stages edges; those need a
code-level refactor first.

Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>

* ci: refine tach client and foundation layers

Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>

* ci: add optional windows and macos smoke lanes

Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>

* ci: normalize reusable workflow boolean inputs

Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>

* ci: replace external all-green action

Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>

* ci: use org-allowed setup-uv action

Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>

* ci: install compiler toolchain for ML tests

Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>

* DCO Remediation Commit for Georg Heiler <georg.kf.heiler@gmail.com>

I, Georg Heiler <georg.kf.heiler@gmail.com>, hereby add my Signed-off-by to this commit: bb714afb42cd1b29ab073a7f59cc72874ff2fdcd

I, Georg Heiler <georg.kf.heiler@gmail.com>, hereby add my Signed-off-by to this commit: a1f2761da8f72bfed636bd571ebf77b42c8771b6

Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>

* DCO Remediation Commit for Georg Heiler <georg.kf.heiler@gmail.com>

I, Georg Heiler <georg.kf.heiler@gmail.com>, hereby add my Signed-off-by to this commit: cc6551b54c5bf4815ae9cd57cf43a98928a74be0

I, Georg Heiler <georg.kf.heiler@gmail.com>, hereby add my Signed-off-by to this commit: b21b0e7ca12b552dbdd54fac1bda113719c286f1

Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>

* ci: simplify ML pytest suite patterns

Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>

* ci: gate heavy examples on label, add job timeouts

- ci-heavy-examples: run only on main push, schedule, workflow_dispatch,
  or when a PR is labeled tests:full / tests:heavy-examples. Drops the
  path-based auto-trigger so that common edits to pyproject.toml,
  uv.lock, or .github/actions do not kick off the 45-60min matrix on
  every PR push. Collapses the changes job into a job-level if gate and
  adds timeout-minutes: 90.
- checks.yml: add timeout-minutes to every job so stuck runners cannot
  burn the full 6h default.

Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>

* ci: tolerate cancelled allowed-skip jobs in check aggregator

Intentional cancellations (manual cancel, concurrency replacement) on
jobs that are already in ALLOWED_SKIPS should not mark the overall
workflow red. Treat `cancelled` the same as `skipped` when the job is
listed as an allowed skip; any unexpected cancellation of a required
job still fails.

Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>

* docs: make minimal vlm example portable

Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>

* DCO Remediation Commit for Georg Heiler <georg.kf.heiler@gmail.com>

I, Georg Heiler <georg.kf.heiler@gmail.com>, hereby add my Signed-off-by to this commit: 2135051da3ed73d4b8a9130f584f40b56155af1a

I, Georg Heiler <georg.kf.heiler@gmail.com>, hereby add my Signed-off-by to this commit: 4f6d1d7960f7418d0cde6425ae61538da84fda40

Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>

* ci: install workspace packages in CI syncs

Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>

* DCO Remediation Commit for Georg Heiler <georg.kf.heiler@gmail.com>

I, Georg Heiler <georg.kf.heiler@gmail.com>, hereby add my Signed-off-by to this commit: 492fa9883d4de6d98ebcb40fa863eafe2facff3c

I, Georg Heiler <georg.kf.heiler@gmail.com>, hereby add my Signed-off-by to this commit: 3eefae71643f9ca3df0264690c0c6eb1f67f06f1

Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>

* DCO Remediation Commit for Georg Heiler <georg.kf.heiler@gmail.com>

I, Georg Heiler <georg.kf.heiler@gmail.com>, hereby add my Signed-off-by to this commit: fe8c9689a0ee94f36eb826da8e2177ef87404f5e

I, Georg Heiler <georg.kf.heiler@gmail.com>, hereby add my Signed-off-by to this commit: eabdd24a6734ec873cdaac857718aef2473677e7

Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>

* ci: remove unused graphite concurrency exception

Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>

* ci: document test labels and gate cross-platform lanes

Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>

* ci: select ml tests with pytest markers

Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>

* ci: fix marker selector typing

Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>

* ci: simplify ml suite scheduling

Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>

* ci: mark cross-platform smoke tests

Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>

* ci: reuse test trigger for ml matrix

Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>

* ci: tighten full ci aggregation

Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>

* ci: share required job result check

Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>

---------

Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 14:15:35 +02:00
EliSchwartz 1569e42f84 feat: implement GraniteVisionTableStructureModel for VLM-based table extraction (#3323)
Add a new table structure model using IBM Granite Vision to extract table
structure from document images via OTSL token generation.

Changes:
- Add `GraniteVisionTableStructureOptions` with configurable model repo,
  device, batch size, and crop padding options
- Implement `GraniteVisionTableStructureModel` that uses a VLM pipeline to
  generate OTSL tokens from cropped table images, then parses them into
  `TableData` with cells, rows, and columns
- Register the model in `table_structure_engines` alongside existing engines
- Add example script `docs/examples/granite_vision_table_structure.py`
- Add tests covering options, model enable/disable, OTSL parsing (including
  self-closing tags xcel/srow/ecel), and invalid-backend error handling
- Update model catalog docs and CI workflow accordingly

Signed-off-by: Eli Schwartz <eli.shw@gmail.com>
2026-04-17 11:02:20 +02:00
geoHeil 251c8b217a fix(ocr): align RapidOCR english assets with 3.8 mobile models (#3291)
* fix(ocr): support language selection for RapidOCR engine

Allows specifying 'english' or 'chinese' via the --ocr-lang flag and automatically downloads the correct models.

Signed-off-by: DevAbdullah90 <abdullahkashif12b3@gmail.com>

* fix(ocr): fix linting and add unit tests for RapidOCR language selection

Signed-off-by: DevAbdullah90 <abdullahkashif12b3@gmail.com>

* fix(ocr): align RapidOCR english assets with 3.8 mobile models

Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>

* fix(ocr): restore RapidOCR default model compatibility

Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>

* fix(examples): disable OCR in code formula comparison

Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>

---------

Signed-off-by: DevAbdullah90 <abdullahkashif12b3@gmail.com>
Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>
Co-authored-by: DevAbdullah90 <abdullahkashif12b3@gmail.com>
2026-04-15 12:16:41 +02:00
nuri cd2e5b633d docs: add indexed picture placeholder example to serialization notebook (#3293)
* docs: add indexed picture placeholder example to serialization notebook

Show how to subclass MarkdownPictureSerializer to resolve {index}
tokens in image placeholders using self_ref, as an alternative to
modifying the library default.

Ref: docling-project/docling-core#555

* DCO Remediation Commit for nuri-yoo <nuri-yoo@users.noreply.github.com>

I, nuri-yoo <nuri-yoo@users.noreply.github.com>, hereby add my Signed-off-by to this commit: 82cb733c0b

Signed-off-by: nuri-yoo <nuri-yoo@users.noreply.github.com>

---------

Signed-off-by: nuri-yoo <nuri-yoo@users.noreply.github.com>
Co-authored-by: nuri-yoo <nuri-yoo@users.noreply.github.com>
2026-04-15 05:45:39 +02:00
Jehlum Pandit c23622f6f5 docs: add agent skill bundle for coding assistants (SKILL.md, pipelines, convert/evaluate) (#3174)
* docs: add agent skill bundle with convert/evaluate helpers

- Add docs/examples/agent_skill/docling-document-intelligence/ with
  SKILL.md, pipelines.md, EXAMPLE.md, improvement-log template, and
  scripts/docling-convert.py + docling-evaluate.py (standard/vlm-local/vlm-api).
- Document InputFormat.PDF + PdfFormatOption for explicit PdfPipelineOptions.
- Link from examples index and mkdocs nav.

Made-with: Cursor

* docs: align agent skill README and EXAMPLE with Cursor bundle

- Document both ~/.cursor/skills and docs/examples paths.
- README notes repo parity for PRs and local installs.

Made-with: Cursor

* DCO Remediation Commit for jehlum11 <jehlum11@gmail.com>

I, jehlum11 <jehlum11@gmail.com>, hereby add my Signed-off-by to this commit: 2d268ffb6f
I, jehlum11 <jehlum11@gmail.com>, hereby add my Signed-off-by to this commit: 041e709c66

Signed-off-by: jehlum11 <jehlum11@gmail.com>
Made-with: Cursor

* docs: refactor agent skill to use docling CLI for conversion

Address maintainer feedback: the custom docling-convert.py script was
largely redundant with the existing docling CLI. This commit:

- Removes scripts/docling-convert.py (redundant with `docling` CLI)
- Refactors SKILL.md (v1.4 → v2.0) to use `docling` CLI for all
  conversion tasks, reserving the Python API only for features the
  CLI does not expose (chunking, VLM API endpoint config,
  force_backend_text hybrid mode)
- Updates docling-evaluate.py recommended_actions to reference
  `docling` CLI flags instead of the removed script
- Updates README.md, EXAMPLE.md, pipelines.md to use `docling` CLI
  examples throughout
- Simplifies requirements.txt (removes packaging dependency)

The only custom script retained is docling-evaluate.py, which provides
heuristic quality evaluation — functionality the CLI does not cover.

Signed-off-by: jehlum11 <jehlum11@gmail.com>
Made-with: Cursor

* docs: fix ruff format on docling-evaluate.py

Signed-off-by: jehlum11 <jehlum11@gmail.com>
Made-with: Cursor

---------

Signed-off-by: jehlum11 <jehlum11@gmail.com>
2026-04-13 15:02:51 +02:00
Christoph Auer 42157a3e10 feat(service): Establish client SDK for docling serve (#3264)
* Move client SDK to docling

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Add client SDK examples

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Mark client SDK as experimental

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

---------

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
2026-04-13 14:54:06 +02:00
geoHeil 9970d1ef94 feat(vlm): add Nanonets OCR2 onboarding (#3274)
* feat(vlm): add nanonets ocr2 onboarding

Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>

* feat(vlm): add vLLM and API runtimes for Nanonets-OCR2

Extend the Nanonets-OCR2 preset with vLLM + remote API paths so all
standard docling runtimes (Transformers, MLX, vLLM, API, LM Studio,
OpenAI-compatible) work out of the box. Drop the restricted
supported_engines set to match the GLM-OCR / LightOnOCR / Falcon-OCR
pattern, add top-level torch_dtype on the Transformers override, and
register NANONETS_OCR2_VLLM / NANONETS_OCR2_VLLM_API /
NANONETS_OCR2_LMSTUDIO_API legacy specs plus VlmModelType enum entries.

Folds in the remote-API scope that was on the superseded PR #3275.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>

---------

Signed-off-by: Georg Heiler <georg.kf.heiler@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-13 06:51:35 +02:00
Faridun Mirzoev 1fed840506 docs: add AG2 multi-agent document analysis example (#3261)
* docs: add AG2 multi-agent document analysis example

Add a Jupyter notebook demonstrating how to combine Docling document
conversion with AG2 multi-agent orchestration. A Document Processor
agent uses Docling tools to convert PDFs to markdown and extract tables,
while an Analyst agent synthesizes findings into a structured summary.

* DCO Remediation Commit for Faridun Mirzoev <faridun@ag2.ai>

I, Faridun Mirzoev <faridun@ag2.ai>, hereby add my Signed-off-by to this commit: e80e0f3375

Signed-off-by: Faridun Mirzoev <faridun@ag2.ai>

* docs: fix ruff PD901 lint — rename df to table_df

Signed-off-by: Faridun Mirzoev <faridun@ag2.ai>

---------

Signed-off-by: Faridun Mirzoev <faridun@ag2.ai>
2026-04-12 07:30:39 +02:00
Peter W. J. Staar fd834204fa feat: Support for GraniteVision v4 (#3217)
* feat: add GV4

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* got everything to work with granite-vision-4

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* robustifying the output of granite-vision-v4

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* refactored the code to reduce duplications:

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* refactored the code to align with pipeline options

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* ran pre-commit

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* fixed the circular import

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* added the chart_extraction_options file

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

---------

Signed-off-by: Peter Staar <taa@zurich.ibm.com>
2026-04-10 08:55:40 +02:00
Peter W. J. Staar e9a39e8720 feat: remove the deprecation of extraction (#3220)
* updated the example

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* chore: removed the experimental warnings in the extractor

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

---------

Signed-off-by: Peter Staar <taa@zurich.ibm.com>
2026-04-01 09:39:34 +02:00
Anish Raghavendra 3a64f41af8 docs: add line-based chunker documentation and examples (#3210)
Signed-off-by: anish.raghavendra <anish.raghavendra@ibm.com>
Co-authored-by: anish.raghavendra <anish.raghavendra@ibm.com>
2026-03-30 10:55:31 +02:00
Maxim Lysak 1c74a9b9c7 feat: Implementation of HTML backend with headless browser (#2969)
- Implementation of HTML backend that (optionally) uses headless browser (via Playwright) to materialize HTML pages into images, and add provenances with bboxes to all elements in the converted docling document.
- Conversion preserves reading order given by HTML DOM tree
- Added support for HTML "input" fields: checkboxes, radiobuttons, text inputs, etc.
- Added support to Key-Value convention in HTML (i.e. elements with id "key1" and "key1_value1" will be paired as key-values, see test cases as examples)
- Heuristic that glues independent inline HTML elements with single-character text in them into larger text blocks
- Support for inline styling (bold, italic, etc.)

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>
Co-authored-by: Maksym Lysak <mly@zurich.ibm.com>
2026-03-24 14:28:57 +01:00
Max Swain fffd445789 docs: Fix Erroneous vLLM VLM pipeline engine option params causing empty/bad responses (#3167)
docs: Update vLLM VLM pipeline engine option params to match hugging face

Signed-off-by: Max Swain <89113255+maxdswain@users.noreply.github.com>
2026-03-23 10:44:37 +01:00
Maree Carroll d113e611c4 docs: fix code in rag langchain chunker tokenizer (#2993)
* docs: fix code in rag langchain chunker tokenizer

Signed-off-by: Maree Carroll <maree.carroll@gmail.com>

* DCO Remediation Commit for Maree Carroll <maree.carroll@gmail.com>

I, Maree Carroll <maree.carroll@gmail.com>, hereby add my Signed-off-by to this commit: c5abe24ab0c8f29188373dfeaca2c0d6aff8cca2

Signed-off-by: Maree Carroll <maree.carroll@gmail.com>

* fix:Loosen pillow version constraints to allow CVE-2026-25990 fix (#2992)

* Loosen pillow version constraints to allow CVE-2026-25990 fix

Signed-off-by: divekarsc <divekar.samved@gmail.com>

* Added numba>=0.63.0 constraint directly to the asr optional dependency in pyproject.toml

Signed-off-by: divekarsc <divekar.samved@gmail.com>

* fix:Added numba>=0.63.0 constraint directly to the asr optional dependency in pyproject.toml

Signed-off-by: divekarsc <divekar.samved@gmail.com>

---------

Signed-off-by: divekarsc <divekar.samved@gmail.com>
Signed-off-by: Maree Carroll <maree.carroll@gmail.com>

---------

Signed-off-by: Maree Carroll <maree.carroll@gmail.com>
Signed-off-by: divekarsc <divekar.samved@gmail.com>
Co-authored-by: Samved Divekar <divekar.samved@gmail.com>
2026-03-10 06:50:24 +01:00
Kaiiiiiiiii 5d3ac38a65 docs: set HuggingFaceEndpoint task for Mixtral examples (#2945)
* docs: set HuggingFaceEndpoint task for Mixtral examples

* DCO Remediation Commit for Kaiiiiiiiii <2761362118@qq.com> I, Kaiiiiiiiii <2761362118@qq.com>, hereby add my Signed-off-by to this commit: 1288c1fadd

Signed-off-by: Kaiiiiiiiii <2761362118@qq.com>

* DCO Remediation Commit for Kaiiiiiiiii <2761362118@qq.com>

I, Kaiiiiiiiii <2761362118@qq.com>, hereby add my Signed-off-by to this commit: 1288c1fadd

Signed-off-by: Kaiiiiiiiii <2761362118@qq.com>

---------

Signed-off-by: Kaiiiiiiiii <2761362118@qq.com>
2026-03-08 10:38:06 +01:00
Cesar Berrospi Ramis 1eb5c21dab docs: add XBRL conversion example notebook and update feature listings (#3039)
docs(xbrl): add notebook for XBRL parsing

Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com>
2026-02-27 16:09:19 +01:00
Christoph Auer 03532938b5 feat: Unified model-family inference engines (including image-classification) and KServe v2 API support (#2979)
* feat: Inference engines abstraction for image classification model family with HF Transformers and ONNX runtime

Implements runtime abstraction for image classification models with support for both ONNX Runtime and HuggingFace Transformers engines. Users can switch between engines without model retraining, similar to the object detection abstraction (#2959).

Key components:
- BaseImageClassificationEngine with factory pattern
- OnnxRuntimeImageClassificationEngine and TransformersImageClassificationEngine implementations
- Shared HfVisionModelMixin for common HF model utilities
- Engine-specific configuration options
- Test suite and example demonstrating runtime engine switching

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Add missing files and re-export for backward compat

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Don't run with OCR in the example.

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Remove excess onnxruntime related options for inuts and outputs

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* feat: centralize torch compile defaults with DOCLING_INFERENCE_COMPILE_TORCH_MODELS

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* feat: Add Kserve2 API engine for image classifier and object detection models (#2999)

* fix: add failed pages to DoclingDocument for page break consistency (#2939)

* fix: add failed pages to DoclingDocument for page break consistency

When some PDF pages fail to parse, they were not added to
DoclingDocument.pages, causing page break markers to be incorrect
during export. This adds failed/skipped pages with their size info
(if available) to maintain correct page numbering and structure.

- Add _add_failed_pages_to_document() method in StandardPdfPipeline
- Add test cases for failed page handling
- Add test cases for normal page handling (regression test)
- Add test PDF files

Signed-off-by: jhchoi1182 <jhchoi1182@gmail.com>

* fix: ensure resource cleanup and simplify type hints

- Wrap page_backend usage in try-finally to guarantee unload (prevents resource leaks).
- Simplify redundant 'float | None | None' type hint.

Signed-off-by: jhchoi1182 <jhchoi1182@gmail.com>

* fix: add groundtruth for normal_4pages.pdf and exclude failing PDFs from e2e test

Signed-off-by: jhchoi1182 <jhchoi1182@gmail.com>

* fix: ensure correct status assertion for failed pages in tests

Signed-off-by: jhchoi1182 <jhchoi1182@gmail.com>

---------

Signed-off-by: jhchoi1182 <jhchoi1182@gmail.com>

* fix: Use timezone-aware datetime (#2947)

* Use timezone-aware datetime for profiling timestamps

Updated timestamp recording to use timezone-aware datetime.

Signed-off-by: Nikhil Singh <124866156+Ritinikhil@users.noreply.github.com>

* run formatter

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

---------

Signed-off-by: Nikhil Singh <124866156+Ritinikhil@users.noreply.github.com>
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
Co-authored-by: Michele Dolfi <dol@zurich.ibm.com>

* fix(asciidoc): handle commas in image alt text (#2983)

* Fix: Handle commas in AsciiDoc image alt text

  - Modified _parse_picture() to gracefully handle alt text containing commas
  - Commas in alt text are now preserved instead of causing ValueError
  - Added test case with realistic auto-generated alt text
  - split('=', 1) prevents issues when values contain '=' characters

* DCO Remediation Commit for n0rdp0l <n90.w135@gmail.com>

I, n0rdp0l <n90.w135@gmail.com>, hereby add my Signed-off-by to this commit: ee752491fc

Signed-off-by: n0rdp0l <n90.w135@gmail.com>

* style: fix ruff formatting in test_backend_asciidoc.py

Signed-off-by: n0rdp0l <n90.w135@gmail.com>

---------

Signed-off-by: n0rdp0l <n90.w135@gmail.com>
Co-authored-by: Michele Dolfi <dol@zurich.ibm.com>

* chore: bump version to 2.73.1 [skip ci]

* First attempt at establishing API Kserve2 facet

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* refactor: improve KServe v2 engine implementation after code review

- Add comprehensive error handling to KserveV2HttpClient
  - Catch and wrap Timeout, ConnectionError, HTTPError with context
  - Validate response formats with clear error messages

- Refactor URL building to eliminate duplication
  - Extract _build_model_url() helper method
  - Single source of truth for infer_url and model_metadata_url

- Make URL required parameter (remove default localhost:8000)
  - Update ApiKserveV2*EngineOptions to require explicit URL
  - Add preset validation with helpful error messages

- Rename constants for clarity: TRITON_* → KSERVE_V2_*
  - Add comment explaining KServe v2 uses Triton type system

- Improve error messages with actual values
  - Show counts, shapes, and supported types in validation errors

- Document official KServe Python SDK alternative
  - Note async-only requirement and alpha status

- Update tests for required URL parameter

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Cleanup in kserve http helper and options

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Further cleanup

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Fix for remote-services on tablemodel

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* fix: improved deserialization of engine_options (#3008)

* add registry of discriminated subclasses

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* fix detection of engine_type value

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

---------

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* Add options serialization improvements

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

---------

Signed-off-by: jhchoi1182 <jhchoi1182@gmail.com>
Signed-off-by: Nikhil Singh <124866156+Ritinikhil@users.noreply.github.com>
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
Signed-off-by: n0rdp0l <n90.w135@gmail.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Co-authored-by: jhchoi1182 <jhchoi1182@gmail.com>
Co-authored-by: Nikhil Singh <124866156+Ritinikhil@users.noreply.github.com>
Co-authored-by: Michele Dolfi <dol@zurich.ibm.com>
Co-authored-by: Felix Wente <63914035+n0rdp0l@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Michele Dolfi <97102151+dolfim-ibm@users.noreply.github.com>

* Fixes from review

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* DCO Remediation Commit for Christoph Auer <cau@zurich.ibm.com>

I, Christoph Auer <cau@zurich.ibm.com>, hereby add my Signed-off-by to this commit: 4cdb01e6d3

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* DCO Remediation Commit for Christoph Auer <60343111+cau-git@users.noreply.github.com>

I, Christoph Auer <60343111+cau-git@users.noreply.github.com>, hereby add my Signed-off-by to this commit: e293ba3270

Signed-off-by: Christoph Auer <60343111+cau-git@users.noreply.github.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Add fallback for API variants

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Recreate uv.lock

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

---------

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Signed-off-by: jhchoi1182 <jhchoi1182@gmail.com>
Signed-off-by: Nikhil Singh <124866156+Ritinikhil@users.noreply.github.com>
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
Signed-off-by: n0rdp0l <n90.w135@gmail.com>
Signed-off-by: Christoph Auer <60343111+cau-git@users.noreply.github.com>
Co-authored-by: jhchoi1182 <jhchoi1182@gmail.com>
Co-authored-by: Nikhil Singh <124866156+Ritinikhil@users.noreply.github.com>
Co-authored-by: Michele Dolfi <dol@zurich.ibm.com>
Co-authored-by: Felix Wente <63914035+n0rdp0l@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Michele Dolfi <97102151+dolfim-ibm@users.noreply.github.com>
2026-02-18 10:49:19 +01:00
Peter W. J. Staar bf417e6d26 feat: Introduce docling-parse v5 and deprecate old docling-parse backends (#2872)
* feat: simplifying towards docling-parse v5

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* working on integrating docling-parse v5

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* added the test_backend_docling_parse

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* Updated the docling-parse to 5.3.0

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* ran the pre-commit

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* fixed the backend_docling_parse

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* ran pre-commit

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* updated the groundtruth to deal with rounding errors

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* updated comments for later docling-parse integrations

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* ran pre-commit

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* Make DoclingParseV2 and DoclingParseV4 backend stubs that route to new backend, emit warning.

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* lock docling-parse

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* updated to 3.5.2

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

---------

Signed-off-by: Peter Staar <taa@zurich.ibm.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
Co-authored-by: Christoph Auer <cau@zurich.ibm.com>
Co-authored-by: Michele Dolfi <dol@zurich.ibm.com>
2026-02-17 20:27:56 +01:00
Christoph Auer 14e474c955 feat: Inference engines abstraction for object detection model family with HF Transformers and ONNX runtime (#2959)
* Add object_detection family and inference engine

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Fixes for alignment with vlm family

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Proposals for layout and table models

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Update the object-detection family, runtime, plugin, and more

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Add comments

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Clean up artifacts path handling.

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* feat: Load label mappings from HuggingFace config in object detection engines

- Add abstract get_label_mapping() method to BaseObjectDetectionEngine
- Implement label loading from config.json in OnnxRuntimeObjectDetectionEngine
- Refactor LayoutObjectDetectionModel to use engine-provided labels instead of hardcoded mapping
- Centralizes label mapping logic in the inference engine layer

This eliminates hardcoded label dictionaries and makes label mappings configurable through model configs.

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* feat: Add Transformers engine for object detection

Implement TransformersObjectDetectionEngine as a PyTorch-based alternative
to ONNX Runtime. Works as drop-in replacement for both layout and table
detection models with support for CPU, CUDA, and MPS devices.

- Add TransformersObjectDetectionEngine with AutoModelForObjectDetection
- Update TransformersObjectDetectionEngineOptions (score_threshold, torch_dtype)
- Update factory to instantiate Transformers engine
- Switch OBJECT_DETECTION_LAYOUT_HERON preset to use Transformers by default
- Add logging configuration to layout_object_detection_example.py

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Improve OD example with different runtimes to demonstrate abstraction.

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Added onnxruntime as extra

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Update example header.

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Add missing transformers_engine for OD

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Remove unnused cleanup hook.

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Guard against onnxruntime missing on python 3.14

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* refactor: extract shared HF object-detection engine base

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

---------

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
2026-02-10 16:15:30 +01:00
Michele Dolfi d4c87133f3 feat: Introduce pluggable VLM runtime system with preset-based configuration (#2919)
* model runtime refactoring

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* add test

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* fix code formula preset

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* batch prediction

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* use presets and new vlm options in CLI

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* use new model settings by default

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* running

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* update examples

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* fixes for running examples

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* keep old stage

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* update model

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* use granite 3.3 and set options

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* revisit init logic and propagate the proper options to the runtimes

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* update all stages with original setup

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* per stage registry

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* use chat template

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* remove duplicated predict() and factor out some utils

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* working picture description examples

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* add granite docling as code formula model

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* rename code formula presets

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* fix running minimal_vlm example

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* add all models to presets and run compare_vlm

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* remove unused repo_id

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* update vlm api model example

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* fix legacy examples

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* add another legacy example

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* fix test

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* avoid automatic fallback to mlx and fix end_of_utterance in codeformula

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* move vlm_convert_model

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* use new vlm runtime class

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* flasg for CI

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* rename runtimes to explicit vlm_runtimes

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* renaming from runtime to inference engine and model families

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* fixes

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* fix test

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* add docs with stages

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* update docs catalog page

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* rename runtime to inference engine

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

---------

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2026-02-04 17:29:17 +01:00
Peter W. J. Staar a5ad8f24ff docs: add granite vision for charts (#2946)
* feat: add chart extraction models

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* refactored code

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* fixed the meta

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* cleaned up the ChartExtractionModelGraniteVision

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* cleaned up the chart2csv model

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* fixed the merge issues

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* fixed the imports

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* added the missing files

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* fixed the output post-processing

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* example: add chart extraction example

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* working chart conversion

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

---------

Signed-off-by: Peter Staar <taa@zurich.ibm.com>
2026-02-03 12:31:27 +01:00
Michele Dolfi 7f386587ed feat: Drop support for Python 3.9 (#2905)
* chore: drop support for Python 3.9

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* disable CI for python 3.9

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* fix: test bump version

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* add chore to the changelog but without bumping the version

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* force newer langchain-core

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* fix linter for 3.10

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* Add python 3.9 removal notice

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* avoid upgrading docling-core

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* restore semantic release settings

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

---------

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Co-authored-by: Christoph Auer <cau@zurich.ibm.com>
2026-01-23 10:15:58 +01:00
Michele Dolfi b14ee1557b refactor: organize models in submodules (#2845)
* refactor ocr models

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* refactor vlm api models

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* refactor vlm_models

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* refactor picture description

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* refactor table structure

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* refactor all into stages

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

---------

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2026-01-13 13:45:48 +01:00
Nikhil Singh 72851cce66 docs: fix Colab badge links and Weaviate typo in docs examples (#2871)
* Fix typo in vector store name from 'Weavieate' to 'Weaviate'

Signed-off-by: Nikhil Singh <124866156+Ritinikhil@users.noreply.github.com>

* Fix link to hybrid_rag_qdrant notebook in markdown

Signed-off-by: Nikhil Singh <124866156+Ritinikhil@users.noreply.github.com>

* Fix markdown link formatting in retrieval_qdrant.ipynb

Signed-off-by: Nikhil Singh <124866156+Ritinikhil@users.noreply.github.com>

* Update Colab link in retrieval_qdrant notebook

Signed-off-by: Nikhil Singh <124866156+Ritinikhil@users.noreply.github.com>

* Fix HTML escaping in Colab link in notebook

Signed-off-by: Nikhil Singh <124866156+Ritinikhil@users.noreply.github.com>

* Create output directory if it doesn't exist

Ensure output directory exists before saving files.

Signed-off-by: Nikhil Singh <124866156+Ritinikhil@users.noreply.github.com>

---------

Signed-off-by: Nikhil Singh <124866156+Ritinikhil@users.noreply.github.com>
2026-01-12 08:50:41 +01:00
Nikhil Singh 211c759fd0 docs(example): fix update sample image path to be relative (#2864)
* Update sample image path to be relative

The example script currently uses a relative path (tests\data\...) to load the sample image. This causes a FileNotFoundError if the script is not executed strictly from the repository root (e.g., when running directly from an IDE or the script's own directory).



Signed-off-by: Nikhil Singh <124866156+Ritinikhil@users.noreply.github.com>

* Update docs/examples/experimental/process_table_crops.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Peter W. J. Staar <91719829+PeterStaar-IBM@users.noreply.github.com>

* formatter and linter

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

---------

Signed-off-by: Nikhil Singh <124866156+Ritinikhil@users.noreply.github.com>
Signed-off-by: Peter W. J. Staar <91719829+PeterStaar-IBM@users.noreply.github.com>
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
Co-authored-by: Peter W. J. Staar <91719829+PeterStaar-IBM@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Michele Dolfi <dol@zurich.ibm.com>
2026-01-09 12:46:48 +01:00
Michele Dolfi aab3ff5d82 feat: Enrichment annotations in the new meta format (#2859)
* produce meta annotations in the new format

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* update examples

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* update test and check for deprecation

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* update last example

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

---------

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2026-01-08 18:28:52 +01:00
Rehan Khan a0530a271e fix: correct type hint for table_structure_options usage (#2823)
* fix(#2785): correct type hint for table_structure_options in PdfPipelineOptions

Signed-off-by: ryyhan <dayel.rehan@gmail.com>

* fix: use Union for table_structure_options and update examples

Signed-off-by: ryyhan <dayel.rehan@gmail.com>

* revert(pipeline_options): use BaseTableStructureOptions as requested

Signed-off-by: ryyhan <dayel.rehan@gmail.com>

---------

Signed-off-by: ryyhan <dayel.rehan@gmail.com>
2026-01-06 08:59:43 +01:00
jspast 2b83fdd0de feat: Add XPU device support for Intel GPUs (#2809)
* feat: Add XPU device support

* feat: Add XPU as supported device on some models

* docs: Add XPU usage to examples

* DCO Remediation Commit for jspast <140563347+jspast@users.noreply.github.com>

I, jspast <140563347+jspast@users.noreply.github.com>, hereby add my Signed-off-by to this commit: f26e8b8c3a
I, jspast <140563347+jspast@users.noreply.github.com>, hereby add my Signed-off-by to this commit: a4a2bf90fa
I, jspast <140563347+jspast@users.noreply.github.com>, hereby add my Signed-off-by to this commit: a2d5dac2e1

Signed-off-by: jspast <140563347+jspast@users.noreply.github.com>

---------

Signed-off-by: jspast <140563347+jspast@users.noreply.github.com>
2026-01-05 17:35:26 +01:00
Michele Dolfi 241d19ed6f feat: add preset for using granite-docling via vllm and other apis (#2792)
add preset for using granite-docling via vllm and other apis

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-12-16 17:22:35 +01:00
Michele Dolfi d03439ccc5 docs(gpu): Add benchmarks of standard pipeline with OCR (#2764)
* add results for standard + OCR and more Windows timings

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* fix runtime selection for py 3.14 in CI

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

---------

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-12-10 20:43:20 +01:00
Christoph Auer 609069d12c fix: Ensure proper image_scale for generated page images in VLM pipelines (#2728)
* fix: Ensure proper image_scale is used for generated page images in layout+vlm pipeline

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* fix: Ensure proper image_scale output in default VLM pipeline

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

---------

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
2025-12-05 13:16:11 +01:00
Maxim Lysak c0b57ae389 chore: Cleaning the example of post_process_ocr_with_vlm (#2693)
Cleaning the example

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>
Co-authored-by: Maksym Lysak <mly@zurich.ibm.com>
2025-11-27 12:38:45 +01:00
Maxim Lysak fa21128138 docs: Example on how to apply external OCR as post processing (#2517)
* Example on how to apply to Docling Document OCR as a post-processing with "nanonets-ocr2-3b" via LM Studio

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>

* Added support of elements with multiple provenances

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>

* cleaning up

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>

* improved prompt for nanonets-ocr2-3b

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>

* cleaning up

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>

* excluded example from CI

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>

* updated class name

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>

* Improved usability of the example, added simple cli, and some helper functions

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>

* Fix api_image_request usage

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Fix pydantic errors

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Improvements and corrections

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>

* Added string sanitation, removing break lines from remote OCR, also preserving original text from json

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>

* Added quick and reliable detection of empty image crops (elements, table cells, form items), these are not sent to OCR

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>

* Example respects ocr_documents.txt, tuned empty crop detection

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>

* cleaning api_image_request

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>

---------

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Co-authored-by: Maksym Lysak <mly@zurich.ibm.com>
Co-authored-by: Christoph Auer <cau@zurich.ibm.com>
2025-11-27 11:04:40 +01:00
Christoph Auer 134436245a feat(experimental): Add experimental TableCropsLayoutModel (#2669)
* feat: Scaffolding for layout and table model plugin factory

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Add missing files

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Add base options classes for layout and table

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* feat(experimental): Add experimental TableCropsLayoutModel

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Add example

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

---------

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
2025-11-25 05:14:51 +01:00
Michele Dolfi b75c6461f4 docs: More GPU results and improvements in the example docs (#2674)
* add more results and improve the example docs

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* 5070 windows timing

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* add reference for cpu-only

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

---------

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-11-24 15:26:08 +01:00
Cesar Berrospi Ramis 54e65d9511 chore: update Milvus on examples and references to deprecated method (#2664)
* docs(examples): update the set up of Milvus Lite

Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com>

* chore: remove references to deprecated save_as_document_tokens

Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com>

---------

Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com>
2025-11-20 13:22:45 +01:00
Harry Ho b216ad848d docs: Added documentation to use SuryaOCR via plugin docling-surya (#2533)
* docs: Added documentation to use SuryaOCR via plugin `docling-surya`

Signed-off-by: Harry Ho <kho7@student.umgc.edu>

* Add PyPI link for docling-surya package

Added a link to the PyPI page for docling-surya.

Signed-off-by: Harry Ho <4719770+harrykhh@users.noreply.github.com>

* Add licensing note for SuryaOCR integration

Added important licensing note regarding SuryaOCR integration. 

Signed-off-by: Harry Ho <4719770+harrykhh@users.noreply.github.com>

* Ran linter to reformat

Signed-off-by: Harry Ho <4719770+harrykhh@users.noreply.github.com>

---------

Signed-off-by: Harry Ho <kho7@student.umgc.edu>
Signed-off-by: Harry Ho <4719770+harrykhh@users.noreply.github.com>
Co-authored-by: Harry Ho <kho7@student.umgc.edu>
2025-11-19 15:27:24 +01:00
Michele Dolfi 8af228f1e2 docs(examples): processing parquet file of images (#2641)
* add example processing parquet file of images

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* vlm using vllm api

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* use openvino and add more docs

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* add default input file

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* change default to standard for running in CI

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* use simple rapidocr without openvino in the CI example

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

---------

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-11-19 06:39:25 +01:00
Cesar Berrospi Ramis f5528623a7 docs(examples): remove deprecation warnings with export_to_dataframe (#2638)
fix: remove deprecation warnings with export_to_dataframe

Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com>
2025-11-17 12:48:41 +01:00
Peter W. J. Staar 14b436d590 fix: correct the model-repo name (#2624)
* fix: correct the model-repo name

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* udated model-id

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* reformatted code

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

---------

Signed-off-by: Peter Staar <taa@zurich.ibm.com>
2025-11-14 13:21:08 +01:00
Christoph Auer 4852d8b4f2 feat(experimental): Layout + VLM model with layout prompt (#2244)
* adding granite-docling preview

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* updated the model specs

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* Add Layout+VLM pipeline with prompt injection, ApiVlmModel updates

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Update layout injection, move to experimental

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Adjust defaults

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Map Layout+VLM pipeline to GraniteDoclign

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Remove base_prompt from layout injection prompt

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Reinstate custom prompt

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* add demo_layout file that produces with vs without layout injection

Signed-off-by: Peter El Hachem <peter.el.hachem@ibm.com>
Signed-off-by: ElHachem02 <peterelhachem02@gmail.com>

* feat: wrap vlm_inference around process_images

Signed-off-by: ElHachem02 <peterelhachem02@gmail.com>

* feat: carry input prompt + number of input tokens

Signed-off-by: ElHachem02 <peterelhachem02@gmail.com>

* fix: adapt example to run on local test file

Signed-off-by: ElHachem02 <peterelhachem02@gmail.com>

* fix: example now expects single document

Signed-off-by: ElHachem02 <peterelhachem02@gmail.com>

* feat: add layout example to EXAMPLES_TO_SKIP

Signed-off-by: ElHachem02 <peterelhachem02@gmail.com>

* feat: address comments on git

Signed-off-by: ElHachem02 <peterelhachem02@gmail.com>

* feat: add inference wrapper for hf_transformers + carry input prompt

Signed-off-by: ElHachem02 <peterelhachem02@gmail.com>

* Feat: add track_input_prompt to ApiVlmOptions, and track input prompt as part of api vlm

Signed-off-by: ElHachem02 <peterelhachem02@gmail.com>

* fix: Ensure backward-compatible build_prompt by adding _internal_page ag

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* fix: Ensure backward-compatible build_prompt by adding _internal_page ag

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Fixes for demo

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Typing fixes

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Restoring lost changes in vllm_model

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

* Restoring vlm_pipeline_api_model example

Signed-off-by: Christoph Auer <cau@zurich.ibm.com>

---------

Signed-off-by: Peter Staar <taa@zurich.ibm.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Signed-off-by: Peter El Hachem <peter.el.hachem@ibm.com>
Signed-off-by: ElHachem02 <peterelhachem02@gmail.com>
Co-authored-by: Peter Staar <taa@zurich.ibm.com>
Co-authored-by: ElHachem02 <peterelhachem02@gmail.com>
2025-11-12 13:42:09 +01:00
Michele Dolfi 97aa06bfbc docs: Add details and examples on optimal GPU setup (#2531)
* docs for GPU optimizations

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* improve time reporting and improve execution

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* fix standard pipeline

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* tune examples with batch size 64

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* add benchmark results

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* improve docs

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* typo in excluded tests

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* explicit pipeline in table

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

---------

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
2025-10-30 13:22:05 +01:00
Cesar Berrospi Ramis 9a6fdf936b docs: update opensearch notebook and backend documentation (#2519)
* docs(opensearch): update the example notebook RAG with OpenSearch

Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com>

* docs(uspto): remove direct usage of the backend class for conversion

Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com>

* docs: remove direct usage of backends from documentation

Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com>

---------

Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com>
2025-10-27 10:02:50 +01:00
Ken Steele 657ce8b01c feat(ASR): MLX Whisper Support for Apple Silicon (#2366)
* add mlx-whisper support

* added mlx-whisper example and test. update docling cli to use MLX automatically if present.

* fix pre-commit checks and added proper type safety

* fixed linter issue

* DCO Remediation Commit for Ken Steele <ksteele@gmail.com>

I, Ken Steele <ksteele@gmail.com>, hereby add my Signed-off-by to this commit: a979a680e1dc2fee8461401335cfb5dda8cfdd98
I, Ken Steele <ksteele@gmail.com>, hereby add my Signed-off-by to this commit: 9827068382ca946fe1387ed83f747ae509fcf229
I, Ken Steele <ksteele@gmail.com>, hereby add my Signed-off-by to this commit: ebbeb45c7dc266260e1fad6bdb54a7041f8aeed4
I, Ken Steele <ksteele@gmail.com>, hereby add my Signed-off-by to this commit: 2f6fd3cf46c8ca0bb98810191578278f1df87aa3

Signed-off-by: Ken Steele <ksteele@gmail.com>

* fix unit tests and code coverage for CI

* DCO Remediation Commit for Ken Steele <ksteele@gmail.com>

I, Ken Steele <ksteele@gmail.com>, hereby add my Signed-off-by to this commit: 5e61bf11139a2133978db2c8d306be6289aed732

Signed-off-by: Ken Steele <ksteele@gmail.com>

* fix CI example test - mlx_whisper_example.py defaults to tests/data/audio/sample_10s.mp3 if no args specified.

Signed-off-by: Ken Steele <ksteele@gmail.com>

* refactor: centralize audio file extensions and MIME types in base_models.py

- Move audio file extensions from CLI hardcoded set to FormatToExtensions[InputFormat.AUDIO]
- Add support for additional audio formats: m4a, aac, ogg, flac, mp4, avi, mov
- Update FormatToMimeType mapping to include MIME types for all audio formats
- Update CLI auto-detection to use centralized FormatToExtensions mapping
- Add comprehensive tests for audio file auto-detection and pipeline selection
- Ensure explicit pipeline choices are not overridden by auto-detection

Fixes issue where only .mp3 and .wav files were processed as audio despite
CLI auto-detection working for all formats. The document converter now
properly recognizes all audio formats through MIME type detection.

Addresses review comments:
- Centralizes audio extensions in base_models.py as suggested
- Maintains existing auto-detection behavior while using centralized data
- Adds proper test coverage for the audio detection functionality

All examples and tests pass with the new centralized approach.
All audio formats (mp3, wav, m4a, aac, ogg, flac, mp4, avi, mov) now work correctly.

Signed-off-by: Ken Steele <ksteele@gmail.com>

* feat: address reviewer feedback - improve CLI auto-detection and add explicit model options

Review feedback addressed:
1. Fix CLI auto-detection to only switch to ASR pipeline when ALL files are audio
   - Previously switched if ANY file was audio, now requires ALL files to be audio
   - Added warning for mixed file types with guidance to use --pipeline asr

2. Add explicit WHISPER_X_MLX and WHISPER_X_NATIVE model options
   - Users can now force specific implementations if desired
   - Auto-selecting models (WHISPER_BASE, etc.) still choose best for hardware
   - Added 12 new explicit model options: _MLX and _NATIVE variants for each size

CLI now supports:
- Auto-selecting: whisper_tiny, whisper_base, etc. (choose best for hardware)
- Explicit MLX: whisper_tiny_mlx, whisper_base_mlx, etc. (force MLX)
- Explicit Native: whisper_tiny_native, whisper_base_native, etc. (force native)

Addresses reviewer comments from @dolfim-ibm

Signed-off-by: Ken Steele <ksteele@gmail.com>

* DCO Remediation Commit for Ken Steele <ksteele@gmail.com>

I, Ken Steele <ksteele@gmail.com>, hereby add my Signed-off-by to this commit: c60e72d2b5
I, Ken Steele <ksteele@gmail.com>, hereby add my Signed-off-by to this commit: 94803317a3
I, Ken Steele <ksteele@gmail.com>, hereby add my Signed-off-by to this commit: 21905e8ace
I, Ken Steele <ksteele@gmail.com>, hereby add my Signed-off-by to this commit: 96c669d155
I, Ken Steele <ksteele@gmail.com>, hereby add my Signed-off-by to this commit: 8371c060ea

Signed-off-by: Ken Steele <ksteele@gmail.com>

* test(asr): add coverage for MLX options, pipeline helpers, and VLM prompts

- tests/test_asr_mlx_whisper.py: verify explicit MLX options (framework, repo ids)
- tests/test_asr_pipeline.py: cover _has_text/_determine_status and backend support with proper InputDocument/NoOpBackend wiring
- tests/test_interfaces.py: add BaseVlmPageModel.formulate_prompt tests (RAW/NONE/CHAT, invalid style), with minimal InlineVlmOptions scaffold

Improves reliability of ASR and VLM components by validating configuration paths and helper logic.

Signed-off-by: Ken Steele <ksteele@gmail.com>

* test(asr): broaden coverage for model selection, pipeline flows, and VLM prompts

- tests/test_asr_mlx_whisper.py
  - Add MLX/native selector coverage across all Whisper sizes
  - Validate repo_id choices under MLX and Native paths
  - Cover fallback path when MPS unavailable and mlx_whisper missing

- tests/test_asr_pipeline.py
  - Relax silent-audio assertion to accept PARTIAL_SUCCESS or SUCCESS
  - Force CPU native path in helper tests to avoid torch in device selection
  - Add language handling tests for native/MLX transcribe
  - Cover native run success (BytesIO) and failure (exception) branches
  - Cover MLX run success/failure branches with mocked transcribe
  - Add init path coverage with artifacts_path

- tests/test_interfaces.py
  - Add focused VLM prompt tests (NONE/CHAT variants)

Result: all tests passing with significantly improved coverage for ASR model selectors, pipeline execution paths, and VLM prompt formulation.

Signed-off-by: Ken Steele <ksteele@gmail.com>

* simplify ASR model settings (no pipeline detection needed)

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* clean up disk space in runners

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

---------

Signed-off-by: Ken Steele <ksteele@gmail.com>
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
Co-authored-by: Michele Dolfi <dol@zurich.ibm.com>
2025-10-21 08:05:59 +02:00
Peter W. J. Staar 3e6da2c62d docs: Example on PII obfuscation (#2459)
* added example on PII obfuscation

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* reformatting code

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* add in index and fix heading formatting

Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>

* add GLINER to PII

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

* final commit

Signed-off-by: Peter Staar <taa@zurich.ibm.com>

---------

Signed-off-by: Peter Staar <taa@zurich.ibm.com>
Signed-off-by: Michele Dolfi <dol@zurich.ibm.com>
Co-authored-by: Michele Dolfi <dol@zurich.ibm.com>
2025-10-14 15:39:16 +02:00
Jeremy Chen 90200443bc docs: Remove deprecated call in custom_convert.py (#2447)
Update custom_convert.py

export_to_document_tokens is deprecated so change it to export_to_doctags

Signed-off-by: Jeremy Chen <github@jeremychen.email>
2025-10-13 09:30:02 +02:00