Commit Graph

  • 05af91edc1 Deployed 46763a1 with MkDocs version: 1.6.1 gh-pages 2026-05-17 06:48:58 +00:00
  • 46763a195c docs(cli): clarify help text (#3408) main Jefsky Wong 2026-05-17 14:44:49 +08:00
  • ab6aa050be fix: fail on empty markdown export (#3429) Pragnya Khandelwal 2026-05-17 12:05:46 +05:30
  • 038b9916bc fix: fix OSTL ucel merged incorrectly (#3453) Sunny He 2026-05-16 23:33:30 -07:00
  • bcd550950a fix(service): Improve transport-level connection error handing in client SDK (#3439) Christoph Auer 2026-05-15 11:18:47 +02:00
  • 64686fcb5c update async tests, typing cau/client-sdk-async-facade Christoph Auer 2026-05-15 09:57:46 +02:00
  • eceedc2f40 feat(latex): add optional Tectonic TikZ rendering (#3369) Aditya Sasidhar 2026-05-15 12:33:17 +05:30
  • 61ac4b3d19 refactor: use ConversionStatus type for task_status field (#3438) Phil Nash 2026-05-15 14:41:16 +10:00
  • 2d1dcde869 chore: add shared Python agent skills (#3445) geoHeil 2026-05-15 06:39:16 +02:00
  • efd4c9d10a Add async facade to the client SDK Christoph Auer 2026-05-13 20:50:21 +02:00
  • 3187b7395b Re-establish websocket channel after transient connectivity failures Christoph Auer 2026-05-13 16:40:16 +02:00
  • 1b07ff746e Retry on transport failures Christoph Auer 2026-05-13 16:12:12 +02:00
  • b1ddc25b84 Enable retry on HTTP 502 Christoph Auer 2026-05-13 15:02:37 +02:00
  • 5fadc6d180 feat: add image_placeholder and use_markdown_images as fields in the BaseChunkerOptions (#3436) ibrahimGoumrane 2026-05-13 06:02:06 +01:00
  • 208fe565e2 ci: unify Python version to 3.10 across single-version CI lanes (#3421) geoHeil 2026-05-12 21:37:13 +02:00
  • 38354b7d13 Added support of "row_section" semantics of HTML_backend. Improvements on complex rendering example. dev/complex_html_renderer_example Maksym Lysak 2026-05-12 17:08:27 +02:00
  • 85a2a9c5fd Complex example of html page renderer, with custom logic of table extraction and dynamic resolution based on seed images Maksym Lysak 2026-04-23 17:13:40 +02:00
  • 694cf0c791 fix: Handle valid JATS contributor name variants (#3432) David Wallace 2026-05-12 01:44:54 -07:00
  • b5f2e530e2 feat(extraction): add Granite Vision 4.1 as alternative KVP extraction model (#3398) benvizel 2026-05-12 07:53:22 +03:00
  • 0c317060cf fix(docx): preserve custom numbering text prefix in list markers (#3425) Brighton 2026-05-12 12:49:34 +08:00
  • ce87c1f091 chore: Add Agents.md instructions (#3426) Christoph Auer 2026-05-11 15:14:12 +02:00
  • 3803f96fa2 tighten gRPC URL validation cau/kserve-channel-args Christoph Auer 2026-05-11 14:53:18 +02:00
  • ab3deedaf7 use channel args to make gRPC use round_robin scheduling Christoph Auer 2026-05-11 14:39:37 +02:00
  • 82128f4df6 Improve threaded docling-parse backend integration cau/docling-parse-threaded Christoph Auer 2026-05-11 14:12:46 +02:00
  • d3ac439d1d Merge branch 'main' of github.com:DS4SD/docling into cau/docling-parse-threaded Christoph Auer 2026-05-08 16:14:30 +02:00
  • 64ddeb64b8 fix: Update service client URL parsing with v1 suffix (#3415) Christoph Auer 2026-05-08 15:34:21 +02:00
  • 5b1df788ef ci: tighten pre-commit guardrails (#3346) geoHeil 2026-05-08 15:07:11 +02:00
  • 24af7f6249 docs(security): Add GitHub Private Vulnerability Reporting (#3416) Michele Dolfi 2026-05-08 10:00:29 +02:00
  • 81f6358c85 Update test GT cau/test-gt-update Christoph Auer 2026-05-07 18:24:00 +02:00
  • ed128e2a7b Update lock Christoph Auer 2026-05-07 18:17:30 +02:00
  • aba7f155ae fix(client): Make submit_and_retrieve_many accept lazy iterable and yield (#3405) Christoph Auer 2026-05-07 18:15:26 +02:00
  • eb6e1e6609 fix(html): add redirect validation to image fetching (#3407) Panos Vagenas 2026-05-07 08:12:06 +02:00
  • 2bb0fa67bd fix(html): improve local file path handling (#3400) Panos Vagenas 2026-05-06 16:34:17 +02:00
  • 45c3d2b895 ci: share typecheck deps with PR fast checks (#3406) geoHeil 2026-05-06 15:05:21 +02:00
  • df5fbc3858 docs(readme): improve structure and clarity (#3366) yasqdb 2026-05-06 13:06:32 +01:00
  • 336f942854 feat: add 2 stage model dowload from hf and call it for threaded layout model. (#3267) Peter El Hachem 2026-05-06 13:29:42 +02:00
  • 6b3322ef85 fix(markdown): flush pending list/heading creation on CodeSpan to prevent RecursionError (#3361) Qiefan Jiang 2026-05-06 19:26:44 +08:00
  • a4d6683d98 ci: run heavy examples only manually (#3392) geoHeil 2026-05-06 10:53:22 +02:00
  • 885873ea36 ci: avoid mutable PR merge refs in fast checks (#3397) geoHeil 2026-05-06 10:33:16 +02:00
  • fdca54caf7 ci: clarify Codecov coverage reporting (#3389) geoHeil 2026-05-06 10:00:00 +02:00
  • 61c37a23a9 chore: bump version to 2.93.0 [skip ci] v2.93.0 github-actions[bot] 2026-05-05 19:53:32 +00:00
  • e00735dd59 fix(docx): fix OMML equation handling and improve type safety (#3381) Cesar Berrospi Ramis 2026-05-04 10:58:25 +02:00
  • 24f2d148d9 feat(vlm): upgrade Granite Vision model to 4.1 for table + chart extraction (#3382) EliSchwartz 2026-05-04 09:36:08 +03:00
  • eb4724ee4c ci: prototype tach-based modular skipping (#3333) geoHeil 2026-04-30 14:15:35 +02:00
  • 05e0a4daa4 ci: add stable required status checks (#3387) geoHeil 2026-04-30 12:13:41 +02:00
  • 4cb6334b04 ci: prepare editable docling-parse for docs fix/docs-ci-docling-parse-source Georg Heiler 2026-04-29 18:37:24 +02:00
  • 847c7256a0 Merge branch 'main' of github.com:DS4SD/docling into cau/docling-parse-threaded Christoph Auer 2026-04-29 17:31:56 +02:00
  • e51fc23a79 Add threaded parse to CLI, make RGB images Christoph Auer 2026-04-29 16:27:03 +02:00
  • 0c85938e12 ci: diff PR fast checks against merge ref (#3383) geoHeil 2026-04-29 15:17:51 +02:00
  • 41e9fa7886 ci: implement phase 1 path-based workflow skipping (#3332) geoHeil 2026-04-29 10:55:27 +02:00
  • 80f81b2799 chore: bump version to 2.92.0 [skip ci] v2.92.0 github-actions[bot] 2026-04-29 07:38:26 +00:00
  • 72942486ff fix(pptx): skip malformed picture shapes instead of aborting conversion (#3372) pateltejas 2026-04-29 01:29:08 -05:00
  • 2be2c38be9 chore: refactor release script to use Python regex for dependency updates (#3379) Michele Dolfi 2026-04-29 08:28:28 +02:00
  • cb2fe9de6d Adjust tests and thread count source Christoph Auer 2026-04-28 22:11:00 +02:00
  • c96078d215 feat: add threaded PDF backend consuming DoclingThreadedPdfParser Christoph Auer 2026-04-28 21:49:21 +02:00
  • 8c26f6021f chore: fix docling extras (#3376) Michele Dolfi 2026-04-28 19:42:12 +02:00
  • 2b27739e57 refactor: allow import of docling datamodel without transformers (#3375) Michele Dolfi 2026-04-28 17:21:57 +02:00
  • 8b67fae687 feat: Extend the kserve-triton OCR model to have multi-lingual support (#3368) Nikos Livathinos 2026-04-28 16:00:57 +02:00
  • 3df80e7f46 fix(docx): OMML conversion failures for unsupported limit functions (#3359) Cesar Berrospi Ramis 2026-04-28 14:43:24 +02:00
  • c455a65e36 feat(docx): add checkbox parsing support (#3349) Cesar Berrospi Ramis 2026-04-28 14:38:43 +02:00
  • 987bb0e585 refactor: fix mutable default arguments in backend __init__ methods (#3354) Cesar Berrospi Ramis 2026-04-28 10:58:54 +02:00
  • f2c03edb30 fix(html):preserve fragment-only anchor links during path resolution (#3262) Aatrey Sahay 2026-04-28 04:28:23 -04:00
  • ed32c5e993 feat: Introduce modular docling-slim package (#3285) Michele Dolfi 2026-04-24 15:14:57 +02:00
  • 3744331bd1 adding script to monitor memory consumption dev/investigate-memory-footprint-of-docling-parse-backend Peter Staar 2026-04-24 10:26:41 +02:00
  • a6a37ca895 fix: Make VLLM model_impl configurable (#3358) Christoph Auer 2026-04-24 09:53:23 +02:00
  • 0f6f8d0bcd feat: Add ResponseFormat.DOCLANG and parsing branch in VLM pipeline (#3350) Christoph Auer 2026-04-24 08:36:30 +02:00
  • 188b6a192c chore: bump version to 2.91.0 [skip ci] v2.91.0 github-actions[bot] 2026-04-23 09:29:36 +00:00
  • c1dbac22c7 fix: strengthen input validation for METS‑GBS processing (#3336) Cesar Berrospi Ramis 2026-04-23 10:17:39 +02:00
  • 5e161ac185 fix: EasyOCR model downloading (#3339) Nikos Livathinos 2026-04-23 09:27:54 +02:00
  • c190ba2636 fix(vlm): Remove bogus preamble from VLM chat template (#3351) Christoph Auer 2026-04-22 21:42:38 +02:00
  • cd0cb69530 fix(html): refine image URL and size handling (#3348) Panos Vagenas 2026-04-22 17:34:01 +02:00
  • 9813190ab4 fix: Fixes to html_backend (#3342) Maxim Lysak 2026-04-22 12:38:07 +02:00
  • d6e0f881bf chore: breaking release guards (#3347) Michele Dolfi 2026-04-22 12:28:35 +02:00
  • 2ddaa3be97 feat(docx): extract VML images with v:imagedata elements (#3343) Cesar Berrospi Ramis 2026-04-22 08:46:36 +02:00
  • 3a3c8f68dd fix(pptx)!: assign pptx notes to ContentLayer.NOTES (#3341) v2.90.1 Matvei Smirnov 2026-04-21 19:35:43 +03:00
  • 65ef18075f fix: prevent path traversal in LaTeX macro handlers (#3330) Cesar Berrospi Ramis 2026-04-21 17:28:21 +02:00
  • 09de7f99df docs(uspto): improve documentation of USPTO XML parser security config (#3338) Cesar Berrospi Ramis 2026-04-21 16:38:09 +02:00
  • 075fa69491 fix(service): Add explicit usage exceeded exception handling (#3325) Christoph Auer 2026-04-17 16:57:50 +02:00
  • d5bff7155c chore: bump version to 2.90.0 [skip ci] v2.90.0 github-actions[bot] 2026-04-17 11:56:33 +00:00
  • 1569e42f84 feat: implement GraniteVisionTableStructureModel for VLM-based table extraction (#3323) EliSchwartz 2026-04-17 12:02:20 +03:00
  • 101233ebe2 fix(latex): fully unwrap deeply nested formatting macros (#3249) Smeet Agrawal 2026-04-17 12:51:44 +05:30
  • c7615123e6 fix(docx): handle inline formulas in list items (#3304) Cesar Berrospi Ramis 2026-04-17 07:33:20 +02:00
  • 3bab6b4d38 fix(format): add MD fallback for .txt files in _guess_from_content (#3311) Yarizakura 2026-04-17 13:29:20 +08:00
  • 827489275e fix: strip soft hyphen when joining merged text elements (#3232) Frank-Schruefer 2026-04-17 07:15:41 +02:00
  • 043ed2dd3d fix(pptx): handle NotImplementedError from shape.shape_type (#3309) pateltejas 2026-04-16 23:59:48 -05:00
  • 8ec14f2c6f docs: fix nanonets_ocr2 runtime support matrix (#3317) geoHeil 2026-04-17 06:24:53 +02:00
  • fa334aeb46 chore: bump version to 2.89.0 [skip ci] v2.89.0 github-actions[bot] 2026-04-16 08:08:36 +00:00
  • 251c8b217a fix(ocr): align RapidOCR english assets with 3.8 mobile models (#3291) geoHeil 2026-04-15 12:16:41 +02:00
  • 740c386730 fix(docx): isolate list state in table cells (#3294) Cesar Berrospi Ramis 2026-04-15 09:51:37 +02:00
  • 5b84911a4c fix(pipeline): prevent cache miss due to pipeline options mutation during chart extraction (#3300) Hemanth Battu 2026-04-15 00:39:56 -07:00
  • a64c3784d0 perf(markdown): avoid eager string formatting in Markdown backend debug logs (#3301) Qiefan Jiang 2026-04-15 15:24:23 +08:00
  • a15c16e19f feat: explicit TikZ environment handling in LaTeX backend (#3187) Rayabharapu Trinai 2026-04-15 09:17:20 +05:30
  • cd2e5b633d docs: add indexed picture placeholder example to serialization notebook (#3293) nuri 2026-04-15 12:45:39 +09:00
  • f5fa294e17 chore(readme): fix broken Apify badge (typo) (#3296) Said Gürbüz 2026-04-14 16:28:41 +02:00
  • e04e602fc8 chore: bump version to 2.88.0 [skip ci] v2.88.0 github-actions[bot] 2026-04-13 14:05:27 +00:00
  • c23622f6f5 docs: add agent skill bundle for coding assistants (SKILL.md, pipelines, convert/evaluate) (#3174) Jehlum Pandit 2026-04-13 09:02:51 -04:00
  • 42157a3e10 feat(service): Establish client SDK for docling serve (#3264) Christoph Auer 2026-04-13 14:54:06 +02:00
  • 6b257ece33 fix(ocr): support rapidocr 3.8 mobile model naming (#3277) geoHeil 2026-04-13 11:44:33 +02:00
  • 60fc517af0 chore: Condensing the latex test backend into multiple files (#3281) Aditya Sasidhar 2026-04-13 13:34:22 +05:30
  • 2446f5c41b chore: bump version to 2.87.0 [skip ci] v2.87.0 github-actions[bot] 2026-04-13 07:37:13 +00:00