Commit Graph

  • 2ab5d1250f chore: bump version to 1.4.0 [skip ci] main v1.4.0 github-actions[bot] 2026-05-15 15:46:17 +00:00
  • 6efbb3f214 feat: Improved region extraction and filtering for doclingsdg_builder.py (#217) Maxim Lysak 2026-05-15 17:40:59 +02:00
  • 4ee7c51d88 fix(pixel-layout): bound in-flight futures to prevent OOM on large datasets sami/fix-pixel-layout-oom samiuc 2026-05-12 12:47:32 -07:00
  • 184ca920b6 fix(dpbench): normalize coordinates for upstream breaking change sami/fix-dpbench-breaking-change samiuc 2026-05-12 10:37:19 -07:00
  • 102bb119ea chore: bump version to 1.3.0 [skip ci] v1.3.0 github-actions[bot] 2026-05-11 16:07:41 +00:00
  • c1745a06f9 feat: dummy release trigger (#214) Christoph Auer 2026-05-11 18:03:42 +02:00
  • 3d78029bbe Merge branch 'main' into sami/update-omnidoc-builder sami/update-omnidoc-builder samiuc 2026-04-27 14:05:33 -07:00
  • e581570abf - Added modality selection to CLI when generating dataset (#213) Maxim Lysak 2026-04-27 09:36:16 +02:00
  • 97d356100f chore: bump version to 1.2.0 [skip ci] v1.2.0 github-actions[bot] 2026-04-24 12:52:25 +00:00
  • 356c8df4d6 feat: CVAT submission delivery improvements (#211) Christoph Auer 2026-04-24 12:34:49 +02:00
  • 1abca34468 Make vlm extra, don't depend on docling[vlm] by default (#212) Christoph Auer 2026-04-24 10:03:16 +02:00
  • 9a792033e9 Integrating doclingsdg_builder.py with DatasetRecordWithBBox instead of DatasetRecord (#210) Maxim Lysak 2026-04-21 09:56:15 +02:00
  • 07e57b57a7 chore: bump version to 1.1.1 [skip ci] v1.1.1 github-actions[bot] 2026-04-14 15:25:51 +00:00
  • cb0072967a fix: bump pillow version (#209) Peter W. J. Staar 2026-04-14 17:23:21 +02:00
  • d50fbb9d62 chore: bump version to 1.1.0 [skip ci] v1.1.0 github-actions[bot] 2026-04-13 16:21:07 +00:00
  • 412c43aa96 feat: Dev/add datasetrecord (#207) Peter W. J. Staar 2026-04-13 17:54:58 +02:00
  • 9ff1dff7c3 ci(mergify): upgrade configuration to current format mergify/configuration-deprecated-update mergify[bot] 2026-04-13 15:51:32 +00:00
  • 885a64479b Merge branch 'main' into sami/update-omnidoc-builder samiuc 2026-04-01 22:47:53 -07:00
  • 5c9f3fadf2 feat: flat-layout CVAT campaign tools and resilient shard writing (#206) Christoph Auer 2026-03-31 13:27:23 +02:00
  • 55fd3eb604 fix: PIL Image Memory Leaks in Dataset Builders (#194) samiuc 2026-03-31 03:44:45 -07:00
  • e761bcc9dd feat: New dataset builder - DoclingSDGDatasetBuilder (#205) Maxim Lysak 2026-03-30 17:31:40 +02:00
  • c2657b3f1b Merge branch 'main' into sami/update-omnidoc-builder samiuc 2026-03-19 12:21:25 -07:00
  • 2d9994c8a3 fix import samiuc 2026-03-19 12:20:02 -07:00
  • 8f7d3c882f Update pyproject and lock cau/cleanup-and-fixes Christoph Auer 2026-03-16 11:11:28 +01:00
  • 685d0090ae Update pyproject and lock Christoph Auer 2026-03-16 10:48:10 +01:00
  • 3a71eb4a84 chore: bump version to 1.0.1 [skip ci] v1.0.1 github-actions[bot] 2026-03-11 16:38:43 +00:00
  • 901814d35a fix: remove hard pinning of docling-parse (#203) Peter W. J. Staar 2026-03-11 17:18:34 +01:00
  • 7a8fedf120 chore: bump version to 1.0.0 [skip ci] v1.0.0 github-actions[bot] 2026-03-11 15:42:34 +00:00
  • 945c38ac64 chore: add datasets v5.0.0 (#202) Peter W. J. Staar 2026-03-11 16:35:47 +01:00
  • 6328c7a957 Introduce support in CvatPreannotationBuilder for even/odd starting page in multi-page setups" cau/multi-page-cvat Christoph Auer 2026-02-11 10:33:24 +01:00
  • 8b80d7173c Restrict docling-parse to <5 cau/bugfixes Christoph Auer 2026-02-10 11:29:56 +01:00
  • efde25f631 Merge branch 'main' of github.com:DS4SD/docling-eval into cau/bugfixes Christoph Auer 2026-02-10 11:28:34 +01:00
  • 23729cbd88 Small bugfixes and deps updates Christoph Auer 2026-02-10 11:26:17 +01:00
  • a7e74a3b36 fix: correct import path for TableStructureModel (#199) samiuc 2026-02-10 02:06:35 -08:00
  • c9ad5621a6 fix: build error and import tableformer provider optionally sami/add-fix-np1 samiuc 2026-01-21 15:57:39 -08:00
  • 457e24871f fix: build error and import tableformer provider conditionally samiuc 2026-01-21 15:45:52 -08:00
  • 10251dc017 feat: update OmniDocBench with Parquet support and add test samiuc 2026-01-20 22:36:46 -08:00
  • a764b6a4b5 fix: replace np.bitwise_count with custom _popcount function for compatibility samiuc 2026-01-20 20:07:16 -08:00
  • 3ce7591872 fix: Fix the reporting of doc_id, true_md, pred_md in markdown_text_evaluator.py (#196) Nikos Livathinos 2026-01-12 15:33:54 +01:00
  • 9c1a2be221 refactor: factor out cvat_tools cleanly and update with optional dependency to new package (#195) Christoph Auer 2026-01-08 12:12:53 +01:00
  • 9d04a56b93 feat: Parallelize the evaluation of tables and cache the loading of external predictions (#190) Nikos Livathinos 2025-12-19 12:51:40 +01:00
  • 8a10188177 feat: Regression tests for CVAT to Docling conversion (#193) Christoph Auer 2025-12-18 16:49:07 +01:00
  • db068e9d88 feat: CVAT box rotation support, structural cleanup (#191) Christoph Auer 2025-12-18 14:46:17 +01:00
  • a850784b4f feat: Improvements in user experience: Performance, error handling, logging (#189) Nikos Livathinos 2025-12-16 11:25:55 +01:00
  • 8a2ba74be8 fix: Make CVAT pipeline resilient to single document crashes, report failures at the end cau/cvat-pipeline-resilience Christoph Auer 2025-12-15 14:11:10 +01:00
  • bcc5200f74 chore: Fix pyproject.toml. Introduce import guards for optional dependencies (#187) Nikos Livathinos 2025-12-12 11:03:13 +01:00
  • cb28df734e docs: Improve multi_label_pixel_layout_evaluations.md. More TODOs nli/doc_pixel_eval Nikos Livathinos 2025-12-10 17:32:17 +01:00
  • 7bf0bb031a docs: First version of the multi_label_pixel_layout_evaluations.md Nikos Livathinos 2025-12-10 16:49:34 +01:00
  • 2581eca8b2 docs: Documentation for the Multi-label pixel evaluation: Computation of the confusion matrix and derivatives Nikos Livathinos 2025-12-10 12:47:30 +01:00
  • ea40f8fe80 docs: Objectives, confusion matrix for the multi-label pixel evaluations Nikos Livathinos 2025-12-09 17:27:23 +01:00
  • 373f959633 feat: Visualizer tool and command for datasets (#186) Christoph Auer 2025-12-09 14:47:43 +01:00
  • 53dbd955ae feat: Extend the evaluators to support external predictions stored in files (#185) Nikos Livathinos 2025-12-08 16:51:45 +01:00
  • 15888fd25c feat: convert Docling JSON inputs to image streams in FileDatasetBuilder (#184) Christoph Auer 2025-12-05 13:36:07 +01:00
  • ebb8800641 feat: Allow subset to split routing in CVAT to HF exporter (#182) Christoph Auer 2025-12-05 11:37:25 +01:00
  • 4314091abf fix: PixelLayoutEvaluator: Set all-pixels background in case of a missing prediction and evaluate (#183) Nikos Livathinos 2025-12-04 17:20:42 +01:00
  • 8254a6d9a8 chore: update docling pin (#181) Panos Vagenas 2025-12-03 16:31:22 +01:00
  • b55b2ea40d feat: ingest CVAT assets and filter submissions (#180) Christoph Auer 2025-12-03 10:11:32 +01:00
  • 7df0f6b867 chore: Remove docling-core from uv sources (#179) Christoph Auer 2025-12-02 21:17:09 +01:00
  • 9b6df83aea fix: fix empty prediction handling in markdown evaluator (#177) Panos Vagenas 2025-12-02 18:49:08 +01:00
  • 5084a4d675 feat: Runtime optimizations for MultiLabelConfusionMatrix (#175) Nikos Livathinos 2025-12-02 10:15:52 +01:00
  • 8f33420d6a feat: Add more fine-grained control in the DoclingEvalCOCOExporter (#149) Nikos Livathinos 2025-12-02 10:10:48 +01:00
  • 693c22445f feat: Remove legacy CvatDatasetBuilder code, use modernized code (#174) Christoph Auer 2025-11-28 11:16:57 +01:00
  • a79bac5d02 feat: Introduce the PixelLayoutEvaluator to produce confusion matrices for the multi-label layout analysis (#173) Nikos Livathinos 2025-11-20 10:43:53 +01:00
  • 21341ce1be feat: Review-bundle builder, fixes for GraphCell with merged elements and more (#172) Christoph Auer 2025-11-13 14:29:14 +01:00
  • 8fb3a169f6 fix: consistenty and perf improvements (#171) Christoph Auer 2025-11-10 11:48:32 +01:00
  • 2fdd1c7803 Merge branch 'main' into fix-rate-limiting-issues fix-rate-limiting-issues samiuc 2025-11-07 14:39:54 -08:00
  • e92a17ae05 chore: bump version to 0.10.0 [skip ci] v0.10.0 github-actions[bot] 2025-11-05 18:26:09 +00:00
  • d4a0ef619e perf: consistenty and perf improvements (#170) Christoph Auer 2025-10-24 14:42:59 +02:00
  • 74e7b3e7de fix: Validation fixes for list item impurity check (#169) Christoph Auer 2025-10-24 09:44:18 +02:00
  • 8be2e8399b feat: Extend the CLI for create-eval to receive the vlm-options and max_new_tokens parameters when the provider is GraniteDocling (#164) Nikos Livathinos 2025-10-22 14:47:35 +02:00
  • 740157dba3 feat: Harmonizing pic classes for cvat to docling conversion (#167) Maxim Lysak 2025-10-17 16:10:34 +02:00
  • cb71009009 fix: Don't report content-layer group violation multiple times Christoph Auer 2025-10-15 16:15:12 +02:00
  • c10fdfd8c4 fix: Handle merged elements regarding inclusion, don't flag single element pages Christoph Auer 2025-10-15 14:47:10 +02:00
  • 5e5f2dbb36 feat: Add more specific validation for reading-order, enhance validation report Christoph Auer 2025-10-15 11:42:08 +02:00
  • 1eb6b4ea76 fix: Missing transform to storage_scale for some items and table cells Christoph Auer 2025-10-15 10:48:58 +02:00
  • 6f59c7a8af fix: More CVAT validation and docling conversion fixes (#163) Christoph Auer 2025-10-15 09:59:46 +02:00
  • ef17b5a30b fix: Better control over scaling in CVAT transform, fixes for OCR (#162) Christoph Auer 2025-10-14 15:54:16 +02:00
  • 06b71b76f8 add the option for orphan cells (#158) Peter W. J. Staar 2025-10-13 16:47:39 +02:00
  • 80e449de7f fix: Fixes for CVAT validation, OCR in CVAT pipeline, logging, and more (#161) Christoph Auer 2025-10-13 15:11:07 +02:00
  • 076958d9cd Merge branch 'main' of github.com:DS4SD/docling-eval into cau/cvat-folder-conversion cau/cvat-folder-conversion Christoph Auer 2025-10-10 10:47:13 +02:00
  • a65daafe9b Merge branch 'main' into fix-rate-limiting-issues samiuc 2025-10-08 09:00:11 -07:00
  • 91b9cad257 Fixes for CVAT deliveries pipeline Christoph Auer 2025-10-08 14:40:19 +02:00
  • d6f83b931f Add CVAT deliveries pipeline Christoph Auer 2025-10-08 11:24:17 +02:00
  • 023cae0050 Upgrade cvat_evaluation_pipeline to work with folder-mode and full docs Christoph Auer 2025-10-07 13:11:10 +02:00
  • 6144796b1a Outfit cvat_evaluation_pipeline for fulldocs/folder mode (WIP) Christoph Auer 2025-10-06 17:19:51 +02:00
  • c493f3a980 Cleanup and redundancy removal on cvat_tools and campaign codes Christoph Auer 2025-10-06 16:11:47 +02:00
  • 8ce6f0fd88 Fix for multipage docs. Christoph Auer 2025-10-06 12:21:51 +02:00
  • fb87ef7076 Add capability to convert full CVAT folder to DoclingDocuments Christoph Auer 2025-10-06 11:24:54 +02:00
  • 9eafe9560c Fixes for CVAT deliveries pipeline Christoph Auer 2025-10-08 14:40:19 +02:00
  • 1c5c6e47e3 Add CVAT deliveries pipeline Christoph Auer 2025-10-08 11:24:17 +02:00
  • 3d65896a14 fix type errors samiuc 2025-10-07 18:38:03 -07:00
  • 8f9e7e3047 Merge branch 'main' into fix-rate-limiting-issues samiuc 2025-10-07 18:25:31 -07:00
  • 3a9543c865 feat: integrate textline_cells based OCR evaluation (#156) samiuc 2025-10-07 13:04:38 -07:00
  • b0c773a6ea Upgrade cvat_evaluation_pipeline to work with folder-mode and full docs Christoph Auer 2025-10-07 13:11:10 +02:00
  • 65803d1811 feat: implement retry mechanism for dataset downloads to handle rate limits samiuc 2025-10-06 12:00:37 -07:00
  • 58370d8dfa Outfit cvat_evaluation_pipeline for fulldocs/folder mode (WIP) Christoph Auer 2025-10-06 17:19:51 +02:00
  • 7ded46fda4 Cleanup and redundancy removal on cvat_tools and campaign codes Christoph Auer 2025-10-06 16:11:47 +02:00
  • 08b7dbe9d9 Fix for multipage docs. Christoph Auer 2025-10-06 12:21:51 +02:00
  • 53eaf91268 Add capability to convert full CVAT folder to DoclingDocuments Christoph Auer 2025-10-06 11:24:54 +02:00
  • 90cc730c0b chore: bump version to 0.9.0 [skip ci] v0.9.0 github-actions[bot] 2025-10-01 03:42:58 +00:00