* chore: Move the teds.py inside the subdir evaluators/table
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Introduce the external_predictions_path in BaseEvaluator and dummy entries in all evaluators.
Extend the CLI to support the --external-predictions-path
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Extend test_dataset_builder.py to save document predictions in various formats
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Extend MarkDownTextEvaluator to support external_predictions_path. Add unit test
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Extend LayoutEvaluator to support external_predictions_path. Add unit test.
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* fix: Add missing pytest dependencies in tests
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* fix: Fix loading the external predictions in LayoutEvaluator
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Introduce external predictions in DocStructureEvaluator. Add unit test.
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Extend the TableEvaluator to support external predictions. Add unit test
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Extend the KeyValueEvaluator to support external predictions. Add unit test.
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Extend the PixelLayoutEvaluator to support external predictions. Add unit test
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Extend the BboxTextEvaluator to support external predictions. Add unit test
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Disable the OCREvaluator when using the external predictions
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* fix: Fixing guard for external predictions in TimingsEvaluator, ReadingOrderEvaluator. Fix main
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* fix: Export the doctag files with the correct file extension
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Refactor the ExternalDoclingDocumentLoader to properly load a DoclingDocument from doctags and
the GT image.
- Introduce the staticmethod load_doctags() which covers all cases on page image loading.
- Refactor the FilePredictionProvider to use the load_doctags() from ExternalDoclingDocumentLoader.
- Refactor all evaluators to use the new ExternalDoclingDocumentLoader.
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* chore: Rename code file as external_docling_document_loader.py
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* fix: Fix typo
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Introduce examples how to evaluate using external predictions using the API and the CLI.
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Prediction vizualizer
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Update docling_eval/utils/external_predictions_visualizer.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Christoph Auer <60343111+cau-git@users.noreply.github.com>
* feat: Update examples bash script to demonstrate visualisations on external predictions
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
---------
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Signed-off-by: Christoph Auer <60343111+cau-git@users.noreply.github.com>
Co-authored-by: Nikos Livathinos <nli@zurich.ibm.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* chore: Move the teds.py inside the subdir evaluators/table
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Introduce the external_predictions_path in BaseEvaluator and dummy entries in all evaluators.
Extend the CLI to support the --external-predictions-path
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Extend test_dataset_builder.py to save document predictions in various formats
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Extend MarkDownTextEvaluator to support external_predictions_path. Add unit test
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Extend LayoutEvaluator to support external_predictions_path. Add unit test.
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* fix: Add missing pytest dependencies in tests
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* fix: Fix loading the external predictions in LayoutEvaluator
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Introduce external predictions in DocStructureEvaluator. Add unit test.
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Extend the TableEvaluator to support external predictions. Add unit test
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Extend the KeyValueEvaluator to support external predictions. Add unit test.
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Extend the PixelLayoutEvaluator to support external predictions. Add unit test
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Extend the BboxTextEvaluator to support external predictions. Add unit test
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Disable the OCREvaluator when using the external predictions
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* fix: Fixing guard for external predictions in TimingsEvaluator, ReadingOrderEvaluator. Fix main
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* fix: Export the doctag files with the correct file extension
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Refactor the ExternalDoclingDocumentLoader to properly load a DoclingDocument from doctags and
the GT image.
- Introduce the staticmethod load_doctags() which covers all cases on page image loading.
- Refactor the FilePredictionProvider to use the load_doctags() from ExternalDoclingDocumentLoader.
- Refactor all evaluators to use the new ExternalDoclingDocumentLoader.
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* chore: Rename code file as external_docling_document_loader.py
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* fix: Fix typo
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Introduce examples how to evaluate using external predictions using the API and the CLI.
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
---------
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Extend the consolidator to produce Latex files also. Fix the MultiEvaluator to accept loading
json evalutions with the old format.
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* fix: Style the generated Latex code to have & symbols vertically aligned
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
---------
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* Misc fixes
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Make DatasetRecord tolerant to old parquet files
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Make DatasetRecord tolerant to old parquet files (2)
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Fix docvqa test, more cleanup
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Important fixes for layout mAP computation
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Adding modes for missing_prediction_strategy and label_filtering_strategy
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Fixes for mismatched docs
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Add F1 no_picture metrics to layout evaluator
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Fixed commands on all READMEs
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Remove extract_images ambiguity, use utility and fix errors on visualizer
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Upgrade to latest docling_core
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Fix ocrmac dep, upgrade uv.lock
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Fix for tableformer provider
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Remove code redundancy
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
---------
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Misc fixes
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Make DatasetRecord tolerant to old parquet files
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
---------
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* feat: Extend the FileProvider and the CLI to accept parameters that control the source of the
prediction images. This is used in case of DocTags:
- By default the images from GT will be used.
- The user can provide an external image path.
- Add documentation example how to evaluate doctag files.
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* fix: FileProvider. Fix loading of the image file.
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
---------
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* fix: MultiEvaluator fix minor logging issue
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* chore: Improve code comments
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Refactor the MultiEvaluator to allow arbitrary experiment names for the benchmark subdirs.
- In case there is no eval dataset, the experiment name must match a provider's name and this
will be used to run the predictions.
- In case there is eval dataset, the experiment name is just a tag and the information about the
prediction provider will be extracted by the corresponding column of the parquet.
- If there is not eval dataset and the experiment name does not match any prediction provider,
an exception is raised.
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* fix: MultiEvalutor rename the GT_LEAF_DIR and introduce the EVALUATIONS_DIR to make the dir structure
created/used by MultiEvaluator the same with the ones created by the CLI
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* fix: Change the pipeline settings of Docling to use 16 CPU threads.
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* chore: MultiEvaluator improve logging
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* fix: Fix the MultiEvaluator.load_multi_evaluation() to properly scan the multi evalution dir structure
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* fix: Ensure to use all CPU cores for the DoclingPredictionProvider
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
---------
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* added the area-level precision, recall and f1
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* WIP: adding timing modality
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* updated the code with timings
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* added the timings modality
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* reformatted the code
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* fixed the test
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* ran the test_run_dpbench_tables with success
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* commented out test_run_dpbench_tables
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* reformatted code
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* found potential bug in base_prediction_provider
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* found potential bug in base_prediction_provider (2)
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* fixed the timings in base-predictor
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* removed prints and added logging-level for matplotlib
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* found bug in stats
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* updated the logging
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
---------
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* propagated the CVAT parameters in the cli and updated the documentation
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* fixed the formatting
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* fix the export in PDF_Docling
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* fixed the PDF_Docling to parquet
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* fixed the visualisations and leveraged the new docling-core visualization capability
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* cleaned up the visualisation code
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* cleaned up the code and reformatted
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
---------
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* feat: Introduce the pred_modalities parameter in the BasePredictionProvider and its implementations
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Refactor main:get_prediction_provider() to add parameter that controls the visualizations.
Refactor the evaluate() to return the DatasetEvaluation as object.
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Introduce the MultiEvaluator that can generate ground truth and prediction datasets and also
compute the evalution across multiple providers and modalities. Add unit test.
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* chore: Update toml dependencies to include pandas, openpyxl
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Introduce staticmethod MultiEvaluator.load_multi_evaluation() to load multi-evaluations from
the disk. Update unit tests.
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* fix: Allow PENDING in the _accepted_status of BaseEvaluator
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* fix: Modifications in the unit test of MultiEvaluator. Code clean up.
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Introduce the Consolidator class that collects evaluation results and generates one excel report
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* chore: Improve the header names of excel export
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Extend the DatasetLayoutEvaluation with DatasetStatistics for all metrics
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Extend the Consolidator to include the standard deviation for each metric
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* chore: Update the pyproject.toml to pin to the docling branch that supports the RT-DETRv2 model
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Extend the DatasetEvaluation to contain the evaluated and rejected samples. The rejected ones
are itemized per rejection type.
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Extend the Consolidator to include the samples (evaluated, rejected) in the generated excel
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Refactor the directory structure for MultiEvaluator and Consolidator classes.
- Refactor the generated excel matrix to include the experiment and provider columns.
- Refactor the BasePredictionProvider and all providers to have class attributes for the
prediction_provider_type and prediction_modalities.
- Introduce CLI in the examples for the generation of the consolidation matrix.
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* fix: Remove ConversionStatus.PENDING accepted status from BaseEvaluator
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* fix: MultiEvaluator fix the load_multi_evaluation()
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Add the class attributes for supported modalities in all prediction providers
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* chore: Fix code typos
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* Address predictor_info TODOs
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* fix: Add the test_multi_evaluator as a pytest dependency for test_consolidator
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* Repin to docling release
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Regenerate lock file
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* fix: Use defaultdict for rejected_samples
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
---------
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Co-authored-by: Christoph Auer <cau@zurich.ibm.com>
* Add README for Docling-DPBench
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Add FileDatasetBuilder
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Add test and fixes
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Add FileDatasetBuilder to CLI
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
---------
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Update PubTabNet_benchmarks.md
With the default `--split test`, the create-gt method throws an exception `Error creating dataset builder: Unknown split "test". Should be one of ['train', 'val'].`. Indeed, https://huggingface.co/datasets/ds4sd/PubTabNet_OTSL only has train and val splits. Given this, I believe using the `val` split is more suitable for this dataset.
Signed-off-by: laurachiticariu <chiti@us.ibm.com>
* Small typo
Signed-off-by: laurachiticariu <chiti@us.ibm.com>
* Update PubTabNet_benchmarks.md
Signed-off-by: laurachiticariu <chiti@us.ibm.com>
---------
Signed-off-by: laurachiticariu <chiti@us.ibm.com>
* Add README for Docling-DPBench
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Restructured CVAT builder (WIP)
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* CVAT preannotation and dataset builders, with test cases
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Add CLI, merge from main
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Update README for CVAT
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Add artifacts path option to CLI, several fixes
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Remove raise
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
---------
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Add README for Docling-DPBench
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Restructured CVAT builder (WIP)
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* CVAT preannotation and dataset builders, with test cases
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Add CLI, merge from main
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Update README for CVAT
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
---------
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* correct mpy
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* reformatting
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* adding the script to make an initial dataset from pdf's
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* before switching to specific docling-core branch
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* rebased on kv-items and updated the create script in CVAT
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* fixed the cvat
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* added the annotation description on CVAT
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* added the annotation description on CVAT (2)
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* added the annotation description on CVAT (3)
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* [WIP] Crafting new dataset builder and prediction provider API
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Restructure to docling_eval_next
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Fix mypy
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Fix f-strings
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Changes for prediction_provider interface, to support all cases.
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Add omnidocbench DatasetBuilder
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Add doclaynet v1, funsd
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Fixes
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Add XFUND, more fixes
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* update the kv cell creation to prevent false positives
Signed-off-by: Saidgurbuz <said.gurbuz@epfl.ch>
* chore: Fixing imports
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* chore: Update docling-core version
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Introduce new design for Evaluators based on BaseEvaluator that accept external predictions.
And utility adapters.
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* Factor PredictionProvider out of dataset builder, many fixes on DatasetRecord
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Sketch example for file-directory prediction provider
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* chore: Fix typing hints
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* chore: Update poetry to doclign-core 2.24.0
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: WIP: Introduce the FilePredictionProvider that reads files with predictions from the disk
- It currently supports doctags, markdown, json, yaml formats.
- We still need to improve the returned type so that it allows for no DoclingDocument but only for
the source data (e.g. in case of markdown).
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* Add DocLayNetV2DatasetBuilder
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Added TableDatasetBuilder and test, update TableFormerPredictionProvider
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* chore: Update MyPy configuration in toml
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Refactor the BasePredictionProvider.predict() to return DatasetRecordWithPrediction
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* Fixes
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* fix: Fix the FilePredictionProvider. Return None in the predicted document in case of Markdown.
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* fix: Remove the kwargs from all PredictonProvider classes and introduce provider specific
initialization arguments
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Introduce the parameter "ignore_missing_files" in FilePredictionProvider
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* Add do_visualization to PredictionProvider
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Move next-gen API to main source tree, re-organize module paths
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Fixes
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Cleanup, change path handling
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Cleanup, change path handling
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* More module removal and renaming
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Small test fixes
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* fix: Add the "prediction_format" in the serialization of DatasetRecordWithPrediction
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Refactor the MarkdownTextEvaluator to support the new classes design. Add unit test.
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* fix: Improve the new design of MarkdownEvaluator to move common functionalities into the base class
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Refactor the LayoutEvaluator to use the new class design. Add unit test.
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* fix: Clean up LayoutEvaluator code
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* chore: Implementation cleanup and fixes for new class design (#52)
* More module removal and renaming
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Small test fixes
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Small test fixes
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Cleanup of tests and more fixes
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
---------
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Add visualization for tables
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Add visualization for all tests
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Fixes for test files, FilePredictionProvider changes
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Put new CLI
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Cleanup
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Rename CLI
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Update all README with new commands.
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Remove old examples
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Several Fixes
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* README updates
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Add gt_dir arg to create-eval, README fixes
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Fixes, pass tests
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* feat: Refactor the TableEvaluator to use the new class design.
Move common evaluator code to BaseEvaluator.
Add more unit tests. Introduce pytest dependencies.
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* Update lockfile
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Update lockfile
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Make pytest CI output more verbose
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* feat: Refactor the ReadingOrderEvaluator to use the new class design.
Remove the BaseReadingOrderEvaluator. Add unit test.
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* Optimize GT downloading behaviour
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Add file sources
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Allow pytest output on CI
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Disable tests in CI
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Reenable tests in CI
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Add correct @pytest.mark.dependency()
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* feat: Introduce TypeVars for the UnitEvaluation and DatasetEvaluation used by the BaseEvaluator.
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* Minimize tests in CI
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* feat: Refactor BboxTestEvaluator to use the new design. Introduce unit test.
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* Remove streaming in DocLaynet v1
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Add back test dependency
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
---------
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Signed-off-by: Saidgurbuz <said.gurbuz@epfl.ch>
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
Co-authored-by: Peter Staar <taa@zurich.ibm.com>
Co-authored-by: Saidgurbuz <said.gurbuz@epfl.ch>
Co-authored-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Refactor main to remove --max-items and replace with --begin-index, --end-index. Add --debug.
Provide implementation for create_dlnv1_e2e_dataset()
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* docs: Update Readme
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* fix: DLNv1 create: Reduce shard size to 100. Fix debug error when not in SmolDocling. Improve logging
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* chore: Delete unused file
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* chore: Working on logging
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* chore: Code clean up
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Extend the LayoutEvaluator to accept external dictionary with DoclingDocuments
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* fix: DLNv1 create: Introduce hardcoded list of blacklisted doc_ids
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Extend the TabelEvalautor to accept optional dict with predicted DoclingDocument objects keyed
by the doc_id
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* fix: Fix DatsetStatistics:to_table() in case there is no data, to avoid division by zero.
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* fix: Fix bug in the visualizations:save_comparison_html_with_clusters()
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* Add label_mapping feature, penalize empty predictions instead of skipping.
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* chore: Pin docling to dev/new-tf-weights
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* fix: Update the FTN-OTSL and P1M-OTSL to v1.1 from our HF datasets
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* fix: Ensure that the table bboxes are not negative and within the page size
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* fix: In case pred_dict is given to TableEvaluator remove the ".png", ".jpg" suffixes from doc_id
before trying to match
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* fix: Protect the prediction in TableEvaluator within try..except
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Introduce the parameter structure_only in TableEvaluator
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Add optional parameter intermediate_results_dir in TableEvaluator
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* fix: TableEvaluator: Fixs in save_table_evaluations()
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* chore: Update to the latest docling 2.26.0
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Ensure that all create methods support the `begin_index`, `end_index` parameters.
- DPBench.
- OmniDocBench.
- DocLayNetv1.
- Table datasets (FinTabNet, PubTabNet, Pub1M).
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
---------
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Co-authored-by: Christoph Auer <cau@zurich.ibm.com>
* doclaynet v2
Signed-off-by: Yusik Kim <kmyusk@gmail.com>
* True doc bounding boxes to bottom left origin
Signed-off-by: Yusik Kim <kmyusk@gmail.com>
* add reading order to benchmark script
Signed-off-by: Yusik Kim <kmyusk@gmail.com>
* make true and pred doc names the same
Signed-off-by: Yusik Kim <kmyusk@gmail.com>
* bound mem usage through limiting single shard size
Signed-off-by: Yusik Kim <kmyusk@gmail.com>
* add key-value items to the DoclingDocument
Signed-off-by: Saidgurbuz <said.gurbuz@epfl.ch>
Signed-off-by: Yusik Kim <kmyusk@gmail.com>
* adopt v1 format to v2
Signed-off-by: Yusik Kim <yusik.kim@ibm.com>
Signed-off-by: Yusik Kim <kmyusk@gmail.com>
* update implementation of kv items based on structure change, and add prov item
Signed-off-by: Saidgurbuz <said.gurbuz@epfl.ch>
Signed-off-by: Yusik Kim <kmyusk@gmail.com>
* fix version and format issues
Signed-off-by: Saidgurbuz <said.gurbuz@epfl.ch>
Signed-off-by: Yusik Kim <kmyusk@gmail.com>
* handle empty prov, update benchmark script
Signed-off-by: Yusik Kim <kmyusk@gmail.com>
* adapt create.py to KeyValue structure change
Signed-off-by: Saidgurbuz <said.gurbuz@epfl.ch>
Signed-off-by: Yusik Kim <kmyusk@gmail.com>
* add rule-based method, classify_cells, to assign labels to GraphCells
Signed-off-by: Saidgurbuz <said.gurbuz@epfl.ch>
Signed-off-by: Yusik Kim <kmyusk@gmail.com>
* add visualize_docling_document method to generate highlighted image for all item types with options
Signed-off-by: Saidgurbuz <said.gurbuz@epfl.ch>
Signed-off-by: Yusik Kim <kmyusk@gmail.com>
* Revert "add visualize_docling_document method to generate highlighted image for all item types with options"
This reverts commit 267918b59f892513ca633bbd1e7a31a0a45acc60.
Signed-off-by: Saidgurbuz <said.gurbuz@epfl.ch>
Signed-off-by: Yusik Kim <kmyusk@gmail.com>
* update method name
Signed-off-by: Saidgurbuz <said.gurbuz@epfl.ch>
Signed-off-by: Yusik Kim <kmyusk@gmail.com>
* fix: max_items loop break
Signed-off-by: Yusik Kim <yusik.kim@ibm.com>
Signed-off-by: Yusik Kim <kmyusk@gmail.com>
* fix tqdm total
Signed-off-by: Yusik Kim <yusik.kim@ibm.com>
Signed-off-by: Yusik Kim <kmyusk@gmail.com>
* rebase main utils.py
Signed-off-by: Saidgurbuz <said.gurbuz@epfl.ch>
Signed-off-by: Yusik Kim <kmyusk@gmail.com>
* chore: rebased and refactored
Signed-off-by: Yusik Kim <yusik.kim@ibm.com>
Signed-off-by: Yusik Kim <kmyusk@gmail.com>
* fix: modalities for DLN
Signed-off-by: Yusik Kim <kmyusk@gmail.com>
* remove unrelated code
Signed-off-by: Saidgurbuz <said.gurbuz@epfl.ch>
Signed-off-by: Yusik Kim <kmyusk@gmail.com>
* fix: ocr options, use image converter
Signed-off-by: Yusik Kim <kmyusk@gmail.com>
* change max_items back to 1000
Signed-off-by: Yusik Kim <yusik.kim@ibm.com>
* precommit black
Signed-off-by: Yusik Kim <yusik.kim@ibm.com>
---------
Signed-off-by: Yusik Kim <kmyusk@gmail.com>
Signed-off-by: Saidgurbuz <said.gurbuz@epfl.ch>
Signed-off-by: Yusik Kim <yusik.kim@ibm.com>
Co-authored-by: Saidgurbuz <said.gurbuz@epfl.ch>
Co-authored-by: Yusik Kim <yusik.kim@ibm.com>
* chore: Change the pinning of docling
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* fix: Fix the modalities supported for DPBench, OmniDocBench, DLNv1. Clean up code.
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* docs: Update documentation to have all benchmarks in separate md files and place links in Readme.
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* fix: Change the initialization of the create_smol_docling_converter() to allow flash-attn
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* docs: List benchmarks in the main readme with short description. Fix broken links in the documentation.
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* docs: Fix broken link in Readme.
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* chore: Update lock file
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* fix: Add debug code to dump the predicted text in create_dlnv1_e2e_dataset()
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* chore: Update toml to pin docling with branch and extras
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* fix: Disable the generation of VLM text debugging files for DLNv1 benchmark
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* chore: Update toml to docling v2.25.0 with vln extra
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
---------
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Introduce create_vln_converter() that uses SmolDocling.
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Introduce the converter_type parameter in the CLI and initialize the converter appropriately
to use Docling or SmolDocling.
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Add the column CONVERTER_TYPE in the produced dataset.
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* fix: Fixing the benchmarks/utils/save_comparison_html_with_clusters() to skip docitems without provenances
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* chore: Rename conversion/create_converter() as create_docling_converter()
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* chore: Replace prints with logger in doclaynet_v1/create.py
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* chore: Update pinning of docling to a certain commit for smoldocling
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* fix: Fixing missing imports. Code formatting.
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
---------
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* correct mpy
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* reformatting
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* adding the script to make an initial dataset from pdf's
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* before switching to specific docling-core branch
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* rebased on kv-items and updated the create script in CVAT
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* fixed the cvat
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* added the annotation description on CVAT
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* added the annotation description on CVAT (2)
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* added the annotation description on CVAT (3)
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* feat: Add new reading-order model evaluator, re-factor to BaseReadingOrderEvaluator
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Rename New
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* fix: Fix calling DoclingDocument methods. Pin latest versions of docling and docling-core.
Remove commented out code, remove unused imports, code formatting.
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* docs: Re-run the evaluations for DPBench and update the docs
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* docs: Re-run the evaluations for OmniDocBench and update the docs
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* docs: Re-run the evaluations for DPBench and update the docs
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* docs: Re-run the evaluations for OmniDocBench and update the docs
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* docs: Minor improvements in DPBench OmniDocBench documentation
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
---------
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
Co-authored-by: Peter Staar <taa@zurich.ibm.com>
Co-authored-by: Nikos Livathinos <nli@zurich.ibm.com>
* correct mpy
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* reformatting
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* adding the script to make an initial dataset from pdf's
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* before switching to specific docling-core branch
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* rebased on kv-items and updated the create script in CVAT
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* fixed the cvat
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* added the annotation description on CVAT
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* added the annotation description on CVAT (2)
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* added the annotation description on CVAT (3)
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* Update lock
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Fixes for CVAT
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Disable Examples in CI
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
---------
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Co-authored-by: Christoph Auer <cau@zurich.ibm.com>
* doclaynet v1 create.py
Signed-off-by: Yusik Kim <kmyusk@gmail.com>
* added benchmark example for doclaynet v1
Signed-off-by: Yusik Kim <kmyusk@gmail.com>
* parameterize input data split
Signed-off-by: Yusik Kim <kmyusk@gmail.com>
* fix typo
Signed-off-by: Yusik Kim <kmyusk@gmail.com>
* use test set for benchmark
Signed-off-by: Yusik Kim <kmyusk@gmail.com>
* fix benchmark script path
Signed-off-by: Yusik Kim <kmyusk@gmail.com>
* fix ground truth bbox
Signed-off-by: Yusik Kim <kmyusk@gmail.com>
* add MARKDOWN_TEXT eval to doclaynet benchmark
Signed-off-by: Yusik Kim <kmyusk@gmail.com>
* follow benchmark output dir convention
Signed-off-by: Yusik Kim <kmyusk@gmail.com>
* doclaynet v1 eval results
Signed-off-by: Yusik Kim <kmyusk@gmail.com>
* fix last shard id
Signed-off-by: Yusik Kim <kmyusk@gmail.com>
* add tables and debug viz
Signed-off-by: Yusik Kim <kmyusk@gmail.com>
* use pdf as conversion source
Signed-off-by: Yusik Kim <kmyusk@gmail.com>
* small fix
Signed-off-by: Yusik Kim <kmyusk@gmail.com>
* updated the create script for doclaynet-v1
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* chore: Update the version of docling-core in pyproject.toml
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Make the visualization of the reading order during the creation of the datasets optional. WIP
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* fix: Introduce CLI parameter to pass the max-items. Fix the visualizaton on DLNv1 to colorise the
clusters and remove the reading-order.
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* fix: Visualize correctly the true and pred document.
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Introduce parameter artifacts_path in CLI to allow passing external files of models.
The parameter is propagated to docling PdfPipelineOptions
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* fix: Fix issues when passing custom artifact paths
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* feat: Bring back the drawing of the reading order as an optional feature during the creation of the dataset
Add DLNv1 create modality in the CLI main.
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* fix: Use the HF repo to download DLNv1. Remove the main from the `create()` methods.
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* fix: Fix a split issue in the DLNv1 create
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* fix: Fix bug in DLNv1 create.
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* fix: Rename docling-DocLayNet-v1.1 to DocLayNet-v1.2
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
---------
Signed-off-by: Yusik Kim <kmyusk@gmail.com>
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
Co-authored-by: Peter Staar <taa@zurich.ibm.com>
Co-authored-by: Nikos Livathinos <nli@zurich.ibm.com>
* fix: Many code refactorings to support TableFormerMode (by default ACCURATE). Additonally:
- Clean up the overall implementation of TableFormerUpdater and set the AcceleratorOptions.
- Extend CLI to allow the creation of table datasets (PTN, FTN, P1M).
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* docs: WIP: Updating the Readme with benchmarks for the FTN, PTN, P1M.
Move the DP-Bench and OmniDocBench in separate files inside docs/ and provide links in Readme
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* docs: Add evaluation files inside docs/ with json/png files for FTN, PTN, P1M
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* docs: Fix the images links for DP-Bench, OmniDocBench docs
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* docs: Refactor the Readme to have evaluation data and results for the table datasets FTN, PTN, Pub1M
Add code snippets how to run the evalutions and visualizations for all datasets.
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* docs: Fix broken links in Readme
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
---------
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
- Introduce the 'split' optional argument in CLI.
- Refactor main.evaluate, main.visualize to receive the split.
- Refactor the LayoutEvaluator, TableEvaluator to load the given split.
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* allow-for-buckets-in-cvat
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* refactored the CVAT create script
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* refactoring the CVAT pre-annotate and create
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* added eval for cvat-annotations
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* reformatted the code
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
---------
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
- fix: Refactor/improve the code to save log files with evaluation tables and png files with the plots
and ensure to produce all the evaluations/visualizations in the docs/examples/benchmark_xxx.py files
- Introduce optional parameter in create methods for DP-Bench and OmniDocBench to generate
visualizations.
- Update the evaluation files (json/txt/png) in docs/evaluations per dataset. Update Readme.
- Update Readme with the OmniDocBench evaluation/visualization files
- Poetry: Move to docling 2.15.1
---------
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* chore: Add tqdm in the dependencies
* chore: Move the DatasetStats outside of table_evaluator into the utils/stats.py
* feat: ReadingOrderEvaluator: Full implementation with Average Relative Distance metric
* fix: Add reading_order in the visualise() method of main
* fix: utils/stats.py: Add the metric name as a parameter. Clean up code
* chore: Add reading_order evaluation and visualization in the examples for dp-bench and omnidocbench
Add the doc_id in the evaluation report
* feat: MarkdownTextEvaluator: Introduce text evaluation based on markdown export of DoclingDocument.
Use BLEU metric
* feat: Add ReadingOrderVisualizer and use it in the main
* chore: Add pillow lib to the poetry
* fix: ReadingOrderEvaluator: Convert the bboxes in bottom-left origin before calling the reading-order
* chore: Update poetry lock
* fix: Refactor to move the evaluator statistics in a separate file evaluators/stats.py.
Decouple the code to draw arrows in a separate function inside utils.py
Delete unused code.
Fix mypy issues.
* chore: Update Readme to include the evaluations and visualizations for the "reading-order" and the
"markdown-text" modalities.
* fix: Refactor the stats.py:save_historgram() to receive generic name for the plot
Generate histogram plots for the reading_order and markdown_text visualizations
Update Readme statistics for the reading_order and markdown_text modalities.
* feat: ReadingOrder: Implement weighted ARD where the weight is based on the bbox size
* chore: Update Readme with ARD and weighted ARD and histograms
---------
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* adding script to prepopulate CVAT and create GT annotations from CVAT annotation files
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* it works
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* Fixed the DPBench with the refactoring
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* refactored the cvat annotations in preannotate and create script
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* updated the code to export layout after re-annotated the DP-Bench dataset
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* fixed some nasty bugs
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* adding documentation files for CVAT annotation
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* updated code
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* major updates
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* work-in-progress
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* working on reformatting and getting mypy alignment
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* fixed the mypy errors
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* moved the code to cvat_annotation
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* updated the png packaging
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
---------
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* adding the omnidocbench benchmarkl
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* added the table-parsing in omnidocbench
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* finished the OmniDocBench implementation
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* reformatted the code
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* updated the README
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* updated the README and the cli
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* clean up the DP-Bench example
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* made the DPBench and OmniDocBench follow the same example code
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* reformatted the code
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* cleaned up the dp-bench create script
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* added the ability to see the clusters and reading order for layout
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* reformatted the code
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* working on making datasets from pdf collections
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* added the package_pdfs example
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* added the FinTabNet-OTSL benchmark
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* reformatted the code
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* added the fintabnet example evaluation
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* updated the README fort FinTabNet
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* updated the README
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* refactored the table evaluations
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* added the text inclusion in the table prediction
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* fixed the header of the HTML
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* reformatted the code
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
* fix: Formatting and unused code cleanup
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* feat: Extend the CLI to create the OMNIDOCBENCH datasets for the layout and tableformer modalities
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
* Added exit to benchmark end-to-end scripts in case git-lfs is not installed (#5)
Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>
Co-authored-by: Maksym Lysak <mly@zurich.ibm.com>
* fix: Use TableStructureModel from docling, use backends, fix boundingbox coordinates
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Reinstate layout test on dpbench
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Comments
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Comments
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Remove unused code
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Remove more unused code
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Fixes for Omnidoc
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Fixes for layout eval bounding boxes
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* More fixes for OmniDoc, README updates
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* More fixes for OmniDoc, README updates
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
* Replace git-lsf with HF snapshot_download
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
---------
Signed-off-by: Peter Staar <taa@zurich.ibm.com>
Signed-off-by: Christoph Auer <cau@zurich.ibm.com>
Signed-off-by: Nikos Livathinos <nli@zurich.ibm.com>
Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>
Co-authored-by: Christoph Auer <cau@zurich.ibm.com>
Co-authored-by: Nikos Livathinos <nli@zurich.ibm.com>
Co-authored-by: Maxim Lysak <101627549+maxmnemonic@users.noreply.github.com>
Co-authored-by: Maksym Lysak <mly@zurich.ibm.com>