mirror of
https://github.com/docling-project/docling-core.git
synced 2026-05-17 13:10:44 +00:00
c73904e68e
* Added ruff to dev dependencies Signed-off-by: Florian Schwarb <florian.schwarb@gmail.com> * Added ruff settings to pyproject.toml as in docling Signed-off-by: Florian Schwarb <florian.schwarb@gmail.com> * Cleanup uf pyproject.toml Signed-off-by: Florian Schwarb <florian.schwarb@gmail.com> * Copied settings for ruff pre-commit hooks from docling Signed-off-by: Florian Schwarb <florian.schwarb@gmail.com> * Excluded test/data/** from ruff formatting / linting Signed-off-by: Florian Schwarb <florian.schwarb@gmail.com> * ruff format Signed-off-by: Florian Schwarb <florian.schwarb@gmail.com> * Added some ignore statements to pyproject.toml such that ruff check raises fewer issues Signed-off-by: Florian Schwarb <florian.schwarb@gmail.com> * ruff check --fix Signed-off-by: Florian Schwarb <florian.schwarb@gmail.com> * Ignored some more rules Signed-off-by: Florian Schwarb <florian.schwarb@gmail.com> * Fixed the rest of the errors that would only concern 1 - 3 files Signed-off-by: Florian Schwarb <florian.schwarb@gmail.com> * Added another ignore related to df for DataFrame names Signed-off-by: Florian Schwarb <florian.schwarb@gmail.com> * Modified CONTRIBUTING.md such that black / isort are replaced by ruff Signed-off-by: Florian Schwarb <florian.schwarb@gmail.com> * Added UP045 to ignore list such that Optional[...] does not raise Signed-off-by: Florian Schwarb <florian.schwarb@gmail.com> * Moved .flake8 configs to pyproject.toml Signed-off-by: Florian Schwarb <florian.schwarb@gmail.com> * Moved autoflake to be used with ruff Signed-off-by: Florian Schwarb <florian.schwarb@gmail.com> * Moved all .flake8 settings to pyproject.toml to be compatible with ruff (i.e. no separate [tool.flake8] section Signed-off-by: Florian Schwarb <florian.schwarb@gmail.com> * Removed flake8 from .pre-commit hooks Signed-off-by: Florian Schwarb <florian.schwarb@gmail.com> * Applied ruff format (again); formatted some files as the line-length = 120 equals now what was set for the .flake8 settings Signed-off-by: Florian Schwarb <florian.schwarb@gmail.com> * Set max-complexity to 30 (as was originally) in the pyproject.toml as one linting check would fail Signed-off-by: Florian Schwarb <florian.schwarb@gmail.com> * Adding PD901 to ignore list such that pre-commit hooks run fully again Signed-off-by: Florian Schwarb <florian.schwarb@gmail.com> * Replaced dtype | None syntax by Optional[dtype] in remaining places Signed-off-by: Florian Schwarb <florian.schwarb@gmail.com> * chore: fix 'test' ref in pyproject Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * style: remove typing List, Set, Tuple, Dict Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * style: remove UP015 check from ignore list Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * style: remove UP034 check from ignore list Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * style: normalize dashes in comments and docstrings Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * style: remove PD901 check from ignore list Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * style: remove C403 check from ignore list Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * style: remove C403, C413, C416 check from ignore list Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> * style: remove E203, F811 check from ignore list Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> --------- Signed-off-by: Florian Schwarb <florian.schwarb@gmail.com> Signed-off-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com> Co-authored-by: Florian Schwarb <florian.schwarb@gmail.com> Co-authored-by: Cesar Berrospi Ramis <ceb@zurich.ibm.com>
28 lines
832 B
Python
28 lines
832 B
Python
"""Test the pydantic models in module data_types.ccs."""
|
|
|
|
import json
|
|
|
|
from pydantic import ValidationError
|
|
|
|
from docling_core.types.legacy_doc.document import CCSDocument
|
|
|
|
|
|
def test_ccs_document_update():
|
|
"""Validate data with CCSDocument extract."""
|
|
filename = "test/data/legacy_doc/ext-1.json"
|
|
try:
|
|
with open(filename, encoding="utf-8") as f:
|
|
raw_doc = json.load(f)
|
|
for item in raw_doc["main-text"]:
|
|
if "$ref" in item:
|
|
assert False, f"$ref should not be in file {filename}"
|
|
|
|
doc = CCSDocument.model_validate(raw_doc)
|
|
|
|
if doc.description.abstract:
|
|
assert False, "Abstract should not be present"
|
|
|
|
except ValidationError as e:
|
|
print(f"Validation error in file {filename}:\n{e.json()}")
|
|
raise
|