mirror of
https://github.com/docling-project/docling-core.git
synced 2026-05-17 13:10:44 +00:00
85 KiB
85 KiB
v2.75.0 - 2026-05-12
Feature
Fix
Documentation
v2.74.1 - 2026-04-22
Fix
- Refine ImageRef URI handling (#595) (
2087d0f) - doclang: Default DoclangDeserializer to page 1 (#590) (
048f172) - Refine remote filename handling (#591) (
473fbac)
v2.74.0 - 2026-04-17
Feature
- serializer: Add MsExcelMarkdownDocSerializer for sheet-name headings (#587) (
9dc882d) - DocChunk expansion (#549) (
f2a6186)
Fix
- DocLang: Fix chemistry serialization (#584) (
b72af12) - Prevent numeric precision loss in Markdown table serialization (#588) (
6cbdee9)
v2.73.0 - 2026-04-09
Feature
- ouline: Extend OutlineDocSerializer with filtering capabilities (#580) (
18f5738) - Add latex and Tikz as codelabels (#579) (
46a9b5a)
Documentation
v2.72.0 - 2026-04-07
Feature
v2.71.0 - 2026-03-30
Feature
- Add code representation meta field (#573) (
0bd5d8e) - Doclang: Add content layer support (#568) (
fe9bbfb) - Add handwriting support (#561) (
fb3b603)
Fix
- Doclang: Improve checkbox serialization & deserialization (#570) (
c9b5152) - Doclang: Fix serialization order in text items (#571) (
a1535bc) - Extend validation to address duplicate refs (#565) (
0cfb663) - Doclang: Fix group serialization (#566) (
159eb8f) - Repair table children when rich table cells break hierarchy (#563) (
b65dd24)
v2.70.2 - 2026-03-20
Fix
- Doclang: Suppress empty elements in Doclang serialization (#554) (
91ee7e2) - Expose traverse_pictures in export_to_markdown and export_to_text (#557) (
3e030ed) - Sync picture classification enums with DocumentFigureClassifier-v2.0 model (#529) (
f97ec83)
v2.70.1 - 2026-03-17
Fix
- markdown: Remove assert statements to support Python optimization mode (#548) (
0a3b278) - Improve rich table cell validation (#550) (
c57e50a)
v2.70.0 - 2026-03-13
Feature
- Introduce field data model incl. Doclang serialization (#519) (
b93d5a3) - Make an experimental outline serializer (#415) (
8d7859e) - Profile a document or collection (#511) (
af50f1c) - Split html table to headers and body (#532) (
b435090) - Handle wide table outliers with LineBasedTokenChunker (#536) (
e00125c)
v2.69.0 - 2026-03-09
Feature
v2.68.0 - 2026-03-07
Feature
Fix
v2.67.1 - 2026-03-05
Fix
v2.67.0 - 2026-03-04
Feature
v2.66.0 - 2026-02-26
Feature
Fix
- Rich table triplet serialization (#425) (
c566268) - Support single-column table default serialization (#526) (
73b0757)
v2.65.2 - 2026-02-23
Fix
- Accept relative URIs in PdfHyperlink without validation failure (#520) (
6032c7c) - Shift KV/Form graph cell page numbers during DoclingDocument.concatenate (#521) (
6a04db7) - chunker: Propagate 'traverse_pictures' parameter to chunker (#518) (
a3b6e3f)
v2.65.1 - 2026-02-13
Fix
v2.65.0 - 2026-02-13
Feature
Fix
- Doclang: Fix table cell
contentdeserialization (#512) (9ba605d) - Doclang: Align image mode, defaulting to placeholder (#506) (
aec74d4) - Fix document re-indexing (#510) (
1d969d4) - Switch XML parsing (#509) (
2793dda)
v2.64.0 - 2026-02-09
Feature
Fix
- Doclang: Fix image URI serialization (#504) (
193c25f) - DocTags: Fix deserialization to populate picture meta fields (#505) (
8005892)
v2.63.0 - 2026-02-03
Feature
Fix
- serialization: Add 'traverse_pictures' parameter to serializers (#501) (
04cf44b) - DocTags: Fix picture classification deserialization (#500) (
de2b729) - Doclang: Fix checkbox serialization (#503) (
1d8b78c)
v2.62.0 - 2026-01-30
Feature
- IDocTags: Add rich table support (#491) (
62f8d4d) - Model and serializer for audio tracks (#426) (
c8f3c01)
Fix
- html: Visualize picture meta as html collapsible (#497) (
fd27df1) - markdown: Add an option to compact table serialization (#495) (
3b0b909) - IDocTags: Fix default location resolution handling (#492) (
549a2f1)
v2.61.0 - 2026-01-26
Feature
- Added parameter to get_row_bounding_boxes and get_column_bounding_boxes (#490) (
577a1a7) - IDocTags: Add content wrapping for handling whitespace (#489) (
fdcdfd1)
Fix
v2.60.2 - 2026-01-23
Fix
v2.60.1 - 2026-01-22
Fix
v2.60.0 - 2026-01-20
Feature
Fix
- Fix transparency rendering in all visualizers (#481) (
a6d2be6) - IDocTags: Fix
InlineGroupserialization and deserialization (#477) (d9e8d37)
v2.59.0 - 2026-01-12
Feature
Fix
v2.58.1 - 2026-01-09
Fix
v2.58.0 - 2026-01-08
Feature
Fix
v2.57.0 - 2025-12-18
Feature
v2.56.0 - 2025-12-17
Feature
v2.55.0 - 2025-12-10
Feature
v2.54.1 - 2025-12-08
Fix
Documentation
v2.54.0 - 2025-11-29
Feature
v2.53.0 - 2025-11-27
Feature
- experimental: Extend IDocTags tokens (#439) (
aa5c668) - Added the Azure Document Intelligence (#395) (
92d60b0)
Fix
- Chart title (#428) (
3b253c1) - Robustify page filtering (#437) (
8bdeaa7) - Markdown serialization of hyperlink with code (#434) (
8feb09f)
v2.52.0 - 2025-11-20
Feature
- experimental: Add new DocTags serializer (#412) (
c9e5fb4) - Convert regions into TableData (#430) (
c80b583)
v2.51.1 - 2025-11-14
Fix
- Improve meta migration (#422) (
bc0e96b) - DoclingDocument model validator should deal with any raw input (#419) (
56b3c42)
v2.51.0 - 2025-11-12
Feature
Fix
- Improve meta migration and warning handling (#417) (
3d13b02) - Fix import handling of extra dependencies for chunking (#418) (
567d3ad)
v2.50.1 - 2025-11-04
Fix
v2.50.0 - 2025-10-30
Feature
- Add metadata model hierarchy (#408) (
2ee3cac) - Add split view & YAML support to CLI viewer (#407) (
a3feae0) - New picture classes for doctags (#404) (
ada4068)
v2.49.0 - 2025-10-16
Feature
v2.48.4 - 2025-10-01
Fix
v2.48.3 - 2025-09-29
Fix
v2.48.2 - 2025-09-22
Fix
v2.48.1 - 2025-09-11
Fix
v2.48.0 - 2025-09-09
Feature
- Introduction of fillable TableCell (#384) (
b13267f) - Add support for heading with inline in HTML & DocTags (#379) (
b60ac19)
Fix
- Add
docparam to allexport_to_dataframe()calls (#380) (0512f44) - Fix handling of generic groups in rich table cells (#383) (
2dc57c1)
v2.47.0 - 2025-09-02
Feature
v2.46.0 - 2025-09-01
Feature
Fix
Performance
v2.45.0 - 2025-08-20
Feature
Fix
v2.44.2 - 2025-08-14
Fix
v2.44.1 - 2025-07-30
Fix
v2.44.0 - 2025-07-28
Feature
v2.43.1 - 2025-07-23
Fix
- LayoutVisualizer should traverse pictures (#358) (
f9b3b49) - HTML serialization of nested lists (#359) (
5a7883c)
v2.43.0 - 2025-07-16
Feature
Fix
v2.42.0 - 2025-07-09
Feature
- Extend and expose float serialization control (#353) (
c339171) - Additional DoclingDocument methods for use in MCP document manipulation (#344) (
cb59fd3)
v2.41.0 - 2025-07-09
Feature
v2.40.0 - 2025-07-02
Feature
Fix
v2.39.0 - 2025-06-27
Feature
- Remodel lists, add MD & HTML ser. params, enable unset marker (#339) (
14a4fde) - Download Google docs and drive files via export url (#335) (
3eeb259)
v2.38.2 - 2025-06-25
Fix
- Add missing mimetypes for asr inputs (#341) (
c2fd20f) - Add text direction to export_to_textlines (#338) (
425b191)
v2.38.1 - 2025-06-20
Fix
v2.38.0 - 2025-06-18
Feature
- viz: Add reading order branch numbering, fix cross-page lists (#334) (
78b7962) - Add parameter to choose of which pages export the doctags (#290) (
0fd3c1c)
Fix
- Expose base types consistently (#332) (
2e14a74) - HybridChunker: Improve long heading handling (#333) (
5c99722)
v2.37.0 - 2025-06-13
Feature
v2.36.0 - 2025-06-11
Feature
v2.35.0 - 2025-06-11
Feature
v2.34.2 - 2025-06-10
Fix
v2.34.1 - 2025-06-08
Fix
v2.34.0 - 2025-06-06
Feature
- doctags: Add enclosing bbox to inline (#302) (
dcc198f) - Add subscript & superscript formatting (#319) (
ae96129) - Add table annotations (#304) (
d8a5256)
Fix
v2.33.1 - 2025-06-04
Fix
- New typer version with new click (#315) (
e17eabf) - Support section_header levels in doctags deserialization (#313) (
defd49e)
v2.33.0 - 2025-06-02
Feature
v2.32.0 - 2025-05-27
Feature
Fix
v2.31.2 - 2025-05-22
Fix
v2.31.1 - 2025-05-20
Fix
v2.31.0 - 2025-05-18
Feature
v2.30.1 - 2025-05-14
Fix
v2.30.0 - 2025-05-06
Feature
- Add image group serialization in html (#284) (
7f83f1c) - Adding the label picture_group (#283) (
2f0f121)
Fix
- Add unit flags to SegmentedPage (#286) (
ad88ecf) - Update deserialization for better recovery (#282) (
511fb98) - Include captions regardless of
traverse_picturesflag (#278) (7eb9fa9) - Hashlib usage for FIPS (#280) (
4b967ab)
v2.29.0 - 2025-05-01
Feature
Fix
- Fix multi-provenance item visualization (#277) (
8677d6e) - Added return value for crop_text method in segmentedPdfPage Class (#275) (
591fe59) - Make load_from_doctags method static (#273) (
8f85d05)
v2.28.1 - 2025-04-25
Fix
- Visualization of document pages without items (#271) (
a947440) - UnboundLocal variable (#269) (
d9709d0)
v2.28.0 - 2025-04-23
Feature
v2.27.0 - 2025-04-16
Feature
Fix
- HTML serialization for single image documents (#261) (
d0a49da) - codecov: Fix codecov argument and yaml file (#260) (
1af0721) - Safer label color API (#259) (
159f61d)
v2.26.4 - 2025-04-14
Fix
v2.26.3 - 2025-04-14
Fix
v2.26.2 - 2025-04-14
Fix
v2.26.1 - 2025-04-11
Performance
v2.26.0 - 2025-04-11
Feature
- Add HTML serializer (#232) (
5d40600) - Add serializer provider to chunkers (#239) (
23036e1) - Integrate serialization API into chunkers (#221) (
5e4c0fd) - Expose page number in Serialization API (#238) (
73b9941) - Markdown chart serializer (picture+table) (#235) (
0482bac) - Support of DocTags charts (serialization and deserialization) (#229) (
e9259a5) - Added initial delete and insert methods in DoclingDocument (#220) (
f2fe1c1)
Fix
- Fix page filtering issue (#247) (
ab78e0b) - Propagate HTMLOutputStyle properly through (#246) (
587e67f) - Better
BoundingRectangle.angleandBoundingRectangle.angle_360computation (#237) (055742c) - DocTags import location fix for tables, pictures, captions (#227) (
a055e1a)
Performance
v2.25.0 - 2025-03-31
Feature
Fix
- DivisionByZero in intersection_over_self (#224) (
2f380ab) - Fix hyperlink deserialization (#223) (
57d26ee)
v2.24.1 - 2025-03-28
Fix
- Automatic transformation of output cells bbox coord origin defined by input in get_cells_in_bbox (#219) (
8e0e9b7)
v2.24.0 - 2025-03-25
Feature
- Expose MD page break & DocTags minification (#213) (
ff13a93) - Add document tokens from key value items (#170) (
db119f4) - Add DocTags serializers (#192) (
1f4d57e) - Add kv_item support for doctag to docling_document (#188) (
2371c11)
Fix
- Enable caption serialization for all floating items (#216) (
e1d0597) - Allow captions without holding item (#215) (
2efb71a) - Add 'text/csv' mimetype to _extra_mimetypes type list (#210) (
bc3f5d5) - Add handling for str filenames in save/load methods (#205) (
75d94ab) - Markdown picture item export (#207) (
510649e) - DocTags support of furniture (#209) (
337ff74)
Performance
v2.23.3 - 2025-03-19
Fix
v2.23.2 - 2025-03-18
Fix
v2.23.1 - 2025-03-17
Fix
v2.23.0 - 2025-03-13
Feature
- Add serializers, text formatting, update Markdown export (#182) (
a7cdc87) - Add data model types from docling-parse (#186) (
a86a4a3)
v2.22.0 - 2025-03-12
Feature
- Add DoclingDocument.load_from_doctags method and DocTags data models (#187) (
c065c4c) - Add document tokens for SMILES (#176) (
32398b8)
v2.21.2 - 2025-03-06
Fix
- Suppress warning for missing fallback case (#184) (
ccde54a) - doctags: Fix code export (#181) (
53f6d09) - markdown: Fix escaping in case of nesting (#180) (
834db4b) - HybridChunker: Remove
max_lengthfrom tokenization (#178) (419252c)
v2.21.1 - 2025-02-28
Fix
v2.21.0 - 2025-02-27
Feature
Fix
- markdown: Fix case of leading list (#174) (
c77c59b) - Properly handle missing page image case for export_to_html (#166) (
4708f93)
v2.20.0 - 2025-02-19
Feature
v2.19.1 - 2025-02-17
Fix
v2.19.0 - 2025-02-17
Feature
- Redefine CodeItem as floating object with captions (#160) (
916323f) - Implementation of doc tags (#138) (
f751b45)
Fix
- Document Tokens (doc tags) clean up, fix iterate_items for content_layer (#161) (
58ed6c8) - Fix inheritance of CodeItem for backward compatibility (#162) (
7267c3f)
v2.18.1 - 2025-02-13
Fix
v2.18.0 - 2025-02-10
Feature
v2.17.2 - 2025-02-06
Fix
v2.17.1 - 2025-02-03
Fix
v2.17.0 - 2025-02-03
Feature
- HTML: Fallback showing formulas as images (#146) (
23477f7) - HTML: Export formulas with mathml (#144) (
ed36437)
Fix
v2.16.1 - 2025-01-30
Fix
v2.16.0 - 2025-01-29
Feature
- Escape underscores that are within latex equations (#137) (
0d5cd11) - Add escaping_underscores option to markdown export (#135) (
c9739b2) - Added the geometric operations to BoundingBox (#136) (
f02bbae)
v2.15.1 - 2025-01-21
Fix
v2.15.0 - 2025-01-21
Feature
Fix
- Fix hybrid chunker token constraint (#131) (
b741eea) - Always return a new bbox when changing origin (#128) (
841668f)
v2.14.0 - 2025-01-10
Feature
v2.13.1 - 2025-01-08
Fix
v2.13.0 - 2025-01-08
Feature
Fix
v2.12.1 - 2024-12-17
Fix
v2.12.0 - 2024-12-17
Feature
Fix
- Skip labels not included in the allow-list (#113) (
d147c25) - Always write with utf8 encoding (#111) (
268c294)
v2.11.0 - 2024-12-16
Feature
v2.10.0 - 2024-12-13
Feature
- Add legacy to DoclingDocument utility (#108) (
b31e0a3) - Add DoclingDocument viewer to CLI (#99) (
9628d19) - Add default tokenizer to HybridChunker (#107) (
2591c70)
Fix
- Improve doc item typing (#105) (
047a196) - Set origin when merging chunks (#109) (
b546c0a) - Add REFERENCE to exported labels and remove CAPTION (#106) (
a66b0bb)
v2.9.0 - 2024-12-09
Feature
Fix
v2.8.0 - 2024-12-06
Feature
v2.7.1 - 2024-12-06
Fix
v2.7.0 - 2024-12-04
Feature
v2.6.1 - 2024-12-02
Fix
v2.6.0 - 2024-12-02
Feature
- Extend source resolution with streams and workdir (#79) (
9a74d13) - Simple method to load DoclingDocument from .json files (#71) (
fc1cfb0)
Fix
- Allow all url types in referenced exports (#82) (
3bd83bc) - Even better style for HTML export (#78) (
8422ad4)
v2.5.1 - 2024-11-27
Fix
v2.5.0 - 2024-11-27
Feature
- Adding HTML export to DoclingDocument, adding export of images in png with links to Markdown & HTML (#69) (
ef49fd3)
v2.4.1 - 2024-11-21
Fix
v2.4.0 - 2024-11-18
Feature
- Add get_image for all DocItem (#67) (
9d7e831) - Allow exporting a specific page to md. (#63) (
1a201bc)
v2.3.2 - 2024-11-11
Fix
v2.3.1 - 2024-11-01
Fix
v2.3.0 - 2024-10-29
Feature
v2.2.3 - 2024-10-29
Fix
- Str representation of enum across python versions (#60) (
8528918) - Title for export to markdown and add text_width parameter (#59) (
4993c34)
v2.2.2 - 2024-10-26
Fix
v2.2.1 - 2024-10-25
Fix
v2.2.0 - 2024-10-24
Feature
Fix
v2.1.0 - 2024-10-22
Feature
- Improve markdown export of DoclingDocument (#50) (
328778e) - Extend chunk meta with schema, version, origin (#49) (
d09fe7e)
v2.0.1 - 2024-10-18
Fix
v2.0.0 - 2024-10-16
Feature
Breaking
v1.7.2 - 2024-10-09
Fix
v1.7.1 - 2024-10-07
Fix
- Make doc metadata keys pure strings (#38) (
246627f) - Align chunk ref format with one used in Document (#37) (
b5592ad)
v1.7.0 - 2024-10-01
Feature
- (experimental) introduce new document format (#21) (
688789e) - Add doc metadata extractor and ID generator classes (#34) (
b76780c) - Support heading as chunk metadata (#36) (
4bde515)
v1.6.3 - 2024-09-26
Fix
v1.6.2 - 2024-09-24
Fix
v1.6.1 - 2024-09-24
Fix
v1.6.0 - 2024-09-23
Feature
v1.5.0 - 2024-09-20
Feature
- Add export to doctags for document components (#25) (
891530f) - Add file source resolution utility (#22) (
752cbc3)