Files
pateltejas 72942486ff fix(pptx): skip malformed picture shapes instead of aborting conversion (#3372)
* fix(pptx): skip malformed picture shapes instead of aborting conversion

MsPowerpointDocumentBackend._handle_pictures reads embedded image bytes via python-pptx's shape.image accessor. On PPTX files with slightly malformed <p:pic> shapes, shape.image raises three exceptions that the existing (UnidentifiedImageError, OSError, ValueError) clause does not catch, so one bad picture aborts conversion of the entire presentation:

- InvalidXmlError when <p:blipFill> is missing
- KeyError when <a:blip r:embed> points to an unknown relationship
- AttributeError when the embedded part's content-type isn't an image

These files open normally in Keynote and Google Drive, so the backend should handle them as gracefully as it already handles truncated or unreadable image payloads.

This follows the same pattern as #2914, which extended the same except tuple with ValueError to handle linked (external) image references. The three cases above are the remaining shape.image failure modes that still escape.

Extend the except tuple to cover the three cases and log the same warning used for other unreadable images, leaving the rest of the presentation to convert normally. Add a regression fixture with one malformed picture per failure mode plus a focused test.

Fixes #3371

Signed-off-by: pateltejas <tejas226@hotmail.com>

* refactor(pptx): use warnings.warn for malformed picture skips

Address PR review feedback: use Python's warnings module with UserWarning to signal the skip to callers instead of logging.Logger.warning, matching the pattern used in msword_backend for "Skipping external image reference". This makes the skip visible via standard warning filters and catchable in tests.

Update the regression test to assert the warning is emitted via pytest.warns, which also suppresses the message during the test run so it doesn't clutter suite output.

Signed-off-by: pateltejas <tejas226@hotmail.com>

---------

Signed-off-by: pateltejas <tejas226@hotmail.com>
2026-04-29 08:29:08 +02:00
..