mirror of
https://github.com/mozilla/readability.git
synced 2026-04-07 19:17:37 +00:00
d7949dc47d
This change improves the logic for wrapping phrasing content in paragraphs, fixing two bugs: 1. Trailing whitespace was not always trimmed correctly from phrasing content at the end of a container <div>. 2. Leading whitespace nodes could be left behind as direct children of the <div> instead of being discarded. The logic in Readability.js has been refactored to use a more robust "collect and transform" pattern. It now uses a DocumentFragment to gather all consecutive phrasing content, correctly trims leading / trailing whitespace, and then wraps the non-empty result in a <p> tag. This produces cleaner HTML, as reflected in the updated test pages. To support this, several enhancements were made to the JSDOMParser.js DOM implementation: * Added support for DocumentFragment, including doc.createDocumentFragment(). * Centralized DOM insertion logic (appendChild(), insertBefore(), and replaceChild()) into a single, efficient _insertNodesAtIndex() private helper, ensuring consistency. * Made replaceChild() more robust by simplifying its implementation and fixing an edge case where replacing a node with itself failed. * Fixed a bug in remove() where it incorrectly modified element-specific properties on non-element nodes. The test suite in test-jsdomparser.js was expanded to validate these improvements, with new tests for DocumentFragment handling, node moving, and self-insertion/replacement edge cases.