Commit Graph

249 Commits

Author SHA1 Message Date
iska db0bb51bc5 Fix the "isindex" attributes handling in the "in body" insertion mode 2015-04-11 00:36:37 +02:00
iska 27817fb7ec Fix start-tag handling for "object" tags in the "in body" insertion mode 2015-04-11 00:36:02 +02:00
iska 7c1b04e9a4 Fix start-tag handling for the "li", "dd" & "dt" case in the "in body" insertion mode 2015-04-11 00:35:13 +02:00
iska 082d573cf3 Fix "frameset" case in the start-tag handling of the "in body" insertion mode 2015-04-11 00:34:27 +02:00
iska 99ea01a908 Fix character token handling in the "in body" insertion mode 2015-04-11 00:33:43 +02:00
iska abad8f0a62 Fix "frameset" case in the "after head" insertion mode 2015-04-11 00:33:03 +02:00
iska 8007b4817c Fix reseting-the-insertion-mode method 2015-04-11 00:32:20 +02:00
iska 280339e2f0 Fix adoption agency algorithm 2015-04-11 00:31:37 +02:00
iska aa008900c9 Fix the input stream reader method for the CDATA Section state
- All carriage returns must be converted to line feeds
- All line feeds following a carriage return must be ignored

https://html.spec.whatwg.org/multipage/syntax.html#preprocessing-the-input-stream
2015-04-11 00:18:39 +02:00
iska 27375e3a47 Fix the default case in the "after after frameset" phase 2015-04-09 21:24:14 +02:00
iska 34e88898f2 Fix inserting comment calls in the "initial" & the "after head" insertion modes 2015-04-09 21:23:25 +02:00
iska e362bf717f Add length-check for the character token when ignoring a line-feed character after <textarea> 2015-04-09 21:22:16 +02:00
iska b6d30da180 Fix initializing tokenizer state in the parser
Add check for HTML namespace for the context element
2015-04-09 21:21:34 +02:00
iska bf821fb70d Fix MatML attribute adjustment
Case was wrong and tagName was checked as key instead of "definitionurl"
2015-04-09 21:20:50 +02:00
iska c60f308ca5 Fix HTML Element copy method
Was missing the namespace copy
2015-04-09 21:20:04 +02:00
iska 2c387c13ef Add parameterless initializer for the DOCTYPE Token
Otherwise the token has an incorrect type when inited via a "new" call
2015-04-08 21:48:12 +02:00
iska 7ca60ea53a Add nil-check for the input string when initializing the parser 2015-04-08 20:41:40 +02:00
iska 2f58d1364a Add nil-check in the tokenizer when emitting a nil-string as character token 2015-04-08 20:41:04 +02:00
iska 1b05db8e36 Fix tokenizer's CDATA-Section state
The closing "]]>" was not consumed
2015-04-08 20:40:43 +02:00
iska 61f4f01288 Add nil-check when clearing the active formatting elements upto last marker 2015-04-08 20:40:03 +02:00
iska 4231cc9608 Fix scanning input stream upto a given string
The consumed string variable should not be initialized so that it stays nil if nothing
was scanned
2015-04-08 20:39:33 +02:00
iska 79402946ec Fix the "after frameset" insertion mode handling for "noframes" start tag and "html" end tag 2015-04-08 20:37:53 +02:00
iska e34456f269 Add parser method to adjust token's foreign attributes
https://html.spec.whatwg.org/multipage/syntax.html#adjust-foreign-attributes
2015-04-08 00:53:41 +02:00
iska d6f5434ab8 Add implementation for text-content in the HTML Element 2015-04-08 00:29:00 +02:00
iska 72efa18a22 Rename HTML Element's id attribute 2015-04-08 00:10:04 +02:00
iska f3a8c1a0ec Rename HTML Element's namespace attribute
Not exactly a "conflict free" attribute name, Element id is next
2015-04-08 00:08:00 +02:00
iska 988e175533 Fix check in tokenizer's Markup Declaration Open state 2015-04-07 23:40:47 +02:00
iska 4a041a1f11 Fix c-string in the "in-caption" end-tag handling
A c-string was passed to the equals-method of NSString which caused a bad-access
2015-04-07 01:01:15 +02:00
iska 8cbd6220c8 Fix "li" end-tag case in the "in-body" insertion mode 2015-04-07 00:59:59 +02:00
iska 99579ff5eb Add several missing "reprocess token" calls in the parser
Some token were just swallowed because of this.
2015-04-07 00:59:21 +02:00
iska 83ece1af93 Add method to insert comment without specifying parent node in the parser
Looks better and less error prone
2015-04-07 00:58:37 +02:00
iska dabf24fa1c Change block-based implementation the "in-body" start-tag handling for "li", "dd" & "dt" to for-loop
Easier to read and comprehend
2015-04-05 18:00:20 +02:00
iska e5738cc48f Fix "in-table-row" phase's common case
Clear back to table-row context instead of table
2015-04-05 17:58:39 +02:00
iska b65d602d12 Fix "in-table-body" phase's common case 2015-04-05 17:57:31 +02:00
iska 89ff209f2d Add missing case in the "in-table" start-tag handling
Entry for "colgroup" was missing
2015-04-05 17:55:58 +02:00
iska c020e3ba6a Fix character-token case in the "in-table-text" phase 2015-04-05 13:53:55 +02:00
iska 919e7f8790 Remove wrong-entry from the "in-body" phase's start-tag-handling
This entry is a part of the end-tag-handling
2015-04-05 13:46:16 +02:00
iska c25210b812 Fix inserting-characters method when the appropriate place for insertion is not as a last-node 2015-04-05 13:44:58 +02:00
iska d38f6df2d7 Fix premature check-and-return in the tree-construction-dispatcher 2015-04-05 13:43:10 +02:00
iska 439f6aba19 Refactor blocks-based implementation to simple while-loops in the reconstruct-active-formatting-elements
Much easier to read and comprehend.
2015-04-05 13:35:31 +02:00
iska 3df247f91f Add HTML Node method to pass own child nodes to another node
This is used in the adoption-agency-algorithm to copy all text-child-nodes from the furthest-block
to the new-element
2015-04-05 13:33:46 +02:00
iska 664e555347 Add index-bounds-check to the list of active formatting elements when inserting at index
In some cases the bookmark-index in the adoption-agency-algorithm lies outside the lists bounds, in this
case the new elements is to be inserted at the end of the list.
2015-04-05 13:31:59 +02:00
iska c369abbfee Add "Noah's Ark Clause", three per family, to the list of active formatting elements
When adding new elements to the list, if there are already three same elements present, then remove
the oldest one before inserting the new.
2015-04-05 13:30:34 +02:00
iska a62ccfdc51 Fix loop in the "foreign content" phase's end-tag handling 2015-04-05 13:28:31 +02:00
iska caba3efd95 Fix "in-row" insertion mode and add the corresponding logic to clear the stack back to row context 2015-04-05 13:27:45 +02:00
iska b73be50e2b Add nil-checks for popping-methods in stack of open elements 2015-04-05 13:26:39 +02:00
iska 31178b32cd Add debug-description methods for HTML Node and Parser's structures
Stack of open elements & list of active formatting elements dump the underlying array's
description. The HTML Node prints out the tree description as object's info and the
outer-html for Xcode debugger's quick-look.
Tremendous help when debugging.
2015-04-04 22:58:41 +02:00
iska d1a4aeb5e1 Add HTML Parser methods for parsing a document and document fragments
The parser is initialized with a HTML string which can be parsed as a document or a document
fragment. When parsing a fragment a context element should be provided.
It is also possible to parse the same string as a fragment for different context elements. In this case
the parser reset its internal state and runs the parsing algorithm again. Parsing a fragment for the
same context element runs the algorithm only once, since the parser caches the context element
and the parsing results.
2015-04-04 02:30:08 +02:00
iska 5a140220b1 Fix HTML Node parent-related property references to "weak"
A node shouldn't have a strong reference to its parent or owner document
2015-04-04 02:17:42 +02:00
iska 418913991d Fix HTML Node owner-document getter method 2015-04-04 02:16:15 +02:00