Commit Graph

  • 477af1f4ab Change parse error message to 0x format instead of U+ iska 2014-11-01 22:06:01 +01:00
  • 97df7c5dd8 Fix Attribute Name state to prevent finalising a pending attribute name before it's finished iska 2014-11-01 22:05:22 +01:00
  • e8c74da8e2 Fix Bogus Comment state for nil-characters iska 2014-11-01 22:04:23 +01:00
  • 244021072c Remove erroneous parse-error emit in Comment End state iska 2014-11-01 22:03:34 +01:00
  • 0835e3d120 Add a "reconsume" method for current character to avoid scanning the Stream repeatedly iska 2014-11-01 22:01:44 +01:00
  • 622e22737b Fix emitted Token in Before Attribute Name state iska 2014-11-01 15:49:42 +01:00
  • 8fd6e89854 Fix scan location after reading Named Entity iska 2014-11-01 15:19:46 +01:00
  • 58220a464c Add missing break statement to prevent fall-through in End Tag Open state iska 2014-11-01 15:19:20 +01:00
  • b3ee713e9b Fix DOCTYPE Token's public and system identifier initialisation issue iska 2014-11-01 15:12:58 +01:00
  • a23e1a13c9 Fix surrogate pair handling and remove Input Stream's separate error-reporting class iska 2014-10-31 22:32:34 +01:00
  • bc8abb116f Fix UTF32Char to String conversion for invalid unicode characters iska 2014-10-31 21:46:54 +01:00
  • a7892a2edf Cleanup some parse error messages in Tokenizer for overall consistency iska 2014-10-31 18:26:34 +01:00
  • 2fa0e793e0 Replace dictionary with two arrays for Named Entity replacement iska 2014-10-31 18:25:35 +01:00
  • c04d4a35f3 Set performance baseline for tokenizing step iska 2014-10-31 18:07:18 +01:00
  • aaec5971b8 Add performance test for tokenizing step iska 2014-10-31 18:06:59 +01:00
  • bc3a7165ec Fix logic for Named Entity replacement iska 2014-10-31 18:06:15 +01:00
  • 8a2422426e Fix several state switches for Attribute states iska 2014-10-31 00:11:31 +01:00
  • d5ff28b1de Fix parsing numeric entities iska 2014-10-31 00:10:41 +01:00
  • d45ebb611e Fix state switch in Attribute Name state iska 2014-10-30 21:28:08 +01:00
  • a28580b258 Fix initialisation of HTML5 Lib tests iska 2014-10-30 21:18:56 +01:00
  • e2e4940100 Add implementation for finalising the current attribute of the current Tag Token iska 2014-10-30 21:18:15 +01:00
  • 1e4fe4ae45 Fix state switch in Tag Name state iska 2014-10-30 21:17:34 +01:00
  • 44fb192e22 Add helper methods for appending to attribute name/value iska 2014-10-30 21:17:00 +01:00
  • e896c1fe0b Fix initial values for HTML Tokens and handle nil-parameter iska 2014-10-30 21:15:12 +01:00
  • dd95a639c3 Add nil checks for DOCTYPE identifiers when initing HTML5 Lib tests iska 2014-10-26 23:43:52 +01:00
  • 9d2664fce0 Fix regex match range index in Process Double Escaped method iska 2014-10-26 23:43:29 +01:00
  • f2993181a8 Change method names in HTML5 Lib test class iska 2014-10-26 23:42:17 +01:00
  • a00c3a27d7 Fix several bugs in Named Entity Character Reference method iska 2014-10-26 23:21:07 +01:00
  • 8509cb1955 Remove faulty statement in Char Ref Attribute Value state iska 2014-10-26 22:49:15 +01:00
  • f04abf6236 Fix numeric entity replacement bug where wrong variable and valid-range were used iska 2014-10-26 22:29:23 +01:00
  • d5a00713ad Fix equality methods in HTML token classes for the nil-cases iska 2014-10-26 22:27:58 +01:00
  • 6cfb6cd65f Fix tokenizer for Bogus Comment state iska 2014-10-26 19:03:23 +01:00
  • e7126a720f Add category to overwrite isEqual method in Parse Error tokens for testing iska 2014-10-26 18:12:25 +01:00
  • e43d7c21d1 Change tokenzing to concatenate all adjacent character tokens into one iska 2014-10-26 18:11:21 +01:00
  • a3358dd8b0 Fix HTML5 Lib test to correctly handle ParseError token iska 2014-10-26 18:10:32 +01:00
  • e73a8dc512 Revert "Adapt HTML5 Lib test class to break output into multiple character tokens" iska 2014-10-26 18:02:47 +01:00
  • f45c9d33a6 Fix bug in named entity method iska 2014-10-26 16:37:48 +01:00
  • 971f7b3cf0 Add methods to access tokens in the Tokenizer class iska 2014-10-26 02:03:37 +02:00
  • 79da32ed72 Adapt HTML5 Lib test class to break output into multiple character tokens iska 2014-10-26 02:02:58 +02:00
  • 7d9b66ff8d Add "HTML Standarad" html file to tests resource for benchmarking iska 2014-10-26 02:01:48 +02:00
  • 8088d239df Add common test case class iska 2014-10-26 01:58:25 +02:00
  • f96c69c6f7 Remove fast enumeration protocol and its method implementation from Tokenizer class iska 2014-10-26 01:55:12 +02:00
  • 141310f910 Add equality and hash method for HTML Token classes iska 2014-10-26 01:54:30 +02:00
  • 65bd4c15f1 Remove EOF token and replace it with a boolean iska 2014-10-26 01:53:46 +02:00
  • b7039aba1d Add html5lib tests folder to tests target supporting files iska 2014-10-26 01:52:05 +02:00
  • 5c28638be6 Add html5lib-tests as git submodule iska 2014-10-25 17:37:47 +02:00
  • 0910290ef2 Add generic HTML5LibTest class for performing HTML5lib tests iska 2014-10-25 17:35:58 +02:00
  • 779fcf0a73 Improve parse error reason messages in Tokenizer class iska 2014-10-25 01:07:51 +02:00
  • 0a92fb8eb4 Add description methods for HTML tokens iska 2014-10-25 01:07:23 +02:00
  • d4401c0f17 Remove UTF32Char option for character tokens and use strings only iska 2014-10-25 01:06:55 +02:00
  • d119a78273 Add stream location to Parse Error tokens for more comprehensive reporting iska 2014-10-24 01:08:06 +02:00
  • 7c4e39af07 Add current location property to Input Stream reader iska 2014-10-24 01:07:07 +02:00
  • 9e198c9766 Add init method to Tokenizer header and extend to accept string parameter iska 2014-10-24 01:06:21 +02:00
  • 5d3ea61a04 Refactor HTML Tokens into separate files iska 2014-10-23 23:52:43 +02:00
  • 98357f9d4b Add new virtual project group for Tokenizing implementation iska 2014-10-23 20:14:20 +02:00
  • 6daac059bd Add project file with entries for Element class iska 2014-10-23 20:13:01 +02:00
  • d2507dcd5b Use little endian UTF-32 encoding for char-string conversion iska 2014-10-23 20:08:02 +02:00
  • 968a6aff79 Implement named character entity replacement iska 2014-10-23 20:07:30 +02:00
  • e88f0ae3e3 Add class with Named Character Reference dictionary iska 2014-10-15 22:41:54 +02:00
  • 5597323fbf Add a Numeric Entity Replacement table to replace the switch-statement for windows1525 trick iska 2014-10-12 18:55:24 +02:00
  • 7489adc285 Fix class cast when emitting character token in HTML Tokenizer iska 2014-10-05 19:17:01 +02:00
  • 29f20923bc Add handling of tag token emit errors in HTML Tokenizer iska 2014-10-05 19:16:37 +02:00
  • 680a84f389 Fix typo in HTML Input Stream error domain iska 2014-10-05 19:15:44 +02:00
  • b0391dc7ba Add fast enumeration for HTML Tokens in the Tokenizer class iska 2014-10-05 15:48:01 +02:00
  • 22131d5ef1 Add class stub for HTML Element iska 2014-10-05 15:35:41 +02:00
  • 1043b5dabd Add header with HTML Insertion Modes enum iska 2014-10-05 15:30:39 +02:00
  • 17f2f32350 Add implementation for HTML Tokenization iska 2014-10-04 23:30:32 +02:00
  • c5b60e2977 Add method to consume characters up-to a given string in Input Stream Reader iska 2014-10-04 23:28:00 +02:00
  • 721839f54a Add class stub for HTML Parser iska 2014-10-04 21:58:25 +02:00
  • 0519ff597d Rename method to "emit" prefix instead of "report" in Input Stream errors for naming consistency iska 2014-10-04 21:57:23 +02:00
  • 19ae5b0516 Use explicit values for current-consumed characters in Input Stream Reader class iska 2014-10-04 21:56:22 +02:00
  • 26e7ba35d0 Fix whitespace and indentation in several classes iska 2014-10-04 21:55:27 +02:00
  • adabe9c3f3 Add methods to consume given strings at the current location in Input Stream Reader iska 2014-10-04 21:54:15 +02:00
  • bfd9c9df97 Add implementation to consume a number reference iska 2014-09-22 00:31:44 +02:00
  • ebcbbbf693 Add initial implementation for "consuming a character reference" iska 2014-09-22 00:20:06 +02:00
  • b74c67498b Add methods to emit HTML tokens in the Tokenizer class iska 2014-09-22 00:19:18 +02:00
  • 5724912986 Add central definitions for characters used throughout HTML parsing iska 2014-09-22 00:18:27 +02:00
  • 12bcac50bc Add methods to consume hex and decimal numbers in Stream Reader iska 2014-09-22 00:17:11 +02:00
  • f7d7d987f3 Add method stubs for tokenizer states iska 2014-09-20 23:08:34 +02:00
  • 0400fb74aa Add HTML token definitions iska 2014-09-20 23:06:48 +02:00
  • 5cc13d0b2a Add methods to mark location and rewind-to-mark in HTML stream reader iska 2014-09-20 23:05:55 +02:00
  • 28623d951c Define HTML tekonization states iska 2014-09-20 23:04:43 +02:00
  • fb01febfca Add class stubs for HTML tokenizer implementation iska 2014-09-20 23:04:23 +02:00
  • fb9dcaa608 Add several checks and error reporting while processing the input stream iska 2014-09-15 23:24:16 +02:00
  • 1fc1f1c71c Add initial implementation for a HTML Input Stream processor iska 2014-09-15 23:03:33 +02:00
  • b220b7d561 Add static library target for iOS build iska 2014-09-15 22:32:32 +02:00
  • d3be396ffc Add Xcode framework project for HTMLKit iska 2014-09-15 22:26:03 +02:00
  • ecf9085b31 Initial Commit iska 2014-09-15 22:25:25 +02:00