Anne van Kesteren

Project: HTML Parser fully functional

Thanks to the work from both James Graham and myself from html5lib now works and passes all testcases provided by Google including some extra tests we contributed ourselves. Feel free to contribute new tests that help us stabilize the parser and tokenizer. (The parser, tokenizer, tests and everything else part of the project are available under the MIT license.)


  1. I guess I should get things moving on the Ruby version soon. :D

    Posted by ryan king at

  2. Starting with revision 321, I'm seeing 13 errors.

    Question: is it expected that people run tests before checkin? Or should people assume that the code is unstable?

    Posted by Sam Ruby at

  3. It is expected people run tests and I’m surprised you see failures. I don’t see any. Care to join #whatwg and explain?

    Posted by Anne van Kesteren at

  4. I left this in #whatwg, but will repeat it here

    python tests/
    Traceback (most recent call last):
      File "/home/rubys/svn/html5lib/tests/", line 105, in <lambda>
        testFunc = lambda self, method=func, input=input, output=output: \
      File "/home/rubys/svn/html5lib/tests/", line 86, in runTokenizerTest
        tokens = parser.parse(input)
      File "/home/rubys/svn/html5lib/tests/", line 23, in parse
        getattr(self, 'process%s' % token.__class__.__name__)(token)
    AttributeError: 'TokenizerTestParser' object has no attribute 'processCharacters'

    Posted by Sam Ruby at

  5. That bug was fixed yesterday by Sam Ruby himself for what it’s worth. Thanks Sam!

    Posted by Anne van Kesteren at