Anne van Kesteren

Project: HTML Parser fully functional

22 December 2006

Thanks to the work from both James Graham and myself parser.py from html5lib now works and passes all testcases provided by Google including some extra tests we contributed ourselves. Feel free to contribute new tests that help us stabilize the parser and tokenizer. (The parser, tokenizer, tests and everything else part of the project are available under the MIT license.)

Comments

I guess I should get things moving on the Ruby version soon. :D
Posted by ryan king at 11:21AM
Starting with revision 321, I'm seeing 13 errors.
Question: is it expected that people run tests before checkin? Or should people assume that the code is unstable?
Posted by Sam Ruby at 12:20AM
It is expected people run tests and I’m surprised you see failures. I don’t see any. Care to join #whatwg and explain?
Posted by Anne van Kesteren at 12:36AM

I left this in #whatwg, but will repeat it here

python tests/runtests.py
Traceback (most recent call last):
  File "/home/rubys/svn/html5lib/tests/test_tokenizer.py", line 105, in <lambda>
    testFunc = lambda self, method=func, input=input, output=output: \
  File "/home/rubys/svn/html5lib/tests/test_tokenizer.py", line 86, in runTokenizerTest
    tokens = parser.parse(input)
  File "/home/rubys/svn/html5lib/tests/test_tokenizer.py", line 23, in parse
    getattr(self, 'process%s' % token.__class__.__name__)(token)
AttributeError: 'TokenizerTestParser' object has no attribute 'processCharacters'

Posted by Sam Ruby at 3:44AM

That bug was fixed yesterday by Sam Ruby himself for what it’s worth. Thanks Sam!
Posted by Anne van Kesteren at 8:54PM