python-html5lib | HTML parser/tokenizer based on the WHATWG HTML5 specification | Mehr ...
html5lib is a pure-python library for parsing HTML. It is designed to conform to the HTML 5 specification, which has formalized the error handling algorithms of popular web browsers.