pytidylib 0.3.2

Creator: railscoderz

Last updated:

Add to Cart

Description:

pytidylib 0.3.2

`PyTidyLib`_ is a Python package that wraps the `HTML Tidy`_ library. Thisallows you, from Python code, to "fix" invalid (X)HTML markup. Some of thelibrary's many capabilities include:* Clean up unclosed tags and unescaped characters such as ampersands* Output HTML 4 or XHTML, strict or transitional, and add missing doctypes* Convert named entities to numeric entities, which can then be used in XML documents without an HTML doctype.* Clean up HTML from programs such as Word (to an extent)* Indent the output, including proper (i.e. no) indenting for ``pre`` elements, which some (X)HTML indenting code overlooks.Changes=======* 0.3.2: Initialization bug fix* 0.3.1: find_library support while still allowing a list of library names* 0.3.0: Refactored to use Tidy and PersistentTidy classes while keeping thefunctional interface (which will lazily create a global Tidy() object) forbackward compatibility. You can now pass a list of library names and baseoptions when instantiating Tidy. The keep_doc argument is now deprecatedand does nothing; use PersistentTidy.* 0.2.4: Bugfix for a strange memory allocation corner case in Tidy.* 0.2.3: Python 3 support (2 + 3 cross compatible) with passing Tox tests.Small example of use====================The following code cleans up an invalid HTML document and sets an option:: from tidylib import tidy_document document, errors = tidy_document('''<p>f&otilde;o <img src="bar.jpg">''', options={'numeric-entities':1}) print document print errorsDocs====Documentation is shipped with the source distribution and is available atthe `PyTidyLib`_ web page... _`HTML Tidy`: http://tidy.sourceforge.net/.. _`PyTidyLib`: http://countergram.com/open-source/pytidylib/

License

For personal and professional use. You cannot resell or redistribute these repositories in their original state.

Customer Reviews

There are no reviews.