{"body":"breadability - another readability Python port\r\n===============================================\r\nI've tried to work with the various forks of some ancient codebase that ported\r\n`readability`_ to Python. The lack of tests, unused regex's, and commented out\r\nsections of code in other Python ports just drove me nuts.\r\n\r\nI put forth an effort to bring in several of the better forks into one\r\ncodebase, but they've diverged so much that I just can't work with it.\r\n\r\nSo what's any sane person to do? Re-port it with my own repo, add some tests,\r\ninfrastructure, and try to make this port better. OSS FTW (and yea, NIH FML,\r\nbut oh well I did try)\r\n\r\nThis is a pretty straight port of the JS here:\r\n\r\n- http://code.google.com/p/arc90labs-readability/source/browse/trunk/js/readability.js#82\r\n\r\n\r\nInstallation\r\n-------------\r\nThis does depend on lxml so you'll need some C headers in order to install\r\nthings from pip so that it can compile.\r\n\r\n::\r\n\r\n sudo apt-get install libxml2-dev libxslt-dev\r\n pip install breadability\r\n\r\n\r\nUsage\r\n------\r\n\r\ncmd line\r\n~~~~~~~~~\r\n\r\n::\r\n\r\n $ breadability http://wiki.python.org/moin/BeginnersGuide\r\n\r\nOptions\r\n``````````\r\n\r\n - b will write out the parsed content to a temp file and open it in a\r\n browser for viewing.\r\n - d will write out debug scoring statements to help track why a node was\r\n chosen as the document and why some nodes were removed from the final\r\n product.\r\n - f will override the default behaviour of getting an html fragment (