Commit Graph

46 Commits (master)

Author SHA1 Message Date
Mišo Belica 2285531110 Add support for "lxml[html_clean]" v5.2 module 1 week ago
Mišo Belica aa83825334 Tests migrated into pytest style 6 years ago
Mišo Belica 48acf389b1 Prefer pytest over nosetest runner 6 years ago
Mišo Belica 2c123008fa Added new Python versions into TravisCI 6 years ago
Jelmer Vernooij be2da44269 Fix installation of scripts. 10 years ago
Richard Harding c1e2e529a9 Update tests to be py.test 10 years ago
Richard Harding 6d747a312a Update to 0.1.20, remove tests from build 10 years ago
Richard Harding 5e8d9b46be Update for version 0.1.19 10 years ago
Jelmer Vernooij 6f912830c0 Use chardet rather than charade.
The changes from charade have been merged into upstream chardet,
and chardet is available in Debian/Ubuntu whereas charade is not.
10 years ago
Mišo Belica e2f3391dc3 Better decoding page into unicode
- Fixes #22
- Fixes #23

Prepare for release
10 years ago
Mišo Belica 66022e2503 Updated dependecies and tests 10 years ago
Mišo Belica d40a89a683 Use nose collector for tests 10 years ago
Richard Harding 6906f3b2fa Update logging to drop WARN to INFO 10 years ago
Richard Harding ca8bee0a7b Update to 0.1.15 11 years ago
Richard Harding 1fc153d850 Rename it back. Respect others 11 years ago
Richard Harding f4fa0c1040 Working on merging/updating changelog, news, and makefile 11 years ago
Richard Harding e9485b6fdf Tests working, makefile back into play 11 years ago
Mišo Belica 81ba7aec3c Create console scripts with python version suffix 11 years ago
Mišo Belica eb8a8c5248 Replaced deprecated method 'getiterator' by 'iter' 11 years ago
Mišo Belica 5e41280f77 Updated helper for creating an article test 11 years ago
Mišo Belica 3b5b2b1522 Renamed to readability 11 years ago
Mišo Belica 272fe480a3 Updated setup.py 11 years ago
Mišo Belica c92f61fa53 Fixed docopt version 11 years ago
Mišo Belica ec88a4efe6 Use docopt as an argument parser 11 years ago
Mišo Belica b3b987440d Added test runner via nosetests 11 years ago
Mišo Belica 9f83ea973a Fixed setup.py 11 years ago
Mišo Belica 3322681166 Use 'charade' for detecting encoding 11 years ago
Mišo Belica c4dbe24a65 New repository structure 11 years ago
Richard Harding 75b3151de9 Update the unittest import to grab unittest2 for 2.6 12 years ago
Richard Harding b18589ced8 Use the right package doh 12 years ago
Richard Harding fee5c37b39 Add argparse as a install req for py <2.7 12 years ago
Nathan Nifong 920094c81a Add a penalty for double quote chars in paragraphs.
- They are far more common in random commented code and proprietary metadata
  that keeps slipping by the filter as actual content.
- Downgraded the score value of commas for the same reason.
- Prep for 0.1.10 release with these changes.

Add credits and tweak the " and , scoring

Update version and update the scoring code
12 years ago
Richard Harding 60da675da5 Reprocess without candidate in case of errors using one
- Fixes #10
12 years ago
Richard Harding 3984e04668 Add better handling around xml parsing issues
- Fixes #9 with empty/non parsable docs
- Fixes #8 and removes kwargs for the decode statements.
- Fixes #7 by checking if the node has a parent before dropping.
12 years ago
Richard Harding fe9364295f prep for 0.1.7 release 12 years ago
Richard Harding e592f5322e Prep for 0.1.6 12 years ago
Richard Harding 9cf19d9970 Prep for 0.1.5 12 years ago
Richard Harding 5157b4570d Prep for the 0.1.4 release 12 years ago
Richard Harding 5704eb4c15 Start process of adding a newtest script for generating test cases
- Adds new breadability_newtest tool for generating test cases.
- Add fixes for the scripting.com test failure.
12 years ago
Richard Harding 3b00d33ad3 Prep for 0.1.3 release 12 years ago
Richard Harding 46ede7ccfb Prep for 0.1.2 release 12 years ago
Richard Harding 90a02569ca Prep for 0.1.1 release 12 years ago
Richard Harding 5c1765a6ef Update cmd line client/interface, update doc builders
- For now we're always getting a div back from the parser
- Update the client code, not all flags are enabled, but basic passing a url
works
12 years ago
Richard Harding 31c4439155 Start to add makefile for running life 12 years ago
Richard Harding b70dec4332 adding bits...ignore these commits for a while 12 years ago
Richard Harding 1b95af78c5 Initial bootstrap of modern package template 12 years ago