Yuri Baburov
274b60cdb1
Merge pull request #19 from EvaSDK/master
...
Package that provides source code
13 years ago
Gilles Dartiguelongue
ea6afd3d49
Make sure code is actually distributed
13 years ago
Richard Harding
a19e766900
Update version so we can upload new tar.gz to pypi
13 years ago
Richard Harding
b9f6f6777f
Merge branch 'master' of github.com:buriy/python-readability
13 years ago
Richard Harding
873562cfba
Update setup.py for finding the package correctly
13 years ago
Richard Harding
e9a5cbfe7f
Remove pdb dummy
13 years ago
Richard Harding
f1a79fb8f8
Update to make sure we don't drop the html tag when ditching elements
13 years ago
Richard Harding
46f0302ebc
rename the document_only flag to html_partial
13 years ago
Rick Harding
6e8a1f5ce2
Merge pull request #18 from mitechie/add_makefile
...
Add makefile, update .gitignore for venv potential testfile output.
13 years ago
Richard Harding
b8fc399fac
Fix rebase issue in the Makefile
13 years ago
Richard Harding
82804b664d
Update .gitignore file for venv and nosetests.
13 years ago
Richard Harding
4376eedc13
Add makefile testing, building, uploading.
...
- Adds a makefile with helpers
- make all will setup a virtualenv and get deps
- make test will install test deps and run nosetests
- make version_update will open the setup.py for updating version string
- make upload will build and upload sdist to pypi
13 years ago
Yuri Baburov
7338e9ef63
Added test suite to setup.py
...
Bump to version 0.2.4
13 years ago
Yuri Baburov
a1ae4eaf72
Merge pull request #15 from mitechie/master
...
New option only_document of Document.summary(), fixed issue GH-13 with "<body/>", added some docs, tests, and code quality improvements. Thanks, Rick!
13 years ago
Richard Harding
8d3e39f04e
Update readme
13 years ago
Richard Harding
a46dc14251
Try to pep8 all the things but give up when I got close.
13 years ago
Richard Harding
5a98e2c1b8
Correct appending and allow for document only
...
- Fix the appending of siblings to the correct nested element
- Add a document only flag so that you can get a dom tree you can nest
yourself without html/body tags.
13 years ago
Richard Harding
edccec5d3b
Work on why we have an empty <body/> tag
...
- Seems to come because the sanitizer ends up with two nodes, not one. The
first is an empty body, the second is the article div.
- Fix up the tabs so we can work with the file. Needs lots of pep8 love.
- Implement an initial hack that at least gets it working atm.
- Start to add test cases, sample html files we can test against, etc.
13 years ago
Yuri Baburov
ab783b25b7
Merge pull request #11 from JanX2/master
...
Fixing gap in node_length coverage (length=80 was missed)
Continue early in remove_unlikely_candidates() in case there is neither a class nor an id attribute.
Adding comment about oversight in transform_misused_divs_into_paragraphs
13 years ago
Jan Weiß
3cdc3d67af
Adding comment about oversight in transform_misused_divs_into_paragraphs().
13 years ago
Jan Weiß
960f885edf
Continue early in remove_unlikely_candidates() in case there is neither a class nor an id attribute.
13 years ago
Jan Weiß
6b3961cd30
Fixing gap in node_length coverage.
13 years ago
Yuri Baburov
f9b604c9a8
Merge pull request #10 from facundo/master
...
Fix: Document.score_paragraphs should use ._html() not .html in case it's used not from .summary() method.
Thanks to facundo.
13 years ago
facundo
bb93ae1e5f
fixed a small issue on the Document score_paragraphs method
13 years ago
Yuri Baburov
fc6a500298
Merge pull request #9 from Psycojoker/master
...
Add lxml to the dependencies list in the setup.py
Please note that lxml sometimes can't be built from sources, lots of people use binary distributions, which setup.py/pip can't handle properly!
13 years ago
Laurent Peuch
1583d8a794
add lxml missing dependancy
13 years ago
Yuri Baburov
11c4d95411
Fixed indentation, encoding issue and README bug. Thanks to Greg Jastrab. Bump version to 0.2.3
13 years ago
Yuri Baburov
6bf4948e69
More README fixes for pipy and github. Bump to version 0.2.2
13 years ago
Yuri Baburov
f189ab905d
Fixed README for pypi.
13 years ago
Yuri Baburov
61715dca0a
Bump to version 0.2
13 years ago
Yuri Baburov
21906f1c44
Better setup.py, now we're "readability-lxml" in pypi. Thanks to Jerry Charumilind.
13 years ago
Yuri Baburov
c2ec1d1c38
Sorted out unicode issues, thanks to Lee Semel.
13 years ago
Yuri Baburov
45781a600f
Added command-line usage
13 years ago
Yuri Baburov
97ba2a0369
Debug utilities.
13 years ago
Lee Semel
f3d0a8d842
Allow passing unicode objects
13 years ago
Jerry Charumilind
ad38fac40a
Add chardet to installation requirements
13 years ago
Jerry Charumilind
8c1adc5141
Expose Document in readability package
13 years ago
Jerry Charumilind
bae87079e9
Change to automatically find packages
13 years ago
Jerry Charumilind
5bf5192d03
Add version number to track changes more easily
13 years ago
Yuri Baburov
7a1e063c22
Updated setup.py to my fork, changed package name to lxml-readability
13 years ago
Yuri Baburov
43c34bacc1
Renamed encodings to encoding to avoid conflicts with system module.
14 years ago
Yuri Baburov
096d4db6ce
Added usage
14 years ago
Yuri Baburov
f55f16baa1
Updated scoring algorithm to match readability.js v1.7.1
14 years ago
Yuri Baburov
96f476181c
Improved title shortener method, and added it to the Document class.
14 years ago
Yuri Baburov
f925e3ef05
Corrected README
14 years ago
Yuri Baburov
dada82099b
Moved to lxml (based on decruft version); better encoding recognition.
14 years ago
gfxmonk
b5639a0822
well that was quick; first fork added
14 years ago
gfxmonk
324e280e16
added note to readme to make it clear that I'm not actively working on this library
14 years ago
Tim Cuthbertson
7ebbcc03d2
made setup.py executable
14 years ago
Sean Brant
a5d47a1129
added setup.py
14 years ago