Commit Graph

99 Commits

Author SHA1 Message Date
Martin Thurau
046d2c10c3 Fixes regex declaration in get_encoding.
Since get_encoding() is only called when the input is *not* already unicode we need to declare the regexs as byte type so they continue to work in Python 3.
2015-04-29 23:36:50 +02:00
Martin Thurau
ce7ca26835 Adds compatibility raise_with_traceback method to support different raise syntax
Unfortunately the Python 2 `raise` syntax is not supported in Python 3.3 and not all 3.4.x versions so we deal with that by using conditional imports and a compatibility layer.
2015-04-29 23:35:18 +02:00
Martin Thurau
3ac56329e2 Corrects some things were 2to3 did to much. 2015-04-29 19:33:43 +02:00
Martin Thurau
aa4132f57a Adds Python 3.4 support.
Code now supports Python 2.6, 2.7 and 3.4. PYthon 3.3 isn't support
because of some issues with the parser and the difference between old and
new `raise` syntax.
2015-04-29 16:18:21 +02:00
Martin Thurau
13cca1dd19 Adds tox configuration.
Adds tox.ini to support running the tests on multiple versions. Adds
requirements.txt to support dependency installtion via pip.
2015-04-29 16:16:46 +02:00
Yuri Baburov
1d4ee9d421 Releasing as version 0.5 2015-04-27 16:00:08 +06:00
Yuri Baburov
987570bef0 Updated package links for Python 2.7 and Python 3 support 2015-04-27 15:59:31 +06:00
Yuri Baburov
dc648e7d0b Added a test for issue #48 but can't reproduce it -- seems to work fine. 2015-04-27 15:59:18 +06:00
Yuri Baburov
c715426584 Releasing as version 0.4 2015-04-27 14:54:13 +06:00
Yuri Baburov
1fac7e685a Added a feature to allow more images per article (with a test) 2015-04-27 14:35:00 +06:00
Yuri Baburov
c6796195a7 Fixed makefile testing. 2015-04-27 14:32:40 +06:00
Miguel Galves
d04d41b749 Insert text inside iframe for correct output 2015-04-27 14:05:31 +06:00
Miguel Galves
be2a1c4646 Let width and height attributes 2015-04-27 13:52:25 +06:00
Miguel Galves
f1759c1404 Allows iframes containing youtube or vimeo videos. People like them 2015-04-27 13:52:01 +06:00
Yuri Baburov
332ad810de Bumped to 0.3.0.6 2015-03-16 21:38:17 +05:00
Yuri Baburov
e4bcbe57d7 Fixes #53 2015-03-16 22:19:36 +06:00
Yuri Baburov
aeb4f4c782 Merge pull request #59 from seomoz/mac_10_10
Fix mac version comparison in setup.py for 10.10
2015-01-13 17:41:30 +05:00
Matthew Peters
c8c2f8809c Fix mac version comparison in setup.py for 10.10 2015-01-12 22:19:09 -08:00
Yuri Baburov
2d4cfdb2c8 Merge pull request #56 from nathanathan/patch-1
Defaulting to utf-8 when chardet returns None
2014-12-20 02:11:53 +05:00
Nathan Breit
75e2e0cb3a Defaulting to utf-8 when chardet returns None
On articles like this one chardet returns None:
http://news.zing.vn/nhip-song-tre/thay-giao-gay-sot-tung-bo-luat-tinh-yeu/a291427.html
This causes exceptions later on when encoding.lower() is called
2014-12-18 18:48:22 -08:00
Yuri Baburov
0c2f29ed0d Version bump. 2014-09-22 15:32:46 +07:00
Yuri Baburov
638f73f6a2 Fix for #52: <input type="hidden"> are not counted any more for "form removal" heuristic. 2014-09-22 15:31:31 +07:00
Yuri Baburov
2fab5ffa6b Merge pull request #48 from mperdomo1/master
Added code to check declared encodings first
2014-05-18 15:27:06 +07:00
Mark Perdomo
3a43a3fe7e Added code to check declared encodings first and check them
from kennethreitz/requests/utils.py.  Also I added some superset
encodings I have found in Chinese pages that are mishandled by
chardet/character declarations.
2014-05-13 15:09:47 +08:00
Yuri Baburov
1a4d3697bc Allow latest lxml on Mac OS X 10.9, see issue #39 for comments and setup instructions 2014-04-02 15:16:19 +07:00
Yuri Baburov
d8595b7103 Quickfix for #41 2013-10-10 13:47:58 +07:00
Yuri Baburov
318f25c577 Minor fix in encoding guessing. Claiming it v0.3.0.1 2013-10-10 02:57:53 +07:00
Yuri Baburov
08658d1d31 Released v 0.3, and uploaded to the pypi. 2013-10-10 02:39:37 +07:00
Yuri Baburov
4e3192f5ab Merge pull request #29 from hush-hush/master
Make lxml clean tree available for user modifications
2012-10-17 00:28:02 -07:00
hush-hush
e2e78e4d55 Make lxml clean tree available for user modifications. 2012-09-17 13:54:08 +02:00
Yuri Baburov
c923995606 Merge pull request #27 from sunlightlabs/master
Simple guard for empty title elements. Thanks, dvogel!
2012-08-29 20:11:04 -07:00
Drew Vogel
fdba8d9e11 Added check on title.text to avoid a TypeError on None. 2012-08-28 13:57:52 -04:00
Yuri Baburov
9cd5fb6226 Bump to 0.2.6.1 2012-07-17 19:36:49 +07:00
Yuri Baburov
44915518d3 Merge pull request #24 from zacharydenton/master
Fix issue 22: all titles were blank.
2012-07-17 05:32:23 -07:00
Zach Denton
0843d9cdf2 Explicitly check if title is None. fixes #22
This fixes #22 which caused all titles to be blank.
2012-07-09 05:22:23 -04:00
Yuri Baburov
8aefc6175f Updated README with 0.2.6 changes. 2012-06-21 22:00:47 +07:00
Yuri Baburov
20d5f3a73a Bump to 0.2.6 2012-06-21 21:42:57 +07:00
Yuri Baburov
2e49e34e11 Merge pull request #20 from andreypopp/master
readability.htmls: some docs do not have title elem
2012-06-07 03:10:16 -07:00
Andrey Popp
95852d5c18 readability.htmls: some docs do not have title elem 2012-06-07 14:08:09 +04:00
Yuri Baburov
274b60cdb1 Merge pull request #19 from EvaSDK/master
Package that provides source code
2012-06-04 01:09:29 -07:00
Gilles Dartiguelongue
ea6afd3d49 Make sure code is actually distributed 2012-06-02 23:11:40 +02:00
Richard Harding
a19e766900 Update version so we can upload new tar.gz to pypi 2012-04-17 13:40:25 -04:00
Richard Harding
b9f6f6777f Merge branch 'master' of github.com:buriy/python-readability 2012-04-17 13:36:00 -04:00
Richard Harding
873562cfba Update setup.py for finding the package correctly 2012-04-17 13:35:54 -04:00
Richard Harding
e9a5cbfe7f Remove pdb dummy 2012-04-17 11:33:09 -04:00
Richard Harding
f1a79fb8f8 Update to make sure we don't drop the html tag when ditching elements 2012-04-17 11:04:36 -04:00
Richard Harding
46f0302ebc rename the document_only flag to html_partial 2012-04-17 10:17:14 -04:00
Rick Harding
6e8a1f5ce2 Merge pull request #18 from mitechie/add_makefile
Add makefile, update .gitignore for venv potential testfile output.
2012-04-17 06:22:52 -07:00
Richard Harding
b8fc399fac Fix rebase issue in the Makefile 2012-04-17 09:20:23 -04:00
Richard Harding
82804b664d Update .gitignore file for venv and nosetests. 2012-04-17 08:47:04 -04:00