python-readability

Author	SHA1	Message	Date
Chris Curvey	9a31587192	fix encoding detection to use the encoding being tested	2017-02-08 18:00:11 -05:00
Yuri Baburov	24bb20c761	Added dev branch features. Bumped to version 0.6	2015-07-27 00:22:45 +06:00
Martin Thurau	386e48d29b	Fixes checking of declared encodings in get_encoding. In PYthon 3 .decode() on bytes requires the name of the encoding to be a str type which means we have to convert the extracted encoding before we can use it.	2015-04-30 11:47:32 +02:00
Martin Thurau	046d2c10c3	Fixes regex declaration in get_encoding. Since get_encoding() is only called when the input is not already unicode we need to declare the regexs as byte type so they continue to work in Python 3.	2015-04-29 23:36:50 +02:00
Nathan Breit	75e2e0cb3a	Defaulting to utf-8 when chardet returns None On articles like this one chardet returns None: http://news.zing.vn/nhip-song-tre/thay-giao-gay-sot-tung-bo-luat-tinh-yeu/a291427.html This causes exceptions later on when encoding.lower() is called	2014-12-18 18:48:22 -08:00
Mark Perdomo	3a43a3fe7e	Added code to check declared encodings first and check them from kennethreitz/requests/utils.py. Also I added some superset encodings I have found in Chinese pages that are mishandled by chardet/character declarations.	2014-05-13 15:09:47 +08:00
Yuri Baburov	11c4d95411	Fixed indentation, encoding issue and README bug. Thanks to Greg Jastrab. Bump version to 0.2.3	2011-07-27 02:05:16 +07:00
Yuri Baburov	c2ec1d1c38	Sorted out unicode issues, thanks to Lee Semel.	2011-06-30 11:51:16 +07:00
Yuri Baburov	43c34bacc1	Renamed encodings to encoding to avoid conflicts with system module.	2011-06-16 17:53:02 +07:00

9 Commits