Commit Graph

9 Commits

Author SHA1 Message Date
Chris Curvey
9a31587192 fix encoding detection to use the encoding being tested 2017-02-08 18:00:11 -05:00
Yuri Baburov
24bb20c761 Added dev branch features.
Bumped to version 0.6
2015-07-27 00:22:45 +06:00
Martin Thurau
386e48d29b Fixes checking of declared encodings in get_encoding.
In PYthon 3 .decode() on bytes requires the name of the encoding to be a str type which means we have to convert the extracted encoding before we can use it.
2015-04-30 11:47:32 +02:00
Martin Thurau
046d2c10c3 Fixes regex declaration in get_encoding.
Since get_encoding() is only called when the input is *not* already unicode we need to declare the regexs as byte type so they continue to work in Python 3.
2015-04-29 23:36:50 +02:00
Nathan Breit
75e2e0cb3a Defaulting to utf-8 when chardet returns None
On articles like this one chardet returns None:
http://news.zing.vn/nhip-song-tre/thay-giao-gay-sot-tung-bo-luat-tinh-yeu/a291427.html
This causes exceptions later on when encoding.lower() is called
2014-12-18 18:48:22 -08:00
Mark Perdomo
3a43a3fe7e Added code to check declared encodings first and check them
from kennethreitz/requests/utils.py.  Also I added some superset
encodings I have found in Chinese pages that are mishandled by
chardet/character declarations.
2014-05-13 15:09:47 +08:00
Yuri Baburov
11c4d95411 Fixed indentation, encoding issue and README bug. Thanks to Greg Jastrab. Bump version to 0.2.3 2011-07-27 02:05:16 +07:00
Yuri Baburov
c2ec1d1c38 Sorted out unicode issues, thanks to Lee Semel. 2011-06-30 11:51:16 +07:00
Yuri Baburov
43c34bacc1 Renamed encodings to encoding to avoid conflicts with system module. 2011-06-16 17:53:02 +07:00