Commit Graph

117 Commits

Author SHA1 Message Date
Mišo Belica
4e3227521e Fewer code - fewer bugs (I hope) 2013-03-15 01:40:41 +01:00
Mišo Belica
1a5970b238 Better names and positions for variables 2013-03-15 00:52:56 +01:00
Mišo Belica
930b6ced12 Fixed transformation of leaf <div> into <p> 2013-03-15 00:48:13 +01:00
Mišo Belica
314c999730 Drop useless tags by HTML cleaner 2013-03-15 00:23:41 +01:00
Mišo Belica
272fe480a3 Updated setup.py 2013-03-15 00:10:55 +01:00
Mišo Belica
9eacbd579c Updated LICENSE, AUTHORS, README 2013-03-15 00:10:41 +01:00
Mišo Belica
18b5c9b447 Refactored file 'scoring.py' 2013-03-11 23:06:21 +01:00
Mišo Belica
dcb7c18fd5 Refactored file 'document.py'
Removed non-intuitive parts and dead code
not covered by tests. Better names for objects.
Better coverage by tests.
2013-03-11 22:10:26 +01:00
Mišo Belica
03ff0be266 Moved client script into 'breadability.scripts' 2013-03-11 21:18:04 +01:00
Mišo Belica
c92f61fa53 Fixed docopt version 2013-03-11 12:43:17 +01:00
Mišo Belica
ec88a4efe6 Use docopt as an argument parser 2013-03-11 12:37:15 +01:00
Mišo Belica
8470ef2b45 Purification of file readable.py 2013-03-09 13:15:05 +01:00
Mišo Belica
b3b987440d Added test runner via nosetests 2013-03-09 13:05:16 +01:00
Mišo Belica
2e2e906da7 Purification of document.py 2013-03-09 00:05:49 +01:00
Mišo Belica
9f0fc2d433 Purification 2013-03-08 23:48:35 +01:00
Mišo Belica
baaefeda3c Refactored computing of link density 2013-03-08 23:23:30 +01:00
Mišo Belica
3f71e1b7d4 Refactored checking of node's attribute 2013-03-08 23:19:24 +01:00
Mišo Belica
636a38d705 Refactored generating of hash ID 2013-03-08 23:06:57 +01:00
Mišo Belica
9a613317c0 Make package from tests 2013-03-08 23:05:14 +01:00
Mišo Belica
cc00976533 Replace implementation of 'cached_property'
Parameter 'ttl' isn't needed.
2013-03-08 19:29:15 +01:00
Mišo Belica
e3b6ee2fd6 Suppress warning "ResourceWarning: unclosed file" 2013-03-08 17:46:18 +01:00
Mišo Belica
c69cd4b2ba Purification 2013-03-08 17:42:01 +01:00
Mišo Belica
101950478e Simplify logging 2013-03-08 17:41:39 +01:00
Mišo Belica
81be8ccbfb Updated readme 2013-03-07 17:48:17 +01:00
Mišo Belica
9f83ea973a Fixed setup.py 2013-03-07 17:12:14 +01:00
Mišo Belica
726fe59ecd Show build status from master branch [ci skip] 2013-03-07 17:05:47 +01:00
Mišo Belica
c7299b9852 Updated makefile [ci skip] 2013-03-07 17:01:38 +01:00
Mišo Belica
671d940ded Removed branches from Travis configuration
[ci skip]
2013-03-07 16:57:41 +01:00
Mišo Belica
ea90ee5a5e Updated changelog [ci skip] 2013-03-07 16:52:50 +01:00
Mišo Belica
c89010221e Changed/renamed/added AUTHORS, CHANGELOG, LICENSE
[ci skip]
2013-03-07 16:48:54 +01:00
Mišo Belica
d31d804167 Exclude coverage file from repo 2013-03-07 15:43:56 +01:00
Mišo Belica
231d251536 Added commands test into README 2013-03-07 15:43:02 +01:00
Mišo Belica
3322681166 Use 'charade' for detecting encoding 2013-03-07 15:42:18 +01:00
Mišo Belica
544220e9a3 Replaced u"" literal wit function 'to_unnicode'
Literal u"" is not supported by Python v3.2.
2013-03-07 15:13:15 +01:00
Mišo Belica
915876b675 Added Travis status image to README 2013-03-07 14:57:14 +01:00
Mišo Belica
8c79d4c04b Set white-list branches for @travisbot 2013-03-07 14:40:11 +01:00
Mišo Belica
94f6b0a84e Tests passes for both Python v2.7, v3.3 2013-03-07 14:15:10 +01:00
Mišo Belica
912bb50b76 Skip failing test that I don't know how to fix 2013-03-07 13:22:51 +01:00
Mišo Belica
c4dbe24a65 New repository structure 2013-03-07 13:14:04 +01:00
Richard Harding
75b3151de9 Update the unittest import to grab unittest2 for 2.6 2012-12-12 20:37:24 -05:00
Richard Harding
84f6a079f9 Try to adjust the travis command to test py2.6 2012-12-12 20:16:08 -05:00
Richard Harding
b18589ced8 Use the right package doh 2012-12-12 20:08:09 -05:00
Richard Harding
316c550709 Add python 2.6 to the travis ci 2012-12-12 20:00:23 -05:00
Richard Harding
fee5c37b39 Add argparse as a install req for py <2.7 2012-12-12 19:58:27 -05:00
Richard Harding
3dea2f349b Update ignore file 2012-10-29 11:00:06 +01:00
Nathan Nifong
920094c81a Add a penalty for double quote chars in paragraphs.
- They are far more common in random commented code and proprietary metadata
  that keeps slipping by the filter as actual content.
- Downgraded the score value of commas for the same reason.
- Prep for 0.1.10 release with these changes.

Add credits and tweak the " and , scoring

Update version and update the scoring code
2012-09-13 19:52:48 -04:00
Richard Harding
60da675da5 Reprocess without candidate in case of errors using one
- Fixes #10
2012-08-27 17:31:14 -04:00
Richard Harding
3984e04668 Add better handling around xml parsing issues
- Fixes #9 with empty/non parsable docs
- Fixes #8 and removes kwargs for the decode statements.
- Fixes #7 by checking if the node has a parent before dropping.
2012-08-27 15:31:28 -04:00
Richard Harding
fe9364295f prep for 0.1.7 release 2012-07-21 21:37:12 -04:00
Richard Harding
ae355e9f2f Update kwarg for older python 2012-07-21 21:36:03 -04:00