Richard Harding
bf35e3410e
Do some link filtring to drop stupid permalinks from the content.
12 years ago
Richard Harding
9cf19d9970
Prep for 0.1.5
12 years ago
Richard Harding
ff37f3169f
Add checks to links to remove really bad links from the scripting site
12 years ago
Richard Harding
5157b4570d
Prep for the 0.1.4 release
12 years ago
Richard Harding
5704eb4c15
Start process of adding a newtest script for generating test cases
...
- Adds new breadability_newtest tool for generating test cases.
- Add fixes for the scripting.com test failure.
12 years ago
Richard Harding
3b00d33ad3
Prep for 0.1.3 release
12 years ago
Richard Harding
c2f935bf51
Remove code we didn't need
12 years ago
Richard Harding
326fbfe107
Fix the processing and clean up the antipope article
12 years ago
Richard Harding
3ae64f165e
Update and merge
12 years ago
Richard Harding
edca1c74ba
Add in test files for antipope blog post
13 years ago
Richard Harding
d3c83b7255
Update scoring and tests for the antipope article
13 years ago
Richard Harding
3f70a49a22
Update to fix client, add head to the css downgrade weights
13 years ago
Richard Harding
46ede7ccfb
Prep for 0.1.2 release
13 years ago
Richard Harding
811921775c
Started to do some testing, but really not happy with it
13 years ago
Richard Harding
7c220535df
Complete upstream merge
13 years ago
Greg Jastrab
c8c53b304b
Bonus per 100 chars logic was incorrect
...
Number of characters was being mod'd by 100 instead of divided,
so a paragraph with a character length of 103 would have
incorrectly gotten 3 bonus points added to the content score.
Add Greg to credits
13 years ago
Richard Harding
be77f99be1
Add doc and candidates properties to the article
13 years ago
Richard Harding
2e3f416e3b
Garden
13 years ago
Richard Harding
e83a753b82
Garden and lint
13 years ago
Richard Harding
6d380712c5
Start process of testing full candidate scoring
13 years ago
Richard Harding
ae9208374b
Add some ScoredNode tests as well
13 years ago
Richard Harding
e57f8f02ce
Adding tests for the id/css weights and link density
13 years ago
Richard Harding
90a02569ca
Prep for 0.1.1 release
13 years ago
Richard Harding
e168484126
Garden readme
13 years ago
Richard Harding
645838c66c
Update readme with ci and other important links
13 years ago
Richard Harding
1553eda145
Fix typo in travis config
13 years ago
Richard Harding
ad3685d4f4
Start to add items to get travis ci builds working
13 years ago
Richard Harding
56f29a8585
Mark true so we can start sending tests to travisci
13 years ago
Richard Harding
32350fc3a1
Create LNODE and update bugs in parsing
...
- Add concept of a LNODE logger that outputs information about scoring, node,
and generates a hash_id for the node content so we can track it.
- Add `-d` flag to the cmd line client to output the LNODE logging
- Update reading in of http content in the client to be unicode
- Wrap stdout with a unicode happy stream so we can pipe unicode to less/grep,
etc
- Add html article to the scorable tags we work with
- Make sure we drop iframe along with noscript
- Fix scoring bugs around length points
- Add the hash_id as a scored node @property
13 years ago
Richard Harding
f1623fc3e3
Redo the candidate logging to help us locate the best candidate
13 years ago
Richard Harding
278d695614
Update readme for the new cmd line flags
13 years ago
Richard Harding
6b92dd2f83
Add -f and -b flags to client
...
- added a -f flag that will override only getting a <div> fragement back and
return a fully constructed document
- added a -b flag to not just parse, but write to temp file and open in a
browser, great for testing
- Updated the Article to support the fragment=False so that you can get back a
fully wrapped <html> document with a header (especially with utf-8 content
type set yay)
13 years ago
Richard Harding
8b77675ab2
Fix up some tests since we should have run them before tagging 0.1...need to get into build server
13 years ago
Richard Harding
745598dff9
Update news file with initial release
13 years ago
Richard Harding
279788c003
Update the readme for install info
13 years ago
Richard Harding
9e6835bd92
Work on tweaking out parser algorithm to help find the right candidate: fixes #2
13 years ago
Richard Harding
b78ea49c5a
Update readme so people don't misunderstand
13 years ago
Richard Harding
454e283850
Add link to readability
13 years ago
Richard Harding
d52d99f6b0
More readme tweaks
13 years ago
Richard Harding
773361efd9
Update readme with some real content
13 years ago
Richard Harding
7d2eec8f52
Add the conditional node checking during node cleaning
13 years ago
Richard Harding
14bbe701eb
Add some more debugging to support tracing wtf we did and why
13 years ago
Richard Harding
00ba7e5164
Start to add debugging process for the library/client
13 years ago
Richard Harding
e7873d3d92
Profile and adjust for performance, add bugfix to parse out mitechie blog post
13 years ago
Richard Harding
6b16b7b21f
Start to add scoring file specific tests
13 years ago
Richard Harding
ab79d9632b
Some refactoring starts to help us org tests/code
13 years ago
Richard Harding
ccac04e567
Add some cleaning/post processing of our target
...
- Starting to look decent
- Still need to port their cleanConditionally but going to have to think on
that
- Removes spare paragraphs, does some other cleaning tweaks
13 years ago
Richard Harding
19a38a2cea
Add support for sibling detection, need to figure out how to test it well still
13 years ago
Richard Harding
4455ec226d
Fix logic in the changing of body -> div
13 years ago
Richard Harding
5c1765a6ef
Update cmd line client/interface, update doc builders
...
- For now we're always getting a div back from the parser
- Update the client code, not all flags are enabled, but basic passing a url
works
13 years ago