Commit Graph

227 Commits (master)
 

Author SHA1 Message Date
Richard Harding d52d99f6b0 More readme tweaks 12 years ago
Richard Harding 773361efd9 Update readme with some real content 12 years ago
Richard Harding 7d2eec8f52 Add the conditional node checking during node cleaning 12 years ago
Richard Harding 14bbe701eb Add some more debugging to support tracing wtf we did and why 12 years ago
Richard Harding 00ba7e5164 Start to add debugging process for the library/client 12 years ago
Richard Harding e7873d3d92 Profile and adjust for performance, add bugfix to parse out mitechie blog post 12 years ago
Richard Harding 6b16b7b21f Start to add scoring file specific tests 12 years ago
Richard Harding ab79d9632b Some refactoring starts to help us org tests/code 12 years ago
Richard Harding ccac04e567 Add some cleaning/post processing of our target
- Starting to look decent
- Still need to port their cleanConditionally but going to have to think on
that
- Removes spare paragraphs, does some other cleaning tweaks
12 years ago
Richard Harding 19a38a2cea Add support for sibling detection, need to figure out how to test it well still 12 years ago
Richard Harding 4455ec226d Fix logic in the changing of body -> div 12 years ago
Richard Harding 5c1765a6ef Update cmd line client/interface, update doc builders
- For now we're always getting a div back from the parser
- Update the client code, not all flags are enabled, but basic passing a url
works
12 years ago
Richard Harding 5b3ef916ef Update to add link density scoring adjustments, prep for sibling checks 12 years ago
Richard Harding e843940549 Garden 12 years ago
Richard Harding 8e96cb7844 Update tests for scoring, returning div/html doc depending on the found content 12 years ago
Richard Harding 60ab4a96b0 Fix tests to pass again 12 years ago
Richard Harding 8f28e7c947 Add processing of content per the algorithm with some base tests 12 years ago
Richard Harding 7960264c3b Make sure we return body with our css class on it 12 years ago
Richard Harding e93a52a748 Start to add some processing for the readable contnet
- Add removal of style, script, etc bits in the content
12 years ago
Richard Harding 2e7fb0aa89 Rework document into its own file 12 years ago
Richard Harding ac053979a9 Add support for links, absoluting links
- Add a test that we absolute correctly
- Add a links cached attribute to get all links in the doc
12 years ago
Richard Harding 590a94345f Start to add some basic tests and layout to use for breaking down documents. 12 years ago
Richard Harding 5e95f531bc start to add some initial target test articles 12 years ago
Richard Harding 31c4439155 Start to add makefile for running life 12 years ago
Richard Harding b70dec4332 adding bits...ignore these commits for a while 12 years ago
Richard Harding 1b95af78c5 Initial bootstrap of modern package template 12 years ago
Rick Harding 84de8f5078 initial commit 12 years ago