Adam Pash
|
81ed4f00ed
|
feat: improve nymag.com extractor to grab deks from features
|
2016-09-14 13:12:40 -04:00 |
|
Adam Pash
|
3694c2d12c
|
chore: improve linter/babelrc
|
2016-09-14 10:14:19 -04:00 |
|
Adam Pash
|
c48e3485c0
|
chore: code reorganization
Squashed commit of the following:
commit 636296841d5cf5e685237fe70db7a15305d8e966
Author: Adam Pash <adam.pash@gmail.com>
Date: Fri Sep 9 13:37:21 2016 -0400
final cleanup
commit 51f712b3074d41a1f2da91519289d4dd09719ad0
Author: Adam Pash <adam.pash@gmail.com>
Date: Fri Sep 9 13:25:28 2016 -0400
Another big pass
commit 3860e6d872a9adb9290093fd9c8708dfcc773c28
Author: Adam Pash <adam.pash@gmail.com>
Date: Fri Sep 9 12:49:52 2016 -0400
chore: started reorganizing
|
2016-09-09 13:44:58 -04:00 |
|
Adam Pash
|
8da2425e59
|
feat: resource fetches content from a URL and prepares for parsing
Squashed commit of the following:
commit 7ba2d2b36d175f5ccbc02f918322ea0dd44bf2c1
Author: Adam Pash <adam.pash@gmail.com>
Date: Tue Sep 6 17:55:10 2016 -0400
feat: resource fetches content from a URL and prepares for parsing
commit 0abdfa49eed5b363169070dac6d65d0a5818c918
Author: Adam Pash <adam.pash@gmail.com>
Date: Tue Sep 6 17:54:07 2016 -0400
fix: this was messing up double Esses ('ss', as in class => cla)
commit 9dc65a99631e3a68267a68b2b4629c4be8f61546
Author: Adam Pash <adam.pash@gmail.com>
Date: Tue Sep 6 14:58:57 2016 -0400
fix: test suite working w/new dirs
commit 993dc33a5229bfa22ea998e3c4fe105be9d91c21
Author: Adam Pash <adam.pash@gmail.com>
Date: Tue Sep 6 14:49:39 2016 -0400
feat: convertLazyLoadedImages puts img urls in the src
commit e7fb105443dd16d036e460ad21fbcb47191f475b
Author: Adam Pash <adam.pash@gmail.com>
Date: Tue Sep 6 14:30:43 2016 -0400
feat: makeLinksAbsolute to fully qualify urls
commit dbd665078af854efe84bbbfe9b55acd02e1a652f
Author: Adam Pash <adam.pash@gmail.com>
Date: Tue Sep 6 13:38:33 2016 -0400
feat: fetchResource to fetch a url and validate the response
commit 42d3937c8f0f8df693996c2edee93625f13dced7
Author: Adam Pash <adam.pash@gmail.com>
Date: Tue Sep 6 10:25:34 2016 -0400
feat: normalizing meta tags
|
2016-09-06 17:55:45 -04:00 |
|
Adam Pash
|
93e844cdfe
|
feat: implemented extractBestNode functionality
Squashed commit of the following:
commit 9af554dd975ff1778ed70c71fa9bde667fc5f880
Author: Adam Pash <adam.pash@gmail.com>
Date: Tue Aug 30 15:19:32 2016 -0400
feat: add cleanHeaders
commit 0dfea98eedc4f97fcbd78866322595c705e20521
Author: Adam Pash <adam.pash@gmail.com>
Date: Tue Aug 30 14:30:49 2016 -0400
fix: scoring parent nodes recursively
commit b6e5897a694adeb81e25a905aba72c0f45a8cc94
Author: Adam Pash <adam.pash@gmail.com>
Date: Tue Aug 30 12:47:24 2016 -0400
feat: extract clean node up and running
commit fb652c5db13db6bce7271efd68ba4b20515e9549
Author: Adam Pash <adam.pash@gmail.com>
Date: Tue Aug 30 09:57:21 2016 -0400
chore: added test for p tags with nested tags (e.g., img, iframe)
commit 731d0a2e4d89121dfafad195e9d0911805c4f8e4
Author: Adam Pash <adam.pash@gmail.com>
Date: Mon Aug 29 17:50:33 2016 -0400
feat: extact clean node integrates most functions
commit 322bc6534d30feb7c1c08d3813132badc6286b40
Author: Adam Pash <adam.pash@gmail.com>
Date: Mon Aug 29 16:46:04 2016 -0400
feat: removing empty nodes as defined in constants
commit f1d38932ea12a865814d2326970031fcb8515baa
Author: Adam Pash <adam.pash@gmail.com>
Date: Mon Aug 29 16:33:31 2016 -0400
feat: cleaning attributes from nodes
commit 0aa73ada6854af0ecd504bfe3d926a9524787ab5
Author: Adam Pash <adam.pash@gmail.com>
Date: Mon Aug 29 16:09:56 2016 -0400
feat: cleaning h1s from text
commit 12d4a309246285c278ce7765e4fbaa8271bb5889
Author: Adam Pash <adam.pash@gmail.com>
Date: Mon Aug 29 15:52:03 2016 -0400
feat: removing spacer images
commit 4e74ff830cc67586560f6fc72e2cfa432a3a2647
Author: Adam Pash <adam.pash@gmail.com>
Date: Mon Aug 29 15:38:49 2016 -0400
feat: stripping unwanted html from doc
commit c774166e90169fd0c1aa89898d3f7a975e82bf0a
Author: Adam Pash <adam.pash@gmail.com>
Date: Mon Aug 29 15:17:32 2016 -0400
feat: removing small images, height attribute from images
commit 3a8642f42cda451669c832482c5e1611b1ff2ea9
Author: Adam Pash <adam.pash@gmail.com>
Date: Mon Aug 29 12:57:45 2016 -0400
feat: rewrite top level
commit a1c03e779234b0aea02206d92ec3dcc15758507e
Author: Adam Pash <adam.pash@gmail.com>
Date: Fri Aug 26 17:34:36 2016 -0400
in a weird place rn
|
2016-08-30 15:25:25 -04:00 |
|
Adam Pash
|
f3aebb2a16
|
Basic testing in place
|
2016-08-23 11:03:31 -04:00 |
|