Adam Pash
173f885674
feat: custom parser + generator + detailed readme instructions
...
Squashed commit of the following:
commit 02563daa67712c3679258ebebac60dfa9568dffb
Author: Adam Pash <adam.pash@gmail.com>
Date: Fri Sep 30 12:25:44 2016 -0400
updated readme, added newyorker parser for readme guide
commit 0ac613ef823efbffbf4cc9a89e5cb2489d1c4f6f
Author: Adam Pash <adam.pash@gmail.com>
Date: Fri Sep 30 11:16:52 2016 -0400
feat: updated parser so the saved fixture absolutizes urls
commit 85c7a2660b21f95c2205ca4a4378a7570687fed0
Author: Adam Pash <adam.pash@gmail.com>
Date: Fri Sep 30 10:15:26 2016 -0400
refactor: attribute selectors must be an array for custom extractors
commit f60f93d5d3d9b2f2d9ec6f28d27ae9dcf16ef01e
Author: Adam Pash <adam.pash@gmail.com>
Date: Thu Sep 29 10:13:14 2016 -0400
fix: whitelisting srcset and alt attributes
commit e31cb1f4e8a9fc9c3d9b20ef9f40ca6c8d6ad51a
Author: Adam Pash <adam.pash@gmail.com>
Date: Thu Sep 29 09:44:21 2016 -0400
some housekeeping for coverage tests
commit 39eafe420c776a1fe7f9fea634fb529a3ed75a71
Author: Adam Pash <adam.pash@gmail.com>
Date: Wed Sep 28 17:52:08 2016 -0400
fix: word count for multi-page articles
commit b04e0066b52f190481b1b604c64e3d0b1226ff02
Author: Adam Pash <adam.pash@gmail.com>
Date: Thu Sep 22 10:40:23 2016 -0400
major improvements to output
commit 3f3a880b63b47fe21953485da670b6e291ac60e5
Author: Adam Pash <adam.pash@gmail.com>
Date: Wed Sep 21 17:27:53 2016 -0400
updated test command
commit 14503426557a870755453572221d95c92cff4bd2
Author: Adam Pash <adam.pash@gmail.com>
Date: Wed Sep 21 16:00:30 2016 -0400
shortened generator command
commit 5ebd8343cd4b87b3f5787dab665bff0de96846e1
Author: Adam Pash <adam.pash@gmail.com>
Date: Wed Sep 21 15:59:14 2016 -0400
feat: can disable fallback to generic parser (this will be useful for testing custom parsers)
8 years ago
Adam Pash
39a3c0690d
chore: readme improvement
8 years ago
Adam Pash
ef047107ea
feat: content cleaner still runs, but can disable some cleaners
8 years ago
Adam Pash
75b1880f01
chore: cleaned up unused files, slight reorg
8 years ago
Adam Pash
ad42055f8f
feat: switched test framework to jest
8 years ago
Adam Pash
8f42e119e8
feat: generator for custom parsers and some documentation
...
Squashed commit of the following:
commit deaf9e60d031d9ee06e74b8c0895495b187032a5
Author: Adam Pash <adam.pash@gmail.com>
Date: Tue Sep 20 10:31:09 2016 -0400
chore: README for custom parsers
commit a8e8ad633e0d1576a52dbc90ce31b98fb2ec21ee
Author: Adam Pash <adam.pash@gmail.com>
Date: Mon Sep 19 23:36:09 2016 -0400
draft of readme
commit 4f0f463f821465c282ce006378e5d55f8f41df5f
Author: Adam Pash <adam.pash@gmail.com>
Date: Mon Sep 19 17:56:34 2016 -0400
custom extractor used to build basic parser for theatlantic
commit c5562a3cede41f56c4e723dcfa1181b49dcaae4d
Author: Adam Pash <adam.pash@gmail.com>
Date: Mon Sep 19 17:20:13 2016 -0400
pre-commit to test custom parser generator
commit 7d50d5b7ab780b79fae38afcb87a7d1da5d139b2
Author: Adam Pash <adam.pash@gmail.com>
Date: Mon Sep 19 17:19:55 2016 -0400
feat: added nytimes parser
commit 58b8d83a56927177984ddfdf70830bc4f328f200
Author: Adam Pash <adam.pash@gmail.com>
Date: Mon Sep 19 17:17:28 2016 -0400
feat: can do fuzzy search or go straight to file
commit c99add753723a8e2ac64d51d7379ac8e23125526
Author: Adam Pash <adam.pash@gmail.com>
Date: Mon Sep 19 10:52:26 2016 -0400
refactored export for custom extractors for easier renames
commit 22563413669651bb497f1bb2a92085b71f2ae324
Author: Adam Pash <adam.pash@gmail.com>
Date: Fri Sep 16 17:36:13 2016 -0400
feat: custom extractor generation in place
commit 2285a29908a7f82a5de3c81f6b2b902ddec9bdaa
Author: Adam Pash <adam.pash@gmail.com>
Date: Fri Sep 16 16:42:20 2016 -0400
good progress
8 years ago
Adam Pash
7ade83692a
feat: improve wikipedia parser
8 years ago
Adam Pash
2ae2dba690
chore: renamed iris to mercury
8 years ago
Adam Pash
005ba47f6f
fix: wikpedia transform only grabs one image from .infobox
8 years ago
Adam Pash
8dc6042dc9
build for comparisons
8 years ago
Adam Pash
cbd0636dcf
chore: cleaned up python and other unneeded comments
8 years ago
Adam Pash
bf13b38a9b
feat: some basic error handling for bad urls
8 years ago
Adam Pash
ffaf7db0f1
fix: some improvements to date parsing. punting on localization issues
8 years ago
Adam Pash
396313aeae
feat: added twitter custom extractor
...
Squashed commit of the following:
commit 8116f14364869b72a8afabfcb44b2ac154caed96
Author: Adam Pash <adam.pash@gmail.com>
Date: Thu Sep 15 16:27:27 2016 -0400
feat: added twitter custom extractor
commit e478eb1b0bcdcb65fdd5fa64e37be92b6defd702
Author: Adam Pash <adam.pash@gmail.com>
Date: Thu Sep 15 16:22:54 2016 -0400
fix: made custom extractors and cleaners adhere to underscore keys
8 years ago
Adam Pash
d60d396c98
feat: added text direction to response
8 years ago
Adam Pash
f0f216c7b9
feat: add option to allow custom extractors to skip default cleaners
8 years ago
Adam Pash
97a0728ecf
test: added sanity test for get-extractor
8 years ago
Adam Pash
7c375aded7
chore: cleanup
8 years ago
Adam Pash
4cdc4165d6
fix: encodeURI before fetching
8 years ago
Adam Pash
1343469b6c
fix: explicit/better decoding of gzipped content
8 years ago
Adam Pash
c338098f21
refactor: renamed child to sibling for clarity
8 years ago
Adam Pash
6263e505d5
fix: handling case where node.get(0) returns null
8 years ago
Adam Pash
3b36a33e36
chore: change result keys to match python api
8 years ago
Adam Pash
cc060b794d
fix: wordcount calling excerpt
8 years ago
Adam Pash
7fc1f7f6bb
checking in dist
8 years ago
Adam Pash
daa9266182
feat: generic extractor for word count
...
Squashed commit of the following:
commit 0aba26ef9efba71a72c76fa351a9037e97fc1e9e
Author: Adam Pash <adam.pash@gmail.com>
Date: Wed Sep 14 14:56:45 2016 -0400
fix: normalizeSpaces regex fix broke a test
commit 07d60c1c8c6599d6c94d92e5a70649c28d03d6ea
Author: Adam Pash <adam.pash@gmail.com>
Date: Wed Sep 14 14:52:41 2016 -0400
feat: generic extractor for word count
8 years ago
Adam Pash
76df30e303
chore: cleanup
8 years ago
Adam Pash
b3481a2c45
feat: generic excerpt extraction
8 years ago
Adam Pash
457075889d
fix: selection should not be empty
8 years ago
Adam Pash
81ed4f00ed
feat: improve nymag.com extractor to grab deks from features
8 years ago
Adam Pash
21f444367f
feat: added page counts
8 years ago
Adam Pash
f3a5d0ecca
feat: added domain and url extractor (using same extractor)
...
commit 43ab423d575cd15cc55041fb3fe2f21ffdd7adff
Author: Adam Pash <adam.pash@gmail.com>
Date: Wed Sep 14 11:57:25 2016 -0400
8 years ago
Adam Pash
67296691c2
refactor: page collection
8 years ago
Adam Pash
b325a4acdd
chore: clean up junk tests
8 years ago
Adam Pash
547ee2b4ca
Merge pull request #1 from postlight/test-fix-fixture-locations
...
Fix Fixture Locations
8 years ago
Adam Pash
62ae330db2
fix: bug in scoring and converting to paragraphs
8 years ago
Jeremy Mack
7ca19d2e6f
test: fix fixture locations
8 years ago
Adam Pash
7e2a34945f
chore: refactored and linted
8 years ago
Adam Pash
9906bd36a4
chore: moved content scoring out of utils, removed no-longer-necessary utils
8 years ago
Adam Pash
7ec0ed0d31
feat: nextPageUrl handles multi-page articles
...
Squashed commit of the following:
commit b5070c0967a7f1a0c0c449ba7ea40aebe8fe4bb8
Author: Adam Pash <adam.pash@gmail.com>
Date: Tue Sep 13 10:03:00 2016 -0400
root extractor includes next page url
commit 79be83127d5342d89eef33665586fabea227d6b3
Author: Adam Pash <adam.pash@gmail.com>
Date: Tue Sep 13 09:58:20 2016 -0400
small score adjustment
commit 0f00507dbff43401145a892e849311518edec68a
Author: Adam Pash <adam.pash@gmail.com>
Date: Mon Sep 12 18:17:38 2016 -0400
feat: nextPageUrl generic parser up and running
commit be91c589fc0c6d6f9b573080a76c9b1ac7af710c
Author: Adam Pash <adam.pash@gmail.com>
Date: Mon Sep 12 11:53:58 2016 -0400
feat: pageNumFromUrl extracts the pagenum of the current url
commit ad879d7aabedadfd051c01b42d841703bf4763fa
Author: Adam Pash <adam.pash@gmail.com>
Date: Mon Sep 12 11:52:37 2016 -0400
feat: isWordpress checks if a page is generated by wordpress
8 years ago
Adam Pash
a89b9b785e
feat: small improvement to author selectors
8 years ago
Adam Pash
acaab70ee2
fix: scorePs parent scoring was overwriting child scoring
8 years ago
Adam Pash
8fe3bec6b6
fix: accepting cookies with request (required for sites like
...
nytimes.com)
8 years ago
Adam Pash
74694ba8e2
debugging: cheerio isn't always consistent in setting scores
8 years ago
Adam Pash
47ac7e9803
refactor: limiting calls to $ function
...
Squashed commit of the following:
commit c72da261cb5319d1eef207bff63b3c9cd49018df
Author: Adam Pash <adam.pash@gmail.com>
Date: Fri Sep 9 15:28:43 2016 -0400
refactor: limiting calls to $ function
commit eeae88247d844d5c6acbc529dbc3ce4d14e04191
Author: Adam Pash <adam.pash@gmail.com>
Date: Fri Sep 9 15:14:33 2016 -0400
refactor: convertNodeTo; requires a cheerio object
8 years ago
Adam Pash
81e9e7a317
feat: whitelisting attrs to keep
8 years ago
Adam Pash
7b97559778
chore: remove logic for fetching meta tags with custom attrs (resource
...
normalizes this now
8 years ago
Adam Pash
c48e3485c0
chore: code reorganization
...
Squashed commit of the following:
commit 636296841d5cf5e685237fe70db7a15305d8e966
Author: Adam Pash <adam.pash@gmail.com>
Date: Fri Sep 9 13:37:21 2016 -0400
final cleanup
commit 51f712b3074d41a1f2da91519289d4dd09719ad0
Author: Adam Pash <adam.pash@gmail.com>
Date: Fri Sep 9 13:25:28 2016 -0400
Another big pass
commit 3860e6d872a9adb9290093fd9c8708dfcc773c28
Author: Adam Pash <adam.pash@gmail.com>
Date: Fri Sep 9 12:49:52 2016 -0400
chore: started reorganizing
8 years ago
Adam Pash
f2729a5ee6
improved wiki extractor
8 years ago
Adam Pash
52e89a0229
fix: cleaning embed and object nodes
8 years ago
Adam Pash
edfb54c532
feat: links are rewritten to absolute in cleaner
...
Squashed commit of the following:
commit 9057d411a5458f80c316604559c469a239ef3a40
Author: Adam Pash <adam.pash@gmail.com>
Date: Fri Sep 9 11:42:19 2016 -0400
feat: links are rewritten to absolute in cleaner
8 years ago
Adam Pash
bdc2c0c1da
feat: can now fetch attrs in RootExtractor's select method
8 years ago
Adam Pash
33c7e0d1c9
feat: Improved dateString parsing to handle more; first trying to parse without cleaning
8 years ago
Adam Pash
91881df523
refactor: cleaners now run on custom extractors
...
Squashed commit of the following:
commit e4c7d1d149d1846f0d589b3653655b81b477c682
Author: Adam Pash <adam.pash@gmail.com>
Date: Thu Sep 8 19:29:26 2016 -0400
refactor: cleaners now run on custom extractors
commit ca08d2482c54bf6a40f50758da9353f00987a4d7
Author: Adam Pash <adam.pash@gmail.com>
Date: Thu Sep 8 14:42:19 2016 -0400
moved cleaners, refactored as necessary
commit ec2c5d36410b255c6d8ee264deca990c46709c3c
Author: Adam Pash <adam.pash@gmail.com>
Date: Thu Sep 8 14:07:01 2016 -0400
moved datePublished cleaner
commit 5e55e397eecb3e88d64cd2aa2c6071c9cffed272
Author: Adam Pash <adam.pash@gmail.com>
Date: Thu Sep 8 13:34:21 2016 -0400
moved dek cleaner
commit 2dfb0c44d7882336992fdc864792df6eac094c21
Author: Adam Pash <adam.pash@gmail.com>
Date: Thu Sep 8 13:29:37 2016 -0400
moved lead-image-url
commit cef7a213b80ddd671249225622f1388f9e68896c
Author: Adam Pash <adam.pash@gmail.com>
Date: Thu Sep 8 13:26:20 2016 -0400
moved author
8 years ago
Adam Pash
603682239d
feat: basic wikipedia custom extractor
8 years ago
Adam Pash
9665fe7209
feat: blogspot.com custom extractor
8 years ago
Adam Pash
6c6451b34b
fix: duplicate key bug
8 years ago
Adam Pash
93ca688955
fix: dek and leadImg should not be html
8 years ago
Adam Pash
45ef18ba37
fix: brought .html fixtures into project dir
8 years ago
Adam Pash
7d88fee199
feat: RootExtractor performs extraction using custom and generic
...
extraction methods
8 years ago
Adam Pash
937138c7bb
refactor: improve extractor args; passing as object
8 years ago
Adam Pash
ecacc6ce12
Some good basic restructuring
8 years ago
Adam Pash
b3f90c489e
basic merging of extracting sources
8 years ago
Adam Pash
0f45b39ca2
refactor: preparing for extraction merging
8 years ago
Adam Pash
a022252a14
feat: getExtractor returns generic extractor
8 years ago
Adam Pash
c40b702b93
clean formatting
8 years ago
Adam Pash
dfb5334f18
fix: encoding request response as null
...
This fixes an issue with gzipped content
8 years ago
Adam Pash
ddc684c7d3
updated constants
8 years ago
Adam Pash
189361dc20
cleanup
8 years ago
Adam Pash
ac62e0fba0
fix: pre-loading html in resource
8 years ago
Adam Pash
3128baeda1
cleanup
8 years ago
Adam Pash
86b2ee194c
feat: can pass in raw html if already fetched
8 years ago
Adam Pash
8da2425e59
feat: resource fetches content from a URL and prepares for parsing
...
Squashed commit of the following:
commit 7ba2d2b36d175f5ccbc02f918322ea0dd44bf2c1
Author: Adam Pash <adam.pash@gmail.com>
Date: Tue Sep 6 17:55:10 2016 -0400
feat: resource fetches content from a URL and prepares for parsing
commit 0abdfa49eed5b363169070dac6d65d0a5818c918
Author: Adam Pash <adam.pash@gmail.com>
Date: Tue Sep 6 17:54:07 2016 -0400
fix: this was messing up double Esses ('ss', as in class => cla)
commit 9dc65a99631e3a68267a68b2b4629c4be8f61546
Author: Adam Pash <adam.pash@gmail.com>
Date: Tue Sep 6 14:58:57 2016 -0400
fix: test suite working w/new dirs
commit 993dc33a5229bfa22ea998e3c4fe105be9d91c21
Author: Adam Pash <adam.pash@gmail.com>
Date: Tue Sep 6 14:49:39 2016 -0400
feat: convertLazyLoadedImages puts img urls in the src
commit e7fb105443dd16d036e460ad21fbcb47191f475b
Author: Adam Pash <adam.pash@gmail.com>
Date: Tue Sep 6 14:30:43 2016 -0400
feat: makeLinksAbsolute to fully qualify urls
commit dbd665078af854efe84bbbfe9b55acd02e1a652f
Author: Adam Pash <adam.pash@gmail.com>
Date: Tue Sep 6 13:38:33 2016 -0400
feat: fetchResource to fetch a url and validate the response
commit 42d3937c8f0f8df693996c2edee93625f13dced7
Author: Adam Pash <adam.pash@gmail.com>
Date: Tue Sep 6 10:25:34 2016 -0400
feat: normalizing meta tags
8 years ago
Adam Pash
bc97156718
fix: better scoring for iamge extensions
8 years ago
Adam Pash
11a2286659
notes, cleanup
8 years ago
Adam Pash
752331eaae
feat: bundling with rollup
...
Squashed commit of the following:
commit 52bcf0f2dd79bcb2ee21bc134522edd259a3d35e
Author: Adam Pash <adam.pash@gmail.com>
Date: Fri Sep 2 13:42:29 2016 -0400
fix: converting date to ISO string
commit 11e827e27129ac229a96f66ca03f0b18dc5d289d
Author: Adam Pash <adam.pash@gmail.com>
Date: Fri Sep 2 13:42:12 2016 -0400
feat: bundling with rollup
commit 1ff752a3e44e5836b955f7f15c799abbbdfc9207
Author: Adam Pash <adam.pash@gmail.com>
Date: Fri Sep 2 12:11:39 2016 -0400
clean
8 years ago
Adam Pash
0ff3082295
feat: GenericExtractLeadImageUrl
...
Squashed commit of the following:
commit 22d37ebf26dbbd0a3daebbfde3509a6ce04aaf72
Author: Adam Pash <adam.pash@gmail.com>
Date: Thu Sep 1 17:50:13 2016 -0400
feat: GenericExtractLeadImageUrl
commit 3327a0a7929dd0e9267dc9c26f4e2aa78c32586f
Author: Adam Pash <adam.pash@gmail.com>
Date: Thu Sep 1 15:33:42 2016 -0400
feat: can pass custom attributes to extractFromMeta
8 years ago
Adam Pash
467b600721
feat: extract dek stubbed (not currently functional)
8 years ago
Adam Pash
d3b791d516
fix: title wasn't cleaning html tags
8 years ago
Adam Pash
956fd678f7
feat: GenericDatePublishedExtractor
...
Squashed commit of the following:
commit 8eda4606e773147ae8dd67666d1a64d659f9fdad
Author: Adam Pash <adam.pash@gmail.com>
Date: Thu Sep 1 12:28:06 2016 -0400
feat: GenericDatePublishedExtractor
commit 935510fe9bc0a92f68fca7faf66019cb45330097
Author: Adam Pash <adam.pash@gmail.com>
Date: Thu Sep 1 09:28:42 2016 -0400
updated todo
8 years ago
Adam Pash
29db4a6ee0
feat: extract author
8 years ago
Adam Pash
7e28871a02
chore: plumbing
8 years ago
Adam Pash
746d07d4a2
feat: title extraction and scaffolding for more
...
Squashed commit of the following:
commit 31d8b63dcb3ec9bbd6c8e7a10852fbd060e91103
Author: Adam Pash <adam.pash@gmail.com>
Date: Wed Aug 31 15:52:27 2016 -0400
feat: title extraction
commit 7002c552a9f5bb54630455d983b699c041c629fc
Author: Adam Pash <adam.pash@gmail.com>
Date: Wed Aug 31 14:21:29 2016 -0400
feat: withinComment checks if a node is inside a comment
commit 57f06ef5b499c2f747edee0c9eb276e38984de9a
Author: Adam Pash <adam.pash@gmail.com>
Date: Wed Aug 31 13:40:36 2016 -0400
feat: extractFromMeta function
commit 0947f21aae94fa5ce462246ed5cb53144d563931
Author: Adam Pash <adam.pash@gmail.com>
Date: Wed Aug 31 13:32:30 2016 -0400
fix: returning original string if no tags in string
commit dd6b032e5f9877395b9600480dd96c6fdf60cecd
Author: Adam Pash <adam.pash@gmail.com>
Date: Wed Aug 31 12:03:58 2016 -0400
feat: clean title function removes junk from titles
commit f33b3eef29ad7692441bd0e5aa26b11dd4411dde
Author: Adam Pash <adam.pash@gmail.com>
Date: Wed Aug 31 12:03:35 2016 -0400
chore: renamed function to correct name
commit 076a986b12df68a939a8efa773e01d08780d79aa
Author: Adam Pash <adam.pash@gmail.com>
Date: Wed Aug 31 12:02:18 2016 -0400
feat: utility method to strip tags from text
commit f3e98cdf0a0d7601fab9e8824c0cde73ded51651
Author: Adam Pash <adam.pash@gmail.com>
Date: Wed Aug 31 11:31:33 2016 -0400
feat: resolveSplitTitle cleans raw title text
8 years ago
Adam Pash
07834c0e15
refactor: restructuring for metadata extraction
8 years ago
Adam Pash
95085d1a11
chore: cleanup
8 years ago
Adam Pash
e1ef25aab1
fix: added babel-polyfill for bug in Reflect
8 years ago
Adam Pash
93e844cdfe
feat: implemented extractBestNode functionality
...
Squashed commit of the following:
commit 9af554dd975ff1778ed70c71fa9bde667fc5f880
Author: Adam Pash <adam.pash@gmail.com>
Date: Tue Aug 30 15:19:32 2016 -0400
feat: add cleanHeaders
commit 0dfea98eedc4f97fcbd78866322595c705e20521
Author: Adam Pash <adam.pash@gmail.com>
Date: Tue Aug 30 14:30:49 2016 -0400
fix: scoring parent nodes recursively
commit b6e5897a694adeb81e25a905aba72c0f45a8cc94
Author: Adam Pash <adam.pash@gmail.com>
Date: Tue Aug 30 12:47:24 2016 -0400
feat: extract clean node up and running
commit fb652c5db13db6bce7271efd68ba4b20515e9549
Author: Adam Pash <adam.pash@gmail.com>
Date: Tue Aug 30 09:57:21 2016 -0400
chore: added test for p tags with nested tags (e.g., img, iframe)
commit 731d0a2e4d89121dfafad195e9d0911805c4f8e4
Author: Adam Pash <adam.pash@gmail.com>
Date: Mon Aug 29 17:50:33 2016 -0400
feat: extact clean node integrates most functions
commit 322bc6534d30feb7c1c08d3813132badc6286b40
Author: Adam Pash <adam.pash@gmail.com>
Date: Mon Aug 29 16:46:04 2016 -0400
feat: removing empty nodes as defined in constants
commit f1d38932ea12a865814d2326970031fcb8515baa
Author: Adam Pash <adam.pash@gmail.com>
Date: Mon Aug 29 16:33:31 2016 -0400
feat: cleaning attributes from nodes
commit 0aa73ada6854af0ecd504bfe3d926a9524787ab5
Author: Adam Pash <adam.pash@gmail.com>
Date: Mon Aug 29 16:09:56 2016 -0400
feat: cleaning h1s from text
commit 12d4a309246285c278ce7765e4fbaa8271bb5889
Author: Adam Pash <adam.pash@gmail.com>
Date: Mon Aug 29 15:52:03 2016 -0400
feat: removing spacer images
commit 4e74ff830cc67586560f6fc72e2cfa432a3a2647
Author: Adam Pash <adam.pash@gmail.com>
Date: Mon Aug 29 15:38:49 2016 -0400
feat: stripping unwanted html from doc
commit c774166e90169fd0c1aa89898d3f7a975e82bf0a
Author: Adam Pash <adam.pash@gmail.com>
Date: Mon Aug 29 15:17:32 2016 -0400
feat: removing small images, height attribute from images
commit 3a8642f42cda451669c832482c5e1611b1ff2ea9
Author: Adam Pash <adam.pash@gmail.com>
Date: Mon Aug 29 12:57:45 2016 -0400
feat: rewrite top level
commit a1c03e779234b0aea02206d92ec3dcc15758507e
Author: Adam Pash <adam.pash@gmail.com>
Date: Fri Aug 26 17:34:36 2016 -0400
in a weird place rn
8 years ago
Adam Pash
9da7a6f2a9
feat: find top candidate function
8 years ago
Adam Pash
e2600231ac
feat: added linkDensity function
8 years ago
Adam Pash
c470261d41
fix: changed parseInt to parseFloat
8 years ago
Adam Pash
44eae5e931
feat: added scoreContent function
8 years ago
Adam Pash
bd7ed77f23
Lots of progress on score-content
8 years ago
Adam Pash
cc734c7e7d
chore: cleaned up repetative testing for dom
8 years ago
Adam Pash
f3b1fefba6
chore: refactored tests
8 years ago
Adam Pash
d4a19e6a27
feat: ported scoring methods with unit tests
8 years ago
Adam Pash
97087bd626
chore: refactored to slightly cleaner file structure (more to do here)
8 years ago
Adam Pash
67e212ffac
feat: convertToParagraphs function working
8 years ago
Adam Pash
c237245e89
Converting multiple line breaks to p
8 years ago
Adam Pash
95d02dadd1
simple logic in place for brsToPs
8 years ago
Adam Pash
777e11c25c
Stripping unlikely candidates from DOM
8 years ago
Adam Pash
89a2cfbb82
getWeight with tests
8 years ago
Adam Pash
f3aebb2a16
Basic testing in place
8 years ago