Commit Graph

120 Commits

Author SHA1 Message Date
Adam Pash
ad42055f8f feat: switched test framework to jest 2016-09-20 10:52:16 -04:00
Adam Pash
8f42e119e8 feat: generator for custom parsers and some documentation
Squashed commit of the following:

commit deaf9e60d031d9ee06e74b8c0895495b187032a5
Author: Adam Pash <adam.pash@gmail.com>
Date:   Tue Sep 20 10:31:09 2016 -0400

    chore: README for custom parsers

commit a8e8ad633e0d1576a52dbc90ce31b98fb2ec21ee
Author: Adam Pash <adam.pash@gmail.com>
Date:   Mon Sep 19 23:36:09 2016 -0400

    draft of readme

commit 4f0f463f821465c282ce006378e5d55f8f41df5f
Author: Adam Pash <adam.pash@gmail.com>
Date:   Mon Sep 19 17:56:34 2016 -0400

    custom extractor used to build basic parser for theatlantic

commit c5562a3cede41f56c4e723dcfa1181b49dcaae4d
Author: Adam Pash <adam.pash@gmail.com>
Date:   Mon Sep 19 17:20:13 2016 -0400

    pre-commit to test custom parser generator

commit 7d50d5b7ab780b79fae38afcb87a7d1da5d139b2
Author: Adam Pash <adam.pash@gmail.com>
Date:   Mon Sep 19 17:19:55 2016 -0400

    feat: added nytimes parser

commit 58b8d83a56927177984ddfdf70830bc4f328f200
Author: Adam Pash <adam.pash@gmail.com>
Date:   Mon Sep 19 17:17:28 2016 -0400

    feat: can do fuzzy search or go straight to file

commit c99add753723a8e2ac64d51d7379ac8e23125526
Author: Adam Pash <adam.pash@gmail.com>
Date:   Mon Sep 19 10:52:26 2016 -0400

    refactored export for custom extractors for easier renames

commit 22563413669651bb497f1bb2a92085b71f2ae324
Author: Adam Pash <adam.pash@gmail.com>
Date:   Fri Sep 16 17:36:13 2016 -0400

    feat: custom extractor generation in place

commit 2285a29908a7f82a5de3c81f6b2b902ddec9bdaa
Author: Adam Pash <adam.pash@gmail.com>
Date:   Fri Sep 16 16:42:20 2016 -0400

    good progress
2016-09-20 10:37:03 -04:00
Adam Pash
c4f06c7ebc fix: .babelrc was still referencing iris 2016-09-19 13:56:19 -04:00
Adam Pash
f58ccec7aa fix: including babel-runtime as a bandaid for polyfill error 2016-09-19 11:24:43 -04:00
Adam Pash
59fb4c4974 fix: using transform-runtime to avoid babel-polyfill conflicts when used
in external code
2016-09-19 11:04:35 -04:00
Adam Pash
b4fbc5b581 chore: barebones readme 2016-09-16 15:11:46 -04:00
Adam Pash
f439f9d2cf refactor: slightly better preview 2016-09-16 15:10:14 -04:00
Adam Pash
7ade83692a feat: improve wikipedia parser 2016-09-16 13:59:05 -04:00
Adam Pash
1597fc79c2 feat: added preview script to test urls on-the-fly 2016-09-16 13:58:41 -04:00
Adam Pash
2ae2dba690 chore: renamed iris to mercury 2016-09-16 13:26:37 -04:00
Adam Pash
005ba47f6f fix: wikpedia transform only grabs one image from .infobox 2016-09-16 13:17:21 -04:00
Adam Pash
44d3a547b2 fix: added dist back to git 2016-09-16 12:56:08 -04:00
Adam Pash
8dc6042dc9 build for comparisons 2016-09-16 11:35:59 -04:00
Adam Pash
9fa502c1f0 feat: test runner takes args for wildcard search on individual test for easier testing 2016-09-16 11:31:34 -04:00
Adam Pash
cbd0636dcf chore: cleaned up python and other unneeded comments 2016-09-16 11:21:23 -04:00
Adam Pash
bf13b38a9b feat: some basic error handling for bad urls 2016-09-15 17:41:29 -04:00
Adam Pash
9f0c075de4 Merge pull request #3 from postlight/fix-date-not-local
fix: some improvements to date parsing. punting on localization issues
2016-09-15 16:59:31 -04:00
Adam Pash
ffaf7db0f1 fix: some improvements to date parsing. punting on localization issues 2016-09-15 16:57:14 -04:00
Adam Pash
396313aeae feat: added twitter custom extractor
Squashed commit of the following:

commit 8116f14364869b72a8afabfcb44b2ac154caed96
Author: Adam Pash <adam.pash@gmail.com>
Date:   Thu Sep 15 16:27:27 2016 -0400

    feat: added twitter custom extractor

commit e478eb1b0bcdcb65fdd5fa64e37be92b6defd702
Author: Adam Pash <adam.pash@gmail.com>
Date:   Thu Sep 15 16:22:54 2016 -0400

    fix: made custom extractors and cleaners adhere to underscore keys
2016-09-15 16:27:46 -04:00
Adam Pash
d60d396c98 feat: added text direction to response 2016-09-15 15:08:04 -04:00
Adam Pash
f0f216c7b9 feat: add option to allow custom extractors to skip default cleaners 2016-09-15 14:50:51 -04:00
Adam Pash
97a0728ecf test: added sanity test for get-extractor 2016-09-15 14:33:06 -04:00
Adam Pash
7c375aded7 chore: cleanup 2016-09-15 14:29:14 -04:00
Adam Pash
4cdc4165d6 fix: encodeURI before fetching 2016-09-15 14:25:22 -04:00
Adam Pash
1343469b6c fix: explicit/better decoding of gzipped content 2016-09-15 12:39:54 -04:00
Adam Pash
7638c15077 push new build for testing 2016-09-15 12:22:38 -04:00
Adam Pash
c338098f21 refactor: renamed child to sibling for clarity 2016-09-15 12:19:33 -04:00
Adam Pash
6263e505d5 fix: handling case where node.get(0) returns null 2016-09-15 12:17:25 -04:00
Adam Pash
2bf274114f chore: disable camelcase for linting 2016-09-14 16:00:36 -04:00
Adam Pash
3b36a33e36 chore: change result keys to match python api 2016-09-14 15:53:02 -04:00
Adam Pash
cc060b794d fix: wordcount calling excerpt 2016-09-14 15:12:05 -04:00
Adam Pash
7fc1f7f6bb checking in dist 2016-09-14 15:10:03 -04:00
Adam Pash
c76435ce62 updated name in package.json 2016-09-14 15:06:54 -04:00
Adam Pash
f1cff0b435 chore: removed TODO.md 2016-09-14 15:00:56 -04:00
Adam Pash
daa9266182 feat: generic extractor for word count
Squashed commit of the following:

commit 0aba26ef9efba71a72c76fa351a9037e97fc1e9e
Author: Adam Pash <adam.pash@gmail.com>
Date:   Wed Sep 14 14:56:45 2016 -0400

    fix: normalizeSpaces regex fix broke a test

commit 07d60c1c8c6599d6c94d92e5a70649c28d03d6ea
Author: Adam Pash <adam.pash@gmail.com>
Date:   Wed Sep 14 14:52:41 2016 -0400

    feat: generic extractor for word count
2016-09-14 14:58:08 -04:00
Adam Pash
76df30e303 chore: cleanup 2016-09-14 14:28:45 -04:00
Adam Pash
b3481a2c45 feat: generic excerpt extraction 2016-09-14 14:13:59 -04:00
Adam Pash
457075889d fix: selection should not be empty 2016-09-14 13:13:26 -04:00
Adam Pash
81ed4f00ed feat: improve nymag.com extractor to grab deks from features 2016-09-14 13:12:40 -04:00
Adam Pash
21f444367f feat: added page counts 2016-09-14 12:21:32 -04:00
Adam Pash
f3a5d0ecca feat: added domain and url extractor (using same extractor)
commit 43ab423d575cd15cc55041fb3fe2f21ffdd7adff
Author: Adam Pash <adam.pash@gmail.com>
Date:   Wed Sep 14 11:57:25 2016 -0400
2016-09-14 11:58:09 -04:00
Adam Pash
67296691c2 refactor: page collection 2016-09-14 11:12:28 -04:00
Adam Pash
b325a4acdd chore: clean up junk tests 2016-09-14 10:36:34 -04:00
Adam Pash
547ee2b4ca Merge pull request #1 from postlight/test-fix-fixture-locations
Fix Fixture Locations
2016-09-14 10:34:50 -04:00
Adam Pash
62ae330db2 fix: bug in scoring and converting to paragraphs 2016-09-14 10:15:36 -04:00
Adam Pash
3694c2d12c chore: improve linter/babelrc 2016-09-14 10:14:19 -04:00
Jeremy Mack
7ca19d2e6f test: fix fixture locations 2016-09-14 08:09:14 -05:00
Adam Pash
7e2a34945f chore: refactored and linted 2016-09-13 15:22:27 -04:00
Adam Pash
9906bd36a4 chore: moved content scoring out of utils, removed no-longer-necessary utils 2016-09-13 10:25:47 -04:00
Adam Pash
7ec0ed0d31 feat: nextPageUrl handles multi-page articles
Squashed commit of the following:

commit b5070c0967a7f1a0c0c449ba7ea40aebe8fe4bb8
Author: Adam Pash <adam.pash@gmail.com>
Date:   Tue Sep 13 10:03:00 2016 -0400

    root extractor includes next page url

commit 79be83127d5342d89eef33665586fabea227d6b3
Author: Adam Pash <adam.pash@gmail.com>
Date:   Tue Sep 13 09:58:20 2016 -0400

    small score adjustment

commit 0f00507dbff43401145a892e849311518edec68a
Author: Adam Pash <adam.pash@gmail.com>
Date:   Mon Sep 12 18:17:38 2016 -0400

    feat: nextPageUrl generic parser up and running

commit be91c589fc0c6d6f9b573080a76c9b1ac7af710c
Author: Adam Pash <adam.pash@gmail.com>
Date:   Mon Sep 12 11:53:58 2016 -0400

    feat: pageNumFromUrl extracts the pagenum of the current url

commit ad879d7aabedadfd051c01b42d841703bf4763fa
Author: Adam Pash <adam.pash@gmail.com>
Date:   Mon Sep 12 11:52:37 2016 -0400

    feat: isWordpress checks if a page is generated by wordpress
2016-09-13 10:08:49 -04:00