Adam Pash
ff144952b9
dx: test/finish bot preview
2019-01-14 11:18:32 -08:00
Adam Pash
d35f7bd5bf
dx: comment on PRs when fixtures have been added/changed ( #192 )
...
The goal here is to provide some sort of relatively easy preview for the
PR reviewer to see if the fixture looks good, if the parsing is working,
and to make suggestions easily.
2019-01-11 13:58:28 -08:00
Adam Pash
4478338046
docs: document release process ( #186 )
2018-12-20 09:30:47 -08:00
Adam Pash
fd6c9d4fa3
release: 1.0.13 ( #183 )
2018-10-12 15:01:42 -07:00
Adam Pash
7fcd9b62eb
release: 1.0.12 ( #173 )
2017-04-10 16:10:52 -07:00
Adam Pash
a51cc81c27
release: 1.0.11 ( #171 )
2017-04-10 14:57:32 -07:00
Adam Pash
86d6bd1dc1
release: 1.0.10 ( #169 )
2017-03-24 15:24:06 -07:00
Adam Pash
e56e8e24cd
release: 1.0.9 ( #167 )
2017-03-23 13:39:46 -07:00
Adam Pash
321c087be6
release: 1.0.8 ( #164 )
2017-03-22 14:08:22 -07:00
Adam Pash
e267d57d78
release: 1.0.7 ( #160 )
2017-03-15 09:16:04 -07:00
Adam Pash
9d4c883d51
release: 1.0.6 ( #142 )
2017-02-09 08:58:49 -08:00
Adam Pash
601b0fac16
release: 1.0.5 ( #136 )
2017-02-01 15:39:19 -08:00
Adam Pash
dbc706410b
release: 1.0.4 ( #122 )
2017-01-26 08:42:37 -08:00
Adam Pash
a710efd2d5
release: 1.0.3 ( #62 )
2016-12-09 12:15:40 -05:00
Adam Pash
8070e4790b
test: streamlined guardian tests w/new single-extraction ( #58 )
2016-12-07 13:17:25 -05:00
Adam Pash
332f85928f
release: 1.0.2 ( #54 )
2016-12-06 14:51:01 -05:00
Adam Pash
15656cb3e1
Refactor: running tests more efficiently ( #49 )
...
Only running one parser per page we're testing rather than a parser per field we're testing.
2016-12-05 15:39:45 -05:00
Adam Pash
edcb7295d1
release: 1.0.1 ( #48 )
2016-12-02 16:14:07 -08:00
Janet
c4d72fb735
feat: add money.cnn custom parser ( #26 )
...
* feat: add money.cnn custom parser
* added timezone to cnn custom parser
2016-11-29 15:13:29 -08:00
Adam Pash
6343946dd8
Feat: custom timezones ( #29 )
...
* using moment-timezone to allow custom timezones
* added tz to tmz, even though still so-so
2016-11-29 14:46:46 -08:00
Adam Pash
7411922c55
feat: encoding response body based on content-type charset ( #21 )
...
Also some small code organization
2016-11-22 10:44:27 -08:00
Adam Pash
88c125d022
chore: package upgrades
2016-11-22 08:45:57 -08:00
Adam Pash
60a6861e18
Feat: browser support ( #19 )
...
Big undertaking to support Mercury in the browser. Builds are working and all tests are passing both for web and node builds. Most code is closely shared.
2016-11-21 14:17:06 -08:00
Adam Pash
eaea57461a
fix: servers returning bad headers was breaking request. temporarily ( #20 )
...
using fork with a fix for this until request merges the necessary pull request
2016-11-15 13:17:01 -08:00
Adam Pash
629eada1f7
feat: recording/playing back network requests with nock ( #18 )
...
* feat: recording/playing back network requests with nock
* lint fix
2016-10-28 14:54:12 -07:00
Adam Pash
e325d860fd
Feat: improving ci ( #16 )
...
This commit also swaps in yarn for npm and tweaks circle ci a bit.
* appveyor.yml first go
* changing node
* ps
* narrow it down
* trying this
* fix airbnb module
* trying with yarn
* logging
* hybrid?
* trying yarn w/circle
* bump workers?
* build off?
* updating script
* tweaking script for appveyor
* bumping maxworkers
* cleaning up
* build step?
* yarn it
* added appveyor badge
2016-10-28 09:16:21 -07:00
Adam Pash
071218ab3c
chore: added repo
2016-10-27 16:53:25 -07:00
Adam Pash
048d654417
feat: parser auto-generates name; lint is more specific
2016-10-27 14:54:38 -07:00
Adam Pash
7fa90f59b7
making all.js export a generic function to decrease possiblity of error
2016-10-27 10:19:21 -07:00
Adam Pash
a73246306d
feat: quicker lint by being more specific
2016-10-26 16:05:00 -07:00
Adam Pash
4b5c029093
feat: added all-contributors
2016-10-26 15:42:55 -07:00
Adam Pash
eb0aa0b1f6
feat: some small tweaks to toy's excellent parsers ☺️
...
Squashed commit of the following:
commit 9638220124a325322d6cda7d16c645185d5fe827
Author: Adam Pash <adam.pash@gmail.com>
Date: Mon Oct 10 11:02:29 2016 -0700
fix: removed eslint plugin that was adding unneded async parens
commit ce2268c0f7c1b093c06f156730a0f1bc2aaba39c
Author: Adam Pash <adam.pash@gmail.com>
Date: Mon Oct 10 10:47:36 2016 -0700
style: fix async in parens
commit 9591856915eddaf93170da1ce9225b8a378bdf55
Author: Adam Pash <adam.pash@gmail.com>
Date: Mon Oct 10 10:37:11 2016 -0700
fix: remove parens around async
commit 6c56054717acc1f7e5499691780f8273f6d07bac
Author: Adam Pash <adam.pash@gmail.com>
Date: Mon Oct 10 10:35:50 2016 -0700
fix msn fixture; adjusted yahoo test
commit 4fc117ad5fdc5528f29b0873d60a6a1709642f15
Author: Adam Pash <adam.pash@gmail.com>
Date: Mon Oct 10 10:14:38 2016 -0700
removed dek and date_publised tests; neither exist in littlethings
commit 401094b4abc52901255fd2461f5839624f11d8a3
Author: Adam Pash <adam.pash@gmail.com>
Date: Mon Oct 10 10:08:44 2016 -0700
feat: updated buzzfeed for content extraction
commit 19548a5485f70ff9b65e3e725d2364d07734ac9c
Author: Adam Pash <adam.pash@gmail.com>
Date: Mon Oct 10 09:54:30 2016 -0700
fix: generator should make transforms an object, not array
commit b92113f9f7c97aca9e6d3ce9243abac967d26b63
Author: Adam Pash <adam.pash@gmail.com>
Date: Mon Oct 10 08:54:38 2016 -0700
feat: updated politico
commit c026591040f7671cb2a6dd5177a995e21d015482
Author: Adam Pash <adam.pash@gmail.com>
Date: Mon Oct 10 08:48:52 2016 -0700
fix: typos
commit 14aa8fa4ce38ff1c2a212cd0225437ae3042c2c3
Author: Adam Pash <adam.pash@gmail.com>
Date: Mon Oct 10 08:36:12 2016 -0700
fix: incorrect command in readme
commit fe260e6122877e2cb0130a1ecde0e503017057a3
Author: Adam Pash <adam.pash@gmail.com>
Date: Mon Oct 10 08:31:11 2016 -0700
fix: removed dek test because there is no dek on wikia
2016-10-10 11:03:43 -07:00
Adam Pash
173f885674
feat: custom parser + generator + detailed readme instructions
...
Squashed commit of the following:
commit 02563daa67712c3679258ebebac60dfa9568dffb
Author: Adam Pash <adam.pash@gmail.com>
Date: Fri Sep 30 12:25:44 2016 -0400
updated readme, added newyorker parser for readme guide
commit 0ac613ef823efbffbf4cc9a89e5cb2489d1c4f6f
Author: Adam Pash <adam.pash@gmail.com>
Date: Fri Sep 30 11:16:52 2016 -0400
feat: updated parser so the saved fixture absolutizes urls
commit 85c7a2660b21f95c2205ca4a4378a7570687fed0
Author: Adam Pash <adam.pash@gmail.com>
Date: Fri Sep 30 10:15:26 2016 -0400
refactor: attribute selectors must be an array for custom extractors
commit f60f93d5d3d9b2f2d9ec6f28d27ae9dcf16ef01e
Author: Adam Pash <adam.pash@gmail.com>
Date: Thu Sep 29 10:13:14 2016 -0400
fix: whitelisting srcset and alt attributes
commit e31cb1f4e8a9fc9c3d9b20ef9f40ca6c8d6ad51a
Author: Adam Pash <adam.pash@gmail.com>
Date: Thu Sep 29 09:44:21 2016 -0400
some housekeeping for coverage tests
commit 39eafe420c776a1fe7f9fea634fb529a3ed75a71
Author: Adam Pash <adam.pash@gmail.com>
Date: Wed Sep 28 17:52:08 2016 -0400
fix: word count for multi-page articles
commit b04e0066b52f190481b1b604c64e3d0b1226ff02
Author: Adam Pash <adam.pash@gmail.com>
Date: Thu Sep 22 10:40:23 2016 -0400
major improvements to output
commit 3f3a880b63b47fe21953485da670b6e291ac60e5
Author: Adam Pash <adam.pash@gmail.com>
Date: Wed Sep 21 17:27:53 2016 -0400
updated test command
commit 14503426557a870755453572221d95c92cff4bd2
Author: Adam Pash <adam.pash@gmail.com>
Date: Wed Sep 21 16:00:30 2016 -0400
shortened generator command
commit 5ebd8343cd4b87b3f5787dab665bff0de96846e1
Author: Adam Pash <adam.pash@gmail.com>
Date: Wed Sep 21 15:59:14 2016 -0400
feat: can disable fallback to generic parser (this will be useful for testing custom parsers)
2016-09-30 12:26:25 -04:00
Adam Pash
ad42055f8f
feat: switched test framework to jest
2016-09-20 10:52:16 -04:00
Adam Pash
8f42e119e8
feat: generator for custom parsers and some documentation
...
Squashed commit of the following:
commit deaf9e60d031d9ee06e74b8c0895495b187032a5
Author: Adam Pash <adam.pash@gmail.com>
Date: Tue Sep 20 10:31:09 2016 -0400
chore: README for custom parsers
commit a8e8ad633e0d1576a52dbc90ce31b98fb2ec21ee
Author: Adam Pash <adam.pash@gmail.com>
Date: Mon Sep 19 23:36:09 2016 -0400
draft of readme
commit 4f0f463f821465c282ce006378e5d55f8f41df5f
Author: Adam Pash <adam.pash@gmail.com>
Date: Mon Sep 19 17:56:34 2016 -0400
custom extractor used to build basic parser for theatlantic
commit c5562a3cede41f56c4e723dcfa1181b49dcaae4d
Author: Adam Pash <adam.pash@gmail.com>
Date: Mon Sep 19 17:20:13 2016 -0400
pre-commit to test custom parser generator
commit 7d50d5b7ab780b79fae38afcb87a7d1da5d139b2
Author: Adam Pash <adam.pash@gmail.com>
Date: Mon Sep 19 17:19:55 2016 -0400
feat: added nytimes parser
commit 58b8d83a56927177984ddfdf70830bc4f328f200
Author: Adam Pash <adam.pash@gmail.com>
Date: Mon Sep 19 17:17:28 2016 -0400
feat: can do fuzzy search or go straight to file
commit c99add753723a8e2ac64d51d7379ac8e23125526
Author: Adam Pash <adam.pash@gmail.com>
Date: Mon Sep 19 10:52:26 2016 -0400
refactored export for custom extractors for easier renames
commit 22563413669651bb497f1bb2a92085b71f2ae324
Author: Adam Pash <adam.pash@gmail.com>
Date: Fri Sep 16 17:36:13 2016 -0400
feat: custom extractor generation in place
commit 2285a29908a7f82a5de3c81f6b2b902ddec9bdaa
Author: Adam Pash <adam.pash@gmail.com>
Date: Fri Sep 16 16:42:20 2016 -0400
good progress
2016-09-20 10:37:03 -04:00
Adam Pash
f58ccec7aa
fix: including babel-runtime as a bandaid for polyfill error
2016-09-19 11:24:43 -04:00
Adam Pash
59fb4c4974
fix: using transform-runtime to avoid babel-polyfill conflicts when used
...
in external code
2016-09-19 11:04:35 -04:00
Adam Pash
2ae2dba690
chore: renamed iris to mercury
2016-09-16 13:26:37 -04:00
Adam Pash
d60d396c98
feat: added text direction to response
2016-09-15 15:08:04 -04:00
Adam Pash
c76435ce62
updated name in package.json
2016-09-14 15:06:54 -04:00
Adam Pash
76df30e303
chore: cleanup
2016-09-14 14:28:45 -04:00
Adam Pash
67296691c2
refactor: page collection
2016-09-14 11:12:28 -04:00
Adam Pash
3694c2d12c
chore: improve linter/babelrc
2016-09-14 10:14:19 -04:00
Adam Pash
7e2a34945f
chore: refactored and linted
2016-09-13 15:22:27 -04:00
Adam Pash
7ec0ed0d31
feat: nextPageUrl handles multi-page articles
...
Squashed commit of the following:
commit b5070c0967a7f1a0c0c449ba7ea40aebe8fe4bb8
Author: Adam Pash <adam.pash@gmail.com>
Date: Tue Sep 13 10:03:00 2016 -0400
root extractor includes next page url
commit 79be83127d5342d89eef33665586fabea227d6b3
Author: Adam Pash <adam.pash@gmail.com>
Date: Tue Sep 13 09:58:20 2016 -0400
small score adjustment
commit 0f00507dbff43401145a892e849311518edec68a
Author: Adam Pash <adam.pash@gmail.com>
Date: Mon Sep 12 18:17:38 2016 -0400
feat: nextPageUrl generic parser up and running
commit be91c589fc0c6d6f9b573080a76c9b1ac7af710c
Author: Adam Pash <adam.pash@gmail.com>
Date: Mon Sep 12 11:53:58 2016 -0400
feat: pageNumFromUrl extracts the pagenum of the current url
commit ad879d7aabedadfd051c01b42d841703bf4763fa
Author: Adam Pash <adam.pash@gmail.com>
Date: Mon Sep 12 11:52:37 2016 -0400
feat: isWordpress checks if a page is generated by wordpress
2016-09-13 10:08:49 -04:00
Adam Pash
c48e3485c0
chore: code reorganization
...
Squashed commit of the following:
commit 636296841d5cf5e685237fe70db7a15305d8e966
Author: Adam Pash <adam.pash@gmail.com>
Date: Fri Sep 9 13:37:21 2016 -0400
final cleanup
commit 51f712b3074d41a1f2da91519289d4dd09719ad0
Author: Adam Pash <adam.pash@gmail.com>
Date: Fri Sep 9 13:25:28 2016 -0400
Another big pass
commit 3860e6d872a9adb9290093fd9c8708dfcc773c28
Author: Adam Pash <adam.pash@gmail.com>
Date: Fri Sep 9 12:49:52 2016 -0400
chore: started reorganizing
2016-09-09 13:44:58 -04:00
Adam Pash
8da2425e59
feat: resource fetches content from a URL and prepares for parsing
...
Squashed commit of the following:
commit 7ba2d2b36d175f5ccbc02f918322ea0dd44bf2c1
Author: Adam Pash <adam.pash@gmail.com>
Date: Tue Sep 6 17:55:10 2016 -0400
feat: resource fetches content from a URL and prepares for parsing
commit 0abdfa49eed5b363169070dac6d65d0a5818c918
Author: Adam Pash <adam.pash@gmail.com>
Date: Tue Sep 6 17:54:07 2016 -0400
fix: this was messing up double Esses ('ss', as in class => cla)
commit 9dc65a99631e3a68267a68b2b4629c4be8f61546
Author: Adam Pash <adam.pash@gmail.com>
Date: Tue Sep 6 14:58:57 2016 -0400
fix: test suite working w/new dirs
commit 993dc33a5229bfa22ea998e3c4fe105be9d91c21
Author: Adam Pash <adam.pash@gmail.com>
Date: Tue Sep 6 14:49:39 2016 -0400
feat: convertLazyLoadedImages puts img urls in the src
commit e7fb105443dd16d036e460ad21fbcb47191f475b
Author: Adam Pash <adam.pash@gmail.com>
Date: Tue Sep 6 14:30:43 2016 -0400
feat: makeLinksAbsolute to fully qualify urls
commit dbd665078af854efe84bbbfe9b55acd02e1a652f
Author: Adam Pash <adam.pash@gmail.com>
Date: Tue Sep 6 13:38:33 2016 -0400
feat: fetchResource to fetch a url and validate the response
commit 42d3937c8f0f8df693996c2edee93625f13dced7
Author: Adam Pash <adam.pash@gmail.com>
Date: Tue Sep 6 10:25:34 2016 -0400
feat: normalizing meta tags
2016-09-06 17:55:45 -04:00
Adam Pash
752331eaae
feat: bundling with rollup
...
Squashed commit of the following:
commit 52bcf0f2dd79bcb2ee21bc134522edd259a3d35e
Author: Adam Pash <adam.pash@gmail.com>
Date: Fri Sep 2 13:42:29 2016 -0400
fix: converting date to ISO string
commit 11e827e27129ac229a96f66ca03f0b18dc5d289d
Author: Adam Pash <adam.pash@gmail.com>
Date: Fri Sep 2 13:42:12 2016 -0400
feat: bundling with rollup
commit 1ff752a3e44e5836b955f7f15c799abbbdfc9207
Author: Adam Pash <adam.pash@gmail.com>
Date: Fri Sep 2 12:11:39 2016 -0400
clean
2016-09-02 13:43:03 -04:00
Adam Pash
0ff3082295
feat: GenericExtractLeadImageUrl
...
Squashed commit of the following:
commit 22d37ebf26dbbd0a3daebbfde3509a6ce04aaf72
Author: Adam Pash <adam.pash@gmail.com>
Date: Thu Sep 1 17:50:13 2016 -0400
feat: GenericExtractLeadImageUrl
commit 3327a0a7929dd0e9267dc9c26f4e2aa78c32586f
Author: Adam Pash <adam.pash@gmail.com>
Date: Thu Sep 1 15:33:42 2016 -0400
feat: can pass custom attributes to extractFromMeta
2016-09-01 17:50:42 -04:00
Adam Pash
956fd678f7
feat: GenericDatePublishedExtractor
...
Squashed commit of the following:
commit 8eda4606e773147ae8dd67666d1a64d659f9fdad
Author: Adam Pash <adam.pash@gmail.com>
Date: Thu Sep 1 12:28:06 2016 -0400
feat: GenericDatePublishedExtractor
commit 935510fe9bc0a92f68fca7faf66019cb45330097
Author: Adam Pash <adam.pash@gmail.com>
Date: Thu Sep 1 09:28:42 2016 -0400
updated todo
2016-09-01 12:28:39 -04:00