Michael Ashley
56a19bf934
fix: updating generate-parser dist ( #499 )
2022-05-09 08:58:26 -07:00
Jad Termsani
0ccb91a487
release: 2.2.1 ( #631 )
2021-09-08 14:06:38 -05:00
Michael Ashley
c5c000586d
release: 2.2.0 ( #496 )
...
* release: 2.2.0
2019-09-10 09:51:14 -07:00
Adam Pash
713de25751
release: 2.1.1 ( #446 )
2019-06-26 13:36:55 -07:00
Adam Pash
ca47f9c7a7
release: 2.1.0 ( #373 )
2019-04-10 08:42:10 -07:00
Toufic Mouallem
144a797564
feat: Support passing custom headers in requests ( #337 )
2019-03-26 13:48:41 +02:00
Drew Bell
b3e2a0ffd1
feat: extract custom types with extend option ( #313 )
...
* feat: extract custom types with extend option
Adds an `extend` option that lets you add custom types to be extracted
and returned alongside the defaults, either in a call to `parse()` or in
a custom extractor.
```
Mercury.parse(
url,
extend: {
last_edited: { selectors: ['#last-edited'], defaultCleaner: false }
}
)
```
* chore: use Reflect.ownKeys
* feat: add CLI options
* doc: add extend param to cli help
* refactor: extract selectExtendedTypes
* feat: only overwrite null extended results
* feat: add allowMultiple extraction option
* feat: accept extendList CLI args
* feat: allow attribute selectors in extends on CLI
* test: update extend tests
* fix: don't invoke cleaner for custom types
* feat: always return array if allowMultiple
* test: add test for array of single result
* refactor: extract extractHtml
* refactor: destructure allowMultiple
* fix: wrap multiple matches in $ for cheerio shim
* fix: find extended types before any other munging
* feat: absolutize all links
* fix: clean content more directly
* doc: Update CLI docs in README
* chore: update dist
* doc: Document extend in custom extractor README
2019-03-25 15:36:20 -07:00
Adam Pash
b044cfa958
release: 2.0.0 ( #275 )
2019-02-13 15:46:45 -08:00
Adam Pash
9bf88b0ba3
chore: refactor format output adjustments ( #272 )
...
I had previously done this in an overly complicated manner. This PR cleans
it up a bit.
2019-02-13 13:30:49 -08:00
Adam Pash
ab56ce0de3
fix: custom parser generator ( #271 )
...
- swap fs import
- fix rollup config
2019-02-12 16:14:47 -08:00
Adam Pash
9b0664bc91
feat: add content format output options ( #256 )
2019-02-07 16:48:13 -08:00
Adam Pash
d884c3470c
release: 1.1.0 ( #245 )
2019-02-05 14:53:22 -08:00
Adam Pash
76d333f0be
deps: upgrade ( #218 )
2019-01-23 09:54:42 -08:00
Adam Pash
c643666c88
dx: automate fixture updates ( #197 )
2019-01-15 15:41:18 -08:00
Adam Pash
fd6c9d4fa3
release: 1.0.13 ( #183 )
2018-10-12 15:01:42 -07:00
Adam Pash
7fcd9b62eb
release: 1.0.12 ( #173 )
2017-04-10 16:10:52 -07:00
Jeremy Mack
5fcea1c5c3
fix: PARSING_NODE undefined ( #172 )
...
* fix: PARSING_NODE undefined
* chore: remove unused cleanup function/call
2017-04-10 15:55:21 -07:00
Adam Pash
a51cc81c27
release: 1.0.11 ( #171 )
2017-04-10 14:57:32 -07:00
Jeremy Mack
e92e798880
fix: viewport tags leaking to parent page ( #170 )
...
* fix: scrub meta viewport tags
They leak to the parent page when using the web version of Mercury
Parser.
* chore: build
* fix: keep DOM in memory to avoid conflicts
2017-04-10 14:35:23 -07:00
Adam Pash
86d6bd1dc1
release: 1.0.10 ( #169 )
2017-03-24 15:24:06 -07:00
Adam Pash
e56e8e24cd
release: 1.0.9 ( #167 )
2017-03-23 13:39:46 -07:00
Adam Pash
321c087be6
release: 1.0.8 ( #164 )
2017-03-22 14:08:22 -07:00
Adam Pash
e267d57d78
release: 1.0.7 ( #160 )
2017-03-15 09:16:04 -07:00
Kevin Ngao
f2e3f055c2
Fixes an issue with encoding ( #154 )
...
* fix: fixes an issue with encoding on the fetch level
2017-03-10 17:40:31 -05:00
Kevin Ngao
afbef9bc39
Fix Encoding on Body ( #143 )
...
* fix: check encoding on body
2017-03-06 11:36:56 -05:00
Adam Pash
9d4c883d51
release: 1.0.6 ( #142 )
2017-02-09 08:58:49 -08:00
Adam Pash
601b0fac16
release: 1.0.5 ( #136 )
2017-02-01 15:39:19 -08:00
Adam Pash
31eb4f9222
Feat: LinkedIn parser ( #123 )
...
* feat: rebuild custom parser
* feat: linkedin custom parser
2017-01-26 10:11:10 -08:00
Adam Pash
dbc706410b
release: 1.0.4 ( #122 )
2017-01-26 08:42:37 -08:00
Adam Pash
a710efd2d5
release: 1.0.3 ( #62 )
2016-12-09 12:15:40 -05:00
Adam Pash
332f85928f
release: 1.0.2 ( #54 )
2016-12-06 14:51:01 -05:00
Adam Pash
15656cb3e1
Refactor: running tests more efficiently ( #49 )
...
Only running one parser per page we're testing rather than a parser per field we're testing.
2016-12-05 15:39:45 -05:00
Adam Pash
edcb7295d1
release: 1.0.1 ( #48 )
2016-12-02 16:14:07 -08:00
Adam Pash
e9a36d6ebd
release: 1.0.0 so we can start doing proper releaes ( #39 )
2016-11-30 17:49:50 -08:00
Janet
c4d72fb735
feat: add money.cnn custom parser ( #26 )
...
* feat: add money.cnn custom parser
* added timezone to cnn custom parser
2016-11-29 15:13:29 -08:00
Adam Pash
6343946dd8
Feat: custom timezones ( #29 )
...
* using moment-timezone to allow custom timezones
* added tz to tmz, even though still so-so
2016-11-29 14:46:46 -08:00
Adam Pash
a8face796a
Fix extension bugs ( #23 )
...
* feat: cleaning supplemental elements in nytimes (visible in web only)
closes https://github.com/postlight/mercury-reader-chrome-extension/issues/102
* wip
* fix: more generous date published bits
* feat: added washington post extractor (including figure transforms)
closes https://github.com/postlight/mercury-reader-chrome-extension/issues/100
* feat: cleaning zoom lightbox from gizmodo/kinja
* lint fix
2016-11-28 16:58:21 -08:00
Adam Pash
3a2f32b0eb
feat: added tmz custom parser ( #22 )
2016-11-28 15:10:28 -08:00
Adam Pash
7411922c55
feat: encoding response body based on content-type charset ( #21 )
...
Also some small code organization
2016-11-22 10:44:27 -08:00
Adam Pash
60a6861e18
Feat: browser support ( #19 )
...
Big undertaking to support Mercury in the browser. Builds are working and all tests are passing both for web and node builds. Most code is closely shared.
2016-11-21 14:17:06 -08:00
Adam Pash
eaea57461a
fix: servers returning bad headers was breaking request. temporarily ( #20 )
...
using fork with a fix for this until request merges the necessary pull request
2016-11-15 13:17:01 -08:00
Adam Pash
6e29848e9c
feat: making yarn-friendly for package manager ( #17 )
...
* updated several commands; some fixes exposed by yarn upgrade
* removed unnec dep
2016-10-28 11:10:42 -07:00
Adam Pash
048d654417
feat: parser auto-generates name; lint is more specific
2016-10-27 14:54:38 -07:00
Adam Pash
4d1d950807
updated generator templates for new style of import/export. also some
...
adjustments for usability
2016-10-27 10:44:06 -07:00
Adam Pash
de5b120b79
feat: allowing extractors to support multiple domains
2016-10-27 09:20:53 -07:00
Adam Pash
d038a36544
feat: custom medium extractor
2016-10-27 08:47:25 -07:00
Adam Pash
b65b0c98b0
feat: supporting all GMG sites using DeadspinExtractor
2016-10-26 16:05:15 -07:00
Adam Pash
17317823de
fix: bug that stopped proper attr cleaning in certain cases
2016-10-26 14:17:52 -07:00
Adam Pash
40768fa188
feat: support lazy loading video on deadspin
2016-10-26 11:53:42 -07:00
Adam Pash
c63f500433
fix: narrowed selector to fix blogspot title selector
2016-10-26 11:16:31 -07:00