Commit Graph

67 Commits (ad8d4aa268fd5ebf29ac7964d62c99d3bb9c8f4a)

Author SHA1 Message Date
John Holdun ad8d4aa268
release: 2.2.3 (#703) 2 years ago
Sarah Doire 8ca8a5f7e5
feat: add postlight.com custom extractor (#695) 2 years ago
John Holdun 39b9ff55c4
release: 2.2.2 (#689) 2 years ago
John Holdun 0d2bad544c chore: Update builds 2 years ago
Jad Termsani 0ccb91a487
release: 2.2.1 (#631) 3 years ago
Michael Ashley c5c000586d
release: 2.2.0 (#496)
* release: 2.2.0
5 years ago
Adam Pash 713de25751
release: 2.1.1 (#446) 5 years ago
Adam Pash ca47f9c7a7
release: 2.1.0 (#373) 5 years ago
Toufic Mouallem 144a797564
feat: Support passing custom headers in requests (#337) 5 years ago
Drew Bell b3e2a0ffd1 feat: extract custom types with extend option (#313)
* feat: extract custom types with extend option

Adds an `extend` option that lets you add custom types to be extracted
and returned alongside the defaults, either in a call to `parse()` or in
a custom extractor.

```
Mercury.parse(
  url,
  extend: {
    last_edited: { selectors: ['#last-edited'], defaultCleaner: false }
  }
)
```

* chore: use Reflect.ownKeys

* feat: add CLI options

* doc: add extend param to cli help

* refactor: extract selectExtendedTypes

* feat: only overwrite null extended results

* feat: add allowMultiple extraction option

* feat: accept extendList CLI args

* feat: allow attribute selectors in extends on CLI

* test: update extend tests

* fix: don't invoke cleaner for custom types

* feat: always return array if allowMultiple

* test: add test for array of single result

* refactor: extract extractHtml

* refactor: destructure allowMultiple

* fix: wrap multiple matches in $ for cheerio shim

* fix: find extended types before any other munging

* feat: absolutize all links

* fix: clean content more directly

* doc: Update CLI docs in README

* chore: update dist

* doc: Document extend in custom extractor README
5 years ago
Adam Pash b044cfa958
release: 2.0.0 (#275) 5 years ago
Adam Pash 9bf88b0ba3
chore: refactor format output adjustments (#272)
I had previously done this in an overly complicated manner. This PR cleans
it up a bit.
5 years ago
Adam Pash ab56ce0de3
fix: custom parser generator (#271)
- swap fs import
- fix rollup config
5 years ago
Adam Pash 9b0664bc91
feat: add content format output options (#256) 5 years ago
Adam Pash d884c3470c
release: 1.1.0 (#245) 5 years ago
Adam Pash 76d333f0be
deps: upgrade (#218) 5 years ago
Adam Pash fd6c9d4fa3
release: 1.0.13 (#183) 6 years ago
Adam Pash 7fcd9b62eb release: 1.0.12 (#173) 7 years ago
Adam Pash 86d6bd1dc1 release: 1.0.10 (#169) 7 years ago
Adam Pash e56e8e24cd release: 1.0.9 (#167) 7 years ago
Adam Pash 321c087be6 release: 1.0.8 (#164) 7 years ago
Adam Pash e267d57d78 release: 1.0.7 (#160) 7 years ago
Kevin Ngao f2e3f055c2 Fixes an issue with encoding (#154)
* fix: fixes an issue with encoding on the fetch level
7 years ago
Kevin Ngao afbef9bc39 Fix Encoding on Body (#143)
* fix: check encoding on body
7 years ago
Adam Pash 9d4c883d51 release: 1.0.6 (#142) 7 years ago
Adam Pash 601b0fac16 release: 1.0.5 (#136) 7 years ago
Adam Pash dbc706410b release: 1.0.4 (#122) 7 years ago
Adam Pash a710efd2d5 release: 1.0.3 (#62) 8 years ago
Adam Pash 332f85928f release: 1.0.2 (#54) 8 years ago
Adam Pash edcb7295d1 release: 1.0.1 (#48) 8 years ago
Adam Pash e9a36d6ebd release: 1.0.0 so we can start doing proper releaes (#39) 8 years ago
Janet c4d72fb735 feat: add money.cnn custom parser (#26)
* feat: add money.cnn custom parser

* added timezone to cnn custom parser
8 years ago
Adam Pash 6343946dd8 Feat: custom timezones (#29)
* using moment-timezone to allow custom timezones

* added tz to tmz, even though still so-so
8 years ago
Adam Pash a8face796a Fix extension bugs (#23)
* feat: cleaning supplemental elements in nytimes (visible in web only)

closes https://github.com/postlight/mercury-reader-chrome-extension/issues/102

* wip

* fix: more generous date published bits

* feat: added washington post extractor (including figure transforms)

closes https://github.com/postlight/mercury-reader-chrome-extension/issues/100

* feat: cleaning zoom lightbox from gizmodo/kinja

* lint fix
8 years ago
Adam Pash 3a2f32b0eb feat: added tmz custom parser (#22) 8 years ago
Adam Pash 7411922c55 feat: encoding response body based on content-type charset (#21)
Also some small code organization
8 years ago
Adam Pash 60a6861e18 Feat: browser support (#19)
Big undertaking to support Mercury in the browser. Builds are working and all tests are passing both for web and node builds. Most code is closely shared.
8 years ago
Adam Pash eaea57461a fix: servers returning bad headers was breaking request. temporarily (#20)
using fork with a fix for this until request merges the necessary pull request
8 years ago
Adam Pash 6e29848e9c feat: making yarn-friendly for package manager (#17)
* updated several commands; some fixes exposed by yarn upgrade

* removed unnec dep
8 years ago
Adam Pash de5b120b79 feat: allowing extractors to support multiple domains 8 years ago
Adam Pash d038a36544 feat: custom medium extractor 8 years ago
Adam Pash b65b0c98b0 feat: supporting all GMG sites using DeadspinExtractor 8 years ago
Adam Pash 17317823de fix: bug that stopped proper attr cleaning in certain cases 8 years ago
Adam Pash 40768fa188 feat: support lazy loading video on deadspin 8 years ago
Adam Pash c63f500433 fix: narrowed selector to fix blogspot title selector 8 years ago
Adam Pash d3b11be473 feat: keeping youtube and vimeo iframe embeds (#14)
* feat: keeping youtube and vimeo iframe embeds

* fix: removing class from article correctly
8 years ago
Adam Pash 5c7f2cd28e fix: better selector for nytimes authors 8 years ago
Drew Bell 76db95e884 feat: Add custom extrator for Apartment Therapy 8 years ago
Drew Bell a708ad3b4f feat: Add custom parser for broadwayworld.com 8 years ago
Adam Pash 896021227d feat: added deadspin custom parser 8 years ago