Commit Graph

88 Commits (fix-remove-moment-js)

Author SHA1 Message Date
Sarah Doire c0364ec52b
feat: update all fixtures and custom parsers to match (#713)
* feat: Refactor and update fixtures

This patch changes how fixtures are stored. Previously, a fixture's folder identified its domain and its filename identified when it was fetched. This has been changed so that the filename indicates the domain and the modified time of the file indicates how recently it was fetched. A fixture's filename can optionally include a modifier to distinguish between two different page types on the same domain, for example.

Also included here are changes to the update-fixture script, both to accomodate the new filename scheme as well as to actually update all fixtures. The functionality for running automatically and opening PRs has been removed but will likely be reintroduced.

Finally, all fixtures have been updated.

* Remove reference to deleted extractor

* feat: first batch of test and parser updates due to new fixtures

* feat: update more custom parsers and unit tests

* feat: update more custom parsers and unit tests and remove unnecessary parser

* feat: update more custom parsers and unit tests

* feat: update more parsers and add correct bloomberg html files

* fix: remove console statement

* feat: all parsers updated and tests passing

* fix: update date_published tests to account for test server time difference

* fix: cleanup remaining fixtures in folders

* feat: move fixtures for newest custom parsers

* feat: remove script changes

* fix: update dist files to account for reverting script changes

* adding .DS_Store to .gitignore

* adding .DS_Store to .gitignore -- 2

* adding .DS_Store to .gitignore -- 3 lol

* cleaning up some tests

* fix: ran build:generator command to update generate-custom-parser dist file

* fix: update rollup configs to generate source maps and update source maps

* fix: use underscore in place of unused error variable

* fix: remove unused fixture

Co-authored-by: Postlight Bot <adam.pash+postlight-bot@postlight.com>
Co-authored-by: flbn <overasc@gmail.com>
1 year ago
Sarah Doire 7b68bcd94c
feat: remove obsolete custom extractors (#712) 2 years ago
John Holdun ad8d4aa268
release: 2.2.3 (#703) 2 years ago
Sarah Doire 8ca8a5f7e5
feat: add postlight.com custom extractor (#695) 2 years ago
John Holdun 39b9ff55c4
release: 2.2.2 (#689) 2 years ago
John Holdun 0d2bad544c chore: Update builds 2 years ago
Michael Ashley 56a19bf934
fix: updating generate-parser dist (#499) 2 years ago
Jad Termsani 0ccb91a487
release: 2.2.1 (#631) 3 years ago
Michael Ashley c5c000586d
release: 2.2.0 (#496)
* release: 2.2.0
5 years ago
Adam Pash 713de25751
release: 2.1.1 (#446) 5 years ago
Adam Pash ca47f9c7a7
release: 2.1.0 (#373) 5 years ago
Toufic Mouallem 144a797564
feat: Support passing custom headers in requests (#337) 5 years ago
Drew Bell b3e2a0ffd1 feat: extract custom types with extend option (#313)
* feat: extract custom types with extend option

Adds an `extend` option that lets you add custom types to be extracted
and returned alongside the defaults, either in a call to `parse()` or in
a custom extractor.

```
Mercury.parse(
  url,
  extend: {
    last_edited: { selectors: ['#last-edited'], defaultCleaner: false }
  }
)
```

* chore: use Reflect.ownKeys

* feat: add CLI options

* doc: add extend param to cli help

* refactor: extract selectExtendedTypes

* feat: only overwrite null extended results

* feat: add allowMultiple extraction option

* feat: accept extendList CLI args

* feat: allow attribute selectors in extends on CLI

* test: update extend tests

* fix: don't invoke cleaner for custom types

* feat: always return array if allowMultiple

* test: add test for array of single result

* refactor: extract extractHtml

* refactor: destructure allowMultiple

* fix: wrap multiple matches in $ for cheerio shim

* fix: find extended types before any other munging

* feat: absolutize all links

* fix: clean content more directly

* doc: Update CLI docs in README

* chore: update dist

* doc: Document extend in custom extractor README
5 years ago
Adam Pash b044cfa958
release: 2.0.0 (#275) 5 years ago
Adam Pash 9bf88b0ba3
chore: refactor format output adjustments (#272)
I had previously done this in an overly complicated manner. This PR cleans
it up a bit.
5 years ago
Adam Pash ab56ce0de3
fix: custom parser generator (#271)
- swap fs import
- fix rollup config
5 years ago
Adam Pash 9b0664bc91
feat: add content format output options (#256) 5 years ago
Adam Pash d884c3470c
release: 1.1.0 (#245) 5 years ago
Adam Pash 76d333f0be
deps: upgrade (#218) 5 years ago
Adam Pash c643666c88
dx: automate fixture updates (#197) 5 years ago
Adam Pash fd6c9d4fa3
release: 1.0.13 (#183) 6 years ago
Adam Pash 7fcd9b62eb release: 1.0.12 (#173) 7 years ago
Jeremy Mack 5fcea1c5c3 fix: PARSING_NODE undefined (#172)
* fix: PARSING_NODE undefined

* chore: remove unused cleanup function/call
7 years ago
Adam Pash a51cc81c27 release: 1.0.11 (#171) 7 years ago
Jeremy Mack e92e798880 fix: viewport tags leaking to parent page (#170)
* fix: scrub meta viewport tags

They leak to the parent page when using the web version of Mercury
Parser.

* chore: build

* fix: keep DOM in memory to avoid conflicts
7 years ago
Adam Pash 86d6bd1dc1 release: 1.0.10 (#169) 7 years ago
Adam Pash e56e8e24cd release: 1.0.9 (#167) 7 years ago
Adam Pash 321c087be6 release: 1.0.8 (#164) 7 years ago
Adam Pash e267d57d78 release: 1.0.7 (#160) 7 years ago
Kevin Ngao f2e3f055c2 Fixes an issue with encoding (#154)
* fix: fixes an issue with encoding on the fetch level
7 years ago
Kevin Ngao afbef9bc39 Fix Encoding on Body (#143)
* fix: check encoding on body
7 years ago
Adam Pash 9d4c883d51 release: 1.0.6 (#142) 7 years ago
Adam Pash 601b0fac16 release: 1.0.5 (#136) 7 years ago
Adam Pash 31eb4f9222 Feat: LinkedIn parser (#123)
* feat: rebuild custom parser

* feat: linkedin custom parser
7 years ago
Adam Pash dbc706410b release: 1.0.4 (#122) 7 years ago
Adam Pash a710efd2d5 release: 1.0.3 (#62) 8 years ago
Adam Pash 332f85928f release: 1.0.2 (#54) 8 years ago
Adam Pash 15656cb3e1 Refactor: running tests more efficiently (#49)
Only running one parser per page we're testing rather than a parser per field we're testing.
8 years ago
Adam Pash edcb7295d1 release: 1.0.1 (#48) 8 years ago
Adam Pash e9a36d6ebd release: 1.0.0 so we can start doing proper releaes (#39) 8 years ago
Janet c4d72fb735 feat: add money.cnn custom parser (#26)
* feat: add money.cnn custom parser

* added timezone to cnn custom parser
8 years ago
Adam Pash 6343946dd8 Feat: custom timezones (#29)
* using moment-timezone to allow custom timezones

* added tz to tmz, even though still so-so
8 years ago
Adam Pash a8face796a Fix extension bugs (#23)
* feat: cleaning supplemental elements in nytimes (visible in web only)

closes https://github.com/postlight/mercury-reader-chrome-extension/issues/102

* wip

* fix: more generous date published bits

* feat: added washington post extractor (including figure transforms)

closes https://github.com/postlight/mercury-reader-chrome-extension/issues/100

* feat: cleaning zoom lightbox from gizmodo/kinja

* lint fix
8 years ago
Adam Pash 3a2f32b0eb feat: added tmz custom parser (#22) 8 years ago
Adam Pash 7411922c55 feat: encoding response body based on content-type charset (#21)
Also some small code organization
8 years ago
Adam Pash 60a6861e18 Feat: browser support (#19)
Big undertaking to support Mercury in the browser. Builds are working and all tests are passing both for web and node builds. Most code is closely shared.
8 years ago
Adam Pash eaea57461a fix: servers returning bad headers was breaking request. temporarily (#20)
using fork with a fix for this until request merges the necessary pull request
8 years ago
Adam Pash 6e29848e9c feat: making yarn-friendly for package manager (#17)
* updated several commands; some fixes exposed by yarn upgrade

* removed unnec dep
8 years ago
Adam Pash 048d654417 feat: parser auto-generates name; lint is more specific 8 years ago
Adam Pash 4d1d950807 updated generator templates for new style of import/export. also some
adjustments for usability
8 years ago