Commit Graph

575 Commits (fix-remove-moment-js)
 

Author SHA1 Message Date
kik0220 c389c966d7 feat: add jvndb.jvn.jp custom parser (#345) 5 years ago
kik0220 8493d05cb5 feat: add scan.netsecurity.ne.jp custom parser (#347) 5 years ago
kik0220 2a76c6c212 feat: add www.elecom.co.jp custom parser (#348) 5 years ago
kik0220 a9e010b718 feat: add www.sanwa.co.jp custom parser (#349) 5 years ago
kik0220 1639eae324 feat: add www.asahi.com custom parser (#350) 5 years ago
kik0220 21f7de70c1 feat: add buzzap.jp custom parser (#351) 5 years ago
kik0220 f3a7e393a3 feat: add www.ossnews.jp custom parser (#352) 5 years ago
kik0220 c309bdb373 feat: add otrs.com custom parser (#353) 5 years ago
Alexsander Akers 71c4d05037 Include "src/shims" for webpack builds for web (#302) 5 years ago
Frankie Simms a3fe02678c chore: small CoC typofix (#358) 5 years ago
John Holdun 437f50a5c8 fix: Initialize Content-Type as empty string if not present (#359) 5 years ago
Frankie Simms da9a836eab chore: remove unneeded import (#357) 5 years ago
Frankie Simms bafa764000 chore: set up ciftr for failed test reports (#343) 5 years ago
Toufic Mouallem 262dda94b3 fix: explicity reject non-200 status codes (#342) 5 years ago
Drew Bell b6c82f2b16 doc: fix extend typo in README (#340) 5 years ago
Toufic Mouallem 144a797564
feat: Support passing custom headers in requests (#337) 5 years ago
Toufic Mouallem 3ed778b53e fix: Adapt CNBC extractor to article redesign (#336) 5 years ago
Toufic Mouallem da9606a4cb docs: Add parsing custom HTML to README.md (#326) 5 years ago
Drew Bell b3e2a0ffd1 feat: extract custom types with extend option (#313)
* feat: extract custom types with extend option

Adds an `extend` option that lets you add custom types to be extracted
and returned alongside the defaults, either in a call to `parse()` or in
a custom extractor.

```
Mercury.parse(
  url,
  extend: {
    last_edited: { selectors: ['#last-edited'], defaultCleaner: false }
  }
)
```

* chore: use Reflect.ownKeys

* feat: add CLI options

* doc: add extend param to cli help

* refactor: extract selectExtendedTypes

* feat: only overwrite null extended results

* feat: add allowMultiple extraction option

* feat: accept extendList CLI args

* feat: allow attribute selectors in extends on CLI

* test: update extend tests

* fix: don't invoke cleaner for custom types

* feat: always return array if allowMultiple

* test: add test for array of single result

* refactor: extract extractHtml

* refactor: destructure allowMultiple

* fix: wrap multiple matches in $ for cheerio shim

* fix: find extended types before any other munging

* feat: absolutize all links

* fix: clean content more directly

* doc: Update CLI docs in README

* chore: update dist

* doc: Document extend in custom extractor README
5 years ago
Toufic Mouallem 136d6df798
feat: Return specific errors on failed parse attempts 5 years ago
Toufic Mouallem a250f403f5 fix: Preserve whitespace in certain HTML elements (#333) 5 years ago
Adam Pash 2a3ade706d fix: run parser preview 5 years ago
Ben Ubois a7e4c67d1d Extract content from GitHub repos. (#306)
* Extract content from GitHub repos.

* Add published and dek.

* Timezone fix.
5 years ago
Matthew Watkins 6e66887048 docs: add content formats to README.md (#318) 5 years ago
Toufic Mouallem 0940971069 fix: better handling for responsive images (#312) 5 years ago
Drew Bell 785a22245f feat: switch from forked request to postman-request (#319) 5 years ago
Toufic Mouallem 7844129fda feat: Add custom parser for Reddit (#307) 5 years ago
Drew Bell 13581cd899 feat: upgrade watchify to remove vulnerable hoek dep (#320) 5 years ago
Drew Bell 91fb0dfb46 fix: update parse signature in tests (#315) 5 years ago
Adam Pash ffb25f34d7
docs: add usage gif (#308) 5 years ago
Toufic Mouallem 9714cb70c5 feat: Use Deadspin parser for all Kinja websites (#304) 5 years ago
Jordan Hotmann 83d1c2401b feat: add custom extractor for blisterreview.com (#299) 5 years ago
kik0220 d9a1e7b22b feat: add news.mynavi.jp custom parser (#287) 5 years ago
Olli Sulopuisto 44a7ec791d docs: typofix (#300) 5 years ago
Adam Pash 0a15a37f04
fix: ci artifact paths (#301) 5 years ago
Adam Pash 9698d9a0c4
dx: comment on custom parser pr fix (#278)
* dx: comment on custom parser pr fix

* fix path

* write json

* chore: rename comment script
5 years ago
Ben Ubois ed14203e97 fix: return early if creating the resource failed. (#285) 5 years ago
greenkeeper[bot] 52dfdda553 Update mocha to the latest version 🚀 (#282)
* chore(package): update mocha to version 6.0.0

* chore(package): update lockfile yarn.lock
5 years ago
Adam Pash b044cfa958
release: 2.0.0 (#275) 5 years ago
Adam Pash 2afd8c9fa8
fix: jquery doesn't like the case insensitive selector (#274) 5 years ago
Adam Pash 9bf88b0ba3
chore: refactor format output adjustments (#272)
I had previously done this in an overly complicated manner. This PR cleans
it up a bit.
5 years ago
David Brownman 867623ab33 chore: add files to package.json (#269) 5 years ago
Adam Pash ab56ce0de3
fix: custom parser generator (#271)
- swap fs import
- fix rollup config
5 years ago
Ben Ubois 0e27448866 feat: Various Character Encoding Improvements (#270)
* Support HTML5 charset tag

In HTML5 `<meta charset="">` is shorthand for `<meta http-equiv="content-type" content="">`
https://developer.mozilla.org/en-US/docs/Web/HTML/Element/meta

* Handle more character encoding declaration methods.
5 years ago
Madison Kanna b3fa18b6d9 docs: delete extra semicolon (#266) 5 years ago
Adam Pash e033835c72
fix: parse signature in cli (#259) 5 years ago
Adam Pash 32748ad4c5
dx: add .prettierignore (#258) 5 years ago
Adam Pash 2d0f10a888
dx: add .prettierignore (#257) 5 years ago
Adam Pash 9b0664bc91
feat: add content format output options (#256) 5 years ago
Adam Pash a57f29eec3
release: 1.1.1 (#254)
see [changelog](./CHANGELOG.md) for changes.
5 years ago