Commit Graph

217 Commits (bdb751fb53ecac944b754d118b740db52bec2ba7)
 

Author SHA1 Message Date
Adam Pash bdb751fb53 feat: more cleaning for wired (#56) 8 years ago
Janet e7e41bd242 feat: the guardian custom extractor (#41) 8 years ago
Adam Pash 332f85928f release: 1.0.2 (#54) 8 years ago
Adam Pash 81aa89f2c1 feat: youtube custom extractor (#53) 8 years ago
Adam Pash 2fb47640f2 Feat: detect platforms (#52)
Detectors for matching extractors for publishing platforms. Currently supporting Medium and Blogger.
8 years ago
Adam Pash 64c0fad2fd fix: preserve whitespace (#51)
No longer normalizing whitespace in html
8 years ago
Adam Pash 15656cb3e1 Refactor: running tests more efficiently (#49)
Only running one parser per page we're testing rather than a parser per field we're testing.
8 years ago
Adam Pash edcb7295d1 release: 1.0.1 (#48) 8 years ago
Adam Pash f9902cfa05 Fix: extension bugs (#47)
* feat: lead image on atlantic stories now included

* feat: supporting buzzfeed "longform" template

* feat: cleaning .parter-box from the atlantic
8 years ago
Adam Pash 16860f1d85 feat: improved nyt parser (#46)
NYT was one of the first, and its test was stale and it didn't have all
of its fields well defined.
8 years ago
Adam Pash d0453efbf8 feat: improvements for nyer magazine articles (#45)
adds dek and date_published for magazine template
8 years ago
Adam Pash 00f8965c1f fix: cleaning up deks (#44)
We've solidified what we consider a dek. This PR removes the dek selectors that do not fit that mold.
8 years ago
Janet b415d1d37c feat: aol custom extractor (#42)
* feat: aol custom parser

* removed work from other commits. merged with latest master
8 years ago
Matt 4cc3b68b5e feat: remove footer links (#40)
the links at the bottom of the stories feel a little spammy because of how we treat links vs. the way they are displayed on the Times, would like to clean them
8 years ago
Adam Pash e9a36d6ebd release: 1.0.0 so we can start doing proper releaes (#39) 8 years ago
Adam Pash ff1963bdca feat: new cleaner for wapo (#38) 8 years ago
Adam Pash 0e6ccdf622 fix: browser cleanup (#35)
Cleaning up after the parser when it's done in the browser, before
returning result.
8 years ago
Adam Pash bd0694fbba feat: preview with optional rebuild (#36)
Now the preview script has an optional build step. Adding --no-rebuild
as an argument to the script will skip the rebuild step and just show a
preview of the parse as is with the current build.
8 years ago
Adam Pash 181b39b238 feat: ci speedup (#37)
minor speedup to see failing tests. linting happens first
8 years ago
Silas Burton c3d98a0d76 Feat cnn extractor (#34)
* wip: cnn custom extactor

* wip: cnn works except first paragraph

* final touches on cnn parser

* cleanup
8 years ago
Silas Burton a0570f8e94 feat: extractor for the verge (#33)
* feat: extractor for the verge's standard article template

* feat: basic support for the verge feature template

* feat: allow multiple links to be previewed

* feat: content selector arrays

Content selector arrays allow custom parsers to select multiple elements
to match and include in the result.

* feat: updated verge parser to use multimatch selectors

* lint fix

* cleanup test builds
8 years ago
Adam Pash 233ca11a33 fix: added timezone to new republic date (#32) 8 years ago
Adam Pash cfe7f34be4 fix: normalizing spaces for authors/dek/title (#31)
* fix: normalizing spaces for authors/dek/title
8 years ago
Adam Pash 9a23b24a89 feat: adjustment for huffpo. skipping overly aggressive default cleaners (#30) 8 years ago
Silas Burton be2e4b5c80 Feat: huffington post extractor (#28)
* wip: huffpo custom extractor

* wip: some huffpo cleanup
8 years ago
Adam Pash 94198c0a65 feat: new republic custom extractor (#25)
* wip: new republic custom extractor

* feat: new republic article extractor

* feat: new republic minutes article extractor
8 years ago
Janet c4d72fb735 feat: add money.cnn custom parser (#26)
* feat: add money.cnn custom parser

* added timezone to cnn custom parser
8 years ago
Adam Pash 6343946dd8 Feat: custom timezones (#29)
* using moment-timezone to allow custom timezones

* added tz to tmz, even though still so-so
8 years ago
Adam Pash 19e7345bfb feat: test builds are created for preview purposes so we aren't committing dist every time (#27) 8 years ago
Adam Pash a8face796a Fix extension bugs (#23)
* feat: cleaning supplemental elements in nytimes (visible in web only)

closes https://github.com/postlight/mercury-reader-chrome-extension/issues/102

* wip

* fix: more generous date published bits

* feat: added washington post extractor (including figure transforms)

closes https://github.com/postlight/mercury-reader-chrome-extension/issues/100

* feat: cleaning zoom lightbox from gizmodo/kinja

* lint fix
8 years ago
Adam Pash 3a2f32b0eb feat: added tmz custom parser (#22) 8 years ago
Adam Pash 783a9cfb2f fix: changed overly liberal regex for removing transparent images 8 years ago
Adam Pash 7411922c55 feat: encoding response body based on content-type charset (#21)
Also some small code organization
8 years ago
Adam Pash 88c125d022 chore: package upgrades 8 years ago
Adam Pash c30fb2e4c0 chore: updated readme 8 years ago
Adam Pash 60a6861e18 Feat: browser support (#19)
Big undertaking to support Mercury in the browser. Builds are working and all tests are passing both for web and node builds. Most code is closely shared.
8 years ago
Adam Pash eaea57461a fix: servers returning bad headers was breaking request. temporarily (#20)
using fork with a fix for this until request merges the necessary pull request
8 years ago
Adam Pash 629eada1f7 feat: recording/playing back network requests with nock (#18)
* feat: recording/playing back network requests with nock

* lint fix
8 years ago
Adam Pash 6e29848e9c feat: making yarn-friendly for package manager (#17)
* updated several commands; some fixes exposed by yarn upgrade

* removed unnec dep
8 years ago
Adam Pash e325d860fd Feat: improving ci (#16)
This commit also swaps in yarn for npm and tweaks circle ci a bit.

* appveyor.yml first go

* changing node

* ps

* narrow it down

* trying this

* fix airbnb module

* trying with yarn

* logging

* hybrid?

* trying yarn w/circle

* bump workers?

* build off?

* updating script

* tweaking script for appveyor

* bumping maxworkers

* cleaning up

* build step?

* yarn it

* added appveyor badge
8 years ago
Adam Pash 071218ab3c chore: added repo 8 years ago
Adam Pash 41c3454590 fix: circle test passing badge 8 years ago
Adam Pash 4c9910384a Feat: adding circle ci (#15)
* added circle.yml config

* set maxworkers in circle

* trying diff node versions

* multiple node

* pre nvm install

* testing parallel

* added badge to readme

* clean up circle.yml
8 years ago
Adam Pash 048d654417 feat: parser auto-generates name; lint is more specific 8 years ago
Adam Pash 65c641a879 feat: enforcing line break rules in linter 8 years ago
Adam Pash 4d1d950807 updated generator templates for new style of import/export. also some
adjustments for usability
8 years ago
Adam Pash 7fa90f59b7 making all.js export a generic function to decrease possiblity of error 8 years ago
Adam Pash de5b120b79 feat: allowing extractors to support multiple domains 8 years ago
Adam Pash d038a36544 feat: custom medium extractor 8 years ago
Adam Pash 007ddec8ac feat: allowing iframes from src domain 8 years ago