Adam Pash
bdb751fb53
feat: more cleaning for wired ( #56 )
8 years ago
Janet
e7e41bd242
feat: the guardian custom extractor ( #41 )
8 years ago
Adam Pash
332f85928f
release: 1.0.2 ( #54 )
8 years ago
Adam Pash
81aa89f2c1
feat: youtube custom extractor ( #53 )
8 years ago
Adam Pash
2fb47640f2
Feat: detect platforms ( #52 )
...
Detectors for matching extractors for publishing platforms. Currently supporting Medium and Blogger.
8 years ago
Adam Pash
64c0fad2fd
fix: preserve whitespace ( #51 )
...
No longer normalizing whitespace in html
8 years ago
Adam Pash
15656cb3e1
Refactor: running tests more efficiently ( #49 )
...
Only running one parser per page we're testing rather than a parser per field we're testing.
8 years ago
Adam Pash
edcb7295d1
release: 1.0.1 ( #48 )
8 years ago
Adam Pash
f9902cfa05
Fix: extension bugs ( #47 )
...
* feat: lead image on atlantic stories now included
* feat: supporting buzzfeed "longform" template
* feat: cleaning .parter-box from the atlantic
8 years ago
Adam Pash
16860f1d85
feat: improved nyt parser ( #46 )
...
NYT was one of the first, and its test was stale and it didn't have all
of its fields well defined.
8 years ago
Adam Pash
d0453efbf8
feat: improvements for nyer magazine articles ( #45 )
...
adds dek and date_published for magazine template
8 years ago
Adam Pash
00f8965c1f
fix: cleaning up deks ( #44 )
...
We've solidified what we consider a dek. This PR removes the dek selectors that do not fit that mold.
8 years ago
Janet
b415d1d37c
feat: aol custom extractor ( #42 )
...
* feat: aol custom parser
* removed work from other commits. merged with latest master
8 years ago
Matt
4cc3b68b5e
feat: remove footer links ( #40 )
...
the links at the bottom of the stories feel a little spammy because of how we treat links vs. the way they are displayed on the Times, would like to clean them
8 years ago
Adam Pash
e9a36d6ebd
release: 1.0.0 so we can start doing proper releaes ( #39 )
8 years ago
Adam Pash
ff1963bdca
feat: new cleaner for wapo ( #38 )
8 years ago
Adam Pash
0e6ccdf622
fix: browser cleanup ( #35 )
...
Cleaning up after the parser when it's done in the browser, before
returning result.
8 years ago
Adam Pash
bd0694fbba
feat: preview with optional rebuild ( #36 )
...
Now the preview script has an optional build step. Adding --no-rebuild
as an argument to the script will skip the rebuild step and just show a
preview of the parse as is with the current build.
8 years ago
Adam Pash
181b39b238
feat: ci speedup ( #37 )
...
minor speedup to see failing tests. linting happens first
8 years ago
Silas Burton
c3d98a0d76
Feat cnn extractor ( #34 )
...
* wip: cnn custom extactor
* wip: cnn works except first paragraph
* final touches on cnn parser
* cleanup
8 years ago
Silas Burton
a0570f8e94
feat: extractor for the verge ( #33 )
...
* feat: extractor for the verge's standard article template
* feat: basic support for the verge feature template
* feat: allow multiple links to be previewed
* feat: content selector arrays
Content selector arrays allow custom parsers to select multiple elements
to match and include in the result.
* feat: updated verge parser to use multimatch selectors
* lint fix
* cleanup test builds
8 years ago
Adam Pash
233ca11a33
fix: added timezone to new republic date ( #32 )
8 years ago
Adam Pash
cfe7f34be4
fix: normalizing spaces for authors/dek/title ( #31 )
...
* fix: normalizing spaces for authors/dek/title
8 years ago
Adam Pash
9a23b24a89
feat: adjustment for huffpo. skipping overly aggressive default cleaners ( #30 )
8 years ago
Silas Burton
be2e4b5c80
Feat: huffington post extractor ( #28 )
...
* wip: huffpo custom extractor
* wip: some huffpo cleanup
8 years ago
Adam Pash
94198c0a65
feat: new republic custom extractor ( #25 )
...
* wip: new republic custom extractor
* feat: new republic article extractor
* feat: new republic minutes article extractor
8 years ago
Janet
c4d72fb735
feat: add money.cnn custom parser ( #26 )
...
* feat: add money.cnn custom parser
* added timezone to cnn custom parser
8 years ago
Adam Pash
6343946dd8
Feat: custom timezones ( #29 )
...
* using moment-timezone to allow custom timezones
* added tz to tmz, even though still so-so
8 years ago
Adam Pash
19e7345bfb
feat: test builds are created for preview purposes so we aren't committing dist every time ( #27 )
8 years ago
Adam Pash
a8face796a
Fix extension bugs ( #23 )
...
* feat: cleaning supplemental elements in nytimes (visible in web only)
closes https://github.com/postlight/mercury-reader-chrome-extension/issues/102
* wip
* fix: more generous date published bits
* feat: added washington post extractor (including figure transforms)
closes https://github.com/postlight/mercury-reader-chrome-extension/issues/100
* feat: cleaning zoom lightbox from gizmodo/kinja
* lint fix
8 years ago
Adam Pash
3a2f32b0eb
feat: added tmz custom parser ( #22 )
8 years ago
Adam Pash
783a9cfb2f
fix: changed overly liberal regex for removing transparent images
8 years ago
Adam Pash
7411922c55
feat: encoding response body based on content-type charset ( #21 )
...
Also some small code organization
8 years ago
Adam Pash
88c125d022
chore: package upgrades
8 years ago
Adam Pash
c30fb2e4c0
chore: updated readme
8 years ago
Adam Pash
60a6861e18
Feat: browser support ( #19 )
...
Big undertaking to support Mercury in the browser. Builds are working and all tests are passing both for web and node builds. Most code is closely shared.
8 years ago
Adam Pash
eaea57461a
fix: servers returning bad headers was breaking request. temporarily ( #20 )
...
using fork with a fix for this until request merges the necessary pull request
8 years ago
Adam Pash
629eada1f7
feat: recording/playing back network requests with nock ( #18 )
...
* feat: recording/playing back network requests with nock
* lint fix
8 years ago
Adam Pash
6e29848e9c
feat: making yarn-friendly for package manager ( #17 )
...
* updated several commands; some fixes exposed by yarn upgrade
* removed unnec dep
8 years ago
Adam Pash
e325d860fd
Feat: improving ci ( #16 )
...
This commit also swaps in yarn for npm and tweaks circle ci a bit.
* appveyor.yml first go
* changing node
* ps
* narrow it down
* trying this
* fix airbnb module
* trying with yarn
* logging
* hybrid?
* trying yarn w/circle
* bump workers?
* build off?
* updating script
* tweaking script for appveyor
* bumping maxworkers
* cleaning up
* build step?
* yarn it
* added appveyor badge
8 years ago
Adam Pash
071218ab3c
chore: added repo
8 years ago
Adam Pash
41c3454590
fix: circle test passing badge
8 years ago
Adam Pash
4c9910384a
Feat: adding circle ci ( #15 )
...
* added circle.yml config
* set maxworkers in circle
* trying diff node versions
* multiple node
* pre nvm install
* testing parallel
* added badge to readme
* clean up circle.yml
8 years ago
Adam Pash
048d654417
feat: parser auto-generates name; lint is more specific
8 years ago
Adam Pash
65c641a879
feat: enforcing line break rules in linter
8 years ago
Adam Pash
4d1d950807
updated generator templates for new style of import/export. also some
...
adjustments for usability
8 years ago
Adam Pash
7fa90f59b7
making all.js export a generic function to decrease possiblity of error
8 years ago
Adam Pash
de5b120b79
feat: allowing extractors to support multiple domains
8 years ago
Adam Pash
d038a36544
feat: custom medium extractor
8 years ago
Adam Pash
007ddec8ac
feat: allowing iframes from src domain
8 years ago