Commit Graph

113 Commits (fix-remove-moment-js)

Author SHA1 Message Date
Ralph Jbeily 46ce505727
feat: update package.json scripts to work on windows (#216)
* feat: add npm-run-all and fix test:web script

* fix: remove test script extra option

* fix: update lint script revert test script and remove npm-run-all

* chore: revert to linux/mac specific script

* fix: prepend node command so it works on windows
5 years ago
Ralph Jbeily 2e1e4d90c9
feat: add remarklint for md docs (#213)
* feat: add remarklint for md docs

* fix: remarkrc file and run linter on commit hook
5 years ago
Adam Pash 76d333f0be
deps: upgrade (#218) 5 years ago
Adam Pash e2dbd08ae7
fix: pre-commit hook on js (#212) 5 years ago
Adam Pash e4b057f9ea
chore: update node and some deps (#209)
* chore: update .nvmrc

* added prettier and pre-commit hooks

* update docker image to new node

* add karma-cli to get web tests working

* explictly install karma... seems to fix problem

* remove pre-built phantomjs

* swap install order
5 years ago
Adam Pash c643666c88
dx: automate fixture updates (#197) 5 years ago
Adam Pash ff144952b9
dx: test/finish bot preview 5 years ago
Adam Pash d35f7bd5bf
dx: comment on PRs when fixtures have been added/changed (#192)
The goal here is to provide some sort of relatively easy preview for the
PR reviewer to see if the fixture looks good, if the parsing is working,
and to make suggestions easily.
5 years ago
Adam Pash 4478338046
docs: document release process (#186) 6 years ago
Adam Pash fd6c9d4fa3
release: 1.0.13 (#183) 6 years ago
Adam Pash 7fcd9b62eb release: 1.0.12 (#173) 7 years ago
Adam Pash a51cc81c27 release: 1.0.11 (#171) 7 years ago
Adam Pash 86d6bd1dc1 release: 1.0.10 (#169) 7 years ago
Adam Pash e56e8e24cd release: 1.0.9 (#167) 7 years ago
Adam Pash 321c087be6 release: 1.0.8 (#164) 7 years ago
Adam Pash e267d57d78 release: 1.0.7 (#160) 7 years ago
Adam Pash 9d4c883d51 release: 1.0.6 (#142) 7 years ago
Adam Pash 601b0fac16 release: 1.0.5 (#136) 7 years ago
Adam Pash dbc706410b release: 1.0.4 (#122) 7 years ago
Adam Pash a710efd2d5 release: 1.0.3 (#62) 8 years ago
Adam Pash 8070e4790b test: streamlined guardian tests w/new single-extraction (#58) 8 years ago
Adam Pash 332f85928f release: 1.0.2 (#54) 8 years ago
Adam Pash 15656cb3e1 Refactor: running tests more efficiently (#49)
Only running one parser per page we're testing rather than a parser per field we're testing.
8 years ago
Adam Pash edcb7295d1 release: 1.0.1 (#48) 8 years ago
Janet c4d72fb735 feat: add money.cnn custom parser (#26)
* feat: add money.cnn custom parser

* added timezone to cnn custom parser
8 years ago
Adam Pash 6343946dd8 Feat: custom timezones (#29)
* using moment-timezone to allow custom timezones

* added tz to tmz, even though still so-so
8 years ago
Adam Pash 7411922c55 feat: encoding response body based on content-type charset (#21)
Also some small code organization
8 years ago
Adam Pash 88c125d022 chore: package upgrades 8 years ago
Adam Pash 60a6861e18 Feat: browser support (#19)
Big undertaking to support Mercury in the browser. Builds are working and all tests are passing both for web and node builds. Most code is closely shared.
8 years ago
Adam Pash eaea57461a fix: servers returning bad headers was breaking request. temporarily (#20)
using fork with a fix for this until request merges the necessary pull request
8 years ago
Adam Pash 629eada1f7 feat: recording/playing back network requests with nock (#18)
* feat: recording/playing back network requests with nock

* lint fix
8 years ago
Adam Pash e325d860fd Feat: improving ci (#16)
This commit also swaps in yarn for npm and tweaks circle ci a bit.

* appveyor.yml first go

* changing node

* ps

* narrow it down

* trying this

* fix airbnb module

* trying with yarn

* logging

* hybrid?

* trying yarn w/circle

* bump workers?

* build off?

* updating script

* tweaking script for appveyor

* bumping maxworkers

* cleaning up

* build step?

* yarn it

* added appveyor badge
8 years ago
Adam Pash 071218ab3c chore: added repo 8 years ago
Adam Pash 048d654417 feat: parser auto-generates name; lint is more specific 8 years ago
Adam Pash 7fa90f59b7 making all.js export a generic function to decrease possiblity of error 8 years ago
Adam Pash a73246306d feat: quicker lint by being more specific 8 years ago
Adam Pash 4b5c029093 feat: added all-contributors 8 years ago
Adam Pash eb0aa0b1f6 feat: some small tweaks to toy's excellent parsers ☺️
Squashed commit of the following:

commit 9638220124a325322d6cda7d16c645185d5fe827
Author: Adam Pash <adam.pash@gmail.com>
Date:   Mon Oct 10 11:02:29 2016 -0700

    fix: removed eslint plugin that was adding unneded async parens

commit ce2268c0f7c1b093c06f156730a0f1bc2aaba39c
Author: Adam Pash <adam.pash@gmail.com>
Date:   Mon Oct 10 10:47:36 2016 -0700

    style: fix async in parens

commit 9591856915eddaf93170da1ce9225b8a378bdf55
Author: Adam Pash <adam.pash@gmail.com>
Date:   Mon Oct 10 10:37:11 2016 -0700

    fix: remove parens around async

commit 6c56054717acc1f7e5499691780f8273f6d07bac
Author: Adam Pash <adam.pash@gmail.com>
Date:   Mon Oct 10 10:35:50 2016 -0700

    fix msn fixture; adjusted yahoo test

commit 4fc117ad5fdc5528f29b0873d60a6a1709642f15
Author: Adam Pash <adam.pash@gmail.com>
Date:   Mon Oct 10 10:14:38 2016 -0700

    removed dek and date_publised tests; neither exist in littlethings

commit 401094b4abc52901255fd2461f5839624f11d8a3
Author: Adam Pash <adam.pash@gmail.com>
Date:   Mon Oct 10 10:08:44 2016 -0700

    feat: updated buzzfeed for content extraction

commit 19548a5485f70ff9b65e3e725d2364d07734ac9c
Author: Adam Pash <adam.pash@gmail.com>
Date:   Mon Oct 10 09:54:30 2016 -0700

    fix: generator should make transforms an object, not array

commit b92113f9f7c97aca9e6d3ce9243abac967d26b63
Author: Adam Pash <adam.pash@gmail.com>
Date:   Mon Oct 10 08:54:38 2016 -0700

    feat: updated politico

commit c026591040f7671cb2a6dd5177a995e21d015482
Author: Adam Pash <adam.pash@gmail.com>
Date:   Mon Oct 10 08:48:52 2016 -0700

    fix: typos

commit 14aa8fa4ce38ff1c2a212cd0225437ae3042c2c3
Author: Adam Pash <adam.pash@gmail.com>
Date:   Mon Oct 10 08:36:12 2016 -0700

    fix: incorrect command in readme

commit fe260e6122877e2cb0130a1ecde0e503017057a3
Author: Adam Pash <adam.pash@gmail.com>
Date:   Mon Oct 10 08:31:11 2016 -0700

    fix: removed dek test because there is no dek on wikia
8 years ago
Adam Pash 173f885674 feat: custom parser + generator + detailed readme instructions
Squashed commit of the following:

commit 02563daa67712c3679258ebebac60dfa9568dffb
Author: Adam Pash <adam.pash@gmail.com>
Date:   Fri Sep 30 12:25:44 2016 -0400

    updated readme, added newyorker parser for readme guide

commit 0ac613ef823efbffbf4cc9a89e5cb2489d1c4f6f
Author: Adam Pash <adam.pash@gmail.com>
Date:   Fri Sep 30 11:16:52 2016 -0400

    feat: updated parser so the saved fixture absolutizes urls

commit 85c7a2660b21f95c2205ca4a4378a7570687fed0
Author: Adam Pash <adam.pash@gmail.com>
Date:   Fri Sep 30 10:15:26 2016 -0400

    refactor: attribute selectors must be an array for custom extractors

commit f60f93d5d3d9b2f2d9ec6f28d27ae9dcf16ef01e
Author: Adam Pash <adam.pash@gmail.com>
Date:   Thu Sep 29 10:13:14 2016 -0400

    fix: whitelisting srcset and alt attributes

commit e31cb1f4e8a9fc9c3d9b20ef9f40ca6c8d6ad51a
Author: Adam Pash <adam.pash@gmail.com>
Date:   Thu Sep 29 09:44:21 2016 -0400

    some housekeeping for coverage tests

commit 39eafe420c776a1fe7f9fea634fb529a3ed75a71
Author: Adam Pash <adam.pash@gmail.com>
Date:   Wed Sep 28 17:52:08 2016 -0400

    fix: word count for multi-page articles

commit b04e0066b52f190481b1b604c64e3d0b1226ff02
Author: Adam Pash <adam.pash@gmail.com>
Date:   Thu Sep 22 10:40:23 2016 -0400

    major improvements to output

commit 3f3a880b63b47fe21953485da670b6e291ac60e5
Author: Adam Pash <adam.pash@gmail.com>
Date:   Wed Sep 21 17:27:53 2016 -0400

    updated test command

commit 14503426557a870755453572221d95c92cff4bd2
Author: Adam Pash <adam.pash@gmail.com>
Date:   Wed Sep 21 16:00:30 2016 -0400

    shortened generator command

commit 5ebd8343cd4b87b3f5787dab665bff0de96846e1
Author: Adam Pash <adam.pash@gmail.com>
Date:   Wed Sep 21 15:59:14 2016 -0400

    feat: can disable fallback to generic parser (this will be useful for testing custom parsers)
8 years ago
Adam Pash ad42055f8f feat: switched test framework to jest 8 years ago
Adam Pash 8f42e119e8 feat: generator for custom parsers and some documentation
Squashed commit of the following:

commit deaf9e60d031d9ee06e74b8c0895495b187032a5
Author: Adam Pash <adam.pash@gmail.com>
Date:   Tue Sep 20 10:31:09 2016 -0400

    chore: README for custom parsers

commit a8e8ad633e0d1576a52dbc90ce31b98fb2ec21ee
Author: Adam Pash <adam.pash@gmail.com>
Date:   Mon Sep 19 23:36:09 2016 -0400

    draft of readme

commit 4f0f463f821465c282ce006378e5d55f8f41df5f
Author: Adam Pash <adam.pash@gmail.com>
Date:   Mon Sep 19 17:56:34 2016 -0400

    custom extractor used to build basic parser for theatlantic

commit c5562a3cede41f56c4e723dcfa1181b49dcaae4d
Author: Adam Pash <adam.pash@gmail.com>
Date:   Mon Sep 19 17:20:13 2016 -0400

    pre-commit to test custom parser generator

commit 7d50d5b7ab780b79fae38afcb87a7d1da5d139b2
Author: Adam Pash <adam.pash@gmail.com>
Date:   Mon Sep 19 17:19:55 2016 -0400

    feat: added nytimes parser

commit 58b8d83a56927177984ddfdf70830bc4f328f200
Author: Adam Pash <adam.pash@gmail.com>
Date:   Mon Sep 19 17:17:28 2016 -0400

    feat: can do fuzzy search or go straight to file

commit c99add753723a8e2ac64d51d7379ac8e23125526
Author: Adam Pash <adam.pash@gmail.com>
Date:   Mon Sep 19 10:52:26 2016 -0400

    refactored export for custom extractors for easier renames

commit 22563413669651bb497f1bb2a92085b71f2ae324
Author: Adam Pash <adam.pash@gmail.com>
Date:   Fri Sep 16 17:36:13 2016 -0400

    feat: custom extractor generation in place

commit 2285a29908a7f82a5de3c81f6b2b902ddec9bdaa
Author: Adam Pash <adam.pash@gmail.com>
Date:   Fri Sep 16 16:42:20 2016 -0400

    good progress
8 years ago
Adam Pash f58ccec7aa fix: including babel-runtime as a bandaid for polyfill error 8 years ago
Adam Pash 59fb4c4974 fix: using transform-runtime to avoid babel-polyfill conflicts when used
in external code
8 years ago
Adam Pash 2ae2dba690 chore: renamed iris to mercury 8 years ago
Adam Pash d60d396c98 feat: added text direction to response 8 years ago
Adam Pash c76435ce62 updated name in package.json 8 years ago
Adam Pash 76df30e303 chore: cleanup 8 years ago
Adam Pash 67296691c2 refactor: page collection 8 years ago
Adam Pash 3694c2d12c chore: improve linter/babelrc 8 years ago
Adam Pash 7e2a34945f chore: refactored and linted 8 years ago
Adam Pash 7ec0ed0d31 feat: nextPageUrl handles multi-page articles
Squashed commit of the following:

commit b5070c0967a7f1a0c0c449ba7ea40aebe8fe4bb8
Author: Adam Pash <adam.pash@gmail.com>
Date:   Tue Sep 13 10:03:00 2016 -0400

    root extractor includes next page url

commit 79be83127d5342d89eef33665586fabea227d6b3
Author: Adam Pash <adam.pash@gmail.com>
Date:   Tue Sep 13 09:58:20 2016 -0400

    small score adjustment

commit 0f00507dbff43401145a892e849311518edec68a
Author: Adam Pash <adam.pash@gmail.com>
Date:   Mon Sep 12 18:17:38 2016 -0400

    feat: nextPageUrl generic parser up and running

commit be91c589fc0c6d6f9b573080a76c9b1ac7af710c
Author: Adam Pash <adam.pash@gmail.com>
Date:   Mon Sep 12 11:53:58 2016 -0400

    feat: pageNumFromUrl extracts the pagenum of the current url

commit ad879d7aabedadfd051c01b42d841703bf4763fa
Author: Adam Pash <adam.pash@gmail.com>
Date:   Mon Sep 12 11:52:37 2016 -0400

    feat: isWordpress checks if a page is generated by wordpress
8 years ago
Adam Pash c48e3485c0 chore: code reorganization
Squashed commit of the following:

commit 636296841d5cf5e685237fe70db7a15305d8e966
Author: Adam Pash <adam.pash@gmail.com>
Date:   Fri Sep 9 13:37:21 2016 -0400

    final cleanup

commit 51f712b3074d41a1f2da91519289d4dd09719ad0
Author: Adam Pash <adam.pash@gmail.com>
Date:   Fri Sep 9 13:25:28 2016 -0400

    Another big pass

commit 3860e6d872a9adb9290093fd9c8708dfcc773c28
Author: Adam Pash <adam.pash@gmail.com>
Date:   Fri Sep 9 12:49:52 2016 -0400

    chore: started reorganizing
8 years ago
Adam Pash 8da2425e59 feat: resource fetches content from a URL and prepares for parsing
Squashed commit of the following:

commit 7ba2d2b36d175f5ccbc02f918322ea0dd44bf2c1
Author: Adam Pash <adam.pash@gmail.com>
Date:   Tue Sep 6 17:55:10 2016 -0400

    feat: resource fetches content from a URL and prepares for parsing

commit 0abdfa49eed5b363169070dac6d65d0a5818c918
Author: Adam Pash <adam.pash@gmail.com>
Date:   Tue Sep 6 17:54:07 2016 -0400

    fix: this was messing up double Esses ('ss', as in class => cla)

commit 9dc65a99631e3a68267a68b2b4629c4be8f61546
Author: Adam Pash <adam.pash@gmail.com>
Date:   Tue Sep 6 14:58:57 2016 -0400

    fix: test suite working w/new dirs

commit 993dc33a5229bfa22ea998e3c4fe105be9d91c21
Author: Adam Pash <adam.pash@gmail.com>
Date:   Tue Sep 6 14:49:39 2016 -0400

    feat: convertLazyLoadedImages puts img urls in the src

commit e7fb105443dd16d036e460ad21fbcb47191f475b
Author: Adam Pash <adam.pash@gmail.com>
Date:   Tue Sep 6 14:30:43 2016 -0400

    feat: makeLinksAbsolute to fully qualify urls

commit dbd665078af854efe84bbbfe9b55acd02e1a652f
Author: Adam Pash <adam.pash@gmail.com>
Date:   Tue Sep 6 13:38:33 2016 -0400

    feat: fetchResource to fetch a url and validate the response

commit 42d3937c8f0f8df693996c2edee93625f13dced7
Author: Adam Pash <adam.pash@gmail.com>
Date:   Tue Sep 6 10:25:34 2016 -0400

    feat: normalizing meta tags
8 years ago
Adam Pash 752331eaae feat: bundling with rollup
Squashed commit of the following:

commit 52bcf0f2dd79bcb2ee21bc134522edd259a3d35e
Author: Adam Pash <adam.pash@gmail.com>
Date:   Fri Sep 2 13:42:29 2016 -0400

    fix: converting date to ISO string

commit 11e827e27129ac229a96f66ca03f0b18dc5d289d
Author: Adam Pash <adam.pash@gmail.com>
Date:   Fri Sep 2 13:42:12 2016 -0400

    feat: bundling with rollup

commit 1ff752a3e44e5836b955f7f15c799abbbdfc9207
Author: Adam Pash <adam.pash@gmail.com>
Date:   Fri Sep 2 12:11:39 2016 -0400

    clean
8 years ago
Adam Pash 0ff3082295 feat: GenericExtractLeadImageUrl
Squashed commit of the following:

commit 22d37ebf26dbbd0a3daebbfde3509a6ce04aaf72
Author: Adam Pash <adam.pash@gmail.com>
Date:   Thu Sep 1 17:50:13 2016 -0400

    feat: GenericExtractLeadImageUrl

commit 3327a0a7929dd0e9267dc9c26f4e2aa78c32586f
Author: Adam Pash <adam.pash@gmail.com>
Date:   Thu Sep 1 15:33:42 2016 -0400

    feat: can pass custom attributes to extractFromMeta
8 years ago
Adam Pash 956fd678f7 feat: GenericDatePublishedExtractor
Squashed commit of the following:

commit 8eda4606e773147ae8dd67666d1a64d659f9fdad
Author: Adam Pash <adam.pash@gmail.com>
Date:   Thu Sep 1 12:28:06 2016 -0400

    feat: GenericDatePublishedExtractor

commit 935510fe9bc0a92f68fca7faf66019cb45330097
Author: Adam Pash <adam.pash@gmail.com>
Date:   Thu Sep 1 09:28:42 2016 -0400

    updated todo
8 years ago
Adam Pash 746d07d4a2 feat: title extraction and scaffolding for more
Squashed commit of the following:

commit 31d8b63dcb3ec9bbd6c8e7a10852fbd060e91103
Author: Adam Pash <adam.pash@gmail.com>
Date:   Wed Aug 31 15:52:27 2016 -0400

    feat: title extraction

commit 7002c552a9f5bb54630455d983b699c041c629fc
Author: Adam Pash <adam.pash@gmail.com>
Date:   Wed Aug 31 14:21:29 2016 -0400

    feat: withinComment checks if a node is inside a comment

commit 57f06ef5b499c2f747edee0c9eb276e38984de9a
Author: Adam Pash <adam.pash@gmail.com>
Date:   Wed Aug 31 13:40:36 2016 -0400

    feat: extractFromMeta function

commit 0947f21aae94fa5ce462246ed5cb53144d563931
Author: Adam Pash <adam.pash@gmail.com>
Date:   Wed Aug 31 13:32:30 2016 -0400

    fix: returning original string if no tags in string

commit dd6b032e5f9877395b9600480dd96c6fdf60cecd
Author: Adam Pash <adam.pash@gmail.com>
Date:   Wed Aug 31 12:03:58 2016 -0400

    feat: clean title function removes junk from titles

commit f33b3eef29ad7692441bd0e5aa26b11dd4411dde
Author: Adam Pash <adam.pash@gmail.com>
Date:   Wed Aug 31 12:03:35 2016 -0400

    chore: renamed function to correct name

commit 076a986b12df68a939a8efa773e01d08780d79aa
Author: Adam Pash <adam.pash@gmail.com>
Date:   Wed Aug 31 12:02:18 2016 -0400

    feat: utility method to strip tags from text

commit f3e98cdf0a0d7601fab9e8824c0cde73ded51651
Author: Adam Pash <adam.pash@gmail.com>
Date:   Wed Aug 31 11:31:33 2016 -0400

    feat: resolveSplitTitle cleans raw title text
8 years ago
Adam Pash e1ef25aab1 fix: added babel-polyfill for bug in Reflect 8 years ago
Adam Pash 93e844cdfe feat: implemented extractBestNode functionality
Squashed commit of the following:

commit 9af554dd975ff1778ed70c71fa9bde667fc5f880
Author: Adam Pash <adam.pash@gmail.com>
Date:   Tue Aug 30 15:19:32 2016 -0400

    feat: add cleanHeaders

commit 0dfea98eedc4f97fcbd78866322595c705e20521
Author: Adam Pash <adam.pash@gmail.com>
Date:   Tue Aug 30 14:30:49 2016 -0400

    fix: scoring parent nodes recursively

commit b6e5897a694adeb81e25a905aba72c0f45a8cc94
Author: Adam Pash <adam.pash@gmail.com>
Date:   Tue Aug 30 12:47:24 2016 -0400

    feat: extract clean node up and running

commit fb652c5db13db6bce7271efd68ba4b20515e9549
Author: Adam Pash <adam.pash@gmail.com>
Date:   Tue Aug 30 09:57:21 2016 -0400

    chore: added test for p tags with nested tags (e.g., img, iframe)

commit 731d0a2e4d89121dfafad195e9d0911805c4f8e4
Author: Adam Pash <adam.pash@gmail.com>
Date:   Mon Aug 29 17:50:33 2016 -0400

    feat: extact clean node integrates most functions

commit 322bc6534d30feb7c1c08d3813132badc6286b40
Author: Adam Pash <adam.pash@gmail.com>
Date:   Mon Aug 29 16:46:04 2016 -0400

    feat: removing empty nodes as defined in constants

commit f1d38932ea12a865814d2326970031fcb8515baa
Author: Adam Pash <adam.pash@gmail.com>
Date:   Mon Aug 29 16:33:31 2016 -0400

    feat: cleaning attributes from nodes

commit 0aa73ada6854af0ecd504bfe3d926a9524787ab5
Author: Adam Pash <adam.pash@gmail.com>
Date:   Mon Aug 29 16:09:56 2016 -0400

    feat: cleaning h1s from text

commit 12d4a309246285c278ce7765e4fbaa8271bb5889
Author: Adam Pash <adam.pash@gmail.com>
Date:   Mon Aug 29 15:52:03 2016 -0400

    feat: removing spacer images

commit 4e74ff830cc67586560f6fc72e2cfa432a3a2647
Author: Adam Pash <adam.pash@gmail.com>
Date:   Mon Aug 29 15:38:49 2016 -0400

    feat: stripping unwanted html from doc

commit c774166e90169fd0c1aa89898d3f7a975e82bf0a
Author: Adam Pash <adam.pash@gmail.com>
Date:   Mon Aug 29 15:17:32 2016 -0400

    feat: removing small images, height attribute from images

commit 3a8642f42cda451669c832482c5e1611b1ff2ea9
Author: Adam Pash <adam.pash@gmail.com>
Date:   Mon Aug 29 12:57:45 2016 -0400

    feat: rewrite top level

commit a1c03e779234b0aea02206d92ec3dcc15758507e
Author: Adam Pash <adam.pash@gmail.com>
Date:   Fri Aug 26 17:34:36 2016 -0400

    in a weird place rn
8 years ago
Adam Pash 89a2cfbb82 getWeight with tests 8 years ago
Adam Pash f3aebb2a16 Basic testing in place 8 years ago
Adam Pash 8efcc70eef bringing in cheerio 8 years ago
Adam Pash b349a1eac5 using rollup 8 years ago