Commit Graph

184 Commits (dffa760c04824431cbfdf04f4a8eb4034b905891)
 

Author SHA1 Message Date
Gijs Kruitbosch dffa760c04 Fix issue #267 by ignoring hash URIs when making URIs absolute 9 years ago
Gijs 7be3ccb57e Merge pull request #262 from gijsk/bbc-script-testcase
Reuse test from pull request #239 which passes without modifications …
9 years ago
Wes Johnston f87a12400b Reuse test from pull request #239 which passes without modifications (modified by @gijsk to pass in the current XHTML test environment) 9 years ago
Gijs b1d360168b Merge pull request #252 from gijsk/fix-delayed-closing-tags
Fix issue #251 by making JSDOMParser deal with non-self-closed-things
9 years ago
Gijs Kruitbosch 2e1cb3f467 Fix issue #251 by making JSDOMParser expect XML and stop making excuses for 'self-closed' things, when all that does is cause trouble 9 years ago
Gijs d360226f8c Merge pull request #260 from gijsk/hid-class
Fix bug 1230050 by checking for the 'hid' class specifically, r?MattN
9 years ago
Gijs Kruitbosch a9597efc17 Fix bug 1230050 by checking for the 'hid' class specifically, r?MattN 9 years ago
Gijs 30d6db3c11 Merge pull request #256 from brendanlong/spdx-license
Fix package.json's license to be in SPDX format ("Apache-2.0").
9 years ago
Brendan Long c59a054f78 Fix package.json's license to be in SPDX format ("Apache-2.0").
See: https://docs.npmjs.com/files/package.json#license
9 years ago
Gijs e5a6d628f4 Merge pull request #254 from hsemarap/readme-change-to-fix-scrambling-of-dom
Readme change to advise about DOM modification effects of parse(). Fixes #250
9 years ago
Parameswaran D a812b329ea moved the sample code under Optional subsection 9 years ago
Parameswaran D d1e4ef0dcd Fixes #250 : scrambling of DOM on parse 9 years ago
Parameswaran D 0b5dd0a6fb Fixes #250 : scrambling of DOM on parse 9 years ago
Gijs 8510106638 Merge pull request #211 from mozilla/add-support-for-wbr-tag
Added support for the wbr html tag to JSDOMParser.
10 years ago
Nicolas Perriault 8806e999d1 Added support for the wbr html tag to JSDOMParser. 10 years ago
Gijs a801846a45 Merge pull request #204 from mozilla/tweak-great-grandparent-scoring
Updated great grandparent node scoring.
10 years ago
Gijs 5bf56177be Merge pull request #207 from mozilla/better-dm
Improved embedded video elements detection.
10 years ago
Nicolas Perriault ae0833522c Improved embedded video elements detection. 10 years ago
Nicolas Perriault 46304bb5fe Updated great grandparent node scoring. 10 years ago
Nicolas Perriault 66071e573d Merge pull request #194 from mozilla/score-intermediary-headers
Fixes #180 - Score intermediary headings.
10 years ago
Nicolas Perriault 88ef3893b5 Fixes #180 - Score intermediary headings. 10 years ago
Nicolas Perriault 6344b3f736 Merge pull request #196 from mozilla/strip-related-contents
Refs #195 - Exclude nodes likely to be related content.
10 years ago
Nicolas Perriault dc1b2c9fa0 Refs #195 - Exclude nodes likely to be related content. 10 years ago
Margaret Leibovic affa0edbdd Merge pull request #197 from mozilla/support-dailymotion-videos
Ref #195 - Add support for dailymotion videos.
10 years ago
Nicolas Perriault cc18cb5787 Ref #195 - Add support for dailymotion videos. 10 years ago
Nicolas Perriault 4721837e27 Merge pull request #193 from mozilla/score-great-grandparent-nodes
Fixes #113 - Score great grandparent nodes.
10 years ago
Nicolas Perriault 9dbc009376 Fixes #113 - Recursive node ancestor scoring. 10 years ago
Gijs f71ec9ceae Merge pull request #191 from mozilla/preserve-list-items
Fixes #183 - Preserve list items.
10 years ago
Nicolas Perriault 44879722b6 Fixes #183 - Preserve list items. 10 years ago
Alexis Métaireau 5912e0c872 Add Firefox User-Agent when generating the test case. 10 years ago
Gijs 79aa2fca87 Merge pull request #189 from mozilla/dont-remove-headings
Fixes #150 - Keep article intermediary headings.
10 years ago
Margaret Leibovic af6da2a87d Merge pull request #190 from mozilla/improved-author-meta-extraction
Improved author metadata detection.
10 years ago
Nicolas Perriault 0d696051e9 Merge pull request #188 from gijsk/improve-isprobably-readerable
Make isProbablyReaderable include <pre>, and deal with long <br>-separat...
10 years ago
Nicolas Perriault 7aee44adb2 Improved author metadata detection. 10 years ago
Gijs Kruitbosch 5f184053cd Make isProbablyReaderable include <pre>, and deal with long <br>-separated paragraphs and/or shorter-than-5-paragraph text and such. 10 years ago
Gijs Kruitbosch d9a475e8d4 Fix benchmark script, add isProbablyReaderable benchmark 10 years ago
Nicolas Perriault 2451a07a7d Fixes #150 - Keep article intermediary headings. 10 years ago
Gijs 62f5d43c70 Merge pull request #187 from leibovic/classnames
Fixes #184 - Don't strip class names from article content
10 years ago
Margaret Leibovic 319a50b4f0 Fixes #184 - Don't strip class names from article content 10 years ago
Gijs 49e40768aa Merge pull request #185 from mozilla/score-section-tags-by-default
Fixes #139 #143: Added more weight to section tags.
10 years ago
Nicolas Perriault f6ffa6acde Fixes #139 #143: Added more weight to section tags. 10 years ago
Gijs 32d8a526f9 Merge pull request #175 from mozilla/improve-title-extraction
Fixes #174 - Remove aggressive article title formatting rule.
10 years ago
Nicolas Perriault 58cd789cd3 Improved title extraction 'algorithm'. 10 years ago
Gijs 647658a47b Merge pull request #172 from mozilla/js-beautify
Fixes #130 - Using js-beautify for HTML formatting.
10 years ago
Nicolas Perriault de89036cd5 Fixes #130 - Using js-beautify for HTML formatting. 10 years ago
Gijs b37ff08bc7 Merge pull request #169 from mozilla/clean-footer-tags
Fixes #163 - Avoid including footer tag contents.
10 years ago
Nicolas Perriault 12c6a11f67 Fixes #163 - Avoid including footer tag contents. 10 years ago
Gijs 87c0bc0144 Merge pull request #167 from mozilla/better-headline-extraction
Fixes #164 - Add support for title alt semantic metadata.
10 years ago
Nicolas Perriault 6eeabf90c1 Fixes #164 - Add support for title alt semantic metadata. 10 years ago
Margaret Leibovic eb7ec7231e Merge pull request #135 from gijsk/links
Bug 1147584 - Don't strip unlikely <a>s, and replace useless <a>s with textContent
10 years ago