Evan Tseng
|
63230a307a
|
Bug 1142312 - Add two more types of unlikely candidates: cover-wrap and yom-remote, r=Gijs
|
8 years ago |
andrei-ch
|
4a0d08c56a
|
font-to-span conversion skips half the font elements on 'real' DOMs
|
8 years ago |
Evan Tseng
|
e84c0c3f07
|
Bug 1285543 - Only use "og:title" or "twitter:title" if _getArticleTitle does not return a valid title, r=Gijs
|
8 years ago |
Evan Tseng
|
33dc8fa023
|
Bug 1255978 - Remove legends candidate, r=Gijs
|
8 years ago |
Evan Tseng
|
af0aa5c59f
|
Bug 1173548 - Find out text direction from ancestors of final candidate, r=Gijs
|
8 years ago |
Evan Tseng
|
4fa0d1b207
|
Bug 1177619 - Score div nodes which have br nodes. r=Gijs
|
8 years ago |
Taylor Hunt
|
71aa562387
|
Add microformats2 class names to heuristics (#303)
Microformats updated their old `hentry` to [a newer
`h-entry`](http://microformats.org/wiki/h-entry).
With the [number of IndieWeb sites breaking into the
ten-thousands](http://tantek.com/2016/190/b1/state-of-indieweb-summit),
this seems like a fair idea.
|
8 years ago |
Gijs
|
1a12befa41
|
Fix code style, tighten up eslint rules (#301)
|
8 years ago |
Ivan Persidsky
|
fd11f92adb
|
Use a dedicated method and backward iteration for removing nodes (#300)
This improves compat with "real" DOMs that provide a live NodeList as the return value of getElementsByTagName.
|
8 years ago |
Gijs Kruitbosch
|
140d4c4aca
|
Only compute textContent once.
|
9 years ago |
usergit
|
327bfcb93f
|
exposed textContent to be returned
this returns the text content only, this is useful as it allows the content to be easily accessible
|
9 years ago |
Gijs
|
69b81f5d70
|
Fix #287: convert getElementsByTagName result to an array (#288)
|
9 years ago |
Gijs Kruitbosch
|
46b08a5ea5
|
Address issue #277 by marking 'modal' unlikely+negative
|
9 years ago |
Peter deHaan
|
b380917b4b
|
Convert nested function declaration to function expression
|
9 years ago |
Gijs Kruitbosch
|
e830ac9dd8
|
Fix eslint issues identified in m-c
|
9 years ago |
Gijs Kruitbosch
|
dffa760c04
|
Fix issue #267 by ignoring hash URIs when making URIs absolute
|
9 years ago |
Gijs Kruitbosch
|
a9597efc17
|
Fix bug 1230050 by checking for the 'hid' class specifically, r?MattN
|
9 years ago |
Gijs
|
a801846a45
|
Merge pull request #204 from mozilla/tweak-great-grandparent-scoring
Updated great grandparent node scoring.
|
10 years ago |
Nicolas Perriault
|
ae0833522c
|
Improved embedded video elements detection.
|
10 years ago |
Nicolas Perriault
|
46304bb5fe
|
Updated great grandparent node scoring.
|
10 years ago |
Nicolas Perriault
|
88ef3893b5
|
Fixes #180 - Score intermediary headings.
|
10 years ago |
Nicolas Perriault
|
dc1b2c9fa0
|
Refs #195 - Exclude nodes likely to be related content.
|
10 years ago |
Nicolas Perriault
|
cc18cb5787
|
Ref #195 - Add support for dailymotion videos.
|
10 years ago |
Nicolas Perriault
|
9dbc009376
|
Fixes #113 - Recursive node ancestor scoring.
|
10 years ago |
Nicolas Perriault
|
44879722b6
|
Fixes #183 - Preserve list items.
|
10 years ago |
Gijs
|
79aa2fca87
|
Merge pull request #189 from mozilla/dont-remove-headings
Fixes #150 - Keep article intermediary headings.
|
10 years ago |
Margaret Leibovic
|
af6da2a87d
|
Merge pull request #190 from mozilla/improved-author-meta-extraction
Improved author metadata detection.
|
10 years ago |
Nicolas Perriault
|
7aee44adb2
|
Improved author metadata detection.
|
10 years ago |
Gijs Kruitbosch
|
5f184053cd
|
Make isProbablyReaderable include <pre>, and deal with long <br>-separated paragraphs and/or shorter-than-5-paragraph text and such.
|
10 years ago |
Nicolas Perriault
|
2451a07a7d
|
Fixes #150 - Keep article intermediary headings.
|
10 years ago |
Margaret Leibovic
|
319a50b4f0
|
Fixes #184 - Don't strip class names from article content
|
10 years ago |
Gijs
|
49e40768aa
|
Merge pull request #185 from mozilla/score-section-tags-by-default
Fixes #139 #143: Added more weight to section tags.
|
10 years ago |
Nicolas Perriault
|
f6ffa6acde
|
Fixes #139 #143: Added more weight to section tags.
|
10 years ago |
Nicolas Perriault
|
58cd789cd3
|
Improved title extraction 'algorithm'.
|
10 years ago |
Gijs
|
b37ff08bc7
|
Merge pull request #169 from mozilla/clean-footer-tags
Fixes #163 - Avoid including footer tag contents.
|
10 years ago |
Nicolas Perriault
|
12c6a11f67
|
Fixes #163 - Avoid including footer tag contents.
|
10 years ago |
Nicolas Perriault
|
6eeabf90c1
|
Fixes #164 - Add support for title alt semantic metadata.
|
10 years ago |
Gijs Kruitbosch
|
0ff82de0f4
|
Implement createTextNode, do more relaxed escaping there, update testcase.
|
10 years ago |
Margaret Leibovic
|
37a8cd4171
|
Bug 1147584 - Don't remove unlikely <a> tags, and replace <a> tags with their text content if they won't be useful links
|
10 years ago |
Gijs
|
a6014f5854
|
Merge pull request #132 from gijsk/heise-ad-prioritization
Don't look at banners and skyscrapers, remove <noscript> elements
|
10 years ago |
Gijs Kruitbosch
|
a6346a0ad4
|
Don't look at banners and skyscrapers, remove <noscript> elements
|
10 years ago |
Nicolas Perriault
|
4424b0bad7
|
Refs #128 - Add support for options to Readability constructor. r=@gijsk
|
10 years ago |
Nicolas Perriault
|
4d41f5e4ed
|
Refs #117 - Drop social/share buttons.
|
10 years ago |
Gijs Kruitbosch
|
7c60dba3b6
|
Fix Readability.js to work with jsdom's DOM implementation (in particular: no firstElementChild implementation...)
|
10 years ago |
Margaret Leibovic
|
eb3a8e8dc4
|
Bug 1150695 - Move isProbablyReaderable function to Readability.js
|
10 years ago |
Nicolas Perriault
|
f8d37e4276
|
Don't remove elements containing figures or having them as a parent.
|
10 years ago |
Nicolas Perriault
|
b6730703a1
|
Fixes #81 - Keep article images.
|
10 years ago |
Gijs
|
194a5376c8
|
Merge pull request #63 from mozilla/preserve-embedded-tweets
Preserve inline tweets as they're part of article contents.
|
10 years ago |
Gijs Kruitbosch
|
b4332328f3
|
Fix an issue where we don't track scores for the parents appropriately.
|
10 years ago |
Gijs
|
14b33b69db
|
Merge pull request #65 from mozilla/support-embed-videos
Fixes #56 - Updated support for embedded Youtube & Vimeo videos.
|
10 years ago |
Nicolas Perriault
|
ad52d8ee30
|
Fixes #53 - Fixed dot-slash relative URI resolution.
|
10 years ago |
Nicolas Perriault
|
2d5f59f3eb
|
Fixes #56 - Updated support for embedded Youtube & Vimeo videos.
|
10 years ago |
Nicolas Perriault
|
d83763c8a1
|
Preserve inline tweets as they're part of article contents.
|
10 years ago |
Nicolas Perriault
|
cf3dce6cf2
|
Refs #58 - Stripped embed tags.
|
10 years ago |
Nicolas Perriault
|
eee224560b
|
Addressed review comments from @Gijsk.
|
10 years ago |
Nicolas Perriault
|
4f9615cb9a
|
Use forEach when it makes sense.
|
10 years ago |
Gijs Kruitbosch
|
955951659d
|
Bug 1143725 - fix the Herald Sun website
|
10 years ago |
Gijs Kruitbosch
|
eb81444946
|
Improve logic to rely on children instead of childNodes
|
10 years ago |
Margaret Leibovic
|
3c2d93cd09
|
Improve byline algorithm
|
10 years ago |
Gijs Kruitbosch
|
d94f3158d3
|
Fix readability.js to do a DOM traversal rather than relying on a wonky DOMCollection, fix trims, fix a potential null access, etc.
|
10 years ago |
Margaret Leibovic
|
fc53e1a315
|
Set 'name' variable to null in _getExcerpt to avoid old values in future for loop iterations
|
10 years ago |
Margaret Leibovic
|
2c7c504a36
|
Merge pull request #32 from gijsk/regex-issues-with-class-and-id-stuff
Fix regex issues. r=margaret
|
10 years ago |
Gijs
|
aec1ce774d
|
Merge pull request #31 from gijsk/testing-generates
Allow generating tests from the web, make testing more closely match Firefox
|
10 years ago |
Gijs Kruitbosch
|
1c42f29aa5
|
Create a script to generate testcases, actually use our version of JSDOMParser
|
10 years ago |
Gijs
|
17062c1ccf
|
Fix video regular expression to support https
|
10 years ago |
Gijs
|
d9f1e884dd
|
Fix regex issues
|
10 years ago |
Margaret Leibovic
|
98ee8f7463
|
Merge pull request #27 from gijsk/fix-missing-paragraphs
Bug 1144441 - avoid leaving out paragraphs. r=margaret
|
10 years ago |
Gijs Kruitbosch
|
1d2df4a70e
|
Bug 1144441 - avoid leaving out paragraphs
|
10 years ago |
Margaret Leibovic
|
a9bd60154d
|
Bug 1144355 - Bail if we don't have a body to parse. r?Gijs
|
10 years ago |
Gijs Kruitbosch
|
d3f84a1e58
|
Fix class-related logging exception
|
10 years ago |
Gijs Kruitbosch
|
ce0ebe24e0
|
Improve logging of elements
|
10 years ago |
Margaret Leibovic
|
03d9e36161
|
Merge pull request #22 from gijsk/fix-empty-classes
Don't create/leave empty class attributes around all the nodes we're using. r=margaret
|
10 years ago |
Nicolas Perriault
|
99f338a03a
|
Added logging to test output.
|
10 years ago |
Gijs Kruitbosch
|
b62fd27ba6
|
Don't create/leave empty class attributes around all the nodes we're using.
|
10 years ago |
Gijs Kruitbosch
|
a563714567
|
Bug 1127778 - while we're at it, add more logs.
|
10 years ago |
Gijs Kruitbosch
|
3c277a1701
|
Bug 1127778 - fix paragraph reordering and add a test for it.
|
10 years ago |
Peter deHaan
|
78b61ccbcd
|
Convert `const` to `var`
Per https://github.com/mozilla/readability/issues/18#issuecomment-77229549
|
10 years ago |
srlakhe
|
a93aa7d0ad
|
Updated Readability.js
|
10 years ago |
shreyas
|
8061bf0254
|
Bug 958735 Function purgeNode moved
|
10 years ago |
Stefan Arentz (Mozilla)
|
7057e46c4f
|
Fixes #3 Let Readability.parse() also return the uri
|
10 years ago |
Stefan Arentz (Mozilla)
|
255595cc70
|
Fixes #1 Replace occurrences of let with var
|
10 years ago |
Tarek Ziade
|
55587d91ac
|
initial file
|
10 years ago |