Commit Graph

11 Commits

Author SHA1 Message Date
PalmerAL
814f0a3884 Add support for detecting lazy-loaded images (#542)
Add support for detecting lazy-loaded images using `src` or `srcset` attributes.
2019-05-08 23:48:37 +01:00
Maria Luiza Soares
8c41d92560 Assert on siteName in all test cases 2018-12-21 18:28:28 +00:00
Daniel Aleksandersen
5a69d4a8eb Improve metadata extraction (#478)
* Improve metadata extraction

* Recognize meta[property] as a space-separated list
* Recognize Dulin Core (dc|dcterm): metadata.
* Prefer Dublin Core, Open Graph, Twitter, and HTML in that order.
* _getArticleTitle() is now only used as fallback if document
 doesn't provide good metadata.
2018-08-25 00:28:00 +01:00
David A Roberts
9f2c5cb42e Put phrasing content into paragraphs
This removes the need for `p.readability-styled` elements.
2018-05-15 13:29:55 +01:00
Gijs Kruitbosch
ad4dd26448 Update test expectations 2017-11-30 10:41:15 +00:00
Cameron McCormack
5ad448f831 Update test expectations. 2017-11-21 10:04:59 +00:00
andrei-ch
c5ff44d8fe Clean <input>,<textarea>,<select>,<button> elements 2016-12-17 13:37:27 +00:00
Evan Tseng
e84c0c3f07 Bug 1285543 - Only use "og:title" or "twitter:title" if _getArticleTitle does not return a valid title, r=Gijs 2016-12-14 11:34:15 +00:00
Gijs Kruitbosch
dffa760c04 Fix issue #267 by ignoring hash URIs when making URIs absolute 2016-03-07 10:32:09 +00:00
Gijs Kruitbosch
2e1cb3f467 Fix issue #251 by making JSDOMParser expect XML and stop making excuses for 'self-closed' things, when all that does is cause trouble 2016-01-22 19:57:45 +00:00
Nicolas Perriault
f6ffa6acde Fixes #139 #143: Added more weight to section tags. 2015-04-24 19:55:51 +02:00