PalmerAL
814f0a3884
Add support for detecting lazy-loaded images ( #542 )
...
Add support for detecting lazy-loaded images using `src` or `srcset` attributes.
2019-05-08 23:48:37 +01:00
Maria Luiza Soares
8c41d92560
Assert on siteName in all test cases
2018-12-21 18:28:28 +00:00
Daniel Aleksandersen
5a69d4a8eb
Improve metadata extraction ( #478 )
...
* Improve metadata extraction
* Recognize meta[property] as a space-separated list
* Recognize Dulin Core (dc|dcterm): metadata.
* Prefer Dublin Core, Open Graph, Twitter, and HTML in that order.
* _getArticleTitle() is now only used as fallback if document
doesn't provide good metadata.
2018-08-25 00:28:00 +01:00
David A Roberts
9f2c5cb42e
Put phrasing content into paragraphs
...
This removes the need for `p.readability-styled` elements.
2018-05-15 13:29:55 +01:00
Gijs Kruitbosch
ad4dd26448
Update test expectations
2017-11-30 10:41:15 +00:00
Cameron McCormack
5ad448f831
Update test expectations.
2017-11-21 10:04:59 +00:00
andrei-ch
c5ff44d8fe
Clean <input>,<textarea>,<select>,<button> elements
2016-12-17 13:37:27 +00:00
Evan Tseng
e84c0c3f07
Bug 1285543 - Only use "og:title" or "twitter:title" if _getArticleTitle does not return a valid title, r=Gijs
2016-12-14 11:34:15 +00:00
Gijs Kruitbosch
dffa760c04
Fix issue #267 by ignoring hash URIs when making URIs absolute
2016-03-07 10:32:09 +00:00
Gijs Kruitbosch
2e1cb3f467
Fix issue #251 by making JSDOMParser expect XML and stop making excuses for 'self-closed' things, when all that does is cause trouble
2016-01-22 19:57:45 +00:00
Nicolas Perriault
f6ffa6acde
Fixes #139 #143 : Added more weight to section tags.
2015-04-24 19:55:51 +02:00