Commit Graph

9 Commits (89572ad29a7c6bc04b944ad9a23f7982412e6a63)

Author SHA1 Message Date
Maria Luiza Soares 8c41d92560 Assert on siteName in all test cases 6 years ago
Daniel Aleksandersen 5a69d4a8eb Improve metadata extraction (#478)
* Improve metadata extraction

* Recognize meta[property] as a space-separated list
* Recognize Dulin Core (dc|dcterm): metadata.
* Prefer Dublin Core, Open Graph, Twitter, and HTML in that order.
* _getArticleTitle() is now only used as fallback if document
 doesn't provide good metadata.
6 years ago
David A Roberts 5ae90930cd Don't convert DIVs to Ps when more than 25% links 6 years ago
David A Roberts 9f2c5cb42e Put phrasing content into paragraphs
This removes the need for `p.readability-styled` elements.
6 years ago
Gijs Kruitbosch ad4dd26448 Update test expectations 7 years ago
Cameron McCormack 5ad448f831 Update test expectations. 7 years ago
Evan Tseng 15e1f03261 Bug 1300697 - Reader View missed first few paragraphs on New York Times website, r=Gijs 8 years ago
andrei-ch c5ff44d8fe Clean <input>,<textarea>,<select>,<button> elements 8 years ago
Evan Tseng 63230a307a Bug 1142312 - Add two more types of unlikely candidates: cover-wrap and yom-remote, r=Gijs 8 years ago