Commit Graph

3 Commits

Author SHA1 Message Date
Daniel Aleksandersen
5a69d4a8eb Improve metadata extraction (#478)
* Improve metadata extraction

* Recognize meta[property] as a space-separated list
* Recognize Dulin Core (dc|dcterm): metadata.
* Prefer Dublin Core, Open Graph, Twitter, and HTML in that order.
* _getArticleTitle() is now only used as fallback if document
 doesn't provide good metadata.
2018-08-25 00:28:00 +01:00
Gijs Kruitbosch
f782bc5f06 Avoid global flag when looking for metadata using regexes 2018-08-21 17:56:25 +02:00
Andres Rey
fa9d8bda48 Add la-nacion test case 2017-12-11 14:00:48 +00:00