5a69d4a8eb
* Improve metadata extraction * Recognize meta[property] as a space-separated list * Recognize Dulin Core (dc|dcterm): metadata. * Prefer Dublin Core, Open Graph, Twitter, and HTML in that order. * _getArticleTitle() is now only used as fallback if document doesn't provide good metadata. |
||
---|---|---|
.. | ||
expected-metadata.json | ||
expected.html | ||
source.html |