You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
143631b4b7
* feat:Add a custom extractor for ma.ttias.be. When parsing content for cron.weekly issues, such as the one at https://ma.ttias.be/cronweekly/issue-130/, Mercury Parser would remove headings and ordered lists that were part of the content. This resolves that as follows: * Remove "id" attributes from "h1" and "h2" elements. Those attributes would result in the elements having a low weight. * Since Mercury Parser demotes "h1" elements to "h2", demote "h2" elements to "h3". * Add class="entry-content-asset" to "ul" elements to avoid them being removed. * removed redundant comment. * feat: Add a custom extractor for engadget.com. * Works, but I need to figure how to make pagination work correctly. * fixed pagination - would only retrieve first or second page because we would send contentOnly: true on subsequent pages (page 2). removed failover: true from preview. * rolled back { fallback: false } option removal * Clarified comments. Co-authored-by: John Holdun <john@johnholdun.com> |
2 years ago | |
---|---|---|
.. | ||
1587927767738.html | 2 years ago | |
1587929444000.html | 2 years ago |