Commit Graph

10 Commits

Author SHA1 Message Date
PalmerAL
3844d8f05b
Include more ancestors in candidate scoring (#611)
* include more ancestors in candidate scoring

* fix medium-3 testcase

The original source file contained two copies of the document, which
was causing incorrect results

* remove unnecessary nested elements

* fix removal of empty elements

* add option to regenerate all testcases

* update tests

* fix quanta testcase

* fix creating testcase from network

* fix early exit in testcase generation

* format HTML before comparing while testing

* upgrade js-beautify

* don't merge outer readability div
2020-08-21 10:16:58 +01:00
Gijs
d6fc38c4b4
Fix #564 by allowing 'content' as an indicator of readable content (#565)
This avoid `contentWithSidebar` causing complete removal of the content.
As a side-effect, it slightly improves byline detection by not removing
content as early on as before.
2019-10-21 15:13:55 +01:00
Maria Luiza Soares
8c41d92560 Assert on siteName in all test cases 2018-12-21 18:28:28 +00:00
Gijs Kruitbosch
ad4dd26448 Update test expectations 2017-11-30 10:41:15 +00:00
Cameron McCormack
5ad448f831 Update test expectations. 2017-11-21 10:04:59 +00:00
Evan Tseng
0f147374b7 Bug 1323861 - Remove the readScript method, r=Gijs 2017-02-22 09:39:17 +00:00
Evan Tseng
15e1f03261 Bug 1300697 - Reader View missed first few paragraphs on New York Times website, r=Gijs 2017-01-21 17:46:50 +00:00
Gijs Kruitbosch
2e1cb3f467 Fix issue #251 by making JSDOMParser expect XML and stop making excuses for 'self-closed' things, when all that does is cause trouble 2016-01-22 19:57:45 +00:00
Nicolas Perriault
dc1b2c9fa0 Refs #195 - Exclude nodes likely to be related content. 2015-05-04 08:51:45 +02:00
Nicolas Perriault
cc18cb5787 Ref #195 - Add support for dailymotion videos. 2015-04-30 15:02:52 +02:00