Fix for #52: <input type="hidden"> are not counted any more for "form removal" heuristic.
This commit is contained in:
parent
2fab5ffa6b
commit
638f73f6a2
@ -452,6 +452,7 @@ class Document:
|
||||
for kind in ['p', 'img', 'li', 'a', 'embed', 'input']:
|
||||
counts[kind] = len(el.findall('.//%s' % kind))
|
||||
counts["li"] -= 100
|
||||
counts["input"] -= len(el.findall('.//input[@type="hidden"]'))
|
||||
|
||||
# Count the text length excluding any surrounding whitespace
|
||||
content_length = text_length(el)
|
||||
|
Loading…
Reference in New Issue
Block a user