You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
mercury-parser/TODO.md

26 lines
508 B
Markdown

Next: Continue working on paragraphize; move p tags outside other p tags (do this when not converting br)
- `extract` (this kicks it all off)
x `node_is_sufficient`
- `_extract_best_node`
x `get_weight`
8 years ago
x `_strip_unlikely_candidates`
x `_convert_to_paragraphs`
x `_brs_to_paragraphs`
x `_paragraphize`
## Scoring
- `_get_score`
- `_set_score`
- `_add_score`
- `_score_content`
- `_score_node`
- `_score_paragraph`
## Top Candidate
- `_find_top_candidate`
- `extract_clean_node`
- `_clean_conditionally`