mirror of
https://github.com/postlight/mercury-parser
synced 2024-11-17 03:25:31 +00:00
fix: cleaning embed and object nodes
This commit is contained in:
parent
edfb54c532
commit
52e89a0229
3
TODO.md
3
TODO.md
@ -1,5 +1,4 @@
|
||||
TODO:
|
||||
- run makeLinksAbsolute on extracted content before returning
|
||||
- remove logic for fetching meta attrs with custom props
|
||||
- Resource (fetches page, validates it, cleans it, normalizes meta tags (!), converts lazy-loaded images, makes links absolute, etc)
|
||||
- extractNextPageUrl
|
||||
@ -12,6 +11,8 @@ TODO:
|
||||
- Separate constants into activity-specific folders (dom, scoring)
|
||||
|
||||
DONE:
|
||||
x cleaning embed and object nodes
|
||||
x run makeLinksAbsolute on extracted content before returning
|
||||
x add option to fetch attrs in RootExtractor's select method
|
||||
x get custom datePublished selector to convert to date object (prob through cleaner)
|
||||
x extract and generalize cleaners
|
||||
|
@ -9,6 +9,8 @@ export const STRIP_OUTPUT_TAGS = [
|
||||
'link',
|
||||
'style',
|
||||
'hr',
|
||||
'embed',
|
||||
'object',
|
||||
]
|
||||
|
||||
// cleanAttributes
|
||||
|
Loading…
Reference in New Issue
Block a user