forked from Archives/langchain
Add title, lang, description to Web loader document metadata (#2955)
Title, lang and description are on almost every web page, and are incredibly useful pieces of information that currently isn't captured with the current web base loader I thought about adding the title and description to the content of the document, as that content could be useful in search, but I left it out for right now. If you think it'd be worth adding, happy to add it. I've found it's nice to have the title/description in the metadata to have some structured data when retrieving rows from vectordbs for use with summary and source citation, so if we do want to add it to the `page_content`, i'd advocate for it to also be included in metadata.fix_agent_callbacks
parent
f7bf917baf
commit
fea5619ce9
Loading…
Reference in New Issue