mirror of
https://github.com/hwchase17/langchain
synced 2024-10-29 17:07:25 +00:00
556 B
556 B
Key Concepts
Document
This class is a container for document information. This contains two parts:
page_content
: The content of the actual page itself.metadata
: The metadata associated with the document. This can be things like the file path, the url, etc.
Loader
This base class is a way to load documents. It exposes a load
method that returns Document
objects.
Unstructured
Unstructured is a python package specifically focused on transformations from raw documents to text.