You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
langchain/tests/integration_tests/examples
qued e4224a396b
feat: Add `UnstructuredXMLLoader` for `.xml` files (#5955)
# Unstructured XML Loader
Adds an `UnstructuredXMLLoader` class for .xml files. Works with
unstructured>=0.6.7. A plain text representation of the text with the
XML tags will be available under the `page_content` attribute in the
doc.

### Testing
```python
from langchain.document_loaders import UnstructuredXMLLoader

loader = UnstructuredXMLLoader(
    "example_data/factbook.xml",
)
docs = loader.load()
```


## Who can review?

@hwchase17 
@eyurtsev
1 year ago
..
default-encoding.py Add PythonLoader which auto-detects encoding of Python files (#3311) 1 year ago
example-utf8.html Add ability to pass kwargs to loader classes in `DirectoryLoader`, add ability to modify encoding and BeautifulSoup behaviour in `BSHTMLLoader` (#2275) 2 years ago
example.html Add HTML document_loader that includes page title metadata (#1720) 2 years ago
example.json JSON loader (#4067) 1 year ago
facebook_chat.json Refactor TelegramChatLoader and FacebookChatLoader classes and add tests (#3863) 1 year ago
factbook.xml feat: Add `UnstructuredXMLLoader` for `.xml` files (#5955) 1 year ago
fake.odt feat: add loader for open office odt files (#4405) 1 year ago
hello.msg Harrison/msg files (#2375) 2 years ago
hello.pdf Harrison/format agent instructions (#973) 2 years ago
layout-parser-paper.pdf Harrison/remote paths pdf (#1544) 2 years ago
non-utf8-encoding.py Add PythonLoader which auto-detects encoding of Python files (#3311) 1 year ago
sitemap.xml Harrison/sitemap local (#4704) 1 year ago
slack_export.zip Add Slack Directory Loader (#2841) 2 years ago
stanley-cups.csv feat: Add `UnstructuredCSVLoader` for CSV files (#5844) 1 year ago
stanley-cups.xlsx feat: add `UnstructuredExcelLoader` for `.xlsx` and `.xls` files (#5617) 1 year ago
whatsapp_chat.txt Update WhatsAppChatLoader to include the character ~ in the sender name (#4420) 1 year ago