Commit Graph

5 Commits (main)

Author SHA1 Message Date
Harrison Chase 4766b20223
clean up loaders (#1178) 1 year ago
Matt Robinson 2bee8d4941
feat: add support for `.ppt` files in `UnstructuredPowerPointLoader` (#1124)
###  Summary

Adds support for older `.ppt` file in the PowerPoint loader. 

### Testing

The following should work on `unstructured==0.4.11` using the example
docs from the `unstructured` repo.

```python
from langchain.document_loaders import UnstructuredPowerPointLoader

filename = "../unstructured/example-docs/fake-power-point.pptx"
loader = UnstructuredPowerPointLoader(filename)
loader.load()

filename = "../unstructured/example-docs/fake-power-point.ppt"
loader = UnstructuredPowerPointLoader(filename)
loader.load()
```

Now downgrade `unstructured` to version `0.4.10`. The following should
work:

```python
from langchain.document_loaders import UnstructuredPowerPointLoader

filename = "../unstructured/example-docs/fake-power-point.pptx"
loader = UnstructuredPowerPointLoader(filename)
loader.load()
```

and the following should give you a `ValueError` and invite you to
upgrade `unstructured`.


```python
from langchain.document_loaders import UnstructuredPowerPointLoader

filename = "../unstructured/example-docs/fake-power-point.ppt"
loader = UnstructuredPowerPointLoader(filename)
loader.load()
```
1 year ago
Harrison Chase 0998577dfe
Harrison/unstructured structured (#1004) 1 year ago
Harrison Chase 637c0d6508
Harrison/obsidian (#920) 1 year ago
Harrison Chase 2ec25ddd4c
add unstructured examples (#913) 1 year ago