Commit Graph

17 Commits

Author SHA1 Message Date
Harrison Chase
0998577dfe
Harrison/unstructured structured (#1004) 2023-02-12 07:36:11 -08:00
Harrison Chase
bbb06ca4cf
pdfminer (#1003) 2023-02-12 07:29:26 -08:00
Francisco Ingham
0b6aa6a024
Added initial capital letter to bullet points that had it missing (#1000)
Co-authored-by: Francisco Ingham <>
2023-02-11 20:31:34 -08:00
Harrison Chase
2e96704d59
Harrison/airbyte (#989)
Co-authored-by: zanderchase <zanderchase@gmail.com>
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MacBook-Pro.local>
2023-02-10 18:08:00 -08:00
zanderchase
c2d1d903fa
Zander/online pdf loader (#984) 2023-02-10 15:42:30 -08:00
Matt Robinson
07a407d89a
feat: adds UnstructuredURLLoader for loading data from urls (#979)
### Summary

Adds a `UnstructuredURLLoader` that supports loading data from a list of
URLs.


### Testing

```python
from langchain.document_loaders import UnstructuredURLLoader

urls = [
    "https://www.understandingwar.org/backgrounder/russian-offensive-campaign-assessment-february-8-2023",
    "https://www.understandingwar.org/backgrounder/russian-offensive-campaign-assessment-february-9-2023"
]
loader = UnstructuredURLLoader(urls=urls)
raw_documents = loader.load()
```
2023-02-10 10:18:38 -08:00
Harrison Chase
c64f98e2bb
Harrison/format agent instructions (#973)
Co-authored-by: Andrew White <white.d.andrew@gmail.com>
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>
Co-authored-by: Peng Qu <82029664+pengqu123@users.noreply.github.com>
2023-02-10 10:07:26 -08:00
Harrison Chase
5469d898a9
Harrison/everynote (#974)
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>
2023-02-10 08:02:35 -08:00
Harrison Chase
01fa2d8117
Harrison/youtube fixes (#955)
Co-authored-by: Ji <jizhang.work@gmail.com>
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>
2023-02-09 08:12:22 -08:00
zanderchase
8e126bc9bd
adding webpage loading logic (#942) 2023-02-09 07:52:50 -08:00
Harrison Chase
3e1901e1aa
gutenberg books (#946)
Co-authored-by: zanderchase <zander@unfold.ag>
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>
2023-02-08 12:00:47 -08:00
Harrison Chase
44ecec3896
Harrison/add roam loader (#939) 2023-02-08 00:35:33 -08:00
Harrison Chase
637c0d6508
Harrison/obsidian (#920) 2023-02-06 22:21:16 -08:00
Ankush Gola
6bd1529cb7
add GoogleDriveLoader (#914)
only deal with docs files for now
2023-02-06 21:44:35 -08:00
Harrison Chase
2ec25ddd4c
add unstructured examples (#913) 2023-02-06 18:13:46 -08:00
Harrison Chase
71e662e88d
update docs (#905) 2023-02-06 00:26:20 -08:00
Harrison Chase
53d56d7650
Harrison/unstructured support (#903) 2023-02-05 23:02:07 -08:00