Commit Graph

60 Commits

Author SHA1 Message Date
Matt Robinson
3dfe1cf60e
feat: document loader for epublications (#2202)
### Summary

Adds a new document loader for processing e-publications. Works with
`unstructured>=0.5.4`. You need to have
[`pandoc`](https://pandoc.org/installing.html) installed for this loader
to work.

### Testing

```python
from langchain.document_loaders import UnstructuredEPubLoader

loader = UnstructuredEPubLoader("winter-sports.epub", mode="elements")
data = loader.load()
data[0]
```
2023-03-30 20:45:31 -07:00
Max Caldwell
3dc49a04a3
[Documents] Updated Figma docs and added example (#2172)
- Current docs are pointing to the wrong module, fixed
- Added some explanation on how to find the necessary parameters
- Added chat-based codegen example w/ retrievers

Picture of the new page:
![Screenshot 2023-03-29 at 20-11-29 Figma — 🦜🔗 LangChain 0 0
126](https://user-images.githubusercontent.com/2172753/228719338-c7ec5b11-01c2-4378-952e-38bc809f217b.png)

Please let me know if you'd like any tweaks! I wasn't sure if the
example was too heavy for the page or not but decided "hey, I probably
would want to see it" and so included it.

Co-authored-by: maxtheman <max@maxs-mbp.lan>
2023-03-29 22:11:45 -07:00
Chase Adams
b5449a866d
docs: tiny fix on docs verbiage (#2124)
Changed `RecursiveCharaterTextSplitter` =>
`RecursiveCharacterTextSplitter`. GH's diff doesn't handle the long
string well.
2023-03-28 22:56:29 -07:00
Harrison Chase
410bf37fb8
Harrison/big query (#2100)
Co-authored-by: lu-cashmoney <lucas.corley@gmail.com>
2023-03-28 08:17:22 -07:00
Stéphane Busso
0bee219cb3
feat: Add Notion database document loader (#2056)
This PR adds Notion DB loader for langchain. 

It reads content from pages within a Notion Database. It uses the Notion
API to query the database and read the pages. It also reads the metadata
from the pages and stores it in the Document object.
2023-03-28 08:07:09 -07:00
Harrison Chase
d5825bd3e8
Harrison/whatsapp loader (#2085)
Co-authored-by: Moshe <hello@moshemalka.me>
2023-03-27 23:43:45 -07:00
Harrison Chase
f74a1bebf5
Harrison/duckdb (#2064)
Co-authored-by: Trent Hauck <trent@trenthauck.com>
2023-03-27 19:51:34 -07:00
Harrison Chase
30e3b31b04
Harrison/document cleanup (#2062)
Co-authored-by: Delip Rao <delip@users.noreply.github.com>
2023-03-27 16:32:55 -07:00
Harrison Chase
a0cd6672aa
Harrison/site map (#2061)
Co-authored-by: Tim Asp <707699+timothyasp@users.noreply.github.com>
2023-03-27 16:28:08 -07:00
Harrison Chase
705431aecc
big docs refactor (#1978)
Co-authored-by: Ankush Gola <ankush.gola@gmail.com>
2023-03-26 19:49:46 -07:00