Commit Graph

630 Commits

Author SHA1 Message Date
b5d3e99f9b TextLoader autodetect encodings + better exception handling (#4481) 2023-05-18 20:38:39 +02:00
Harrison Chase
243886be93
Harrison/virtual time (#4658)
Co-authored-by: ifsheldon <39153080+ifsheldon@users.noreply.github.com>
Co-authored-by: maple.liang <maple.liang@gempoll.com>
2023-05-14 10:29:17 -07:00
Harrison Chase
ef49c659f6
add embedding router (#4644) 2023-05-13 21:47:01 -07:00
Harrison Chase
c09bb00959
Harrison/summary memory history (#4649)
Co-authored-by: engkheng <60956360+outday29@users.noreply.github.com>
2023-05-13 21:46:11 -07:00
Harrison Chase
44ae673388
Harrison/multithreading directory loader (#4650)
Co-authored-by: PawelFaron <42373772+PawelFaron@users.noreply.github.com>
Co-authored-by: Pawel Faron <ext-pawel.faron@vaisala.com>
2023-05-13 21:46:02 -07:00
Harrison Chase
873b0c7eb6
Harrison/structured chat mem (#4652)
Co-authored-by: d 3 n 7 <29033313+d3n7@users.noreply.github.com>
2023-05-13 21:45:42 -07:00
Harrison Chase
279605b4d3
Harrison/metaphor search (#4657)
Co-authored-by: Jeffrey Wang <jeffreyzhiyuanwang@gmail.com>
2023-05-13 21:45:05 -07:00
Harrison Chase
9aa9fe7021
Harrison/spark connect example (#4659)
Co-authored-by: Mike Wang <62768671+skcoirz@users.noreply.github.com>
2023-05-13 21:44:54 -07:00
Leonid Ganeline
3ce78ef6c4
docs: document_loaders classification (#4069)
**Problem statement:** the
[document_loaders](https://python.langchain.com/en/latest/modules/indexes/document_loaders.html#)
section is too long and hard to comprehend.
**Proposal:** group document_loaders by 3 classes: (see `Files changed`
tab)

UPDATE: I've completely reworked the document_loader classification.
Now this PR changes only one file! 

FYI @eyurtsev @hwchase17
2023-05-13 19:17:32 -07:00
Harrison Chase
1e322ffc1c change heading 2023-05-13 09:52:23 -07:00
Davis Chase
9ab7101182
WIP: FLARE-inspired chain (#4612)
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-05-13 09:28:28 -07:00
Harrison Chase
daa3e6dedb
Harrison/prompt constructor methods (#4616) 2023-05-13 09:23:51 -07:00
Harrison Chase
6265cbfb11
Harrison/standard llm interface (#4615) 2023-05-13 09:05:31 -07:00
Harrison Chase
7d425cbf38
improve sql prompt (#4611)
Co-authored-by: Taqi Jaffri <tjaffri@docugami.com>
Co-authored-by: Taqi Jaffri <tjaffri@gmail.com>
2023-05-12 21:55:03 -07:00
Tim Asp
ed0d557ede
docs: fix pdf docs hierarchy and formatting (#4593)
# Fix pdf loader docs page


![image](https://github.com/hwchase17/langchain/assets/707699/4a11f379-00ed-4f7a-9870-71f74e0cadc6)

Using h1's messes with hierarchy, this fixes that, and moves the
PyPDFium2 loader out of the middle of PDFMiner docs
2023-05-12 15:03:01 -04:00
Davis Chase
a4a9d1f403
Improve vespa interface (#4546)
![Screenshot 2023-05-11 at 7 50 31
PM](https://github.com/hwchase17/langchain/assets/130488702/bc8ab4bb-8006-44fc-ba07-df54e84ee2c1)
2023-05-12 10:11:26 -07:00
Neil Ruaro
3a2855945b
added documentation on retrieving a PG vectorstore (#4578)
This PR adds in documentation on querying an existing vectorstore in PG 

Fixes 3191 (issue)
2023-05-12 13:04:06 -04:00
Harrison Chase
5ad151ed44
Add constitutional principles from paper (#4554)
Add constitutional principles from https://arxiv.org/pdf/2212.08073.pdf

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
2023-05-12 07:34:03 -07:00
Sai Vinay G
cf4c1394a2
feat: Added class to support huggingface text generation inference server (#4447)
[Text Generation
Inference](https://github.com/huggingface/text-generation-inference) is
a Rust, Python and gRPC server for generating text using LLMs.

This pull request add support for self hosted Text Generation Inference
servers.

feature: #4280

---------

Co-authored-by: Your Name <you@example.com>
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
2023-05-12 07:32:37 -07:00
Leonid Ganeline
e17d0319d5
Add arxiv retriever (#4538) 2023-05-11 22:48:38 -07:00
SimFG
7bcf238a1a
Optimize the initialization method of GPTCache (#4522)
Optimize the initialization method of GPTCache, so that users can use GPTCache more quickly.
2023-05-11 16:15:23 -07:00
kYLe
446b60d803
Fix a typo in langchain/docs/modules/models/llms/integrations/anyscale.ipynb (#4526) 2023-05-11 09:03:04 -07:00
kYLe
0d51a1f12b
Add LLMs support for Anyscale Service (#4350)
Add Anyscale service integration under LLM

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
2023-05-11 00:39:59 -07:00
Zander Chase
d969f43ed8
Load HuggingFace Tool (#4475)
# Add option to `load_huggingface_tool`

Expose a method to load a huggingface Tool from the HF hub

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
2023-05-11 00:07:36 -07:00
Harrison Chase
3ce29cb4a6
Harrison/new search (#4359)
Co-authored-by: Jiaping(JP) Zhang <vincentzhangv@gmail.com>
2023-05-10 17:09:16 -07:00
Davis Chase
9ec60ad832
Add azure cognitive search retriever (#4467)
All credit to @UmerHA, made a couple small changes

---------

Co-authored-by: UmerHA <40663591+UmerHA@users.noreply.github.com>
2023-05-10 15:27:27 -07:00
Davis Chase
46b100ea63
Add DocArray vector stores (#4483)
Thanks to @anna-charlotte and @jupyterjazz for the contribution! Made
few small changes to get it across the finish line

---------

Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>
Signed-off-by: jupyterjazz <saba.sturua@jina.ai>
Co-authored-by: anna-charlotte <charlotte.gerhaher@jina.ai>
Co-authored-by: jupyterjazz <saba.sturua@jina.ai>
Co-authored-by: Saba Sturua <45267439+jupyterjazz@users.noreply.github.com>
2023-05-10 15:22:16 -07:00
Davis Chase
04475bea7d
Mv plan and execute to experimental (#4459) 2023-05-10 08:31:53 -07:00
Matt Robinson
3637d6da6e
feat: add loader for open office odt files (#4405)
# ODF File Loader

Adds a data loader for handling Open Office ODT files. Requires
`unstructured>=0.6.3`.

### Testing

The following should work using the `fake.odt` example doc from the
[`unstructured` repo](https://github.com/Unstructured-IO/unstructured).

```python
from langchain.document_loaders import UnstructuredODTLoader

loader = UnstructuredODTLoader(file_path="fake.odt", mode="elements")
loader.load()

loader = UnstructuredODTLoader(file_path="fake.odt", mode="single")
loader.load()
```
2023-05-10 01:37:17 -07:00
Harrison Chase
f0cfed636f change nb name 2023-05-09 21:22:35 -07:00
Harrison Chase
6b8d144ccc
Harrison/plan and solve (#4422) 2023-05-09 21:07:56 -07:00
Leonid Ganeline
ce15ffae6a
added Wikipedia retriever (#4302)
- added `Wikipedia` retriever. It is effectively a wrapper for
`WikipediaAPIWrapper`. It wrapps load() into get_relevant_documents()
- sorted `__all__` in the `retrievers/__init__`
- added integration tests for the WikipediaRetriever
- added an example (as Jupyter notebook) for the WikipediaRetriever
2023-05-09 10:08:39 -07:00
Prayson Wilfred Daniel
2b4ba203f7
query correction from when to what (#4383)
# Minor Wording Documentation Change 

```python
agent_chain.run("When's my friend Eric's surname?")
# Answer with 'Zhu'
```

is change to 

```python
agent_chain.run("What's my friend Eric's surname?")
# Answer with 'Zhu'
```

I think when is a residual of the old query that was "When’s my friends
Eric`s birthday?".
2023-05-09 07:42:47 -07:00
BioErrorLog
04f765b838
Fix grammar in Text Splitters docs (#4373)
# Fix grammar in Text Splitters docs

Just a small fix of grammar in the documentation:

"That means there two different axes" -> "That means there are two
different axes"
2023-05-08 22:38:40 -04:00
Ankush Gola
b3ecce0545
fix json saving, update docs to reference anthropic chat model (#4364)
Fixes # (issue)
https://github.com/hwchase17/langchain/issues/4085
2023-05-08 15:30:52 -07:00
Simba Khadder
d84df25466
Add example on how to use Featureform with langchain (#4337)
Added an example on how to use Featureform to
connecting_to_a_feature_store.ipynb .
2023-05-08 10:32:17 -07:00
Zander Chase
8b284f9ad0
Pass parsed inputs through to tool _run (#4309) 2023-05-08 09:13:05 -07:00
Harrison Chase
c8b0b6e6c1
add youtube tools (#4320) 2023-05-08 08:29:30 -07:00
PawelFaron
04b74d0446
Adjusted GPT4All llm to streaming API and added support for GPT4All_J (#4131)
Fix for these issues:
https://github.com/hwchase17/langchain/issues/4126

https://github.com/hwchase17/langchain/issues/3839#issuecomment-1534258559

---------

Co-authored-by: Pawel Faron <ext-pawel.faron@vaisala.com>
2023-05-06 15:14:09 -07:00
Harrison Chase
64940e9d0f
docs for azure (#4238) 2023-05-06 10:16:00 -07:00
Myeongseop Kim
747b5f87c2
Add HumanInputLLM (#4160)
Related: #4028, I opened a new PR because (1) I was unable to unstage
mistakenly committed files (I'm not familiar with git enough to resolve
this issue), (2) I felt closing the original PR and opening a new PR
would be more appropriate if I changed the class name.

This PR creates HumanInputLLM(HumanLLM in #4028), a simple LLM wrapper
class that returns user input as the response. I also added a simple
Jupyter notebook regarding how and why to use this LLM wrapper. In the
notebook, I went over how to use this LLM wrapper and showed example of
testing `WikipediaQueryRun` using HumanInputLLM.
 
I believe this LLM wrapper will be useful especially for debugging,
educational or testing purpose.
2023-05-06 09:48:40 -07:00
Davis Chase
6cd51ef3d0
Simplify router chain constructor signatures (#4146) 2023-05-06 09:38:17 -07:00
Leonid Ganeline
9544b30821
added Wikipedia document loader (#4141)
- Added the `Wikipedia` document loader. It is based on the existing
`unilities/WikipediaAPIWrapper`
- Added a respective ut-s and example notebook
- Sorted list of classes in __init__
2023-05-06 09:32:45 -07:00
Davis Chase
5ca13cc1f0
Dev2049/pypdfium2 (#4209)
thanks @jerrytigerxu for the addition!

---------

Co-authored-by: Jere Xu <jtxu2008@gmail.com>
Co-authored-by: jerrytigerxu <jere.tiger.xu@gmailc.om>
2023-05-05 17:55:31 -07:00
Leonid Ganeline
59204a5033
docs: document_loaders improvements (#4200)
- made notebooks consistent: titles, service/format descriptions.
- corrected short names to full names, for example, `Word` -> `Microsoft
Word`
- added missed descriptions
- renamed notebook files to make ToC correctly sorted
2023-05-05 17:44:54 -07:00
Aivin V. Solatorio
6567b73e1a
JSON loader (#4067)
This implements a loader of text passages in JSON format. The `jq`
syntax is used to define a schema for accessing the relevant contents
from the JSON file. This requires dependency on the `jq` package:
https://pypi.org/project/jq/.

---------

Signed-off-by: Aivin V. Solatorio <avsolatorio@gmail.com>
2023-05-05 14:48:13 -07:00
Harrison Chase
26534457f5
simplify csv args (#4182) 2023-05-05 09:22:08 -07:00
Davis Chase
d84bb02881
Add Chroma self query (#4149)
Add internal query language -> chroma metadata filter translator
2023-05-05 08:43:08 -07:00
Vinoo Ganesh
905a2114d7
Fix: Typo in Docs (#4179)
Fixing small typo in docs
2023-05-05 08:35:49 -07:00
Harrison Chase
a9c2450330
Harrison/toml loader (#4090)
Co-authored-by: Mika Ayenson <Mikaayenson@users.noreply.github.com>
2023-05-03 23:14:39 -07:00