Commit Graph

523 Commits (fix-searx)

Author SHA1 Message Date
blob42 97bcca7518 docs: fix typo in searx search tool 1 year ago
Harrison Chase 3ee32a01ea
Harrison/prompt layer (#1547)
Co-authored-by: Jonathan Pedoeem <jonathanped@gmail.com>
Co-authored-by: AbuBakar <abubakarsohail123@gmail.com>
1 year ago
Harrison Chase c844d1fd46
Harrison/chunk size (#1549)
Co-authored-by: Florian Leuerer <31259070+floleuerer@users.noreply.github.com>
1 year ago
Harrison Chase 9405af6919
Harrison/hf inf error (#1543)
Co-authored-by: Konstantin Hebenstreit <57603012+KonstantinHebenstreit@users.noreply.github.com>
1 year ago
Harrison Chase 357d808484
Harrison/remote paths pdf (#1544)
Co-authored-by: Tim Asp <707699+timothyasp@users.noreply.github.com>
1 year ago
Harrison Chase cc423f40f1
Harrison/youtube loader (#1545)
Co-authored-by: Julian Wustl <57504258+Julianwustl@users.noreply.github.com>
1 year ago
Harrison Chase 523ad8d2e2
Harrison/chat history formatter1 (#1538)
Co-authored-by: Youssef A. Abukwaik <yousseb@users.noreply.github.com>
1 year ago
Harrison Chase 519f0187b6
Harrison/gdrive pdf (#1433)
Co-authored-by: LM <93918064+LuisMalhadas@users.noreply.github.com>
Co-authored-by: Luis Malhadas <luis@sia.so>
1 year ago
Florian Leuerer 64c6435545
Added client_settings support for chromadb vecstore (#1528)
# Problem

The ChromaDB vecstore only supported local connection. There was no way
to use a chromadb server.

# Fix
Added `client_settings` as Chroma attribute. 

# Usage

```
from chromadb.config import Settings
from langchain.vectorstores import Chroma

chroma_settings = Settings(chroma_api_impl="rest",
                            chroma_server_host="localhost",
                            chroma_server_http_port="80")

docsearch = Chroma.from_documents(chunks, embeddings, metadatas=metadatas, client_settings=chroma_settings, collection_name=COLLECTION_NAME)
```
1 year ago
Harrison Chase 7eba828e1b
Harrison/update regex (#1534)
Co-authored-by: Luis <57528712+LuisLechugaRuiz@users.noreply.github.com>
1 year ago
Harrison Chase 2a7215bc3b
Harrison/prompt issues (#1537) 1 year ago
Alpri Else 784d24a1d5
Support S3 Object keys with `/` in `S3FileLoader` (#1517)
Resolves https://github.com/hwchase17/langchain/issues/1510

### Problem
When loading S3 Objects with `/` in the object key (eg.
`folder/some-document.txt`) using `S3FileLoader`, the objects are
downloaded into a temporary directory and saved as a file.

This errors out when the parent directory does not exist within the
temporary directory.

See
https://github.com/hwchase17/langchain/issues/1510#issuecomment-1459583696
on how to reproduce this bug

### What this pr does
Creates parent directories based on object key. 

This also works with deeply nested keys:
`folder/subfolder/some-document.txt`
1 year ago
Harrison Chase c4a557bdd4
add concept of prompt collection (#1507) 1 year ago
Harrison Chase 7ade419a0e
allow passing of messages into prompt template (#1505) 1 year ago
Harrison Chase a4a2d79087
Harrison/rtd loader (#1513)
Co-authored-by: Youssef A. Abukwaik <yousseb@users.noreply.github.com>
1 year ago
Harrison Chase 8f21605d71
add return source docs (#1515) 1 year ago
Harrison Chase 064741db58
Harrison/fix text splitter (#1511)
Co-authored-by: ajaysolanky <ajsolanky@gmail.com>
Co-authored-by: Ajay Solanky <ajaysolanky@saw-l14668307kd.myfiosgateway.com>
1 year ago
Harrison Chase 3610ef2830
add fake embeddings class (#1503) 1 year ago
Ankush Gola 27104d4921
fix `ChatOpenAI.agenerate` (#1504) 1 year ago
Harrison Chase 8e6f599822
change to baselanguagemodel (#1496) 1 year ago
Harrison Chase f276bfad8e
Harrison/chat memory (#1495) 1 year ago
Harrison Chase 7bec461782
Harrison/memory refactor (#1478)
moves memory to own module, factors out common stuff
1 year ago
kahkeng df6865cd52
Allow no token limit for ChatGPT API (#1481)
The endpoint default is inf if we don't specify max_tokens, so unlike
regular completion API, we don't need to calculate this based on the
prompt.
1 year ago
Harrison Chase 0e21463f07
(rfc) chat models (#1424)
Co-authored-by: Ankush Gola <ankush.gola@gmail.com>
1 year ago
Juanky Soriano dec3750875
Change method to calculate number of tokens for OpenAIChat (#1457)
Solves https://github.com/hwchase17/langchain/issues/1412

Currently `OpenAIChat` inherits the way it calculates the number of
tokens, `get_num_token`, from `BaseLLM`.
In the other hand `OpenAI` inherits from `BaseOpenAI`. 

`BaseOpenAI` and `BaseLLM` uses different methodologies for doing this.
The first relies on `tiktoken` while the second on `GPT2TokenizerFast`.

The motivation of this PR is:

1. Bring consistency about the way of calculating number of tokens
`get_num_token` to the `OpenAI` family, regardless of `Chat` vs `non
Chat` scenarios.
2. Give preference to the `tiktoken` method as it's serverless friendly.
It doesn't require downloading models which might make it incompatible
with `readonly` filesystems.
1 year ago
Tim Asp 763f879536
fix always verbose on summarization checker (#1440) 1 year ago
Harrison Chase 63a5614d23
Harrison/simple memory (#1435)
Co-authored-by: Tim Asp <707699+timothyasp@users.noreply.github.com>
1 year ago
Harrison Chase a1b9dfc099
Harrison/similarity search chroma (#1434)
Co-authored-by: shibuiwilliam <shibuiyusuke@gmail.com>
1 year ago
Peng Qu 68ce68f290
Fix an unusual issue that occurs when using OpenAIChat for llm_math (#1410)
Fix an issue that occurs when using OpenAIChat for llm_math, refer to
the code style of the "Final Answer:" in Mrkl。 the reason is I found a
issue when I try OpenAIChat for llm_math, when I try the question in
Chinese, the model generate the format like "\n\nQuestion: What is the
square of 29?\nAnswer: 841", it translate the question first , then
answer. below is my snapshot:
<img width="945" alt="snapshot"
src="https://user-images.githubusercontent.com/82029664/222642193-10ecca77-db7b-4759-bc46-32a8f8ddc48f.png">
1 year ago
Kentaro Tanaka 6a4ee07e4f
Fix type hint of 'vectorstore_cls' arg in `SemanticSimilarityExampleSelector` (#1427)
Hello! Thank you for the amazing library you've created!

While following the tutorial at [the link(`Using an example
selector`)](https://langchain.readthedocs.io/en/latest/modules/prompts/examples/few_shot_examples.html#using-an-example-selector),
I noticed that passing Chroma as an argument to from_examples results in
a type hint error.

Error message(mypy):
```
Argument 3 to "from_examples" of "SemanticSimilarityExampleSelector" has incompatible type "Type[Chroma]"; expected "VectorStore"  [arg-type]mypy(error)
```

This pull request fixes the type hint and allows the VectorStore class
to be specified as an argument.
1 year ago
Tim Asp 23231d65a9
Add PyMuPDF PDF loader (#1426)
Different PDF libraries have different strengths and weaknesses. PyMuPDF
does a good job at extracting the most amount of content from the doc,
regardless of the source quality, extremely fast (especially compared to
Unstructured).

https://pymupdf.readthedocs.io/en/latest/index.html
1 year ago
blob42 3d54b05863
searx: add install instructions, update doc and notebooks (#1420)
- Added instructions on setting up self hosted searx
- Add notebook example with agent
- Use `localhost:8888` as example url to stay consistent since public
instances are not really usable.

Co-authored-by: blob42 <spike@w530>
1 year ago
Jon Luo 882f7964fb
fix sql misinterpretation of % in query (#1408)
% is being misinterpreted by sqlalchemy as parameter passing, so any
`LIKE 'asdf%'` will result in a value error with mysql, mariadb, and
maybe some others. This is one way to fix it - the alternative is to
simply double up %, like `LIKE 'asdf%%'` but this seemed cleaner in
terms of output.
Fixes #1383
1 year ago
Eugene Yurtsev a83a371069
Minor documentation update in initialize_agent (#1397)
Updating documentation in initialize_agent.

One thing that could benefit from further clarification is the
responsibility
breakdown by between an AgentExecutor vs. an Agent. The documentation
for an
AgentExecutor does not clarify that. From the class attributes, it
appears that
executor has access to the tools, while the agent is only aware of the
tool
names. Anyway, additional clarification would be beneficial on the
AgentExecutor class.
1 year ago
Nuno Campos 499e76b199
Allow the regular openai class to be used for ChatGPT models (#1393)
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1 year ago
Kacper Łukawski 8947797250
Return Cohere embeddings as lists of floats (#1394)
This PR fixes the types returned by Cohere embeddings. Currently, Cohere
client returns instances of `cohere.embeddings.Embeddings`. Since the
transport layer relies on JSON, some numbers might be represented as
ints, not floats, which happens quite often. While that doesn't seem to
be an issue, it breaks some pydantic models if they require strict
floats.
1 year ago
Kacper Łukawski f032609f8d
Add `recursive` parameter to `DirectoryLoader` (#1389)
This PR allows loading a directory recursively.
1 year ago
Kacper Łukawski 9ac442624c
Add Qdrant named arguments (#1386)
This PR:
- Increases `qdrant-client` version to 1.0.4
- Introduces custom content and metadata keys (as requested in #1087)
- Moves all the `QdrantClient` parameters into the method parameters to
simplify code completion
1 year ago
Francisco Ingham 34abcd31b9
remove limit clause from prompt for compatibility with ms sql server (#1385)
For reference see:
8a35811556

Co-authored-by: Francisco Ingham <>
1 year ago
Ankush Gola fe30be6fba
add async and streaming support to `OpenAIChat` (#1378)
title says it all
1 year ago
Ryan Dao 59157b6891
Bug: Fix Python version validation in PythonAstREPLTool (#1373)
The current logic checks if the Python major version is < 8, which is
wrong. This checks if the major and minor version is < 3.9.
1 year ago
Harrison Chase e178008b75
Harrison/track token usage (#1382)
Co-authored-by: Zak King <zaking17@gmail.com>
1 year ago
Harrison Chase 1cd8996074
Harrison/summarizer chain (#1356)
Co-authored-by: Tim Asp <707699+timothyasp@users.noreply.github.com>
1 year ago
yakigac cfae03042d
Fix the openaichat example (#1377)
The example was wrong.
1 year ago
Harrison Chase 4b5e850361
chatgpt wrapper (#1367) 1 year ago
Christie Jacob edb3915ee7
typo in vectorstores (#1362) 1 year ago
Harrison Chase fe7dbecfe6
pandas and csv agents (#1353) 1 year ago
Ankush Gola 82baecc892
Add a SQL agent for interacting with SQL Databases and JSON Agent for interacting with large JSON blobs (#1150)
This PR adds 

* `ZeroShotAgent.as_sql_agent`, which returns an agent for interacting
with a sql database. This builds off of `SQLDatabaseChain`. The main
advantages are 1) answering general questions about the db, 2) access to
a tool for double checking queries, and 3) recovering from errors
* `ZeroShotAgent.as_json_agent` which returns an agent for interacting
with json blobs.
* Several examples in notebooks

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1 year ago
Jon Luo 35f1e8f569
separate columns by tabs instead of single space in sql sample rows (#1348)
Use tabs to separate columns instead of a single space - confusing when
there are spaces in a cell
1 year ago
kurehajime 6c629b54e6
Fixed arguments passed to InvalidTool.run(). (#1340)
[InvalidTool.run()](72ef69d1ba/langchain/agents/tools.py (L43))
returns "{arg}is not a valid tool, try another one.".
However, no function name is actually given in the argument.
This causes LLM to be stuck in a loop, unable to find the right tool.

This may resolve these Issues.
https://github.com/hwchase17/langchain/issues/998
https://github.com/hwchase17/langchain/issues/702
1 year ago