Commit Graph

1116 Commits

Author SHA1 Message Date
William FH
c9ae0c5808
Add lint_diff command (#2449)
It's helpful for developers to run the linter locally on just the
changed files.

This PR adds support for a `lint_diff` command.

Ruff is still run over the entire directory since it's very fast.
2023-04-05 09:34:24 -07:00
Harrison Chase
3d871853df
bump version to 132 (#2441) 2023-04-05 07:54:01 -07:00
Harrison Chase
00bc8df640
Harrison/tfidf retriever (#2440) 2023-04-05 07:36:49 -07:00
researchonly
a63cfad558
fixed typo Teplate -> Template (#2433)
fixed a typo in the documentation
2023-04-05 06:56:51 -07:00
Bill Chambers
f0d4f36219
Documentation Error - Typo in Docs - Update custom_mrkl_agent.ipynb (#2437)
Just a small typo in the documentation.
2023-04-05 06:56:39 -07:00
sergerdn
b410dc76aa
fix: elasticsearch (#2402)
- Create a new docker-compose file to start an Elasticsearch instance
for integration tests.
- Add new tests to `test_elasticsearch.py` to verify Elasticsearch
functionality.
- Include an optional group `test_integration` in the `pyproject.toml`
file. This group should contain dependencies for integration tests and
can be installed using the command `poetry install --with
test_integration`. Any new dependencies should be added by running
`poetry add some_new_deps --group "test_integration" `

Note:
New tests running in live mode, which involve end-to-end testing of the
OpenAI API. In the future, adding `pytest-vcr` to record and replay all
API requests would be a nice feature for testing process.More info:
https://pytest-vcr.readthedocs.io/en/latest/

Fixes https://github.com/hwchase17/langchain/issues/2386
2023-04-05 06:51:32 -07:00
Ankush Gola
4d730a9bbc
improve AsyncCallbackManager (#2410) 2023-04-05 09:31:42 +02:00
Harrison Chase
af7f20fa42
Harrison/elastic search (#2419) 2023-04-04 21:29:06 -07:00
Adam Gutglick
659c67e896
Don't create a new Pinecone index if doesn't exist (#2414)
In the case no pinecone index is specified, or a wrong one is, do not
create a new one. Creating new indexes can cause unexpected costs to
users, and some code paths could cause a new one to be created on each
invocation.
This PR solves #2413.
2023-04-04 20:42:27 -07:00
Andrei
e519a81a05
Update LlamaCpp parameters (#2411)
Add `n_batch` and `last_n_tokens_size` parameters to the LlamaCpp class.
These parameters (epecially `n_batch`) significantly effect performance.
There's also a `verbose` flag that prints system timings on the `Llama`
class but I wasn't sure where to add this as it conflicts with (should
be pulled from?) the LLM base class.
2023-04-04 19:52:33 -07:00
jerwelborn
b026a62bc4
hierarchical planning agent for multi-step queries against larger openapi specs (#2170)
The specs used in chat-gpt plugins have only a few endpoints and have
unrealistically small specifications. By contrast, a spec like spotify's
has 60+ endpoints and is comprised 100k+ tokens.

Here are some impressive traces from gpt-4 that string together
non-trivial sequences of API calls. As noted in `planner.py`, gpt-3 is
not as robust but can be improved with i) better retry, self-reflect,
etc. logic and ii) better few-shots iii) etc. This PR's just a first
attempt probing a few different directions that eventually can be made
more core.
 

`make me a playlist with songs from kind of blue. call it machine
blues.`

```
> Entering new AgentExecutor chain...
Action: api_planner
Action Input: I need to find the right API calls to create a playlist with songs from Kind of Blue and name it Machine Blues
Observation: 1. GET /search to find the album ID for "Kind of Blue".
2. GET /albums/{id}/tracks to get the tracks from the "Kind of Blue" album.
3. GET /me to get the current user's ID.
4. POST /users/{user_id}/playlists to create a new playlist named "Machine Blues" for the current user.
5. POST /playlists/{playlist_id}/tracks to add the tracks from "Kind of Blue" to the newly created "Machine Blues" playlist.
Thought:I have a plan to create the playlist. Now, I will execute the API calls.
Action: api_controller
Action Input: 1. GET /search to find the album ID for "Kind of Blue".
2. GET /albums/{id}/tracks to get the tracks from the "Kind of Blue" album.
3. GET /me to get the current user's ID.
4. POST /users/{user_id}/playlists to create a new playlist named "Machine Blues" for the current user.
5. POST /playlists/{playlist_id}/tracks to add the tracks from "Kind of Blue" to the newly created "Machine Blues" playlist.

> Entering new AgentExecutor chain...
Action: requests_get
Action Input: {"url": "https://api.spotify.com/v1/search?q=Kind%20of%20Blue&type=album", "output_instructions": "Extract the id of the first album in the search results"}
Observation: 1weenld61qoidwYuZ1GESA
Thought:Action: requests_get
Action Input: {"url": "https://api.spotify.com/v1/albums/1weenld61qoidwYuZ1GESA/tracks", "output_instructions": "Extract the ids of all the tracks in the album"}
Observation: ["7q3kkfAVpmcZ8g6JUThi3o"]
Thought:Action: requests_get
Action Input: {"url": "https://api.spotify.com/v1/me", "output_instructions": "Extract the id of the current user"}
Observation: 22rhrz4m4kvpxlsb5hezokzwi
Thought:Action: requests_post
Action Input: {"url": "https://api.spotify.com/v1/users/22rhrz4m4kvpxlsb5hezokzwi/playlists", "data": {"name": "Machine Blues"}, "output_instructions": "Extract the id of the newly created playlist"}
Observation: 48YP9TMcEtFu9aGN8n10lg
Thought:Action: requests_post
Action Input: {"url": "https://api.spotify.com/v1/playlists/48YP9TMcEtFu9aGN8n10lg/tracks", "data": {"uris": ["spotify:track:7q3kkfAVpmcZ8g6JUThi3o"]}, "output_instructions": "Confirm that the tracks were added to the playlist"}
Observation: The tracks were added to the playlist. The snapshot_id is "Miw4NTdmMWUxOGU5YWMxMzVmYmE3ZWE5MWZlYWNkMTc2NGVmNTI1ZjY5".
Thought:I am finished executing the plan.
Final Answer: The tracks from the "Kind of Blue" album have been added to the newly created "Machine Blues" playlist. The playlist ID is 48YP9TMcEtFu9aGN8n10lg.

> Finished chain.

Observation: The tracks from the "Kind of Blue" album have been added to the newly created "Machine Blues" playlist. The playlist ID is 48YP9TMcEtFu9aGN8n10lg.
Thought:I am finished executing the plan and have created the playlist with songs from Kind of Blue, named Machine Blues.
Final Answer: I have created a playlist called "Machine Blues" with songs from the "Kind of Blue" album. The playlist ID is 48YP9TMcEtFu9aGN8n10lg.

> Finished chain.
```

or

`give me a song in the style of tobe nwige`

```
> Entering new AgentExecutor chain...
Action: api_planner
Action Input: I need to find the right API calls to get a song in the style of Tobe Nwigwe

Observation: 1. GET /search to find the artist ID for Tobe Nwigwe.
2. GET /artists/{id}/related-artists to find similar artists to Tobe Nwigwe.
3. Pick one of the related artists and use their artist ID in the next step.
4. GET /artists/{id}/top-tracks to get the top tracks of the chosen related artist.
Thought:


I'm ready to execute the API calls.
Action: api_controller
Action Input: 1. GET /search to find the artist ID for Tobe Nwigwe.
2. GET /artists/{id}/related-artists to find similar artists to Tobe Nwigwe.
3. Pick one of the related artists and use their artist ID in the next step.
4. GET /artists/{id}/top-tracks to get the top tracks of the chosen related artist.

> Entering new AgentExecutor chain...
Action: requests_get
Action Input: {"url": "https://api.spotify.com/v1/search?q=Tobe%20Nwigwe&type=artist", "output_instructions": "Extract the artist id for Tobe Nwigwe"}
Observation: 3Qh89pgJeZq6d8uM1bTot3
Thought:Action: requests_get
Action Input: {"url": "https://api.spotify.com/v1/artists/3Qh89pgJeZq6d8uM1bTot3/related-artists", "output_instructions": "Extract the ids and names of the related artists"}
Observation: [
  {
    "id": "75WcpJKWXBV3o3cfluWapK",
    "name": "Lute"
  },
  {
    "id": "5REHfa3YDopGOzrxwTsPvH",
    "name": "Deante' Hitchcock"
  },
  {
    "id": "6NL31G53xThQXkFs7lDpL5",
    "name": "Rapsody"
  },
  {
    "id": "5MbNzCW3qokGyoo9giHA3V",
    "name": "EARTHGANG"
  },
  {
    "id": "7Hjbimq43OgxaBRpFXic4x",
    "name": "Saba"
  },
  {
    "id": "1ewyVtTZBqFYWIcepopRhp",
    "name": "Mick Jenkins"
  }
]
Thought:Action: requests_get
Action Input: {"url": "https://api.spotify.com/v1/artists/75WcpJKWXBV3o3cfluWapK/top-tracks?country=US", "output_instructions": "Extract the ids and names of the top tracks"}
Observation: [
  {
    "id": "6MF4tRr5lU8qok8IKaFOBE",
    "name": "Under The Sun (with J. Cole & Lute feat. DaBaby)"
  }
]
Thought:I am finished executing the plan.

Final Answer: The top track of the related artist Lute is "Under The Sun (with J. Cole & Lute feat. DaBaby)" with the track ID "6MF4tRr5lU8qok8IKaFOBE".

> Finished chain.

Observation: The top track of the related artist Lute is "Under The Sun (with J. Cole & Lute feat. DaBaby)" with the track ID "6MF4tRr5lU8qok8IKaFOBE".
Thought:I am finished executing the plan and have the information the user asked for.
Final Answer: The song "Under The Sun (with J. Cole & Lute feat. DaBaby)" by Lute is in the style of Tobe Nwigwe.

> Finished chain.
```

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-04-04 19:49:42 -07:00
jerwelborn
d6d6f322a9
Fix requests wrapper refactor (#2417)
https://github.com/hwchase17/langchain/pull/2367
2023-04-04 18:22:35 -07:00
Harrison Chase
41832042cc
Harrison/pinecone hybrid (#2405) 2023-04-04 14:09:57 -07:00
Harrison Chase
2b975de94d
add metal retriever (#2244) 2023-04-04 12:17:13 -07:00
Harrison Chase
1f88b11c99
replicate cleanup (#2394) 2023-04-04 12:15:03 -07:00
Harrison Chase
f5da9a5161 cr 2023-04-04 07:26:47 -07:00
Harrison Chase
8a4709582f cr 2023-04-04 07:25:28 -07:00
Harrison Chase
de7afc52a9 cr 2023-04-04 07:23:53 -07:00
Harrison Chase
c7b083ab56
bump version to 131 (#2391) 2023-04-04 07:21:50 -07:00
longgui0318
dc3ac8082b
Revision of "elasticearch" spelling problem (#2378)
Revision of "elasticearch" spelling problem

Co-authored-by: gubei <>
2023-04-04 06:59:50 -07:00
Harrison Chase
0a9f04bad9
Harrison/gpt4all (#2366)
Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2023-04-04 06:49:17 -07:00
Harrison Chase
d17dea30ce
Harrison/sql views (#2376)
Co-authored-by: Wadih Pazos <wadih@wpazos.com>
Co-authored-by: Wadih Pazos Sr <wadih@esgenio.com>
2023-04-04 06:48:45 -07:00
Harrison Chase
e90d007db3
Harrison/msg files (#2375)
Co-authored-by: Sahil Masand <masand.sahil@gmail.com>
Co-authored-by: Sahil Masand <masands@cbh.com.au>
2023-04-04 06:48:34 -07:00
Kacper Łukawski
585f60a5aa
Qdrant update to 1.1.1 & docs polishing (#2388)
This PR updates Qdrant to 1.1.1 and introduces local mode, so there is
no need to spin up the Qdrant server. By that occasion, the Qdrant
example notebooks also got updated, covering more cases and answering
some commonly asked questions. All the Qdrant's integration tests were
switched to local mode, so no Docker container is required to launch
them.
2023-04-04 06:48:21 -07:00
sergerdn
90973c10b1
fix: tests with Dockerfile (#2382)
Update the Dockerfile to use the `$POETRY_HOME` argument to set the
Poetry home directory instead of adding Poetry to the PATH environment
variable.

Add instructions to the `CONTRIBUTING.md` file on how to run tests with
Docker.

Closes https://github.com/hwchase17/langchain/issues/2324
2023-04-04 06:47:19 -07:00
Harrison Chase
fe1eb8ca5f
requests wrapper (#2367) 2023-04-03 21:57:19 -07:00
Shrined
10dab053b4
Add Enum for agent types (#2321)
This pull request adds an enum class for the various types of agents
used in the project, located in the `agent_types.py` file. Currently,
the project is using hardcoded strings for the initialization of these
agents, which can lead to errors and make the code harder to maintain.
With the introduction of the new enums, the code will be more readable
and less error-prone.

The new enum members include:

- ZERO_SHOT_REACT_DESCRIPTION
- REACT_DOCSTORE
- SELF_ASK_WITH_SEARCH
- CONVERSATIONAL_REACT_DESCRIPTION
- CHAT_ZERO_SHOT_REACT_DESCRIPTION
- CHAT_CONVERSATIONAL_REACT_DESCRIPTION

In this PR, I have also replaced the hardcoded strings with the
appropriate enum members throughout the codebase, ensuring a smooth
transition to the new approach.
2023-04-03 21:56:20 -07:00
Zach Jones
c969a779c9
Fix: Pass along kwargs when creating a sql agent (#2350)
Currently, `agent_toolkits.sql.create_sql_agent()` passes kwargs to the
`ZeroShotAgent` that it creates but not to `AgentExecutor` that it also
creates. This prevents the caller from providing some useful arguments
like `max_iterations` and `early_stopping_method`

This PR changes `create_sql_agent` so that it passes kwargs to both
constructors.

---------

Co-authored-by: Zachary Jones <zjones@zetaglobal.com>
2023-04-03 21:50:51 -07:00
andrewmelis
7ed8d00bba
Remove extra word in CONTRIBUTING.md (#2370)
"via by a developer" -> "by a developer"

---

Thank you for all your hard work!
2023-04-03 21:48:58 -07:00
Yunlei Liu
9cceb4a02a
Llama.cpp doc update: fix ipynb path (#2364) 2023-04-03 16:59:52 -07:00
Mandy Gu
c841b2cc51
Expand requests tool into individual methods for load_tools (#2254)
### Motivation / Context

When exploring `load_tools(["requests"] )`, I would have expected all
request method tools to be imported instead of just `RequestsGetTool`.

### Changes

Break `_get_requests` into multiple functions by request method. Each
function returns the `BaseTool` for that particular request method.

In `load_tools`, if the tool name "requests_all" is encountered, we
replace with all `_BASE_TOOLS` that starts with `requests_`.

This way, `load_tools(["requests"])` returns:
- RequestsGetTool
- RequestsPostTool
- RequestsPatchTool
- RequestsPutTool
- RequestsDeleteTool
2023-04-03 15:59:52 -07:00
blackaxe21
28cedab1a4
Update agent_vectorstore.ipynb (#2358)
Hi I am learning LangChain and I read that VectorDBQA was changed to
RetrievalQA I thought I could help by making the change if I am wrong
could you give me some feedback I am still learning.

source:
https://blog.langchain.dev/retrieval/#:~:text=Changed%20all%20our,a%20chat%20model
2023-04-03 15:56:59 -07:00
Harrison Chase
cb5c5d1a4d
Harrison/base language model (#2357)
Co-authored-by: Darien Schettler <50381286+darien-schettler@users.noreply.github.com>
Co-authored-by: Darien Schettler <darien_schettler@hotmail.com>
2023-04-03 15:27:57 -07:00
MohammedAlhajji
fd0d631f39
🐛 fix: missing kwargs in from_agent_and_tools in dataframe agent (#2285)
Hello! 
I've noticed a bug in `create_pandas_dataframe_agent`. When calling it
with argument `return_intermediate_steps=True`, it doesn't return the
intermediate step. I think the issue is that `kwargs` was not passed
where it needed to be passed. It should be passed into
`AgentExecutor.from_agent_and_tools`

Please correct me if my solution isn't appropriate and I will fix with
the appropriate approach.

Co-authored-by: alhajji <m.alhajji@drahim.sa>
2023-04-03 14:26:03 -07:00
Bhanu K
3fb4997ad8
Persist database regardless of notebook or script context (#2351)
`persist()` is required even if it's invoked in a script.

Without this, an error is thrown:

```
chromadb.errors.NoIndexException: Index is not initialized
```
2023-04-03 14:21:17 -07:00
Gerard Hernandez
cc50a4579e
Fix spelling and grammar in multi_input_tool.ipynb (#2337)
Changes:
- Corrected the title to use hyphens instead of spaces.
- Fixed a typo in the second paragraph where "therefor" was changed to
"Therefore".
- Added a hyphen between "comma" and "separated" in the last paragraph.

File link:
[multi_input_tool.ipynb](https://github.com/hwchase17/langchain/blob/master/docs/modules/agents/tools/multi_input_tool.ipynb)
2023-04-03 14:13:48 -07:00
videowala
00c39ea409
Fixed a typo Teplate > Template (#2348)
Nothing special. Just a simple typo fix.
2023-04-03 14:13:25 -07:00
sergerdn
870cd33701
fix: testing in Windows and add missing dev dependency (#2340)
This changes addresses two issues.

First, we add `setuptools` to the dev dependencies in order to debug
tests locally with an IDE, especially with PyCharm. All dependencies dev
dependencies should be installed with `poetry install --extras "dev"`.

Second, we use PurePosixPath instead of Path for URL paths to fix issues
with testing in Windows. This ensures that forward slashes are used as
the path separator regardless of the operating system.

Closes https://github.com/hwchase17/langchain/issues/2334
2023-04-03 14:11:18 -07:00
Mike Lambert
393cd3c796
Bump anthropic version (#2352)
Improves async support (and a few other bug fixes I'd prefer folks be
forced to grab)
2023-04-03 13:35:50 -07:00
Harrison Chase
347ea24524
bump version to 130 (#2343) 2023-04-03 09:01:46 -07:00
Harrison Chase
6c13003dd3 cr 2023-04-03 08:44:50 -07:00
Harrison Chase
b21c485ad5
custom agent docs (#2342) 2023-04-03 08:35:48 -07:00
Harrison Chase
d85f57ef9c
Harrison/llama (#2314)
Co-authored-by: RJ Adriaansen <adriaansen@eshcc.eur.nl>
2023-04-02 14:57:45 -07:00
Frederick Ros
595ebe1796
Fixed a typo in an Error Message of SerpAPI (#2313) 2023-04-02 14:57:34 -07:00
DvirDukhan
3b75b004fc
fixed index name error found at redis new vector test (#2311)
This PR fixes a logic error in the Redis VectorStore class
Creating a redis vector store `from_texts` creates 1:1 mapping between
the object and its respected index, created in the function. The index
will index only documents adhering to the `doc:{index_name}` prefix.
Calling `add_texts` should use the same prefix, unless stated otherwise
in `keys` dictionary, and not create a new random uuid.
2023-04-02 14:47:08 -07:00
Alexander Weichart
3a2782053b
feat: category support for SearxSearchWrapper (#2271)
Added an optional parameter "categories" to specify the active search
categories.
API: https://docs.searxng.org/dev/search_api.html
2023-04-02 14:05:21 -07:00
Kevin Huang
e4cfaa5680
Introduces SeleniumURLLoader for JavaScript-Dependent Web Page Data Retrieval (#2291)
### Summary
This PR introduces a `SeleniumURLLoader` which, similar to
`UnstructuredURLLoader`, loads data from URLs. However, it utilizes
`selenium` to fetch page content, enabling it to work with
JavaScript-rendered pages. The `unstructured` library is also employed
for loading the HTML content.

### Testing
```bash
pip install selenium
pip install unstructured
```

```python
from langchain.document_loaders import SeleniumURLLoader

urls = [
    "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
    "https://goo.gl/maps/NDSHwePEyaHMFGwh8"
]

loader = SeleniumURLLoader(urls=urls)
data = loader.load()
```
2023-04-02 14:05:00 -07:00
Kenneth Leung
00d3ec5ed8
Reduce number of documents to return for Pinecone (#2299)
Minor change: Currently, Pinecone is returning 5 documents instead of
the 4 seen in other vectorstores, and the comments this Pinecone script
itself. Adjusted it from 5 to 4.
2023-04-02 14:04:23 -07:00
Harrison Chase
fe572a5a0d
chat model example (#2310) 2023-04-02 14:04:09 -07:00
akmhmgc
94b2f536f3
Modify output for wikipedia api wrapper (#2287)
## Description
Thanks for the quick maintenance for great repository!!
I modified wikipedia api wrapper

## Details
- Add output for missing search results
- Add tests
2023-04-02 14:00:27 -07:00