Commit Graph

154 Commits (5ba1c7b6018837a2cbf61ae8e7f45c8f1e3e8f72)

Author SHA1 Message Date
Enrico Shippole 4dbea7e9e4 Add Writer, Banana, Modal, StochasticAI (#1270)
Add LLM wrappers and examples for Banana, Writer, Modal, Stochastic AI

Added rigid json format for Banana and Modal
1 year ago
Dennis Antela Martinez 985f36eb3f add aleph alpha llm (#1207)
Integrate Aleph Alpha's client into Langchain to provide access to the
luminous models - more info on latest benchmarks here:
https://www.aleph-alpha.com/luminous-performance-benchmarks
1 year ago
Harrison Chase 5f3550437f rfc: callback changes (#1165)
conceptually, no reason a tool should know what an "agent action" is

unless any objections, can change in all callback handlers
1 year ago
Harrison Chase 529df2a2dd move serpapi wrapper (#1199)
Co-authored-by: Tim Asp <707699+timothyasp@users.noreply.github.com>
1 year ago
Naveen Tatikonda 8e2152c1d6 Add Support for OpenSearch Vector database (#1191)
### Description
This PR adds a wrapper which adds support for the OpenSearch vector
database. Using opensearch-py client we are ingesting the embeddings of
given text into opensearch cluster using Bulk API. We can perform the
`similarity_search` on the index using the 3 popular searching methods
of OpenSearch k-NN plugin:

- `Approximate k-NN Search` use approximate nearest neighbor (ANN)
algorithms from the [nmslib](https://github.com/nmslib/nmslib),
[faiss](https://github.com/facebookresearch/faiss), and
[Lucene](https://lucene.apache.org/) libraries to power k-NN search.
- `Script Scoring` extends OpenSearch’s script scoring functionality to
execute a brute force, exact k-NN search.
- `Painless Scripting` adds the distance functions as painless
extensions that can be used in more complex combinations. Also, supports
brute force, exact k-NN search like Script Scoring.

### Issues Resolved 
https://github.com/hwchase17/langchain/issues/1054

---------

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
1 year ago
Andrew White f0f9f276bd Allow k to be higher than doc size in max_marginal_relevance_search (#1187)
Fixes issue #1186. For some reason, #1117 didn't seem to fix it.
1 year ago
Harrison Chase 9d6d8f85da
Harrison/self hosted runhouse (#1154)
Co-authored-by: Donny Greenberg <dongreenberg2@gmail.com>
Co-authored-by: John Dagdelen <jdagdelen@users.noreply.github.com>
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>
Co-authored-by: Andrew White <white.d.andrew@gmail.com>
Co-authored-by: Peng Qu <82029664+pengqu123@users.noreply.github.com>
Co-authored-by: Matt Robinson <mthw.wm.robinson@gmail.com>
Co-authored-by: jeff <tangj1122@gmail.com>
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MacBook-Pro.local>
Co-authored-by: zanderchase <zander@unfold.ag>
Co-authored-by: Charles Frye <cfrye59@gmail.com>
Co-authored-by: zanderchase <zanderchase@gmail.com>
Co-authored-by: Shahriar Tajbakhsh <sh.tajbakhsh@gmail.com>
Co-authored-by: Stefan Keselj <skeselj@princeton.edu>
Co-authored-by: Francisco Ingham <fpingham@gmail.com>
Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com>
Co-authored-by: cragwolfe <cragcw@gmail.com>
Co-authored-by: Anton Troynikov <atroyn@users.noreply.github.com>
Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com>
Co-authored-by: Oliver Klingefjord <oliver@klingefjord.com>
Co-authored-by: blob42 <contact@blob42.xyz>
Co-authored-by: blob42 <spike@w530>
Co-authored-by: Enrico Shippole <henryshippole@gmail.com>
Co-authored-by: Ibis Prevedello <ibiscp@gmail.com>
Co-authored-by: jped <jonathanped@gmail.com>
Co-authored-by: Justin Torre <justintorre75@gmail.com>
Co-authored-by: Ivan Vendrov <ivan@anthropic.com>
Co-authored-by: Sasmitha Manathunga <70096033+mmz-001@users.noreply.github.com>
Co-authored-by: Ankush Gola <9536492+agola11@users.noreply.github.com>
Co-authored-by: Matt Robinson <mrobinson@unstructuredai.io>
Co-authored-by: Jeff Huber <jeffchuber@gmail.com>
Co-authored-by: Akshay <64036106+akshayvkt@users.noreply.github.com>
Co-authored-by: Andrew Huang <jhuang16888@gmail.com>
Co-authored-by: rogerserper <124558887+rogerserper@users.noreply.github.com>
Co-authored-by: seanaedmiston <seane999@gmail.com>
Co-authored-by: Hasegawa Yuya <52068175+Hase-U@users.noreply.github.com>
Co-authored-by: Ivan Vendrov <ivendrov@gmail.com>
Co-authored-by: Chen Wu (吴尘) <henrychenwu@cmu.edu>
Co-authored-by: Dennis Antela Martinez <dennis.antela@gmail.com>
Co-authored-by: Maxime Vidal <max.vidal@hotmail.fr>
Co-authored-by: Rishabh Raizada <110235735+rishabh-ti@users.noreply.github.com>
1 year ago
CG80499 af8f5c1a49
Added constitutional chain. (#1147)
- Added self-critique constitutional chain based on this
[paper](https://www.anthropic.com/constitutional.pdf).
1 year ago
Ankush Gola 7b5e160d28
Make Tools own model, add ToolKit Concept (#1095)
Follow-up of @hinthornw's PR:

- Migrate the Tool abstraction to a separate file (`BaseTool`).
- `Tool` implementation of `BaseTool` takes in function and coroutine to
more easily maintain backwards compatibility
- Add a Toolkit abstraction that can own the generation of tools around
a shared concept or state

---------

Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Francisco Ingham <fpingham@gmail.com>
Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com>
Co-authored-by: cragwolfe <cragcw@gmail.com>
Co-authored-by: Anton Troynikov <atroyn@users.noreply.github.com>
Co-authored-by: Oliver Klingefjord <oliver@klingefjord.com>
Co-authored-by: William Fu-Hinthorn <whinthorn@Williams-MBP-3.attlocal.net>
Co-authored-by: Bruno Bornsztein <bruno.bornsztein@gmail.com>
1 year ago
Francisco Ingham 3f29742adc
Sql alchemy commands used in table info (#1135)
This approach has several advantages:

* it improves the readability of the code
* removes incompatibilities between SQL dialects
* fixes a bug with `datetime` values in rows and `ast.literal_eval`

Huge thanks and credits to @jzluo for finding the weaknesses in the
current approach and for the thoughtful discussion on the best way to
implement this.

---------

Co-authored-by: Francisco Ingham <>
Co-authored-by: Jon Luo <20971593+jzluo@users.noreply.github.com>
1 year ago
Noah Gundotra 8c5fbab72d
[Integration Tests] Cast fake embeddings to ALL float values (#1102)
Pydantic validation breaks tests for example (`test_qdrant.py`) because
fake embeddings contain an integer.

This PR casts the embeddings array to all floats.

Now the `qdrant` test passes, `poetry run pytest
tests/integration_tests/vectorstores/test_qdrant.py`
1 year ago
yakigac 1ed708391e
Fix a bug that shows "KeyError 'items'" (#1118)
Fix KeyError 'items' when no result found.

## Problem

When no result found for a query, google search crashed with `KeyError
'items'`.

## Solution

I added a check for an empty response before accessing the 'items' key.
It will handle the case correctly.

## Other

my twitter: yakigac
(I don't mind even if you don't mention me for this PR. But just because
last time my real name was shout out :) )
1 year ago
Harrison Chase 5e10e19bfe
Harrison/align table (#1081)
Co-authored-by: Francisco Ingham <fpingham@gmail.com>
1 year ago
Hasegawa Yuya e08961ab25
Fixed openai embeddings to be safe by batching them based on token size calculation. (#991)
I modified the logic of the batch calculation for embedding according to
this cookbook

https://github.com/openai/openai-cookbook/blob/main/examples/Embedding_long_inputs.ipynb
1 year ago
seanaedmiston f0a258555b
Support similarity search by vector (in FAISS) (#961)
Alternate implementation to PR #960 Again - only FAISS is implemented.
If accepted can add this to other vectorstores or leave as
NotImplemented? Suggestions welcome...
1 year ago
rogerserper e46cd3b7db
Google Search API integration with serper.dev (wrapper, tests, docs, … (#909)
Adds Google Search integration with [Serper](https://serper.dev) a
low-cost alternative to SerpAPI (10x cheaper + generous free tier).
Includes documentation, tests and examples. Hopefully I am not missing
anything.

Developers can sign up for a free account at
[serper.dev](https://serper.dev) and obtain an api key.

## Usage

```python
from langchain.utilities import GoogleSerperAPIWrapper
from langchain.llms.openai import OpenAI
from langchain.agents import initialize_agent, Tool

import os
os.environ["SERPER_API_KEY"] = ""
os.environ['OPENAI_API_KEY'] = ""

llm = OpenAI(temperature=0)
search = GoogleSerperAPIWrapper()
tools = [
    Tool(
        name="Intermediate Answer",
        func=search.run
    )
]

self_ask_with_search = initialize_agent(tools, llm, agent="self-ask-with-search", verbose=True)
self_ask_with_search.run("What is the hometown of the reigning men's U.S. Open champion?")
```

### Output
```
Entering new AgentExecutor chain...
 Yes.
Follow up: Who is the reigning men's U.S. Open champion?
Intermediate answer: Current champions Carlos Alcaraz, 2022 men's singles champion.
Follow up: Where is Carlos Alcaraz from?
Intermediate answer: El Palmar, Spain
So the final answer is: El Palmar, Spain

> Finished chain.

'El Palmar, Spain'
```
1 year ago
Ankush Gola caa8e4742e
Enable streaming for OpenAI LLM (#986)
* Support a callback `on_llm_new_token` that users can implement when
`OpenAI.streaming` is set to `True`
1 year ago
Harrison Chase 88bebb4caa
Harrison/llm integrations (#1039)
Co-authored-by: jped <jonathanped@gmail.com>
Co-authored-by: Justin Torre <justintorre75@gmail.com>
Co-authored-by: Ivan Vendrov <ivan@anthropic.com>
1 year ago
Harrison Chase ec727bf166
Align table info (#999) (#1034)
Currently the chain is getting the column names and types on the one
side and the example rows on the other. It is easier for the llm to read
the table information if the column name and examples are shown together
so that it can easily understand to which columns do the examples refer
to. For an instantiation of this, please refer to the changes in the
`sqlite.ipynb` notebook.

Also changed `eval` for `ast.literal_eval` when interpreting the results
from the sample row query since it is a better practice.

---------

Co-authored-by: Francisco Ingham <>

---------

Co-authored-by: Francisco Ingham <fpingham@gmail.com>
1 year ago
Enrico Shippole f30dcc6359
Add GooseAI, CerebriumAI, Petals, ForefrontAI (#981)
Add GooseAI, CerebriumAI, Petals, ForefrontAI
1 year ago
Anton Troynikov d43d430d86
Chroma persistence (#1028)
This PR adds persistence to the Chroma vector store.

Users can supply a `persist_directory` with any of the `Chroma` creation
methods. If supplied, the store will be automatically persisted at that
directory.

If a user creates a new `Chroma` instance with the same persistence
directory, it will get loaded up automatically. If they use `from_texts`
or `from_documents` in this way, the documents will be loaded into the
existing store.

There is the chance of some funky behavior if the user passes a
different embedding function from the one used to create the collection
- we will make this easier in future updates. For now, we log a warning.
1 year ago
Anton Troynikov 78abd277ff
Chroma in LangChain (#1010)
Chroma is a simple to use, open-source, zero-config, zero setup
vectorstore.

Simply `pip install chromadb`, and you're good to go. 

Out-of-the-box Chroma is suitable for most LangChain workloads, but is
highly flexible. I tested to 1M embs on my M1 mac, with out issues and
reasonably fast query times.

Look out for future releases as we integrate more Chroma features with
LangChain!
1 year ago
Shahriar Tajbakhsh b7747017d7
Import of `declarative_base` when SQLAlchemy <1.4 (#883)
In
[pyproject.toml](https://github.com/hwchase17/langchain/blob/master/pyproject.toml),
the expectation is `SQLAlchemy = "^1"`. But, the way `declarative_base`
is imported in
[cache.py](https://github.com/hwchase17/langchain/blob/master/langchain/cache.py)
will only work with SQLAlchemy >=1.4. This PR makes sure Langchain can
be run in environments with SQLAlchemy <1.4
1 year ago
Harrison Chase c64f98e2bb
Harrison/format agent instructions (#973)
Co-authored-by: Andrew White <white.d.andrew@gmail.com>
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>
Co-authored-by: Peng Qu <82029664+pengqu123@users.noreply.github.com>
1 year ago
Harrison Chase 91c6cea227
Harrison/batch embeds (#972)
Co-authored-by: John Dagdelen <jdagdelen@users.noreply.github.com>
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>
1 year ago
Ankush Gola bc7e56e8df
Add asyncio support for LLM (OpenAI), Chain (LLMChain, LLMMathChain), and Agent (#841)
Supporting asyncio in langchain primitives allows for users to run them
concurrently and creates more seamless integration with
asyncio-supported frameworks (FastAPI, etc.)

Summary of changes:

**LLM**
* Add `agenerate` and `_agenerate`
* Implement in OpenAI by leveraging `client.Completions.acreate`

**Chain**
* Add `arun`, `acall`, `_acall`
* Implement them in `LLMChain` and `LLMMathChain` for now

**Agent**
* Refactor and leverage async chain and llm methods
* Add ability for `Tools` to contain async coroutine
* Implement async SerpaPI `arun`

Create demo notebook.

Open questions:
* Should all the async stuff go in separate classes? I've seen both
patterns (keeping the same class and having async and sync methods vs.
having class separation)
1 year ago
Harrison Chase bc53c928fc
Harrison/athropic (#921)
Co-authored-by: Mike Lambert <mlambert@gmail.com>
Co-authored-by: mrbean <sam@you.com>
Co-authored-by: mrbean <43734688+sam-h-bean@users.noreply.github.com>
Co-authored-by: Ivan Vendrov <ivendrov@gmail.com>
1 year ago
Harrison Chase 1e56879d38
Harrison/save faiss (#916)
Co-authored-by: Shrey Joshi <shreyjoshi2004@gmail.com>
1 year ago
Harrison Chase e2b834e427
Harrison/prompt template prefix (#888)
Co-authored-by: Gabriel Simmons <simmons.gabe@gmail.com>
1 year ago
Harrison Chase f95cedc443
Harrison/sql rows (#915)
Co-authored-by: Jon Luo <20971593+jzluo@users.noreply.github.com>
1 year ago
Harrison Chase ba5a2f06b9
Harrison/inference endpoint (#861)
Co-authored-by: Eno Reyes <enoreyes@gmail.com>
1 year ago
Kevin Huo 31b054f69d
Add pinecone integration test (#911)
Basic integration test for pinecone
1 year ago
Harrison Chase 93a091cfb8
Optionally return shell output on incorrect command (#894) (#899)
This allows the LLM to correct its previous command by looking at the
error message output to the shell.

Additionally, this uses subprocess.run because that is now recommended
over subprocess.check_output:

https://docs.python.org/3/library/subprocess.html#using-the-subprocess-module

Co-authored-by: Amos Ng <me@amos.ng>
1 year ago
Harrison Chase a2b699dcd2
prompt template from string (#884) 1 year ago
Harrison Chase 8df6b68093
fix length based example selector (#862) 1 year ago
Harrison Chase 3f48eed5bd
Harrison/milvus (#856)
Signed-off-by: Filip Haltmayer <filip.haltmayer@zilliz.com>
Signed-off-by: Frank Liu <frank.liu@zilliz.com>
Co-authored-by: Filip Haltmayer <81822489+filip-halt@users.noreply.github.com>
Co-authored-by: Frank Liu <frank@frankzliu.com>
1 year ago
kahkeng 4a8f5cdf4b
Add alternative token-based text splitter (#816)
This does not involve a separator, and will naively chunk input text at
the appropriate boundaries in token space.

This is helpful if we have strict token length limits that we need to
strictly follow the specified chunk size, and we can't use aggressive
separators like spaces to guarantee the absence of long strings.

CharacterTextSplitter will let these strings through without splitting
them, which could cause overflow errors downstream.

Splitting at arbitrary token boundaries is not ideal but is hopefully
mitigated by having a decent overlap quantity. Also this results in
chunks which has exact number of tokens desired, instead of sometimes
overcounting if we concatenate shorter strings.

Potentially also helps with #528.
1 year ago
Harrison Chase 23d5f64bda
Harrison/ngram example (#846)
Co-authored-by: Sean Spriggens <ssprigge@syr.edu>
1 year ago
Harrison Chase d564308e0f
rfc: instruct embeddings (#811)
Co-authored-by: seanaedmiston <seane999@gmail.com>
1 year ago
Harrison Chase 7b4882a2f4
Harrison/tf embeddings (#817)
Co-authored-by: Ryohei Kuroki <10434946+yakigac@users.noreply.github.com>
1 year ago
Jason Liu 54f9e4287f
Pass kwargs from initialize_agent into agent classmethod (#799)
# Problem
I noticed that in order to change the prefix of the prompt in the
`zero-shot-react-description` agent
we had to dig around to subset strings deep into the agent's attributes.
It requires the user to inspect a long chain of attributes and classes.

`initialize_agent -> AgentExecutor -> Agent -> LLMChain -> Prompt from
Agent.create_prompt`

``` python
agent = initialize_agent(
    tools=tools,
    llm=fake_llm,
    agent="zero-shot-react-description"
)
prompt_str = agent.agent.llm_chain.prompt.template
new_prompt_str = change_prefix(prompt_str)
agent.agent.llm_chain.prompt.template = new_prompt_str
```

# Implemented Solution

`initialize_agent` accepts `**kwargs` but passes it to `AgentExecutor`
but not `ZeroShotAgent`, by simply giving the kwargs to the agent class
methods we can support changing the prefix and suffix for one agent
while allowing future agents to take advantage of `initialize_agent`.


```
agent = initialize_agent(
    tools=tools,
    llm=fake_llm,
    agent="zero-shot-react-description",
    agent_kwargs={"prefix": prefix, "suffix": suffix}
)
```

To be fair, this was before finding docs around custom agents here:
https://langchain.readthedocs.io/en/latest/modules/agents/examples/custom_agent.html?highlight=custom%20#custom-llmchain
but i find that my use case just needed to change the prefix a little.


# Changes

* Pass kwargs to Agent class method
* Added a test to check suffix and prefix

---------

Co-authored-by: Jason Liu <jason@jxnl.coA>
1 year ago
Roy Williams 6086292252
Centralize logic for loading from LangChainHub, add ability to pin dependencies (#805)
It's generally considered to be a good practice to pin dependencies to
prevent surprise breakages when a new version of a dependency is
released. This commit adds the ability to pin dependencies when loading
from LangChainHub.

Centralizing this logic and using urllib fixes an issue identified by
some windows users highlighted in this video -
https://youtu.be/aJ6IQUh8MLQ?t=537
1 year ago
Harrison Chase 1ad7973cc6
Harrison/tool decorator (#790)
Co-authored-by: Jason Liu <jxnl@users.noreply.github.com>
Co-authored-by: Jason Liu <jason@jxnl.coA>
1 year ago
Harrison Chase 248c297f1b
Sample row in table info for SQLDatabase (#769) (#782)
The agents usually benefit from understanding what the data looks like
to be able to filter effectively. Sending just one row in the table info
allows the agent to understand the data before querying and get better
results.

---------

Co-authored-by: Francisco Ingham <>

---------

Co-authored-by: Francisco Ingham <fpingham@gmail.com>
1 year ago
Amos Ng 6ad360bdef
Suggestions for better debugging (#765)
Please feel free to disregard any changes you disagree with
1 year ago
Ankush Gola 57609845df
add tracing support to langchain (#741)
* add implementations of `BaseCallbackHandler` to support tracing:
`SharedTracer` which is thread-safe and `Tracer` which is not and is
meant to be used locally.
* Tracers persist runs to locally running `langchain-server`

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1 year ago
Amos Ng fa6826e417
Fix sqlalchemy warnings when running tests (#733)
This has been bugging me when running my own tests that call langchain
methods :P
1 year ago
scadEfUr e3df8ab6dc
move hyde into chains (#728)
Co-authored-by: scadEfUr <>
1 year ago
Harrison Chase 0ffeabd14f
Harrison/serialize llm chain (#671) 1 year ago
Harrison Chase cbc146720b
verbose flag (#683) 1 year ago