Commit Graph

134 Commits (d43d430d86cd5ba6b3d6ccdb7e59d0227c8db026)

Author SHA1 Message Date
Anton Troynikov d43d430d86
Chroma persistence (#1028)
This PR adds persistence to the Chroma vector store.

Users can supply a `persist_directory` with any of the `Chroma` creation
methods. If supplied, the store will be automatically persisted at that
directory.

If a user creates a new `Chroma` instance with the same persistence
directory, it will get loaded up automatically. If they use `from_texts`
or `from_documents` in this way, the documents will be loaded into the
existing store.

There is the chance of some funky behavior if the user passes a
different embedding function from the one used to create the collection
- we will make this easier in future updates. For now, we log a warning.
2 years ago
Anton Troynikov 78abd277ff
Chroma in LangChain (#1010)
Chroma is a simple to use, open-source, zero-config, zero setup
vectorstore.

Simply `pip install chromadb`, and you're good to go. 

Out-of-the-box Chroma is suitable for most LangChain workloads, but is
highly flexible. I tested to 1M embs on my M1 mac, with out issues and
reasonably fast query times.

Look out for future releases as we integrate more Chroma features with
LangChain!
2 years ago
Shahriar Tajbakhsh b7747017d7
Import of `declarative_base` when SQLAlchemy <1.4 (#883)
In
[pyproject.toml](https://github.com/hwchase17/langchain/blob/master/pyproject.toml),
the expectation is `SQLAlchemy = "^1"`. But, the way `declarative_base`
is imported in
[cache.py](https://github.com/hwchase17/langchain/blob/master/langchain/cache.py)
will only work with SQLAlchemy >=1.4. This PR makes sure Langchain can
be run in environments with SQLAlchemy <1.4
2 years ago
Harrison Chase c64f98e2bb
Harrison/format agent instructions (#973)
Co-authored-by: Andrew White <white.d.andrew@gmail.com>
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>
Co-authored-by: Peng Qu <82029664+pengqu123@users.noreply.github.com>
2 years ago
Harrison Chase 91c6cea227
Harrison/batch embeds (#972)
Co-authored-by: John Dagdelen <jdagdelen@users.noreply.github.com>
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>
2 years ago
Ankush Gola bc7e56e8df
Add asyncio support for LLM (OpenAI), Chain (LLMChain, LLMMathChain), and Agent (#841)
Supporting asyncio in langchain primitives allows for users to run them
concurrently and creates more seamless integration with
asyncio-supported frameworks (FastAPI, etc.)

Summary of changes:

**LLM**
* Add `agenerate` and `_agenerate`
* Implement in OpenAI by leveraging `client.Completions.acreate`

**Chain**
* Add `arun`, `acall`, `_acall`
* Implement them in `LLMChain` and `LLMMathChain` for now

**Agent**
* Refactor and leverage async chain and llm methods
* Add ability for `Tools` to contain async coroutine
* Implement async SerpaPI `arun`

Create demo notebook.

Open questions:
* Should all the async stuff go in separate classes? I've seen both
patterns (keeping the same class and having async and sync methods vs.
having class separation)
2 years ago
Harrison Chase bc53c928fc
Harrison/athropic (#921)
Co-authored-by: Mike Lambert <mlambert@gmail.com>
Co-authored-by: mrbean <sam@you.com>
Co-authored-by: mrbean <43734688+sam-h-bean@users.noreply.github.com>
Co-authored-by: Ivan Vendrov <ivendrov@gmail.com>
2 years ago
Harrison Chase 1e56879d38
Harrison/save faiss (#916)
Co-authored-by: Shrey Joshi <shreyjoshi2004@gmail.com>
2 years ago
Harrison Chase e2b834e427
Harrison/prompt template prefix (#888)
Co-authored-by: Gabriel Simmons <simmons.gabe@gmail.com>
2 years ago
Harrison Chase f95cedc443
Harrison/sql rows (#915)
Co-authored-by: Jon Luo <20971593+jzluo@users.noreply.github.com>
2 years ago
Harrison Chase ba5a2f06b9
Harrison/inference endpoint (#861)
Co-authored-by: Eno Reyes <enoreyes@gmail.com>
2 years ago
Kevin Huo 31b054f69d
Add pinecone integration test (#911)
Basic integration test for pinecone
2 years ago
Harrison Chase 93a091cfb8
Optionally return shell output on incorrect command (#894) (#899)
This allows the LLM to correct its previous command by looking at the
error message output to the shell.

Additionally, this uses subprocess.run because that is now recommended
over subprocess.check_output:

https://docs.python.org/3/library/subprocess.html#using-the-subprocess-module

Co-authored-by: Amos Ng <me@amos.ng>
2 years ago
Harrison Chase a2b699dcd2
prompt template from string (#884) 2 years ago
Harrison Chase 8df6b68093
fix length based example selector (#862) 2 years ago
Harrison Chase 3f48eed5bd
Harrison/milvus (#856)
Signed-off-by: Filip Haltmayer <filip.haltmayer@zilliz.com>
Signed-off-by: Frank Liu <frank.liu@zilliz.com>
Co-authored-by: Filip Haltmayer <81822489+filip-halt@users.noreply.github.com>
Co-authored-by: Frank Liu <frank@frankzliu.com>
2 years ago
kahkeng 4a8f5cdf4b
Add alternative token-based text splitter (#816)
This does not involve a separator, and will naively chunk input text at
the appropriate boundaries in token space.

This is helpful if we have strict token length limits that we need to
strictly follow the specified chunk size, and we can't use aggressive
separators like spaces to guarantee the absence of long strings.

CharacterTextSplitter will let these strings through without splitting
them, which could cause overflow errors downstream.

Splitting at arbitrary token boundaries is not ideal but is hopefully
mitigated by having a decent overlap quantity. Also this results in
chunks which has exact number of tokens desired, instead of sometimes
overcounting if we concatenate shorter strings.

Potentially also helps with #528.
2 years ago
Harrison Chase 23d5f64bda
Harrison/ngram example (#846)
Co-authored-by: Sean Spriggens <ssprigge@syr.edu>
2 years ago
Harrison Chase d564308e0f
rfc: instruct embeddings (#811)
Co-authored-by: seanaedmiston <seane999@gmail.com>
2 years ago
Harrison Chase 7b4882a2f4
Harrison/tf embeddings (#817)
Co-authored-by: Ryohei Kuroki <10434946+yakigac@users.noreply.github.com>
2 years ago
Jason Liu 54f9e4287f
Pass kwargs from initialize_agent into agent classmethod (#799)
# Problem
I noticed that in order to change the prefix of the prompt in the
`zero-shot-react-description` agent
we had to dig around to subset strings deep into the agent's attributes.
It requires the user to inspect a long chain of attributes and classes.

`initialize_agent -> AgentExecutor -> Agent -> LLMChain -> Prompt from
Agent.create_prompt`

``` python
agent = initialize_agent(
    tools=tools,
    llm=fake_llm,
    agent="zero-shot-react-description"
)
prompt_str = agent.agent.llm_chain.prompt.template
new_prompt_str = change_prefix(prompt_str)
agent.agent.llm_chain.prompt.template = new_prompt_str
```

# Implemented Solution

`initialize_agent` accepts `**kwargs` but passes it to `AgentExecutor`
but not `ZeroShotAgent`, by simply giving the kwargs to the agent class
methods we can support changing the prefix and suffix for one agent
while allowing future agents to take advantage of `initialize_agent`.


```
agent = initialize_agent(
    tools=tools,
    llm=fake_llm,
    agent="zero-shot-react-description",
    agent_kwargs={"prefix": prefix, "suffix": suffix}
)
```

To be fair, this was before finding docs around custom agents here:
https://langchain.readthedocs.io/en/latest/modules/agents/examples/custom_agent.html?highlight=custom%20#custom-llmchain
but i find that my use case just needed to change the prefix a little.


# Changes

* Pass kwargs to Agent class method
* Added a test to check suffix and prefix

---------

Co-authored-by: Jason Liu <jason@jxnl.coA>
2 years ago
Roy Williams 6086292252
Centralize logic for loading from LangChainHub, add ability to pin dependencies (#805)
It's generally considered to be a good practice to pin dependencies to
prevent surprise breakages when a new version of a dependency is
released. This commit adds the ability to pin dependencies when loading
from LangChainHub.

Centralizing this logic and using urllib fixes an issue identified by
some windows users highlighted in this video -
https://youtu.be/aJ6IQUh8MLQ?t=537
2 years ago
Harrison Chase 1ad7973cc6
Harrison/tool decorator (#790)
Co-authored-by: Jason Liu <jxnl@users.noreply.github.com>
Co-authored-by: Jason Liu <jason@jxnl.coA>
2 years ago
Harrison Chase 248c297f1b
Sample row in table info for SQLDatabase (#769) (#782)
The agents usually benefit from understanding what the data looks like
to be able to filter effectively. Sending just one row in the table info
allows the agent to understand the data before querying and get better
results.

---------

Co-authored-by: Francisco Ingham <>

---------

Co-authored-by: Francisco Ingham <fpingham@gmail.com>
2 years ago
Amos Ng 6ad360bdef
Suggestions for better debugging (#765)
Please feel free to disregard any changes you disagree with
2 years ago
Ankush Gola 57609845df
add tracing support to langchain (#741)
* add implementations of `BaseCallbackHandler` to support tracing:
`SharedTracer` which is thread-safe and `Tracer` which is not and is
meant to be used locally.
* Tracers persist runs to locally running `langchain-server`

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2 years ago
Amos Ng fa6826e417
Fix sqlalchemy warnings when running tests (#733)
This has been bugging me when running my own tests that call langchain
methods :P
2 years ago
scadEfUr e3df8ab6dc
move hyde into chains (#728)
Co-authored-by: scadEfUr <>
2 years ago
Harrison Chase 0ffeabd14f
Harrison/serialize llm chain (#671) 2 years ago
Harrison Chase cbc146720b
verbose flag (#683) 2 years ago
dham e04b063ff4
add faiss local saving/loading (#676)
- This uses the faiss built-in `write_index` and `load_index` to save
and load faiss indexes locally
- Also fixes #674
- The save/load functions also use the faiss library, so I refactored
the dependency into a function
2 years ago
Harrison Chase a2eeaf3d43
strip whitespace (#680) 2 years ago
Harrison Chase 0b204d8c21
Harrison/quadrant (#665)
Co-authored-by: Kacper Łukawski <kacperlukawski@users.noreply.github.com>
2 years ago
Harrison Chase 54d7f1c933
fix caching (#658) 2 years ago
Harrison Chase 4d4cff0530
Harrison/cohere experimental (#638)
Co-authored-by: inyourhead <44607279+xettrisomeman@users.noreply.github.com>
2 years ago
Harrison Chase 1ac3319e45
simplify parsing of the final answer (#621) 2 years ago
Harrison Chase ffc7e04d44
Harrison/wolfram alpha (#579)
Co-authored-by: Nicolas <nicolascamara29@gmail.com>
2 years ago
Harrison Chase 1511606799
Harrison/fix splitting (#563)
fix issue where text splitting could possibly create empty docs
2 years ago
Harrison Chase 1192cc0767
smart text splitter (#530)
smart text splitter that iteratively tries different separators until it
works!
2 years ago
Harrison Chase 9833fcfe32
fix caching (#555) 2 years ago
Harrison Chase 330a5b42d4
fix map reduce chain (#550) 2 years ago
Harrison Chase 4974f49bb7
add return_direct flag to tool (#537)
adds a return_direct flag to tools, which just returns the tool output
as the final output
2 years ago
Harrison Chase 1631981f84
Harrison/fix and test caching (#538) 2 years ago
Harrison Chase 9e04c34e20
Add BaseCallbackHandler and CallbackManager (#478)
Co-authored-by: Ankush Gola <9536492+agola11@users.noreply.github.com>
2 years ago
Harrison Chase 0db05b6725
Harrison/add human prefix (#520)
Co-authored-by: Andrew Huang <jhuang16888@gmail.com>
2 years ago
Harrison Chase 985496f4be
Docs refactor (#480)
Big docs refactor! Motivation is to make it easier for people to find
resources they are looking for. To accomplish this, there are now three
main sections:

- Getting Started: steps for getting started, walking through most core
functionality
- Modules: these are different modules of functionality that langchain
provides. Each part here has a "getting started", "how to", "key
concepts" and "reference" section (except in a few select cases where it
didnt easily fit).
- Use Cases: this is to separate use cases (like summarization, question
answering, evaluation, etc) from the modules, and provide a different
entry point to the code base.

There is also a full reference section, as well as extra resources
(glossary, gallery, etc)

Co-authored-by: Shreya Rajpal <ShreyaR@users.noreply.github.com>
2 years ago
Harrison Chase 0072686aab
Harrison/new search engine (#477)
Co-authored-by: Nicolas <nicolascamara29@gmail.com>
2 years ago
Harrison Chase d0f194de73
add logic for agent stopping (#420) 2 years ago
Harrison Chase 95157d0aad
Add schema property to sql database utility class (#448) (#462)
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>

Signed-off-by: Diwank Singh Tomer <diwank.singh@gmail.com>
Co-authored-by: Nuno Campos <nuno@boringbits.io>
Co-authored-by: Diwank Singh Tomer <diwank.singh@gmail.com>
2 years ago
Harrison Chase 0c5d3fd894
version 0.0.49 (#436) 2 years ago