Commit Graph

273 Commits

Author SHA1 Message Date
Shrined
10dab053b4
Add Enum for agent types (#2321)
This pull request adds an enum class for the various types of agents
used in the project, located in the `agent_types.py` file. Currently,
the project is using hardcoded strings for the initialization of these
agents, which can lead to errors and make the code harder to maintain.
With the introduction of the new enums, the code will be more readable
and less error-prone.

The new enum members include:

- ZERO_SHOT_REACT_DESCRIPTION
- REACT_DOCSTORE
- SELF_ASK_WITH_SEARCH
- CONVERSATIONAL_REACT_DESCRIPTION
- CHAT_ZERO_SHOT_REACT_DESCRIPTION
- CHAT_CONVERSATIONAL_REACT_DESCRIPTION

In this PR, I have also replaced the hardcoded strings with the
appropriate enum members throughout the codebase, ensuring a smooth
transition to the new approach.
2023-04-03 21:56:20 -07:00
Harrison Chase
d85f57ef9c
Harrison/llama (#2314)
Co-authored-by: RJ Adriaansen <adriaansen@eshcc.eur.nl>
2023-04-02 14:57:45 -07:00
akmhmgc
94b2f536f3
Modify output for wikipedia api wrapper (#2287)
## Description
Thanks for the quick maintenance for great repository!!
I modified wikipedia api wrapper

## Details
- Add output for missing search results
- Add tests
2023-04-02 14:00:27 -07:00
Harrison Chase
acfda4d1d8
Harrison/multiline commands (#2280)
Co-authored-by: Marc Päpper <mpaepper@users.noreply.github.com>
2023-04-01 12:54:06 -07:00
leo-gan
579ad85785
skip unit tests that fail in Windows (#2238)
Issue #2174
Several unit tests fail in Windows.
Added pytest attribute to skip these tests automatically.
2023-04-01 12:52:21 -07:00
Sam Cordner-Matthews
1ddd6dbf0b
Add ability to pass kwargs to loader classes in DirectoryLoader, add ability to modify encoding and BeautifulSoup behaviour in BSHTMLLoader (#2275)
Solves #2247. Noted that the only test I added checks for the
BeautifulSoup behaviour change. Happy to add a test for
`DirectoryLoader` if deemed necessary.
2023-04-01 12:48:27 -07:00
Harrison Chase
2d3918c152
make requests more general (#2209) 2023-03-30 20:41:56 -07:00
Harrison Chase
5c907d9998
Harrison/base agent without docs (#2166) 2023-03-29 22:11:25 -07:00
Harrison Chase
f5a4bf0ce4
remove prep (#2136)
agents should be stateless or async stuff may not work
2023-03-29 14:38:21 -07:00
Harrison Chase
b35260ed47
Harrison/memory base (#2122)
@3coins + @zoltan-fedor.... heres the pr + some minor changes i made.
thoguhts? can try to get it into tmrws release

---------

Co-authored-by: Zoltan Fedor <zoltan.0.fedor@gmail.com>
Co-authored-by: Piyush Jain <piyushjain@duck.com>
2023-03-29 10:10:09 -07:00
Ankush Gola
ccee1aedd2
add async support for anthropic (#2114)
should not be merged in before
https://github.com/anthropics/anthropic-sdk-python/pull/11 gets released
2023-03-28 22:49:14 -04:00
Harrison Chase
e2c26909f2
Harrison/memory check (#2119)
Co-authored-by: JIAQIA <jqq1716@gmail.com>
2023-03-28 15:40:36 -07:00
Harrison Chase
3e879b47c1
Harrison/gitbook (#2044)
Co-authored-by: Irene López <45119610+ireneisdoomed@users.noreply.github.com>
2023-03-28 15:28:33 -07:00
Honkware
aff33d52c5
Add OpenWeatherMap API Tool (#2083)
Added tool for OpenWeatherMap API
2023-03-28 12:02:14 -07:00
Charlie Holtz
f16c1fb6df
Add replicate take 2 (#2077)
This PR adds a replicate integration to langchain. 

It's an updated version of
https://github.com/hwchase17/langchain/pull/1993, but with updates to
match latest replicate-python code.
https://github.com/replicate/replicate-python.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Zeke Sikelianos <zeke@sikelianos.com>
2023-03-28 11:56:57 -07:00
Harrison Chase
f281033362
rm pandas dependency (#2102) 2023-03-28 08:38:19 -07:00
Harrison Chase
410bf37fb8
Harrison/big query (#2100)
Co-authored-by: lu-cashmoney <lucas.corley@gmail.com>
2023-03-28 08:17:22 -07:00
Harrison Chase
eff5eed719
Harrison/jina (#2043)
Co-authored-by: numb3r3 <wangfelix87@gmail.com>
Co-authored-by: felix-wang <35718120+numb3r3@users.noreply.github.com>
2023-03-28 08:16:17 -07:00
Harrison Chase
9e74df2404
Fix issue#1645: Parse llm_output even there's newline (#2092) (#2099)
Fix issue#1645: Parse either whitespace or newline after 'Action Input:'
in llm_output in mrkl agent.
Unittests added accordingly.

Co-authored-by: ₿ingnan.ΞTH <brillliantz@outlook.com>
2023-03-28 08:14:09 -07:00
b7f392fdd6
[agent_executor] convenience func: lookup tool by name (#2001)
A quick convenience function to lookup a tool by name

Co-authored-by: blob42 <spike@w530>
2023-03-27 23:10:34 -07:00
Harrison Chase
f74a1bebf5
Harrison/duckdb (#2064)
Co-authored-by: Trent Hauck <trent@trenthauck.com>
2023-03-27 19:51:34 -07:00
Ankush Gola
b7ebb8fe30
enable streaming in anthropic llm wrapper (#2065) 2023-03-27 20:25:00 -04:00
Harrison Chase
30e3b31b04
Harrison/document cleanup (#2062)
Co-authored-by: Delip Rao <delip@users.noreply.github.com>
2023-03-27 16:32:55 -07:00
Harrison Chase
a0cd6672aa
Harrison/site map (#2061)
Co-authored-by: Tim Asp <707699+timothyasp@users.noreply.github.com>
2023-03-27 16:28:08 -07:00
Daniel Chalef
6598beacdb
PydanticOutputParser unit test (#2047)
Unit test for PydanticOutputParser

---------

Co-authored-by: Daniel Chalef <daniel.chalef@private.org>
2023-03-27 14:32:56 -07:00
Harrison Chase
705431aecc
big docs refactor (#1978)
Co-authored-by: Ankush Gola <ankush.gola@gmail.com>
2023-03-26 19:49:46 -07:00
Mario Kostelac
e7d6de6b1c
(ChatOpenAI) Add model_name to LLMResult.llm_output (#1960)
This makes sure OpenAI and ChatOpenAI have the same llm_output, and
allow tracking usage per model. Same work for OpenAI was done in
https://github.com/hwchase17/langchain/pull/1713.
2023-03-24 08:51:16 -07:00
Harrison Chase
6e1b5b8f7e
Harrison/figma doc loader (#1908)
Co-authored-by: Ismail Pelaseyed <homanp@gmail.com>
2023-03-22 19:57:46 -07:00
Eli
12f868b292
Propagate "filter" arg in Chroma similarity_search (#1869)
Technically a duplicate fix to #1619 but with unit tests and a small
documentation update
- Propagate `filter` arg in Chroma `similarity_search` to delegated call
to `similarity_search_with_score`
- Add `filter` arg to `similarity_search_by_vector`
- Clarify doc strings on FakeEmbeddings
2023-03-22 19:40:10 -07:00
Maurício Maia
f155d9d3ec
Add metadata filter to PGVector search (#1872)
Add ability to filter pgvector documents by metadata.
2023-03-22 15:21:40 -07:00
Maurício Maia
2212520a6c
Add PGVector collection metadata (#1887)
The `CollectionStore` for `PGVector` has a `cmetadata` field but it's
never used. This PR add the ability to save metadata information to the
collection.
2023-03-22 11:27:07 -07:00
Harrison Chase
ce5d97bcb3
Harrison/guarded output parser (#1804)
Co-authored-by: jerwelborn <jeremy.welborn@gmail.com>
2023-03-21 22:07:23 -07:00
Matt Tucker
a92344f476
Use regex match for bash process error output test assertion. (#1837)
I was getting the same issue reported in #1339 by
[MacYang555](https://github.com/MacYang555) when running the test suite
on my Mac. I implemented the fix they suggested to use a regex match in
the output assertion for the scenario under test.

Resolves #1339
2023-03-21 09:06:52 -07:00
LeoGrin
3701b2901e
use namespace argument in Pinecone constructor (#1757)
Fix #1756

Use the `namespace` argument of `Pinecone.from_exisiting_index` to set
the default value of `namespace` for other methods. Leads to more
expected behavior and easier integration in chains.

For the test, I've added a line to delete and rebuild the
`langchain-demo` index at the beginning of the test. I'm not 100% sure
if it's a good idea but it makes the test reproducible.
2023-03-18 19:55:38 -07:00
Mario Kostelac
aff44d0a98
(OpenAI) Add model_name to LLMResult.llm_output (#1713)
Given that different models have very different latencies and pricings,
it's benefitial to pass the information about the model that generated
the response. Such information allows implementing custom callback
managers and track usage and price per model.

Addresses https://github.com/hwchase17/langchain/issues/1557.
2023-03-16 21:55:55 -07:00
Daniel Chalef
b157e0c1c3
Add HTML document_loader that includes page title metadata (#1720)
This `BSHTMLLoader` document_loader loads an HTML document, extracts
text and adds the page title to the returned Document's metadata. The
loader uses the already installed bs4 package to extract both text
content and the page title.

Included in this PR is an example HTML file and an integration test that
tests against this file.

---------

Co-authored-by: Daniel Chalef <daniel.chalef@private.org>
2023-03-16 21:47:17 -07:00
Kacper Łukawski
4a327dd1d6
Implement basic metadata filtering in Qdrant (#1689)
This PR implements a basic metadata filtering mechanism similar to the
ones in Chroma and Pinecone. It still cannot express complex conditions,
as there are no operators, but some users requested to have that feature
available.
2023-03-15 07:31:39 -07:00
Harrison Chase
0b29e68c17
Harrison/pgvector (#1679)
Co-authored-by: Aman Kumar <krsingh.aman@gmail.com>
2023-03-14 21:13:58 -07:00
Xin Qiu
4e13cef05a
feat: add redisearch vectorstore (#1307)
# Description

Add `RediSearch` vectorstore for LangChain

RediSearch: [RediSearch quick
start](https://redis.io/docs/stack/search/quick_start/)

# How to use

```
from langchain.vectorstores.redisearch import RediSearch

rds = RediSearch.from_documents(docs, embeddings,redisearch_url="redis://localhost:6379")
```
2023-03-14 18:06:03 -07:00
Jon Luo
0a1b1806e9
sql: do not hard code the LIMIT clause in the table_info section (#1563)
Seeing a lot of issues in Discord in which the LLM is not using the
correct LIMIT clause for different SQL dialects. ie, it's using `LIMIT`
for mssql instead of `TOP`, or instead of `ROWNUM` for Oracle, etc.
I think this could be due to us specifying the LIMIT statement in the
example rows portion of `table_info`. So the LLM is seeing the `LIMIT`
statement used in the prompt.
Since we can't specify each dialect's method here, I think it's fine to
just replace the `SELECT... LIMIT 3;` statement with `3 rows from
table_name table:`, and wrap everything in a block comment directly
following the `CREATE` statement. The Rajkumar et al paper wrapped the
example rows and `SELECT` statement in a block comment as well anyway.
Thoughts @fpingham?
2023-03-13 23:08:27 -07:00
Tim Asp
b3234bf3b0
cleanup: unify 3 different pdf loaders, rename PagedPDFSplitter (#1615)
`OnlinePDFLoader` and `PagedPDFSplitter` lived separate from the rest of
the pdf loaders.

Because they're all similar, I propose moving all to `pdy.py` and the
same docs/examples page.

Additionally, `PagedPDFSplitter` naming doesn't match the pattern the
rest of the loaders follow, so I renamed to `PyPDFLoader` and had it
inherit from `BasePDFLoader` so it can now load from remote file
sources.
2023-03-13 23:06:50 -07:00
Luis
562d9891ea
Add regex dict: (#1616)
This class enables us to send a dictionary containing an output key and
the expected format, which in turn allows us to retrieve the result of
the matching formats and extract specific information from it.

To exclude irrelevant information from our return dictionary, we can
prompt the LLM to use a specific command that notifies us when it
doesn't know the answer. We refer to this variable as the
"no_update_value".

Regarding the updated regular expression pattern
(r"{}:\s?([^.'\n']*).?"), it enables us to retrieve a format as 'Output
Key':'value'.

We have improved the regex by adding an optional space between ':' and
'value' with "s?", and by excluding points and line jumps from the
matches using "[^.'\n']*".
2023-03-13 23:05:39 -07:00
Harrison Chase
aed9f9febe
Harrison/return intermediate (#1633)
Co-authored-by: Mario Kostelac <mario@intercom.io>
2023-03-13 07:54:29 -07:00
yakigac
acd86d33bc
Add read only shared memory (#1491)
Provide shared memory capability for the Agent.
Inspired by #1293 .

## Problem

If both Agent and Tools (i.e., LLMChain) use the same memory, both of
them will save the context. It can be annoying in some cases.


## Solution

Create a memory wrapper that ignores the save and clear, thereby
preventing updates from Agent or Tools.
2023-03-12 09:34:36 -07:00
Harrison Chase
c9b5a30b37
move output parsing (#1605) 2023-03-11 16:41:03 -08:00
Harrison Chase
cb04ba0136
Add support for intermediate steps to SQLDatabaseSequentialChain (#1583) (#1601)
for https://github.com/hwchase17/langchain/issues/1582

I simply added the `return_intermediate_steps` and changed the
`output_keys` function.

I added 2 simple tests, 1 for SQLDatabaseSequentialChain without the
intermediate steps and 1 with

Co-authored-by: brad-nemetski <115185478+brad-nemetski@users.noreply.github.com>
2023-03-11 15:44:41 -08:00
Harrison Chase
f95d551f7a
Harrison/shallow metadata (#1599)
Co-authored-by: Jesse Zhang <jessetanzhang@gmail.com>
2023-03-11 09:18:25 -08:00
Harrison Chase
9f78717b3c
Harrison/callbacks (#1587) 2023-03-10 12:53:09 -08:00
Harrison Chase
3ee32a01ea
Harrison/prompt layer (#1547)
Co-authored-by: Jonathan Pedoeem <jonathanped@gmail.com>
Co-authored-by: AbuBakar <abubakarsohail123@gmail.com>
2023-03-08 21:24:27 -08:00
Harrison Chase
c844d1fd46
Harrison/chunk size (#1549)
Co-authored-by: Florian Leuerer <31259070+floleuerer@users.noreply.github.com>
2023-03-08 21:24:18 -08:00
Harrison Chase
357d808484
Harrison/remote paths pdf (#1544)
Co-authored-by: Tim Asp <707699+timothyasp@users.noreply.github.com>
2023-03-08 20:53:37 -08:00
Harrison Chase
cc423f40f1
Harrison/youtube loader (#1545)
Co-authored-by: Julian Wustl <57504258+Julianwustl@users.noreply.github.com>
2023-03-08 20:53:27 -08:00
Harrison Chase
7ade419a0e
allow passing of messages into prompt template (#1505) 2023-03-07 21:10:12 -08:00
Harrison Chase
064741db58
Harrison/fix text splitter (#1511)
Co-authored-by: ajaysolanky <ajsolanky@gmail.com>
Co-authored-by: Ajay Solanky <ajaysolanky@saw-l14668307kd.myfiosgateway.com>
2023-03-07 15:42:28 -08:00
Ankush Gola
27104d4921
fix ChatOpenAI.agenerate (#1504) 2023-03-07 15:22:05 -08:00
Harrison Chase
7bec461782
Harrison/memory refactor (#1478)
moves memory to own module, factors out common stuff
2023-03-07 07:59:37 -08:00
Harrison Chase
0e21463f07
(rfc) chat models (#1424)
Co-authored-by: Ankush Gola <ankush.gola@gmail.com>
2023-03-06 08:34:24 -08:00
Harrison Chase
63a5614d23
Harrison/simple memory (#1435)
Co-authored-by: Tim Asp <707699+timothyasp@users.noreply.github.com>
2023-03-04 08:15:52 -08:00
Harrison Chase
a1b9dfc099
Harrison/similarity search chroma (#1434)
Co-authored-by: shibuiwilliam <shibuiyusuke@gmail.com>
2023-03-04 08:10:15 -08:00
Tim Asp
23231d65a9
Add PyMuPDF PDF loader (#1426)
Different PDF libraries have different strengths and weaknesses. PyMuPDF
does a good job at extracting the most amount of content from the doc,
regardless of the source quality, extremely fast (especially compared to
Unstructured).

https://pymupdf.readthedocs.io/en/latest/index.html
2023-03-03 20:59:28 -08:00
Nuno Campos
499e76b199
Allow the regular openai class to be used for ChatGPT models (#1393)
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-03-02 09:04:18 -08:00
Kacper Łukawski
9ac442624c
Add Qdrant named arguments (#1386)
This PR:
- Increases `qdrant-client` version to 1.0.4
- Introduces custom content and metadata keys (as requested in #1087)
- Moves all the `QdrantClient` parameters into the method parameters to
simplify code completion
2023-03-02 07:05:14 -08:00
Ankush Gola
fe30be6fba
add async and streaming support to OpenAIChat (#1378)
title says it all
2023-03-01 21:55:43 -08:00
Harrison Chase
1cd8996074
Harrison/summarizer chain (#1356)
Co-authored-by: Tim Asp <707699+timothyasp@users.noreply.github.com>
2023-03-01 20:59:07 -08:00
Ankush Gola
82baecc892
Add a SQL agent for interacting with SQL Databases and JSON Agent for interacting with large JSON blobs (#1150)
This PR adds 

* `ZeroShotAgent.as_sql_agent`, which returns an agent for interacting
with a sql database. This builds off of `SQLDatabaseChain`. The main
advantages are 1) answering general questions about the db, 2) access to
a tool for double checking queries, and 3) recovering from errors
* `ZeroShotAgent.as_json_agent` which returns an agent for interacting
with json blobs.
* Several examples in notebooks

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-02-28 19:44:39 -08:00
Harrison Chase
786852e9e6
partial variables (#1308) 2023-02-28 08:40:35 -08:00
Tim Asp
72ef69d1ba
Add new iFixit document loader (#1333)
iFixit is a wikipedia-like site that has a huge amount of open content
on how to fix things, questions/answers for common troubleshooting and
"things" related content that is more technical in nature. All content
is licensed under CC-BY-SA-NC 3.0

Adding docs from iFixit as context for user questions like "I dropped my
phone in water, what do I do?" or "My macbook pro is making a whining
noise, what's wrong with it?" can yield significantly better responses
than context free response from LLMs.
2023-02-27 20:40:20 -08:00
Harrison Chase
166cda2cc6
Harrison/deeplake (#1316)
Co-authored-by: Davit Buniatyan <d@activeloop.ai>
2023-02-26 22:35:04 -08:00
Harrison Chase
aaad6cc954
Harrison/atlas db (#1315)
Co-authored-by: Brandon Duderstadt <brandonduderstadt@gmail.com>
2023-02-26 22:11:38 -08:00
Enrico Shippole
9becdeaadf
Add Writer, Banana, Modal, StochasticAI (#1270)
Add LLM wrappers and examples for Banana, Writer, Modal, Stochastic AI

Added rigid json format for Banana and Modal
2023-02-24 06:58:58 -08:00
Dennis Antela Martinez
53c67e04d4
add aleph alpha llm (#1207)
Integrate Aleph Alpha's client into Langchain to provide access to the
luminous models - more info on latest benchmarks here:
https://www.aleph-alpha.com/luminous-performance-benchmarks
2023-02-22 10:37:36 -08:00
Harrison Chase
b7708bbec6
rfc: callback changes (#1165)
conceptually, no reason a tool should know what an "agent action" is

unless any objections, can change in all callback handlers
2023-02-20 22:54:15 -08:00
Harrison Chase
44c8d8a9ac
move serpapi wrapper (#1199)
Co-authored-by: Tim Asp <707699+timothyasp@users.noreply.github.com>
2023-02-20 21:15:45 -08:00
Naveen Tatikonda
0118706fd6
Add Support for OpenSearch Vector database (#1191)
### Description
This PR adds a wrapper which adds support for the OpenSearch vector
database. Using opensearch-py client we are ingesting the embeddings of
given text into opensearch cluster using Bulk API. We can perform the
`similarity_search` on the index using the 3 popular searching methods
of OpenSearch k-NN plugin:

- `Approximate k-NN Search` use approximate nearest neighbor (ANN)
algorithms from the [nmslib](https://github.com/nmslib/nmslib),
[faiss](https://github.com/facebookresearch/faiss), and
[Lucene](https://lucene.apache.org/) libraries to power k-NN search.
- `Script Scoring` extends OpenSearch’s script scoring functionality to
execute a brute force, exact k-NN search.
- `Painless Scripting` adds the distance functions as painless
extensions that can be used in more complex combinations. Also, supports
brute force, exact k-NN search like Script Scoring.

### Issues Resolved 
https://github.com/hwchase17/langchain/issues/1054

---------

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
2023-02-20 18:39:34 -08:00
Andrew White
c5015d77e2
Allow k to be higher than doc size in max_marginal_relevance_search (#1187)
Fixes issue #1186. For some reason, #1117 didn't seem to fix it.
2023-02-20 16:39:13 -08:00
Harrison Chase
9d6d8f85da
Harrison/self hosted runhouse (#1154)
Co-authored-by: Donny Greenberg <dongreenberg2@gmail.com>
Co-authored-by: John Dagdelen <jdagdelen@users.noreply.github.com>
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>
Co-authored-by: Andrew White <white.d.andrew@gmail.com>
Co-authored-by: Peng Qu <82029664+pengqu123@users.noreply.github.com>
Co-authored-by: Matt Robinson <mthw.wm.robinson@gmail.com>
Co-authored-by: jeff <tangj1122@gmail.com>
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MacBook-Pro.local>
Co-authored-by: zanderchase <zander@unfold.ag>
Co-authored-by: Charles Frye <cfrye59@gmail.com>
Co-authored-by: zanderchase <zanderchase@gmail.com>
Co-authored-by: Shahriar Tajbakhsh <sh.tajbakhsh@gmail.com>
Co-authored-by: Stefan Keselj <skeselj@princeton.edu>
Co-authored-by: Francisco Ingham <fpingham@gmail.com>
Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com>
Co-authored-by: cragwolfe <cragcw@gmail.com>
Co-authored-by: Anton Troynikov <atroyn@users.noreply.github.com>
Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com>
Co-authored-by: Oliver Klingefjord <oliver@klingefjord.com>
Co-authored-by: blob42 <contact@blob42.xyz>
Co-authored-by: blob42 <spike@w530>
Co-authored-by: Enrico Shippole <henryshippole@gmail.com>
Co-authored-by: Ibis Prevedello <ibiscp@gmail.com>
Co-authored-by: jped <jonathanped@gmail.com>
Co-authored-by: Justin Torre <justintorre75@gmail.com>
Co-authored-by: Ivan Vendrov <ivan@anthropic.com>
Co-authored-by: Sasmitha Manathunga <70096033+mmz-001@users.noreply.github.com>
Co-authored-by: Ankush Gola <9536492+agola11@users.noreply.github.com>
Co-authored-by: Matt Robinson <mrobinson@unstructuredai.io>
Co-authored-by: Jeff Huber <jeffchuber@gmail.com>
Co-authored-by: Akshay <64036106+akshayvkt@users.noreply.github.com>
Co-authored-by: Andrew Huang <jhuang16888@gmail.com>
Co-authored-by: rogerserper <124558887+rogerserper@users.noreply.github.com>
Co-authored-by: seanaedmiston <seane999@gmail.com>
Co-authored-by: Hasegawa Yuya <52068175+Hase-U@users.noreply.github.com>
Co-authored-by: Ivan Vendrov <ivendrov@gmail.com>
Co-authored-by: Chen Wu (吴尘) <henrychenwu@cmu.edu>
Co-authored-by: Dennis Antela Martinez <dennis.antela@gmail.com>
Co-authored-by: Maxime Vidal <max.vidal@hotmail.fr>
Co-authored-by: Rishabh Raizada <110235735+rishabh-ti@users.noreply.github.com>
2023-02-19 09:53:45 -08:00
CG80499
af8f5c1a49
Added constitutional chain. (#1147)
- Added self-critique constitutional chain based on this
[paper](https://www.anthropic.com/constitutional.pdf).
2023-02-18 19:31:51 -08:00
Ankush Gola
7b5e160d28
Make Tools own model, add ToolKit Concept (#1095)
Follow-up of @hinthornw's PR:

- Migrate the Tool abstraction to a separate file (`BaseTool`).
- `Tool` implementation of `BaseTool` takes in function and coroutine to
more easily maintain backwards compatibility
- Add a Toolkit abstraction that can own the generation of tools around
a shared concept or state

---------

Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Francisco Ingham <fpingham@gmail.com>
Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com>
Co-authored-by: cragwolfe <cragcw@gmail.com>
Co-authored-by: Anton Troynikov <atroyn@users.noreply.github.com>
Co-authored-by: Oliver Klingefjord <oliver@klingefjord.com>
Co-authored-by: William Fu-Hinthorn <whinthorn@Williams-MBP-3.attlocal.net>
Co-authored-by: Bruno Bornsztein <bruno.bornsztein@gmail.com>
2023-02-18 13:40:43 -08:00
Francisco Ingham
3f29742adc
Sql alchemy commands used in table info (#1135)
This approach has several advantages:

* it improves the readability of the code
* removes incompatibilities between SQL dialects
* fixes a bug with `datetime` values in rows and `ast.literal_eval`

Huge thanks and credits to @jzluo for finding the weaknesses in the
current approach and for the thoughtful discussion on the best way to
implement this.

---------

Co-authored-by: Francisco Ingham <>
Co-authored-by: Jon Luo <20971593+jzluo@users.noreply.github.com>
2023-02-18 10:58:29 -08:00
Noah Gundotra
8c5fbab72d
[Integration Tests] Cast fake embeddings to ALL float values (#1102)
Pydantic validation breaks tests for example (`test_qdrant.py`) because
fake embeddings contain an integer.

This PR casts the embeddings array to all floats.

Now the `qdrant` test passes, `poetry run pytest
tests/integration_tests/vectorstores/test_qdrant.py`
2023-02-17 15:18:09 -08:00
yakigac
1ed708391e
Fix a bug that shows "KeyError 'items'" (#1118)
Fix KeyError 'items' when no result found.

## Problem

When no result found for a query, google search crashed with `KeyError
'items'`.

## Solution

I added a check for an empty response before accessing the 'items' key.
It will handle the case correctly.

## Other

my twitter: yakigac
(I don't mind even if you don't mention me for this PR. But just because
last time my real name was shout out :) )
2023-02-17 13:04:02 -08:00
Harrison Chase
5e10e19bfe
Harrison/align table (#1081)
Co-authored-by: Francisco Ingham <fpingham@gmail.com>
2023-02-15 23:53:37 -08:00
Hasegawa Yuya
e08961ab25
Fixed openai embeddings to be safe by batching them based on token size calculation. (#991)
I modified the logic of the batch calculation for embedding according to
this cookbook

https://github.com/openai/openai-cookbook/blob/main/examples/Embedding_long_inputs.ipynb
2023-02-15 23:02:32 -08:00
seanaedmiston
f0a258555b
Support similarity search by vector (in FAISS) (#961)
Alternate implementation to PR #960 Again - only FAISS is implemented.
If accepted can add this to other vectorstores or leave as
NotImplemented? Suggestions welcome...
2023-02-15 22:50:00 -08:00
rogerserper
e46cd3b7db
Google Search API integration with serper.dev (wrapper, tests, docs, … (#909)
Adds Google Search integration with [Serper](https://serper.dev) a
low-cost alternative to SerpAPI (10x cheaper + generous free tier).
Includes documentation, tests and examples. Hopefully I am not missing
anything.

Developers can sign up for a free account at
[serper.dev](https://serper.dev) and obtain an api key.

## Usage

```python
from langchain.utilities import GoogleSerperAPIWrapper
from langchain.llms.openai import OpenAI
from langchain.agents import initialize_agent, Tool

import os
os.environ["SERPER_API_KEY"] = ""
os.environ['OPENAI_API_KEY'] = ""

llm = OpenAI(temperature=0)
search = GoogleSerperAPIWrapper()
tools = [
    Tool(
        name="Intermediate Answer",
        func=search.run
    )
]

self_ask_with_search = initialize_agent(tools, llm, agent="self-ask-with-search", verbose=True)
self_ask_with_search.run("What is the hometown of the reigning men's U.S. Open champion?")
```

### Output
```
Entering new AgentExecutor chain...
 Yes.
Follow up: Who is the reigning men's U.S. Open champion?
Intermediate answer: Current champions Carlos Alcaraz, 2022 men's singles champion.
Follow up: Where is Carlos Alcaraz from?
Intermediate answer: El Palmar, Spain
So the final answer is: El Palmar, Spain

> Finished chain.

'El Palmar, Spain'
```
2023-02-15 22:47:17 -08:00
Ankush Gola
caa8e4742e
Enable streaming for OpenAI LLM (#986)
* Support a callback `on_llm_new_token` that users can implement when
`OpenAI.streaming` is set to `True`
2023-02-14 15:06:14 -08:00
Harrison Chase
88bebb4caa
Harrison/llm integrations (#1039)
Co-authored-by: jped <jonathanped@gmail.com>
Co-authored-by: Justin Torre <justintorre75@gmail.com>
Co-authored-by: Ivan Vendrov <ivan@anthropic.com>
2023-02-13 22:06:25 -08:00
Harrison Chase
ec727bf166
Align table info (#999) (#1034)
Currently the chain is getting the column names and types on the one
side and the example rows on the other. It is easier for the llm to read
the table information if the column name and examples are shown together
so that it can easily understand to which columns do the examples refer
to. For an instantiation of this, please refer to the changes in the
`sqlite.ipynb` notebook.

Also changed `eval` for `ast.literal_eval` when interpreting the results
from the sample row query since it is a better practice.

---------

Co-authored-by: Francisco Ingham <>

---------

Co-authored-by: Francisco Ingham <fpingham@gmail.com>
2023-02-13 21:48:41 -08:00
Enrico Shippole
f30dcc6359
Add GooseAI, CerebriumAI, Petals, ForefrontAI (#981)
Add GooseAI, CerebriumAI, Petals, ForefrontAI
2023-02-13 21:20:19 -08:00
Anton Troynikov
d43d430d86
Chroma persistence (#1028)
This PR adds persistence to the Chroma vector store.

Users can supply a `persist_directory` with any of the `Chroma` creation
methods. If supplied, the store will be automatically persisted at that
directory.

If a user creates a new `Chroma` instance with the same persistence
directory, it will get loaded up automatically. If they use `from_texts`
or `from_documents` in this way, the documents will be loaded into the
existing store.

There is the chance of some funky behavior if the user passes a
different embedding function from the one used to create the collection
- we will make this easier in future updates. For now, we log a warning.
2023-02-13 21:09:06 -08:00
Anton Troynikov
78abd277ff
Chroma in LangChain (#1010)
Chroma is a simple to use, open-source, zero-config, zero setup
vectorstore.

Simply `pip install chromadb`, and you're good to go. 

Out-of-the-box Chroma is suitable for most LangChain workloads, but is
highly flexible. I tested to 1M embs on my M1 mac, with out issues and
reasonably fast query times.

Look out for future releases as we integrate more Chroma features with
LangChain!
2023-02-12 17:43:48 -08:00
Shahriar Tajbakhsh
b7747017d7
Import of declarative_base when SQLAlchemy <1.4 (#883)
In
[pyproject.toml](https://github.com/hwchase17/langchain/blob/master/pyproject.toml),
the expectation is `SQLAlchemy = "^1"`. But, the way `declarative_base`
is imported in
[cache.py](https://github.com/hwchase17/langchain/blob/master/langchain/cache.py)
will only work with SQLAlchemy >=1.4. This PR makes sure Langchain can
be run in environments with SQLAlchemy <1.4
2023-02-10 18:33:47 -08:00
Harrison Chase
c64f98e2bb
Harrison/format agent instructions (#973)
Co-authored-by: Andrew White <white.d.andrew@gmail.com>
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>
Co-authored-by: Peng Qu <82029664+pengqu123@users.noreply.github.com>
2023-02-10 10:07:26 -08:00
Harrison Chase
91c6cea227
Harrison/batch embeds (#972)
Co-authored-by: John Dagdelen <jdagdelen@users.noreply.github.com>
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>
2023-02-10 06:59:50 -08:00
Ankush Gola
bc7e56e8df
Add asyncio support for LLM (OpenAI), Chain (LLMChain, LLMMathChain), and Agent (#841)
Supporting asyncio in langchain primitives allows for users to run them
concurrently and creates more seamless integration with
asyncio-supported frameworks (FastAPI, etc.)

Summary of changes:

**LLM**
* Add `agenerate` and `_agenerate`
* Implement in OpenAI by leveraging `client.Completions.acreate`

**Chain**
* Add `arun`, `acall`, `_acall`
* Implement them in `LLMChain` and `LLMMathChain` for now

**Agent**
* Refactor and leverage async chain and llm methods
* Add ability for `Tools` to contain async coroutine
* Implement async SerpaPI `arun`

Create demo notebook.

Open questions:
* Should all the async stuff go in separate classes? I've seen both
patterns (keeping the same class and having async and sync methods vs.
having class separation)
2023-02-07 21:21:57 -08:00
Harrison Chase
bc53c928fc
Harrison/athropic (#921)
Co-authored-by: Mike Lambert <mlambert@gmail.com>
Co-authored-by: mrbean <sam@you.com>
Co-authored-by: mrbean <43734688+sam-h-bean@users.noreply.github.com>
Co-authored-by: Ivan Vendrov <ivendrov@gmail.com>
2023-02-06 22:29:25 -08:00
Harrison Chase
1e56879d38
Harrison/save faiss (#916)
Co-authored-by: Shrey Joshi <shreyjoshi2004@gmail.com>
2023-02-06 21:44:50 -08:00
Harrison Chase
e2b834e427
Harrison/prompt template prefix (#888)
Co-authored-by: Gabriel Simmons <simmons.gabe@gmail.com>
2023-02-06 19:09:28 -08:00
Harrison Chase
f95cedc443
Harrison/sql rows (#915)
Co-authored-by: Jon Luo <20971593+jzluo@users.noreply.github.com>
2023-02-06 18:56:18 -08:00
Harrison Chase
ba5a2f06b9
Harrison/inference endpoint (#861)
Co-authored-by: Eno Reyes <enoreyes@gmail.com>
2023-02-06 18:14:25 -08:00