Commit Graph

326 Commits

Author SHA1 Message Date
Zander Chase
334c162f16
Add Other File Utilities (#3209)
Add other File Utilities, include
- List Directory
- Search for file
- Move
- Copy
- Remove file

Bundle as toolkit
Add a notebook that connects to the Chat Agent, which somewhat supports
multi-arg input tools
Update original read/write files to return the original dir paths and
better handle unsupported file paths.
Add unit tests
2023-04-28 10:53:37 -07:00
Zander Chase
da7b51455c
Dynamic tool -> single purpose (#3697)
I think the logic of
https://github.com/hwchase17/langchain/pull/3684#pullrequestreview-1405358565
is too confusing.

I prefer this alternative because:
- All `Tool()` implementations by default will be treated the same as
before. No breaking changes.
- Less reliance on pydantic magic
- The decorator (which only is typed as returning a callable) can infer
schema and generate a structured tool
- Either way, the recommended way to create a custom tool is through
inheriting from the base tool
2023-04-28 09:38:41 -07:00
Zander Chase
4654c58f72
Add validation on agent instantiation for multi-input tools (#3681)
Tradeoffs here:
- No lint-time checking for compatibility
- Differs from JS package
- The signature inference, etc. in the base tool isn't simple
- The `args_schema` is optional 

Pros:
- Forwards compatibility retained
- Doesn't break backwards compatibility
- User doesn't have to think about which class to subclass (single base
tool or dynamic `Tool` interface regardless of input)
-  No need to change the load_tools, etc. interfaces

Co-authored-by: Hasan Patel <mangafield@gmail.com>
2023-04-27 15:36:11 -07:00
Davis Chase
b807a114e4
Add query parsing unit tests (#3672) 2023-04-27 13:42:12 -07:00
Eugene Yurtsev
708787dddb
Blob: Add validator and use future annotations (#3650)
Minor changes to the Blob schema.

---------

Co-authored-by: Zander Chase <130414180+vowelparrot@users.noreply.github.com>
2023-04-27 14:33:59 -04:00
Eugene Yurtsev
c5a4b4fea1
Suppress duckdb warning in unit tests explicitly (#3653)
This catches the warning raised when using duckdb, asserts that it's as expected.

The goal is to resolve all existing warnings to make unit-testing much stricter.
2023-04-27 14:29:41 -04:00
Eugene Yurtsev
e6c8cce050
Add unit-test to catch changes to required deps (#3662)
This adds a unit test that can catch changes to required dependencies
2023-04-27 13:04:17 -04:00
Eugene Yurtsev
055f58960a
Fix pytest collection warning (#3651)
Fixes a pytest collection warning because the test class starts with the
prefix "Test"
2023-04-27 09:51:43 -07:00
Eugene Yurtsev
5d02010763
Introduce Blob and Blob Loader interface (#3603)
This PR introduces a Blob data type and a Blob loader interface.

This is the first of a sequence of PRs that follows this proposal: 

https://github.com/hwchase17/langchain/pull/2833

The primary goals of these abstraction are:

* Decouple content loading from content parsing code.
* Help duplicated content loading code from document loaders.
* Make lazy loading a default for langchain.
2023-04-27 09:45:25 -04:00
Zander Chase
ee670c448e
Persistent Bash Shell (#3580)
Clean up linting and make more idiomatic by using an output parser

---------

Co-authored-by: FergusFettes <fergusfettes@gmail.com>
2023-04-26 15:20:28 -07:00
Roma
2b4e9a3efa
Add unit test for _merge_splits function (#3513)
This commit adds a new unit test for the _merge_splits function in the
text splitter. The new test verifies that the function merges text into
chunks of the correct size and overlap, using a specified separator. The
test passes on the current implementation of the function.
2023-04-25 10:02:59 -07:00
Mindaugas Sharskus
a4d85f7fd5
[Fix #3365]: Changed regex to cover new line before action serious (#3367)
Fix for: [Changed regex to cover new line before action
serious.](https://github.com/hwchase17/langchain/issues/3365)
---

This PR fixes the issue where `ValueError: Could not parse LLM output:`
was thrown on seems to be valid input.

Changed regex to cover new lines before action serious (after the
keywords "Action:" and "Action Input:").

regex101: https://regex101.com/r/CXl1kB/1

---------

Co-authored-by: msarskus <msarskus@cisco.com>
2023-04-24 22:05:31 -07:00
Davis Chase
b2564a6391
fix #3884 (#3475)
fixes mar bug #3384
2023-04-24 19:54:15 -07:00
Zander Chase
49122a96e7
Structured Tool Bugfixes (#3324)
- Proactively raise error if a tool subclasses BaseTool, defines its
own schema, but fails to add the type-hints
- fix the auto-inferred schema of the decorator to strip the
unneeded virtual kwargs from the schema dict

Helps avoid silent instances of #3297
2023-04-24 09:58:29 -07:00
Davis Chase
46542dc774
Contextual compression retriever (#2915)
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-04-20 17:01:14 -07:00
Harrison Chase
9a0356d276
Harrison/file chat history (#3198)
Co-authored-by: Young Lee <joybro201@gmail.com>
2023-04-19 21:05:20 -07:00
Zander Chase
4adfd790f0
Update File Management Tools to Include Root Directory (#3112)
- Permit the specification of a `root_dir` to the read/write file tools
to specify a working directory
- Add validation for attempts to read/write outside the directory (e.g.,
through `../../` or symlinks or `/abs/path`'s that don't lie in the
correct path)
- Add some tests for all


One question is whether we should make a default root directory for
these? tradeoffs either way
2023-04-19 16:46:10 -07:00
engkheng
dbbc340f25
Validate input_variables when using jinja2 templates (#3140)
`langchain.prompts.PromptTemplate` and
`langchain.prompts.FewShotPromptTemplate` do not validate
`input_variables` when initialized as `jinja2` template.

```python
# Using langchain v0.0.144
template = """"\
Your variable: {{ foo }}
{% if bar %}
You just set bar boolean variable to true
{% endif %}
"""

# Missing variable, should raise ValueError
prompt_template = PromptTemplate(template=template, 
                                 input_variables=["bar"], 
                                 template_format="jinja2", 
                                 validate_template=True)

# Extra variable, should raise ValueError
prompt_template = PromptTemplate(template=template, 
                                 input_variables=["bar", "foo", "extra", "thing"], 
                                 template_format="jinja2", 
                                 validate_template=True)
```
2023-04-19 16:18:32 -07:00
Zander Chase
90ef705ced
Update Tool Input (#3103)
- Remove dynamic model creation in the `args()` property. _Only infer
for the decorator (and add an argument to NOT infer if someone wishes to
only pass as a string)_
- Update the validation example to make it less likely to be
misinterpreted as a "safe" way to run a repl


There is one example of "Multi-argument tools" in the custom_tools.ipynb
from yesterday, but we could add more. The output parsing for the base
MRKL agent hasn't been adapted to handle structured args at this point
in time

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-04-18 18:18:33 -07:00
Harrison Chase
aad0a498ac
Harrison/output error (#3094)
Co-authored-by: yummydum <sumita@nowcast.co.jp>
2023-04-18 08:59:56 -07:00
Harrison Chase
db968284f8
tools refactor (#2961)
Co-authored-by: vowelparrot <130414180+vowelparrot@users.noreply.github.com>
2023-04-17 21:35:29 -07:00
engkheng
19febc77d6
Support inference of input_variables from jinja2 template (#3013)
`langchain.prompts.PromptTemplate` is unable to infer `input_variables`
from jinja2 template.

```python
# Using langchain v0.0.141
template_string = """\
Hello world
Your variable: {{ var }}
{# This will not get rendered #}

{% if verbose %}
Congrats! You just turned on verbose mode and got extra messages!
{% endif %}
"""

template = PromptTemplate.from_template(template_string, template_format="jinja2")
print(template.input_variables) # Output ['# This will not get rendered #', '% endif %', '% if verbose %']
```

---------

Co-authored-by: engkheng <ongengkheng929@example.com>
2023-04-17 20:31:03 -07:00
Nuno Campos
dac32c59e5
Nc/combining output parser (#3014)
Co-authored-by: vowelparrot <130414180+vowelparrot@users.noreply.github.com>
2023-04-17 20:29:53 -07:00
Davis Chase
19c85aa990
Factor out doc formatting and add validation (#3026)
@cnhhoang850 slightly more generic fix for #2944, works for whatever the
expected metadata keys are not just `source`
2023-04-17 20:28:01 -07:00
vowelparrot
99c0382209
Generative Characters (#2859)
Add a time-weighted memory retriever and a notebook that approximates a
Generative Agent from https://arxiv.org/pdf/2304.03442.pdf


The "daily plan" components are removed for now since they are less
useful without a virtual world, but the memory is an interesting
component to build off.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-04-16 21:41:00 -07:00
Harrison Chase
e12e00df12
use output parsers in agents (#2987) 2023-04-16 13:15:21 -07:00
vowelparrot
5ca7ce77cd
Remove pythonrepl from LLM-MathChain (#2943)
Use numexpr evaluate instead of the python REPL to avoid malicious code
injection.

Tested against the (limited) math dataset and got the same score as
before.

For more permissive tools (like the REPL tool itself), other approaches
ought to be provided (some combination of Sanitizer + Restricted python
+ unprivileged-docker + ...), but for a calculator tool, only
mathematical expressions should be permitted.

See https://github.com/hwchase17/langchain/issues/814
2023-04-16 08:50:32 -07:00
dev2049
36aa7f30e4
Move PythonRepl -> langchain.utilities (#2917) 2023-04-15 10:50:25 -07:00
Harrison Chase
705596b46a
Harrison/fix create sql agent (#2870)
Co-authored-by: Timothé Pearce <timothe.pearce@gmail.com>
2023-04-13 22:07:58 -07:00
KullTC
802363eb6a
Remove print statement from test (#2809)
Remove unnecessary print statement.
2023-04-13 09:31:48 -07:00
KullTC
64596b23b9
Return output of PythonAstREPLTool when falling back to exec() (#2780)
When the code ran by the PythonAstREPLTool contains multiple statements
it will fallback to exec() instead of using eval(). With this change, it
will also return the output of the code in the same way the
PythonREPLTool will.
2023-04-12 21:22:46 -07:00
Joshua Snyder
59d054308c
Add type inference for output parsers (#2769)
Currently, the output type of a number of OutputParser's `parse` methods
is `Any` when it can in fact be inferred.

This PR makes BaseOutputParser use a generic type and fixes the output
types of the following parsers:
- `PydanticOutputParser`
- `OutputFixingParser`
- `RetryOutputParser`
- `RetryWithErrorOutputParser`

The output of the `StructuredOutputParser` is corrected from `BaseModel`
to `Any` since there are no type guarantees provided by the parser.

Fixes issue #2715
2023-04-12 09:12:20 -07:00
Abhik Singla
955bd2e1db
Fixed Ast Python Repl for Chatgpt multiline commands (#2406)
Resolves issue https://github.com/hwchase17/langchain/issues/2252

---------

Co-authored-by: Abhik Singla <abhiksingla@microsoft.com>
2023-04-10 21:25:03 -07:00
Ankush Gola
b82cbd1be0
Use run and arun in place of combine_docs and acombine_docs (#2635)
`combine_docs` does not go through the standard chain call path which
means that chain callbacks won't be triggered, meaning QA chains won't
be traced properly, this fixes that.

Also fix several errors in the chat_vector_db notebook
2023-04-09 18:47:59 -07:00
Vashisht Madhavan
aa439ac2ff
Adding an in-context QA evaluation chain + chain of thought reasoning chain for improved accuracy (#2444)
Right now, eval chains require an answer for every question. It's
cumbersome to collect this ground truth so getting around this issue
with 2 things:

* Adding a context param in `ContextQAEvalChain` and simply evaluating
if the question is answered accurately from context
* Adding chain of though explanation prompting to improve the accuracy
of this w/o GT.

This also gets to feature parity with openai/evals which has the same
contextual eval w/o GT.

TODO in follow-up:
* Better prompt inheritance. No need for seperate prompt for CoT
reasoning. How can we merge them together

---------

Co-authored-by: Vashisht Madhavan <vashishtmadhavan@Vashs-MacBook-Pro.local>
2023-04-06 22:32:41 -07:00
William FH
f240651bd8
Add Request body (#2507)
This still doesn't handle the following

- non-JSON media types
- anyOf, allOf, oneOf's

And doesn't emit the typescript definitions for referred types yet, but
that can be saved for a separate PR.

Also, we could have better support for Swagger 2.0 specs and OpenAPI
3.0.3 (can use the same lib for the latter) recommend offline conversion
for now.
2023-04-06 13:02:42 -07:00
Zach Jones
13d1df2140
Feature: AgentExecutor execution time limit (#2399)
`AgentExecutor` already has support for limiting the number of
iterations. But the amount of time taken for each iteration can vary
quite a bit, so it is difficult to place limits on the execution time.
This PR adds a new field `max_execution_time` to the `AgentExecutor`
model. When called asynchronously, the agent loop is wrapped in an
`asyncio.timeout()` context which triggers the early stopping response
if the time limit is reached. When called synchronously, the agent loop
checks for both the max_iteration limit and the time limit after each
iteration.

When used asynchronously `max_execution_time` gives really tight control
over the max time for an execution chain. When used synchronously, the
chain can unfortunately exceed max_execution_time, but it still gives
more control than trying to estimate the number of max_iterations needed
to cap the execution time.

---------

Co-authored-by: Zachary Jones <zjones@zetaglobal.com>
2023-04-06 12:54:32 -07:00
leo-gan
fd69cc7e42
Removed duplicate BaseModel dependencies (#2471)
Removed duplicate BaseModel dependencies in class inheritances.
Also, sorted imports by `isort`.
2023-04-06 12:45:16 -07:00
Harrison Chase
1e19e004af
Harrison/openapi spec (#2474)
Co-authored-by: William Fu-Hinthorn <13333726+hinthornw@users.noreply.github.com>
2023-04-06 09:47:37 -07:00
Harrison Chase
26314d7004
Harrison/openapi parser (#2461)
Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com>
2023-04-05 22:19:09 -07:00
Ankush Gola
4d730a9bbc
improve AsyncCallbackManager (#2410) 2023-04-05 09:31:42 +02:00
Harrison Chase
c7b083ab56
bump version to 131 (#2391) 2023-04-04 07:21:50 -07:00
Harrison Chase
fe1eb8ca5f
requests wrapper (#2367) 2023-04-03 21:57:19 -07:00
Shrined
10dab053b4
Add Enum for agent types (#2321)
This pull request adds an enum class for the various types of agents
used in the project, located in the `agent_types.py` file. Currently,
the project is using hardcoded strings for the initialization of these
agents, which can lead to errors and make the code harder to maintain.
With the introduction of the new enums, the code will be more readable
and less error-prone.

The new enum members include:

- ZERO_SHOT_REACT_DESCRIPTION
- REACT_DOCSTORE
- SELF_ASK_WITH_SEARCH
- CONVERSATIONAL_REACT_DESCRIPTION
- CHAT_ZERO_SHOT_REACT_DESCRIPTION
- CHAT_CONVERSATIONAL_REACT_DESCRIPTION

In this PR, I have also replaced the hardcoded strings with the
appropriate enum members throughout the codebase, ensuring a smooth
transition to the new approach.
2023-04-03 21:56:20 -07:00
Harrison Chase
acfda4d1d8
Harrison/multiline commands (#2280)
Co-authored-by: Marc Päpper <mpaepper@users.noreply.github.com>
2023-04-01 12:54:06 -07:00
leo-gan
579ad85785
skip unit tests that fail in Windows (#2238)
Issue #2174
Several unit tests fail in Windows.
Added pytest attribute to skip these tests automatically.
2023-04-01 12:52:21 -07:00
Harrison Chase
2d3918c152
make requests more general (#2209) 2023-03-30 20:41:56 -07:00
Harrison Chase
5c907d9998
Harrison/base agent without docs (#2166) 2023-03-29 22:11:25 -07:00
Harrison Chase
f5a4bf0ce4
remove prep (#2136)
agents should be stateless or async stuff may not work
2023-03-29 14:38:21 -07:00
Harrison Chase
e2c26909f2
Harrison/memory check (#2119)
Co-authored-by: JIAQIA <jqq1716@gmail.com>
2023-03-28 15:40:36 -07:00
Harrison Chase
f281033362
rm pandas dependency (#2102) 2023-03-28 08:38:19 -07:00
Harrison Chase
9e74df2404
Fix issue#1645: Parse llm_output even there's newline (#2092) (#2099)
Fix issue#1645: Parse either whitespace or newline after 'Action Input:'
in llm_output in mrkl agent.
Unittests added accordingly.

Co-authored-by: ₿ingnan.ΞTH <brillliantz@outlook.com>
2023-03-28 08:14:09 -07:00
b7f392fdd6
[agent_executor] convenience func: lookup tool by name (#2001)
A quick convenience function to lookup a tool by name

Co-authored-by: blob42 <spike@w530>
2023-03-27 23:10:34 -07:00
Harrison Chase
30e3b31b04
Harrison/document cleanup (#2062)
Co-authored-by: Delip Rao <delip@users.noreply.github.com>
2023-03-27 16:32:55 -07:00
Daniel Chalef
6598beacdb
PydanticOutputParser unit test (#2047)
Unit test for PydanticOutputParser

---------

Co-authored-by: Daniel Chalef <daniel.chalef@private.org>
2023-03-27 14:32:56 -07:00
Harrison Chase
705431aecc
big docs refactor (#1978)
Co-authored-by: Ankush Gola <ankush.gola@gmail.com>
2023-03-26 19:49:46 -07:00
Harrison Chase
ce5d97bcb3
Harrison/guarded output parser (#1804)
Co-authored-by: jerwelborn <jeremy.welborn@gmail.com>
2023-03-21 22:07:23 -07:00
Matt Tucker
a92344f476
Use regex match for bash process error output test assertion. (#1837)
I was getting the same issue reported in #1339 by
[MacYang555](https://github.com/MacYang555) when running the test suite
on my Mac. I implemented the fix they suggested to use a regex match in
the output assertion for the scenario under test.

Resolves #1339
2023-03-21 09:06:52 -07:00
Jon Luo
0a1b1806e9
sql: do not hard code the LIMIT clause in the table_info section (#1563)
Seeing a lot of issues in Discord in which the LLM is not using the
correct LIMIT clause for different SQL dialects. ie, it's using `LIMIT`
for mssql instead of `TOP`, or instead of `ROWNUM` for Oracle, etc.
I think this could be due to us specifying the LIMIT statement in the
example rows portion of `table_info`. So the LLM is seeing the `LIMIT`
statement used in the prompt.
Since we can't specify each dialect's method here, I think it's fine to
just replace the `SELECT... LIMIT 3;` statement with `3 rows from
table_name table:`, and wrap everything in a block comment directly
following the `CREATE` statement. The Rajkumar et al paper wrapped the
example rows and `SELECT` statement in a block comment as well anyway.
Thoughts @fpingham?
2023-03-13 23:08:27 -07:00
Luis
562d9891ea
Add regex dict: (#1616)
This class enables us to send a dictionary containing an output key and
the expected format, which in turn allows us to retrieve the result of
the matching formats and extract specific information from it.

To exclude irrelevant information from our return dictionary, we can
prompt the LLM to use a specific command that notifies us when it
doesn't know the answer. We refer to this variable as the
"no_update_value".

Regarding the updated regular expression pattern
(r"{}:\s?([^.'\n']*).?"), it enables us to retrieve a format as 'Output
Key':'value'.

We have improved the regex by adding an optional space between ':' and
'value' with "s?", and by excluding points and line jumps from the
matches using "[^.'\n']*".
2023-03-13 23:05:39 -07:00
Harrison Chase
aed9f9febe
Harrison/return intermediate (#1633)
Co-authored-by: Mario Kostelac <mario@intercom.io>
2023-03-13 07:54:29 -07:00
yakigac
acd86d33bc
Add read only shared memory (#1491)
Provide shared memory capability for the Agent.
Inspired by #1293 .

## Problem

If both Agent and Tools (i.e., LLMChain) use the same memory, both of
them will save the context. It can be annoying in some cases.


## Solution

Create a memory wrapper that ignores the save and clear, thereby
preventing updates from Agent or Tools.
2023-03-12 09:34:36 -07:00
Harrison Chase
c9b5a30b37
move output parsing (#1605) 2023-03-11 16:41:03 -08:00
Harrison Chase
f95d551f7a
Harrison/shallow metadata (#1599)
Co-authored-by: Jesse Zhang <jessetanzhang@gmail.com>
2023-03-11 09:18:25 -08:00
Harrison Chase
9f78717b3c
Harrison/callbacks (#1587) 2023-03-10 12:53:09 -08:00
Harrison Chase
cc423f40f1
Harrison/youtube loader (#1545)
Co-authored-by: Julian Wustl <57504258+Julianwustl@users.noreply.github.com>
2023-03-08 20:53:27 -08:00
Harrison Chase
7ade419a0e
allow passing of messages into prompt template (#1505) 2023-03-07 21:10:12 -08:00
Harrison Chase
064741db58
Harrison/fix text splitter (#1511)
Co-authored-by: ajaysolanky <ajsolanky@gmail.com>
Co-authored-by: Ajay Solanky <ajaysolanky@saw-l14668307kd.myfiosgateway.com>
2023-03-07 15:42:28 -08:00
Harrison Chase
7bec461782
Harrison/memory refactor (#1478)
moves memory to own module, factors out common stuff
2023-03-07 07:59:37 -08:00
Harrison Chase
0e21463f07
(rfc) chat models (#1424)
Co-authored-by: Ankush Gola <ankush.gola@gmail.com>
2023-03-06 08:34:24 -08:00
Harrison Chase
63a5614d23
Harrison/simple memory (#1435)
Co-authored-by: Tim Asp <707699+timothyasp@users.noreply.github.com>
2023-03-04 08:15:52 -08:00
Harrison Chase
1cd8996074
Harrison/summarizer chain (#1356)
Co-authored-by: Tim Asp <707699+timothyasp@users.noreply.github.com>
2023-03-01 20:59:07 -08:00
Ankush Gola
82baecc892
Add a SQL agent for interacting with SQL Databases and JSON Agent for interacting with large JSON blobs (#1150)
This PR adds 

* `ZeroShotAgent.as_sql_agent`, which returns an agent for interacting
with a sql database. This builds off of `SQLDatabaseChain`. The main
advantages are 1) answering general questions about the db, 2) access to
a tool for double checking queries, and 3) recovering from errors
* `ZeroShotAgent.as_json_agent` which returns an agent for interacting
with json blobs.
* Several examples in notebooks

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-02-28 19:44:39 -08:00
Harrison Chase
786852e9e6
partial variables (#1308) 2023-02-28 08:40:35 -08:00
Harrison Chase
b7708bbec6
rfc: callback changes (#1165)
conceptually, no reason a tool should know what an "agent action" is

unless any objections, can change in all callback handlers
2023-02-20 22:54:15 -08:00
CG80499
af8f5c1a49
Added constitutional chain. (#1147)
- Added self-critique constitutional chain based on this
[paper](https://www.anthropic.com/constitutional.pdf).
2023-02-18 19:31:51 -08:00
Ankush Gola
7b5e160d28
Make Tools own model, add ToolKit Concept (#1095)
Follow-up of @hinthornw's PR:

- Migrate the Tool abstraction to a separate file (`BaseTool`).
- `Tool` implementation of `BaseTool` takes in function and coroutine to
more easily maintain backwards compatibility
- Add a Toolkit abstraction that can own the generation of tools around
a shared concept or state

---------

Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Francisco Ingham <fpingham@gmail.com>
Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com>
Co-authored-by: cragwolfe <cragcw@gmail.com>
Co-authored-by: Anton Troynikov <atroyn@users.noreply.github.com>
Co-authored-by: Oliver Klingefjord <oliver@klingefjord.com>
Co-authored-by: William Fu-Hinthorn <whinthorn@Williams-MBP-3.attlocal.net>
Co-authored-by: Bruno Bornsztein <bruno.bornsztein@gmail.com>
2023-02-18 13:40:43 -08:00
Francisco Ingham
3f29742adc
Sql alchemy commands used in table info (#1135)
This approach has several advantages:

* it improves the readability of the code
* removes incompatibilities between SQL dialects
* fixes a bug with `datetime` values in rows and `ast.literal_eval`

Huge thanks and credits to @jzluo for finding the weaknesses in the
current approach and for the thoughtful discussion on the best way to
implement this.

---------

Co-authored-by: Francisco Ingham <>
Co-authored-by: Jon Luo <20971593+jzluo@users.noreply.github.com>
2023-02-18 10:58:29 -08:00
Harrison Chase
5e10e19bfe
Harrison/align table (#1081)
Co-authored-by: Francisco Ingham <fpingham@gmail.com>
2023-02-15 23:53:37 -08:00
Ankush Gola
caa8e4742e
Enable streaming for OpenAI LLM (#986)
* Support a callback `on_llm_new_token` that users can implement when
`OpenAI.streaming` is set to `True`
2023-02-14 15:06:14 -08:00
Harrison Chase
ec727bf166
Align table info (#999) (#1034)
Currently the chain is getting the column names and types on the one
side and the example rows on the other. It is easier for the llm to read
the table information if the column name and examples are shown together
so that it can easily understand to which columns do the examples refer
to. For an instantiation of this, please refer to the changes in the
`sqlite.ipynb` notebook.

Also changed `eval` for `ast.literal_eval` when interpreting the results
from the sample row query since it is a better practice.

---------

Co-authored-by: Francisco Ingham <>

---------

Co-authored-by: Francisco Ingham <fpingham@gmail.com>
2023-02-13 21:48:41 -08:00
Shahriar Tajbakhsh
b7747017d7
Import of declarative_base when SQLAlchemy <1.4 (#883)
In
[pyproject.toml](https://github.com/hwchase17/langchain/blob/master/pyproject.toml),
the expectation is `SQLAlchemy = "^1"`. But, the way `declarative_base`
is imported in
[cache.py](https://github.com/hwchase17/langchain/blob/master/langchain/cache.py)
will only work with SQLAlchemy >=1.4. This PR makes sure Langchain can
be run in environments with SQLAlchemy <1.4
2023-02-10 18:33:47 -08:00
Ankush Gola
bc7e56e8df
Add asyncio support for LLM (OpenAI), Chain (LLMChain, LLMMathChain), and Agent (#841)
Supporting asyncio in langchain primitives allows for users to run them
concurrently and creates more seamless integration with
asyncio-supported frameworks (FastAPI, etc.)

Summary of changes:

**LLM**
* Add `agenerate` and `_agenerate`
* Implement in OpenAI by leveraging `client.Completions.acreate`

**Chain**
* Add `arun`, `acall`, `_acall`
* Implement them in `LLMChain` and `LLMMathChain` for now

**Agent**
* Refactor and leverage async chain and llm methods
* Add ability for `Tools` to contain async coroutine
* Implement async SerpaPI `arun`

Create demo notebook.

Open questions:
* Should all the async stuff go in separate classes? I've seen both
patterns (keeping the same class and having async and sync methods vs.
having class separation)
2023-02-07 21:21:57 -08:00
Harrison Chase
e2b834e427
Harrison/prompt template prefix (#888)
Co-authored-by: Gabriel Simmons <simmons.gabe@gmail.com>
2023-02-06 19:09:28 -08:00
Harrison Chase
f95cedc443
Harrison/sql rows (#915)
Co-authored-by: Jon Luo <20971593+jzluo@users.noreply.github.com>
2023-02-06 18:56:18 -08:00
Harrison Chase
93a091cfb8
Optionally return shell output on incorrect command (#894) (#899)
This allows the LLM to correct its previous command by looking at the
error message output to the shell.

Additionally, this uses subprocess.run because that is now recommended
over subprocess.check_output:

https://docs.python.org/3/library/subprocess.html#using-the-subprocess-module

Co-authored-by: Amos Ng <me@amos.ng>
2023-02-06 12:46:16 -08:00
Harrison Chase
a2b699dcd2
prompt template from string (#884) 2023-02-04 17:04:58 -08:00
Harrison Chase
8df6b68093
fix length based example selector (#862) 2023-02-02 22:06:56 -08:00
Jason Liu
54f9e4287f
Pass kwargs from initialize_agent into agent classmethod (#799)
# Problem
I noticed that in order to change the prefix of the prompt in the
`zero-shot-react-description` agent
we had to dig around to subset strings deep into the agent's attributes.
It requires the user to inspect a long chain of attributes and classes.

`initialize_agent -> AgentExecutor -> Agent -> LLMChain -> Prompt from
Agent.create_prompt`

``` python
agent = initialize_agent(
    tools=tools,
    llm=fake_llm,
    agent="zero-shot-react-description"
)
prompt_str = agent.agent.llm_chain.prompt.template
new_prompt_str = change_prefix(prompt_str)
agent.agent.llm_chain.prompt.template = new_prompt_str
```

# Implemented Solution

`initialize_agent` accepts `**kwargs` but passes it to `AgentExecutor`
but not `ZeroShotAgent`, by simply giving the kwargs to the agent class
methods we can support changing the prefix and suffix for one agent
while allowing future agents to take advantage of `initialize_agent`.


```
agent = initialize_agent(
    tools=tools,
    llm=fake_llm,
    agent="zero-shot-react-description",
    agent_kwargs={"prefix": prefix, "suffix": suffix}
)
```

To be fair, this was before finding docs around custom agents here:
https://langchain.readthedocs.io/en/latest/modules/agents/examples/custom_agent.html?highlight=custom%20#custom-llmchain
but i find that my use case just needed to change the prefix a little.


# Changes

* Pass kwargs to Agent class method
* Added a test to check suffix and prefix

---------

Co-authored-by: Jason Liu <jason@jxnl.coA>
2023-01-30 14:54:09 -08:00
Roy Williams
6086292252
Centralize logic for loading from LangChainHub, add ability to pin dependencies (#805)
It's generally considered to be a good practice to pin dependencies to
prevent surprise breakages when a new version of a dependency is
released. This commit adds the ability to pin dependencies when loading
from LangChainHub.

Centralizing this logic and using urllib fixes an issue identified by
some windows users highlighted in this video -
https://youtu.be/aJ6IQUh8MLQ?t=537
2023-01-30 14:52:17 -08:00
Harrison Chase
1ad7973cc6
Harrison/tool decorator (#790)
Co-authored-by: Jason Liu <jxnl@users.noreply.github.com>
Co-authored-by: Jason Liu <jason@jxnl.coA>
2023-01-28 18:26:24 -08:00
Harrison Chase
248c297f1b
Sample row in table info for SQLDatabase (#769) (#782)
The agents usually benefit from understanding what the data looks like
to be able to filter effectively. Sending just one row in the table info
allows the agent to understand the data before querying and get better
results.

---------

Co-authored-by: Francisco Ingham <>

---------

Co-authored-by: Francisco Ingham <fpingham@gmail.com>
2023-01-28 13:37:07 -08:00
Amos Ng
6ad360bdef
Suggestions for better debugging (#765)
Please feel free to disregard any changes you disagree with
2023-01-28 08:05:20 -08:00
Ankush Gola
57609845df
add tracing support to langchain (#741)
* add implementations of `BaseCallbackHandler` to support tracing:
`SharedTracer` which is thread-safe and `Tracer` which is not and is
meant to be used locally.
* Tracers persist runs to locally running `langchain-server`

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-01-26 17:38:13 -08:00
Amos Ng
fa6826e417
Fix sqlalchemy warnings when running tests (#733)
This has been bugging me when running my own tests that call langchain
methods :P
2023-01-25 07:14:07 -08:00
scadEfUr
e3df8ab6dc
move hyde into chains (#728)
Co-authored-by: scadEfUr <>
2023-01-24 22:23:32 -08:00
Harrison Chase
0ffeabd14f
Harrison/serialize llm chain (#671) 2023-01-24 21:36:19 -08:00
Harrison Chase
cbc146720b
verbose flag (#683) 2023-01-22 12:44:14 -08:00
Harrison Chase
a2eeaf3d43
strip whitespace (#680) 2023-01-21 16:03:48 -08:00
Harrison Chase
54d7f1c933
fix caching (#658) 2023-01-19 15:33:45 -08:00
Harrison Chase
1ac3319e45
simplify parsing of the final answer (#621) 2023-01-15 16:39:27 -08:00
Harrison Chase
1511606799
Harrison/fix splitting (#563)
fix issue where text splitting could possibly create empty docs
2023-01-08 19:19:32 -08:00
Harrison Chase
1192cc0767
smart text splitter (#530)
smart text splitter that iteratively tries different separators until it
works!
2023-01-08 15:11:10 -08:00
Harrison Chase
9833fcfe32
fix caching (#555) 2023-01-06 07:30:10 -08:00
Harrison Chase
330a5b42d4
fix map reduce chain (#550) 2023-01-06 07:15:57 -08:00
Harrison Chase
4974f49bb7
add return_direct flag to tool (#537)
adds a return_direct flag to tools, which just returns the tool output
as the final output
2023-01-06 06:40:32 -08:00
Harrison Chase
1631981f84
Harrison/fix and test caching (#538) 2023-01-04 18:39:06 -08:00
Harrison Chase
9e04c34e20
Add BaseCallbackHandler and CallbackManager (#478)
Co-authored-by: Ankush Gola <9536492+agola11@users.noreply.github.com>
2023-01-04 07:54:25 -08:00
Harrison Chase
0db05b6725
Harrison/add human prefix (#520)
Co-authored-by: Andrew Huang <jhuang16888@gmail.com>
2023-01-03 08:03:50 -08:00
Harrison Chase
985496f4be
Docs refactor (#480)
Big docs refactor! Motivation is to make it easier for people to find
resources they are looking for. To accomplish this, there are now three
main sections:

- Getting Started: steps for getting started, walking through most core
functionality
- Modules: these are different modules of functionality that langchain
provides. Each part here has a "getting started", "how to", "key
concepts" and "reference" section (except in a few select cases where it
didnt easily fit).
- Use Cases: this is to separate use cases (like summarization, question
answering, evaluation, etc) from the modules, and provide a different
entry point to the code base.

There is also a full reference section, as well as extra resources
(glossary, gallery, etc)

Co-authored-by: Shreya Rajpal <ShreyaR@users.noreply.github.com>
2023-01-02 08:24:09 -08:00
Harrison Chase
d0f194de73
add logic for agent stopping (#420) 2022-12-29 08:21:11 -05:00
Harrison Chase
95157d0aad
Add schema property to sql database utility class (#448) (#462)
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>

Signed-off-by: Diwank Singh Tomer <diwank.singh@gmail.com>
Co-authored-by: Nuno Campos <nuno@boringbits.io>
Co-authored-by: Diwank Singh Tomer <diwank.singh@gmail.com>
2022-12-28 17:37:53 -05:00
Harrison Chase
0c5d3fd894
version 0.0.49 (#436) 2022-12-27 09:17:01 -05:00
Harrison Chase
f8b605293f
Harrison/improve memory (#432)
add AI prefix

add new type of memory

Co-authored-by: Jason <chisanch@usc.edu>
2022-12-27 08:23:51 -05:00
Harrison Chase
ee3b8e89b3
better parsing of agent output (#418) 2022-12-25 09:53:36 -05:00
Harrison Chase
20959d8c36
check memory variables (#411)
can have multiple input keys, if some come from memory
2022-12-24 08:35:46 -05:00
Harrison Chase
6b60c509ac
(WIP) add HyDE (#393)
Co-authored-by: cameronccohen <cameron.c.cohen@gmail.com>
Co-authored-by: Cameron Cohen <cameron.cohen@quantco.com>
2022-12-21 20:46:41 -05:00
Harrison Chase
c104d507bf
Harrison/improve data augmented generation docs (#390)
Co-authored-by: cameronccohen <cameron.c.cohen@gmail.com>
Co-authored-by: Cameron Cohen <cameron.cohen@quantco.com>
2022-12-20 22:24:08 -05:00
Harrison Chase
cf98f219f9
Harrison/tools exp (#372) 2022-12-18 21:51:23 -05:00
Harrison Chase
e7b625fe03
fix text splitter (#375) 2022-12-18 20:21:43 -05:00
Ankush Gola
8d0869c6d3
change run to use args and kwargs (#367)
Before, `run` was not able to be called with multiple arguments. This
expands the functionality.
2022-12-18 15:54:56 -05:00
Harrison Chase
c1b50b7b13
Harrison/map reduce merge (#344)
Co-authored-by: John Nay <JohnNay@users.noreply.github.com>
2022-12-15 17:49:14 -08:00
Harrison Chase
78b31e5966
Harrison/cache (#343) 2022-12-15 07:53:32 -08:00
Harrison Chase
8cf62ce06e
Harrison/single input (#347)
allow passing of single input into chain

Co-authored-by: thepok <richterthepok@yahoo.de>
2022-12-15 07:52:51 -08:00
Harrison Chase
9bb7195085
Harrison/llm saving (#331)
Co-authored-by: Akash Samant <70665700+asamant21@users.noreply.github.com>
2022-12-13 06:46:01 -08:00
Hunter Gerlach
482611f426
unit test / code coverage improvements (#322)
This PR has two contributions:

1. Add test for when stop token is found in middle of text

2. Add code coverage tooling and instructions
- Add pytest-cov via poetry
- Add necessary config files
- Add new make instruction for `coverage`
- Update README with coverage guidance
- Update minor README formatting/spelling

Co-authored-by: Hunter Gerlach <hunter@huntergerlach.com>
2022-12-13 05:48:53 -08:00
Shobith Alva
19a9fa16a9
Add clear() method for Memory (#305)
a simple helper to clear the buffer in `Conversation*Memory` classes
2022-12-11 07:09:06 -08:00
Harrison Chase
e02d6b2288
beta: logger (#307) 2022-12-10 23:17:19 -08:00
andersenchen
5267ebce2d
Add LLMCheckerChain (#281)
Implementation of https://github.com/jagilley/fact-checker. Works pretty
well.

<img width="993" alt="Screenshot 2022-12-07 at 4 41 47 PM"
src="https://user-images.githubusercontent.com/101075607/206302751-356a19ff-d000-4798-9aee-9c38b7f532b9.png">

Verifying this manually:
1. "Only two kinds of egg-laying mammals are left on the planet
today—the duck-billed platypus and the echidna, or spiny anteater."
https://www.scientificamerican.com/article/extreme-monotremes/
2. "An [Echidna] egg weighs 1.5 to 2 grams (0.05 to 0.07
oz)[[19]](https://en.wikipedia.org/wiki/Echidna#cite_note-19) and is
about 1.4 centimetres (0.55 in) long."
https://en.wikipedia.org/wiki/Echidna#:~:text=sleep%20is%20suppressed.-,Reproduction,a%20reptile%2Dlike%20egg%20tooth.
3. "A [platypus] lays one to three (usually two) small, leathery eggs
(similar to those of reptiles), about 11 mm (7⁄16 in) in diameter and
slightly rounder than bird eggs."
https://en.wikipedia.org/wiki/Platypus#:~:text=It%20lays%20one%20to%20three,slightly%20rounder%20than%20bird%20eggs.
4. Therefore, an Echidna is the mammal that lays the biggest eggs.


cc @hwchase17
2022-12-09 12:49:05 -08:00
Harrison Chase
3c1c7ba672
update branch name in gha (#274) 2022-12-06 22:28:50 -08:00
Akash Samant
48b093823e
Add a Transformation Chain (#257)
Arbitrary transformation chains that can be used to add dictionary
extractions from llms/other chains
2022-12-06 21:58:16 -08:00
coyotespike
b7bef36ee1
BashChain (#260)
Love the project, a ton of fun!

I think the PR is pretty self-explanatory, happy to make any changes! I
am working on using it in an `LLMBashChain` and may update as that
progresses.

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2022-12-06 21:57:50 -08:00
Harrison Chase
28be37f470
LLMRequestsChain (#267) 2022-12-06 21:55:02 -08:00
John McDonnell
68666d6a22
Gracefully degrade when model asks for nonexistent tool (#268)
Not yet tested, but very simple change, assumption is that we're cool
with just producing a generic output when tool is not found
2022-12-06 21:52:48 -08:00
Harrison Chase
f5c665a544
combine python files (#256) 2022-12-04 15:57:36 -08:00
Harrison Chase
db58032973
introduce output parser (#250) 2022-12-03 13:28:07 -08:00
Harrison Chase
a9ce04201f
Harrison/improve usability of api chain (#247)
improve usability of api chain
2022-12-02 15:44:10 -08:00
Harrison Chase
c897bd6cbd
api chain (#246)
Co-authored-by: Subhash Ramesh <33400216+thecooltechguy@users.noreply.github.com>
2022-12-02 13:39:36 -08:00
Xupeng (Tony) Tong
bb4bf9d6d0
chore: minor clean up / formatting (#233)
to get familiarize with the project
2022-12-01 10:50:36 -08:00
Andrew Gleave
ea67c049f0
Support SQL statements that return no results (#222)
Adds support for statements such as insert, update etc which do not
return any rows.

`engine.execute` is deprecated and so execution has been updated to use
`connection.exec_driver_sql` as-per:


https://docs.sqlalchemy.org/en/14/core/connections.html#sqlalchemy.engine.Engine.execute
2022-11-29 08:28:45 -08:00
Akash Samant
d368c43648
Bug Fix (#221)
Quick bug fix for semantic similarity vector injection
2022-11-29 07:03:40 -08:00
Harrison Chase
b94244eb12
nits (#210)
use json.dump

move test to integration tests (since it requires huggingface_hub)
2022-11-27 13:03:09 -08:00
Akash Samant
ae72cf84b8
Save Prompts (#194) 2022-11-27 09:10:35 -08:00
Bagatur
b90e25f786
Add HuggingFace Hub Embeddings (#125)
Add support for calling HuggingFace embedding models
using the HuggingFaceHub Inference API. New class mirrors
the existing HuggingFaceHub LLM implementation. Currently
only supports 'sentence-transformers' models.

Closes #86
2022-11-27 00:24:59 -08:00
Harrison Chase
6eab5254e5
add docs for custom agents (#196) 2022-11-26 06:03:08 -08:00
Harrison Chase
08deed9002
Harrison/memory docs (#195)
update memory docs and change variables
2022-11-26 05:58:54 -08:00
Harrison Chase
b913df3774
make attrs public (#187)
since they are used outside of the class, should be public
2022-11-24 20:11:29 -08:00
Samantha Whitmore
a408ed3ea3
Samantha/add conversation chain (#166)
Add MemoryChain and ConversationChain as chains that take a docstore in
addition to the prompt, and use the docstore to stuff context into the
prompt. This can be used to have an ongoing conversation with a chatbot.

Probably needs a bit of refactoring for code quality

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2022-11-23 16:35:38 -08:00
Harrison Chase
4334ffa6f9
Harrison/clean up language (#179)
dynamic prompts are no longer a thing
2022-11-23 16:58:41 -05:00
Samantha Whitmore
09f301cd38
Add add_example method to all ExampleSelector classes, with tests (#178)
Also updated docs, and noticed an issue with the add_texts method on
VectorStores that I had missed before -- the metadatas arg should be
required to match the classmethod which initializes the VectorStores
(the add_example methods break otherwise in the ExampleSelectors)
2022-11-23 13:12:47 -08:00
Harrison Chase
d3a7429f61
(WIP) agents (#171) 2022-11-22 06:16:26 -08:00
Harrison Chase
4a4dfbfbed
Harrison/sequential chains (#168)
add support for basic sequential chains
2022-11-21 13:08:53 -08:00
Samantha Whitmore
315b0c09c6
wip: add method for both docstore and embeddings (#119)
this will break atm but wanted to get thoughts on implementation.

1. should add() be on docstore interface?
2. should InMemoryDocstore change to take a list of documents as init?
(makes this slightly easier to implement in FAISS -- if we think it is
less clean then could expose a method to get the number of documents
currently in the dict, and perform the logic of creating the necessary
dictionary in the FAISS.add_texts method.

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2022-11-20 16:23:58 -08:00
Harrison Chase
c02eb199b6
add few shot example (#148) 2022-11-19 20:32:45 -08:00
Nicholas Larus-Stone
0c3ae78ec1
chore: update ascii colors to work with dark mode (#152) 2022-11-16 22:05:28 -08:00
Nicholas Larus-Stone
ca4b10bb74
feat: add option to ignore or restrict to SQL tables (#151)
`SQLDatabase` now accepts two `init` arguments:
1. `ignore_tables` to pass in a list of tables to not search over
2. `include_tables` to restrict to a list of tables to consider
2022-11-16 22:04:50 -08:00
Harrison Chase
1835e8a681
prompt nit (#141)
doing some cleanup, and i think this just simplifies things...
2022-11-14 21:30:33 -08:00
Harrison Chase
bbb405a492
update colors (#140) 2022-11-14 20:27:36 -08:00
Harrison Chase
f23b3ceb49
consolidate run functions (#126)
consolidating logic for when a chain is able to run with single input
text, single output text

open to feedback on naming, logic, usefulness
2022-11-13 18:14:35 -08:00
Edmar Ferreira
8a5ec894e7
Prompt from file proof of concept using plain text (#127)
This is a simple proof of concept of using external files as templates. 
I'm still feeling my way around the codebase.
As a user, I want to use files as prompts, so it will be easier to
manage and test prompts.
The future direction is to use a template engine, most likely Mako.
2022-11-13 13:15:30 -08:00
Harrison Chase
db37bd089f
model laboratory (#95) 2022-11-08 22:17:10 -08:00
Harrison Chase
eb36317f9a
Harrison/fix imports (#72)
fix imports and add section to notebook
2022-11-06 16:06:40 -08:00
Samantha Whitmore
a5b61d59e1
Refactor prompts into module, add example generation utils (#64) 2022-11-06 15:40:33 -08:00
Harrison Chase
2456a547de
mrkl (#42) 2022-11-05 14:41:53 -07:00
Samantha Whitmore
c636488fe5
DynamicPrompt class creation (#49)
Checking that this structure looks generally ok -- going to sub in logic
where the TODO comment is then add a test.
2022-11-05 12:43:21 -07:00
Harrison Chase
4cc18d6c2a
Harrison/pretty print (#57)
make stuff look nice
2022-11-03 00:41:07 -07:00
Harrison Chase
76aff023d7
FAISS and embedding support (#48)
also adds embeddings and an in memory docstore
2022-11-01 21:29:39 -07:00
Harrison Chase
e982cf4b2e
Harrison/update docstore (#47)
change docstore interface
2022-10-31 21:18:52 -07:00
Harrison Chase
160af4ba6b
Harrison/map reduce (#36) 2022-10-31 20:17:22 -07:00
Harrison Chase
fba30e07d1
factor out mock python repl (#43) 2022-10-30 18:09:04 -07:00
Harrison Chase
7b0d02ac51
prompt templating (#41)
Co-authored-by: Samantha Whitmore <whitmore.samantha@gmail.com>
2022-10-30 09:45:27 -07:00
Harrison Chase
af81e9ca9c
add sql database (#35) 2022-10-27 23:21:47 -07:00
Harrison Chase
ce7b14b843
Harrison/add react chain (#24)
from https://arxiv.org/abs/2210.03629

still need to think if docstore abstraction makes sense
2022-10-26 21:02:23 -07:00
Harrison Chase
020c42dcae
Harrison/add huggingface hub (#23)
Add support for huggingface hub

I could not find a good way to enforce stop tokens over the huggingface
hub api - that needs to hopefully be cleaned up in the future
2022-10-25 22:00:33 -07:00
Harrison Chase
1ef3ab4d0e
Harrison/add natbot (#18) 2022-10-24 19:56:26 -07:00
Harrison Chase
18aeb72012 initial commit 2022-10-24 14:51:15 -07:00