Commit Graph

168 Commits

Author SHA1 Message Date
Eugene Yurtsev
5d02010763
Introduce Blob and Blob Loader interface (#3603)
This PR introduces a Blob data type and a Blob loader interface.

This is the first of a sequence of PRs that follows this proposal: 

https://github.com/hwchase17/langchain/pull/2833

The primary goals of these abstraction are:

* Decouple content loading from content parsing code.
* Help duplicated content loading code from document loaders.
* Make lazy loading a default for langchain.
2023-04-27 09:45:25 -04:00
Zander Chase
ee670c448e
Persistent Bash Shell (#3580)
Clean up linting and make more idiomatic by using an output parser

---------

Co-authored-by: FergusFettes <fergusfettes@gmail.com>
2023-04-26 15:20:28 -07:00
Roma
2b4e9a3efa
Add unit test for _merge_splits function (#3513)
This commit adds a new unit test for the _merge_splits function in the
text splitter. The new test verifies that the function merges text into
chunks of the correct size and overlap, using a specified separator. The
test passes on the current implementation of the function.
2023-04-25 10:02:59 -07:00
Mindaugas Sharskus
a4d85f7fd5
[Fix #3365]: Changed regex to cover new line before action serious (#3367)
Fix for: [Changed regex to cover new line before action
serious.](https://github.com/hwchase17/langchain/issues/3365)
---

This PR fixes the issue where `ValueError: Could not parse LLM output:`
was thrown on seems to be valid input.

Changed regex to cover new lines before action serious (after the
keywords "Action:" and "Action Input:").

regex101: https://regex101.com/r/CXl1kB/1

---------

Co-authored-by: msarskus <msarskus@cisco.com>
2023-04-24 22:05:31 -07:00
Davis Chase
b2564a6391
fix #3884 (#3475)
fixes mar bug #3384
2023-04-24 19:54:15 -07:00
Zander Chase
49122a96e7
Structured Tool Bugfixes (#3324)
- Proactively raise error if a tool subclasses BaseTool, defines its
own schema, but fails to add the type-hints
- fix the auto-inferred schema of the decorator to strip the
unneeded virtual kwargs from the schema dict

Helps avoid silent instances of #3297
2023-04-24 09:58:29 -07:00
Davis Chase
46542dc774
Contextual compression retriever (#2915)
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-04-20 17:01:14 -07:00
Harrison Chase
9a0356d276
Harrison/file chat history (#3198)
Co-authored-by: Young Lee <joybro201@gmail.com>
2023-04-19 21:05:20 -07:00
Zander Chase
4adfd790f0
Update File Management Tools to Include Root Directory (#3112)
- Permit the specification of a `root_dir` to the read/write file tools
to specify a working directory
- Add validation for attempts to read/write outside the directory (e.g.,
through `../../` or symlinks or `/abs/path`'s that don't lie in the
correct path)
- Add some tests for all


One question is whether we should make a default root directory for
these? tradeoffs either way
2023-04-19 16:46:10 -07:00
engkheng
dbbc340f25
Validate input_variables when using jinja2 templates (#3140)
`langchain.prompts.PromptTemplate` and
`langchain.prompts.FewShotPromptTemplate` do not validate
`input_variables` when initialized as `jinja2` template.

```python
# Using langchain v0.0.144
template = """"\
Your variable: {{ foo }}
{% if bar %}
You just set bar boolean variable to true
{% endif %}
"""

# Missing variable, should raise ValueError
prompt_template = PromptTemplate(template=template, 
                                 input_variables=["bar"], 
                                 template_format="jinja2", 
                                 validate_template=True)

# Extra variable, should raise ValueError
prompt_template = PromptTemplate(template=template, 
                                 input_variables=["bar", "foo", "extra", "thing"], 
                                 template_format="jinja2", 
                                 validate_template=True)
```
2023-04-19 16:18:32 -07:00
Zander Chase
90ef705ced
Update Tool Input (#3103)
- Remove dynamic model creation in the `args()` property. _Only infer
for the decorator (and add an argument to NOT infer if someone wishes to
only pass as a string)_
- Update the validation example to make it less likely to be
misinterpreted as a "safe" way to run a repl


There is one example of "Multi-argument tools" in the custom_tools.ipynb
from yesterday, but we could add more. The output parsing for the base
MRKL agent hasn't been adapted to handle structured args at this point
in time

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-04-18 18:18:33 -07:00
Harrison Chase
aad0a498ac
Harrison/output error (#3094)
Co-authored-by: yummydum <sumita@nowcast.co.jp>
2023-04-18 08:59:56 -07:00
Harrison Chase
db968284f8
tools refactor (#2961)
Co-authored-by: vowelparrot <130414180+vowelparrot@users.noreply.github.com>
2023-04-17 21:35:29 -07:00
engkheng
19febc77d6
Support inference of input_variables from jinja2 template (#3013)
`langchain.prompts.PromptTemplate` is unable to infer `input_variables`
from jinja2 template.

```python
# Using langchain v0.0.141
template_string = """\
Hello world
Your variable: {{ var }}
{# This will not get rendered #}

{% if verbose %}
Congrats! You just turned on verbose mode and got extra messages!
{% endif %}
"""

template = PromptTemplate.from_template(template_string, template_format="jinja2")
print(template.input_variables) # Output ['# This will not get rendered #', '% endif %', '% if verbose %']
```

---------

Co-authored-by: engkheng <ongengkheng929@example.com>
2023-04-17 20:31:03 -07:00
Nuno Campos
dac32c59e5
Nc/combining output parser (#3014)
Co-authored-by: vowelparrot <130414180+vowelparrot@users.noreply.github.com>
2023-04-17 20:29:53 -07:00
Davis Chase
19c85aa990
Factor out doc formatting and add validation (#3026)
@cnhhoang850 slightly more generic fix for #2944, works for whatever the
expected metadata keys are not just `source`
2023-04-17 20:28:01 -07:00
vowelparrot
99c0382209
Generative Characters (#2859)
Add a time-weighted memory retriever and a notebook that approximates a
Generative Agent from https://arxiv.org/pdf/2304.03442.pdf


The "daily plan" components are removed for now since they are less
useful without a virtual world, but the memory is an interesting
component to build off.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-04-16 21:41:00 -07:00
Harrison Chase
e12e00df12
use output parsers in agents (#2987) 2023-04-16 13:15:21 -07:00
vowelparrot
5ca7ce77cd
Remove pythonrepl from LLM-MathChain (#2943)
Use numexpr evaluate instead of the python REPL to avoid malicious code
injection.

Tested against the (limited) math dataset and got the same score as
before.

For more permissive tools (like the REPL tool itself), other approaches
ought to be provided (some combination of Sanitizer + Restricted python
+ unprivileged-docker + ...), but for a calculator tool, only
mathematical expressions should be permitted.

See https://github.com/hwchase17/langchain/issues/814
2023-04-16 08:50:32 -07:00
dev2049
36aa7f30e4
Move PythonRepl -> langchain.utilities (#2917) 2023-04-15 10:50:25 -07:00
Harrison Chase
705596b46a
Harrison/fix create sql agent (#2870)
Co-authored-by: Timothé Pearce <timothe.pearce@gmail.com>
2023-04-13 22:07:58 -07:00
KullTC
802363eb6a
Remove print statement from test (#2809)
Remove unnecessary print statement.
2023-04-13 09:31:48 -07:00
KullTC
64596b23b9
Return output of PythonAstREPLTool when falling back to exec() (#2780)
When the code ran by the PythonAstREPLTool contains multiple statements
it will fallback to exec() instead of using eval(). With this change, it
will also return the output of the code in the same way the
PythonREPLTool will.
2023-04-12 21:22:46 -07:00
Joshua Snyder
59d054308c
Add type inference for output parsers (#2769)
Currently, the output type of a number of OutputParser's `parse` methods
is `Any` when it can in fact be inferred.

This PR makes BaseOutputParser use a generic type and fixes the output
types of the following parsers:
- `PydanticOutputParser`
- `OutputFixingParser`
- `RetryOutputParser`
- `RetryWithErrorOutputParser`

The output of the `StructuredOutputParser` is corrected from `BaseModel`
to `Any` since there are no type guarantees provided by the parser.

Fixes issue #2715
2023-04-12 09:12:20 -07:00
Abhik Singla
955bd2e1db
Fixed Ast Python Repl for Chatgpt multiline commands (#2406)
Resolves issue https://github.com/hwchase17/langchain/issues/2252

---------

Co-authored-by: Abhik Singla <abhiksingla@microsoft.com>
2023-04-10 21:25:03 -07:00
Ankush Gola
b82cbd1be0
Use run and arun in place of combine_docs and acombine_docs (#2635)
`combine_docs` does not go through the standard chain call path which
means that chain callbacks won't be triggered, meaning QA chains won't
be traced properly, this fixes that.

Also fix several errors in the chat_vector_db notebook
2023-04-09 18:47:59 -07:00
Vashisht Madhavan
aa439ac2ff
Adding an in-context QA evaluation chain + chain of thought reasoning chain for improved accuracy (#2444)
Right now, eval chains require an answer for every question. It's
cumbersome to collect this ground truth so getting around this issue
with 2 things:

* Adding a context param in `ContextQAEvalChain` and simply evaluating
if the question is answered accurately from context
* Adding chain of though explanation prompting to improve the accuracy
of this w/o GT.

This also gets to feature parity with openai/evals which has the same
contextual eval w/o GT.

TODO in follow-up:
* Better prompt inheritance. No need for seperate prompt for CoT
reasoning. How can we merge them together

---------

Co-authored-by: Vashisht Madhavan <vashishtmadhavan@Vashs-MacBook-Pro.local>
2023-04-06 22:32:41 -07:00
William FH
f240651bd8
Add Request body (#2507)
This still doesn't handle the following

- non-JSON media types
- anyOf, allOf, oneOf's

And doesn't emit the typescript definitions for referred types yet, but
that can be saved for a separate PR.

Also, we could have better support for Swagger 2.0 specs and OpenAPI
3.0.3 (can use the same lib for the latter) recommend offline conversion
for now.
2023-04-06 13:02:42 -07:00
Zach Jones
13d1df2140
Feature: AgentExecutor execution time limit (#2399)
`AgentExecutor` already has support for limiting the number of
iterations. But the amount of time taken for each iteration can vary
quite a bit, so it is difficult to place limits on the execution time.
This PR adds a new field `max_execution_time` to the `AgentExecutor`
model. When called asynchronously, the agent loop is wrapped in an
`asyncio.timeout()` context which triggers the early stopping response
if the time limit is reached. When called synchronously, the agent loop
checks for both the max_iteration limit and the time limit after each
iteration.

When used asynchronously `max_execution_time` gives really tight control
over the max time for an execution chain. When used synchronously, the
chain can unfortunately exceed max_execution_time, but it still gives
more control than trying to estimate the number of max_iterations needed
to cap the execution time.

---------

Co-authored-by: Zachary Jones <zjones@zetaglobal.com>
2023-04-06 12:54:32 -07:00
leo-gan
fd69cc7e42
Removed duplicate BaseModel dependencies (#2471)
Removed duplicate BaseModel dependencies in class inheritances.
Also, sorted imports by `isort`.
2023-04-06 12:45:16 -07:00
Harrison Chase
1e19e004af
Harrison/openapi spec (#2474)
Co-authored-by: William Fu-Hinthorn <13333726+hinthornw@users.noreply.github.com>
2023-04-06 09:47:37 -07:00
Harrison Chase
26314d7004
Harrison/openapi parser (#2461)
Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com>
2023-04-05 22:19:09 -07:00
Ankush Gola
4d730a9bbc
improve AsyncCallbackManager (#2410) 2023-04-05 09:31:42 +02:00
Harrison Chase
c7b083ab56
bump version to 131 (#2391) 2023-04-04 07:21:50 -07:00
Harrison Chase
fe1eb8ca5f
requests wrapper (#2367) 2023-04-03 21:57:19 -07:00
Shrined
10dab053b4
Add Enum for agent types (#2321)
This pull request adds an enum class for the various types of agents
used in the project, located in the `agent_types.py` file. Currently,
the project is using hardcoded strings for the initialization of these
agents, which can lead to errors and make the code harder to maintain.
With the introduction of the new enums, the code will be more readable
and less error-prone.

The new enum members include:

- ZERO_SHOT_REACT_DESCRIPTION
- REACT_DOCSTORE
- SELF_ASK_WITH_SEARCH
- CONVERSATIONAL_REACT_DESCRIPTION
- CHAT_ZERO_SHOT_REACT_DESCRIPTION
- CHAT_CONVERSATIONAL_REACT_DESCRIPTION

In this PR, I have also replaced the hardcoded strings with the
appropriate enum members throughout the codebase, ensuring a smooth
transition to the new approach.
2023-04-03 21:56:20 -07:00
Harrison Chase
acfda4d1d8
Harrison/multiline commands (#2280)
Co-authored-by: Marc Päpper <mpaepper@users.noreply.github.com>
2023-04-01 12:54:06 -07:00
leo-gan
579ad85785
skip unit tests that fail in Windows (#2238)
Issue #2174
Several unit tests fail in Windows.
Added pytest attribute to skip these tests automatically.
2023-04-01 12:52:21 -07:00
Harrison Chase
2d3918c152
make requests more general (#2209) 2023-03-30 20:41:56 -07:00
Harrison Chase
5c907d9998
Harrison/base agent without docs (#2166) 2023-03-29 22:11:25 -07:00
Harrison Chase
f5a4bf0ce4
remove prep (#2136)
agents should be stateless or async stuff may not work
2023-03-29 14:38:21 -07:00
Harrison Chase
e2c26909f2
Harrison/memory check (#2119)
Co-authored-by: JIAQIA <jqq1716@gmail.com>
2023-03-28 15:40:36 -07:00
Harrison Chase
f281033362
rm pandas dependency (#2102) 2023-03-28 08:38:19 -07:00
Harrison Chase
9e74df2404
Fix issue#1645: Parse llm_output even there's newline (#2092) (#2099)
Fix issue#1645: Parse either whitespace or newline after 'Action Input:'
in llm_output in mrkl agent.
Unittests added accordingly.

Co-authored-by: ₿ingnan.ΞTH <brillliantz@outlook.com>
2023-03-28 08:14:09 -07:00
b7f392fdd6
[agent_executor] convenience func: lookup tool by name (#2001)
A quick convenience function to lookup a tool by name

Co-authored-by: blob42 <spike@w530>
2023-03-27 23:10:34 -07:00
Harrison Chase
30e3b31b04
Harrison/document cleanup (#2062)
Co-authored-by: Delip Rao <delip@users.noreply.github.com>
2023-03-27 16:32:55 -07:00
Daniel Chalef
6598beacdb
PydanticOutputParser unit test (#2047)
Unit test for PydanticOutputParser

---------

Co-authored-by: Daniel Chalef <daniel.chalef@private.org>
2023-03-27 14:32:56 -07:00
Harrison Chase
705431aecc
big docs refactor (#1978)
Co-authored-by: Ankush Gola <ankush.gola@gmail.com>
2023-03-26 19:49:46 -07:00
Harrison Chase
ce5d97bcb3
Harrison/guarded output parser (#1804)
Co-authored-by: jerwelborn <jeremy.welborn@gmail.com>
2023-03-21 22:07:23 -07:00
Matt Tucker
a92344f476
Use regex match for bash process error output test assertion. (#1837)
I was getting the same issue reported in #1339 by
[MacYang555](https://github.com/MacYang555) when running the test suite
on my Mac. I implemented the fix they suggested to use a regex match in
the output assertion for the scenario under test.

Resolves #1339
2023-03-21 09:06:52 -07:00