Commit Graph

852 Commits

Author SHA1 Message Date
Harrison Chase
aa854988bf
bump version to 114 (#1739) 2023-03-17 08:26:06 -07:00
Harrison Chase
96ebe98dc2
Harrison/latex splitter (#1738)
Co-authored-by: Aidan Holland <thehappydinoa@gmail.com>
Co-authored-by: Jan de Boer <44832123+Janldeboer@users.noreply.github.com>
2023-03-17 08:10:27 -07:00
Harrison Chase
45f05fc939
Harrison/blackboard loader (#1737)
Co-authored-by: Aidan Holland <thehappydinoa@gmail.com>
2023-03-17 08:02:44 -07:00
Vincent Liao
cf9c3f54f7
docs: add docs link to agent toolkits (#1735)
New to Langchain, was a bit confused where I should find the toolkits
section when I'm at `agent/key_concepts` docs. I added a short link that
points to the how to section.
2023-03-17 07:59:49 -07:00
Merbin J Anselm
fbc0c85b90
fix: agent json parser fails with text in suffix (#1734)
While testing out `VectorDBQA` as a `Tool` for one of the conversation,
I happened to get a response from LLM (OpenAI) like this

<code>
Could not parse LLM output: Here's a response using the Product Search
tool:

```json
{
    "action": "Product Search",
    "action_input": "pots for plants"
}
```

This will allow you to search for pots for your plants and find a
variety of options that are available for purchase. You can use this
information to choose the pots that best fit your needs and preferences.
</code>

i.e. The response had a text before & *after* the expected JSON, leading
to `JSONDecodeError`. It's fixed now, by removing text after '```' to
remove unwanted text.

The error I encountered in this Jupyter Notebook -
[link](https://github.com/anselm94/chatbot-llm-ecommerce/blob/main/chatcommerce.ipynb)

<details>
    <summary>Error encountered</summary>
    <code>
    

---------------------------------------------------------------------------
JSONDecodeError Traceback (most recent call last)
File
~/Git/chatbot-llm-ecommerce/.venv/lib/python3.11/site-packages/langchain/agents/conversational_chat/base.py:104,
in ConversationalChatAgent._extract_tool_and_input(self, llm_output)
        103 try:
    --> 104     response = self.output_parser.parse(llm_output)
        105     return response["action"], response["action_input"]

File
~/Git/chatbot-llm-ecommerce/.venv/lib/python3.11/site-packages/langchain/agents/conversational_chat/base.py:49,
in AgentOutputParser.parse(self, text)
        48 cleaned_output = cleaned_output.strip()
    ---> 49 response = json.loads(cleaned_output)
50 return {"action": response["action"], "action_input":
response["action_input"]}

File
/opt/homebrew/Cellar/python@3.11/3.11.2_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/json/__init__.py:346,
in loads(s, cls, object_hook, parse_float, parse_int, parse_constant,
object_pairs_hook, **kw)
        343 if (cls is None and object_hook is None and
        344         parse_int is None and parse_float is None and
345 parse_constant is None and object_pairs_hook is None and not kw):
    --> 346     return _default_decoder.decode(s)
        347 if cls is None:

File
/opt/homebrew/Cellar/python@3.11/3.11.2_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/json/decoder.py:340,
in JSONDecoder.decode(self, s, _w)
        339 if end != len(s):
    --> 340     raise JSONDecodeError("Extra data", s, end)
        341 return obj

    JSONDecodeError: Extra data: line 5 column 1 (char 74)

    During handling of the above exception, another exception occurred:

ValueError Traceback (most recent call last)
    Cell In[22], line 1
    ----> 1 ask_ai.run("Yes. I need pots for my plants")

File
~/Git/chatbot-llm-ecommerce/.venv/lib/python3.11/site-packages/langchain/chains/base.py:213,
in Chain.run(self, *args, **kwargs)
        211     if len(args) != 1:
212 raise ValueError("`run` supports only one positional argument.")
    --> 213     return self(args[0])[self.output_keys[0]]
        215 if kwargs and not args:
        216     return self(kwargs)[self.output_keys[0]]

File
~/Git/chatbot-llm-ecommerce/.venv/lib/python3.11/site-packages/langchain/chains/base.py:116,
in Chain.__call__(self, inputs, return_only_outputs)
        114 except (KeyboardInterrupt, Exception) as e:
115 self.callback_manager.on_chain_error(e, verbose=self.verbose)
    --> 116     raise e
117 self.callback_manager.on_chain_end(outputs, verbose=self.verbose)
118 return self.prep_outputs(inputs, outputs, return_only_outputs)

File
~/Git/chatbot-llm-ecommerce/.venv/lib/python3.11/site-packages/langchain/chains/base.py:113,
in Chain.__call__(self, inputs, return_only_outputs)
        107 self.callback_manager.on_chain_start(
        108     {"name": self.__class__.__name__},
        109     inputs,
        110     verbose=self.verbose,
        111 )
        112 try:
    --> 113     outputs = self._call(inputs)
        114 except (KeyboardInterrupt, Exception) as e:
115 self.callback_manager.on_chain_error(e, verbose=self.verbose)

File
~/Git/chatbot-llm-ecommerce/.venv/lib/python3.11/site-packages/langchain/agents/agent.py:499,
in AgentExecutor._call(self, inputs)
        497 # We now enter the agent loop (until it returns something).
        498 while self._should_continue(iterations):
    --> 499     next_step_output = self._take_next_step(
500 name_to_tool_map, color_mapping, inputs, intermediate_steps
        501     )
        502     if isinstance(next_step_output, AgentFinish):
503 return self._return(next_step_output, intermediate_steps)

File
~/Git/chatbot-llm-ecommerce/.venv/lib/python3.11/site-packages/langchain/agents/agent.py:409,
in AgentExecutor._take_next_step(self, name_to_tool_map, color_mapping,
inputs, intermediate_steps)
404 """Take a single step in the thought-action-observation loop.
        405
406 Override this to take control of how the agent makes and acts on
choices.
        407 """
        408 # Call the LLM to see what to do.
    --> 409 output = self.agent.plan(intermediate_steps, **inputs)
410 # If the tool chosen is the finishing tool, then we end and return.
        411 if isinstance(output, AgentFinish):

File
~/Git/chatbot-llm-ecommerce/.venv/lib/python3.11/site-packages/langchain/agents/agent.py:105,
in Agent.plan(self, intermediate_steps, **kwargs)
        94 """Given input, decided what to do.
        95
        96 Args:
    (...)
        102     Action specifying what tool to use.
        103 """
104 full_inputs = self.get_full_inputs(intermediate_steps, **kwargs)
    --> 105 action = self._get_next_action(full_inputs)
        106 if action.tool == self.finish_tool_name:
107 return AgentFinish({"output": action.tool_input}, action.log)

File
~/Git/chatbot-llm-ecommerce/.venv/lib/python3.11/site-packages/langchain/agents/agent.py:67,
in Agent._get_next_action(self, full_inputs)
65 def _get_next_action(self, full_inputs: Dict[str, str]) ->
AgentAction:
        66     full_output = self.llm_chain.predict(**full_inputs)
---> 67 parsed_output = self._extract_tool_and_input(full_output)
        68     while parsed_output is None:
        69         full_output = self._fix_text(full_output)

File
~/Git/chatbot-llm-ecommerce/.venv/lib/python3.11/site-packages/langchain/agents/conversational_chat/base.py:107,
in ConversationalChatAgent._extract_tool_and_input(self, llm_output)
        105     return response["action"], response["action_input"]
        106 except Exception:
--> 107 raise ValueError(f"Could not parse LLM output: {llm_output}")

ValueError: Could not parse LLM output: Here's a response using the
Product Search tool:

    ```json
    {
        "action": "Product Search",
        "action_input": "pots for plants"
    }
    ```

This will allow you to search for pots for your plants and find a
variety of options that are available for purchase. You can use this
information to choose the pots that best fit your needs and preferences.

</details>
2023-03-17 07:59:39 -07:00
Harrison Chase
276940fd9b
Harrison/official method (#1728)
Co-authored-by: Aratako <127325395+Aratako@users.noreply.github.com>
2023-03-16 23:20:08 -07:00
Piyush Jain
cdff6c8181
Sagemaker Endpoint LLM (#1686)
Updates #965

---------

Co-authored-by: Nimisha Mehta <116048415+nimimeht@users.noreply.github.com>
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>
2023-03-16 21:58:06 -07:00
alekhyablue
cd45adbea2
adding new agent types in comments (#1711) 2023-03-16 21:56:08 -07:00
Mario Kostelac
aff44d0a98
(OpenAI) Add model_name to LLMResult.llm_output (#1713)
Given that different models have very different latencies and pricings,
it's benefitial to pass the information about the model that generated
the response. Such information allows implementing custom callback
managers and track usage and price per model.

Addresses https://github.com/hwchase17/langchain/issues/1557.
2023-03-16 21:55:55 -07:00
libra
8a95fdaee1
Fix all the bug in init Tool in docs (#1725)
Fix all the example in the docs when init `Tool`

Test by render with jupyter
2023-03-16 21:55:44 -07:00
Alexandros Mavrogiannis
5d8dc83ede
Bump duckdb-engine to 0.7.0 (#1726)
Resolves https://github.com/hwchase17/langchain/issues/1272
Resolves https://github.com/hwchase17/langchain/issues/1578
2023-03-16 21:55:35 -07:00
Daniel Chalef
b157e0c1c3
Add HTML document_loader that includes page title metadata (#1720)
This `BSHTMLLoader` document_loader loads an HTML document, extracts
text and adds the page title to the returned Document's metadata. The
loader uses the already installed bs4 package to extract both text
content and the page title.

Included in this PR is an example HTML file and an integration test that
tests against this file.

---------

Co-authored-by: Daniel Chalef <daniel.chalef@private.org>
2023-03-16 21:47:17 -07:00
Harrison Chase
40e9488055
fix async in agent (#1723) 2023-03-16 21:43:22 -07:00
jerwelborn
55efbb8a7e
pydantic/json parsing (#1722)
```
class Joke(BaseModel):
    setup: str = Field(description="question to set up a joke")
    punchline: str = Field(description="answer to resolve the joke")

joke_query = "Tell me a joke."

# Or, an example with compound type fields.
#class FloatArray(BaseModel):
#    values: List[float] = Field(description="list of floats")
#
#float_array_query = "Write out a few terms of fiboacci."

model = OpenAI(model_name='text-davinci-003', temperature=0.0)
parser = PydanticOutputParser(pydantic_object=Joke)
prompt = PromptTemplate(
    template="Answer the user query.\n{format_instructions}\n{query}\n",
    input_variables=["query"],
    partial_variables={"format_instructions": parser.get_format_instructions()}
)

_input = prompt.format_prompt(query=joke_query)
print("Prompt:\n", _input.to_string())
output = model(_input.to_string())
print("Completion:\n", output)
parsed_output = parser.parse(output)
print("Parsed completion:\n", parsed_output)
```

```
Prompt:
 Answer the user query.
The output should be formatted as a JSON instance that conforms to the JSON schema below.  For example, the object {"foo":  ["bar", "baz"]} conforms to the schema {"foo": {"description": "a list of strings field", "type": "string"}}.

Here is the output schema:
---
{"setup": {"description": "question to set up a joke", "type": "string"}, "punchline": {"description": "answer to resolve the joke", "type": "string"}}
---

Tell me a joke.

Completion:
 {"setup": "Why don't scientists trust atoms?", "punchline": "Because they make up everything!"}

Parsed completion:
 setup="Why don't scientists trust atoms?" punchline='Because they make up everything!'
```

Ofc, works only with LMs of sufficient capacity. DaVinci is reliable but
not always.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-03-16 21:43:11 -07:00
Alex Strick van Linschoten
d6bbf395af
Loosen PyYAML dependency (#1698)
Hitting some dependency issues relating to this strict pinning. Unsure
of the knock-on effects, but wanted to propose this loosening down a
couple of versions.
2023-03-16 17:05:36 -07:00
Jonathan Pedoeem
606605925d
Adding ability to return_pl_id to all PromptLayer Models in LangChain (#1699)
PromptLayer now has support for [several different tracking
features.](https://magniv.notion.site/Track-4deee1b1f7a34c1680d085f82567dab9)
In order to use any of these features you need to have a request id
associated with the request.

In this PR we add a boolean argument called `return_pl_id` which will
add `pl_request_id` to the `generation_info` dictionary associated with
a generation.

We also updated the relevant documentation.
2023-03-16 17:05:23 -07:00
Jeff Huber
f93c011456
fallback to {} for None metadata from Chroma (#1714)
The basic vector store example started breaking because `Document`
required `not None` for metadata, but Chroma stores metadata as `None`
if none is provided. This creates a fallback which fixes the basic
tutorial
https://langchain.readthedocs.io/en/latest/modules/indexes/examples/vectorstores.html

Here is the error that was generated

```
Running Chroma using direct local API.
Using DuckDB in-memory for database. Data will be transient.
Traceback (most recent call last):
  File "/Users/jeff/src/temp/langchainchroma/test.py", line 17, in <module>
    docs = docsearch.similarity_search(query)
  File "/Users/jeff/src/langchain/langchain/vectorstores/chroma.py", line 133, in similarity_search
    docs_and_scores = self.similarity_search_with_score(query, k)
  File "/Users/jeff/src/langchain/langchain/vectorstores/chroma.py", line 182, in similarity_search_with_score
    return _results_to_docs_and_scores(results)
  File "/Users/jeff/src/langchain/langchain/vectorstores/chroma.py", line 24, in _results_to_docs_and_scores
    return [
  File "/Users/jeff/src/langchain/langchain/vectorstores/chroma.py", line 27, in <listcomp>
    (Document(page_content=result[0], metadata=result[1]), result[2])
  File "pydantic/main.py", line 331, in pydantic.main.BaseModel.__init__
pydantic.error_wrappers.ValidationError: 1 validation error for Document
metadata
  none is not an allowed value (type=type_error.none.not_allowed)
Exiting: Cleaning up .chroma directory
```
2023-03-16 12:06:47 -07:00
Harrison Chase
3c24684522
harrison/bump-version-00113 (#1701) 2023-03-15 14:49:47 -07:00
Harrison Chase
b84d190fd0
Harrison/gr int (#1700)
Co-authored-by: Shreya Rajpal <ShreyaR@users.noreply.github.com>
2023-03-15 13:22:20 -07:00
Harrison Chase
aad4bff098
Harrison/headers (#1696)
Co-authored-by: Tim Asp <707699+timothyasp@users.noreply.github.com>
2023-03-15 13:13:21 -07:00
Harrison Chase
3ea6d9c4d2
add docs for save/load messages (#1697) 2023-03-15 13:13:08 -07:00
Pandazki
ced412e1c1
fix: correct a small mistake in SimpleChatModel. (#1685) 2023-03-15 08:00:26 -07:00
Piyush Jain
1279c8de39
Fixed typo, clarified language (#1682) 2023-03-15 08:00:11 -07:00
at-b612
c7779c800a
Added Mynd URL to gallery (#1684) 2023-03-15 07:59:59 -07:00
Jithin James
6f4f771897
docs: add path to state_of_the_union.txt in indexes/getting_started page (#1691)
add the state_of_the_union.txt file so that its easier to follow through
with the example.

---------

Co-authored-by: Jithin James <jjmachan@pop-os.localdomain>
2023-03-15 07:59:47 -07:00
Kacper Łukawski
4a327dd1d6
Implement basic metadata filtering in Qdrant (#1689)
This PR implements a basic metadata filtering mechanism similar to the
ones in Chroma and Pinecone. It still cannot express complex conditions,
as there are no operators, but some users requested to have that feature
available.
2023-03-15 07:31:39 -07:00
Ankush Gola
d4edd3c312
Zapier Integration (#1654)
* Zapier Wrapper and Tools (implemented by Zapier Team)
* Zapier Toolkit, examples with mrkl agent

---------

Co-authored-by: Mike Knoop <mikeknoop@gmail.com>
Co-authored-by: Robert Lewis <robert.lewis@zapier.com>
2023-03-14 23:06:17 -07:00
Harrison Chase
e72074f78a
Harrison/ifixit (#1680)
Co-authored-by: David Rans <david@ifixit.com>
2023-03-14 21:17:50 -07:00
Harrison Chase
0b29e68c17
Harrison/pgvector (#1679)
Co-authored-by: Aman Kumar <krsingh.aman@gmail.com>
2023-03-14 21:13:58 -07:00
Harrison Chase
4d7fdb8957
Harrison/gml save (#1676)
Co-authored-by: Satoru Sakamoto <51464932+satoru814@users.noreply.github.com>
2023-03-14 20:00:22 -07:00
Harrison Chase
656efe6ef3
Harrison/fix nb (#1678) 2023-03-14 19:34:23 -07:00
Harrison Chase
362586fe8b
save messages (#1653)
@yakigac this is my alternative to
https://github.com/hwchase17/langchain/pull/1648 - thoughts?
2023-03-14 18:15:55 -07:00
Matt Robinson
63aa28e2a6
feat: allow the unstructured kwargs to be passed in to Unstructured document loaders (#1667)
### Summary

Allows users to pass in `**unstructured_kwargs` to Unstructured document
loaders. Implemented with the `strategy` kwargs in mind, but will pass
in other kwargs like `include_page_breaks` as well. The two currently
supported strategies are `"hi_res"`, which is more accurate but takes
longer, and `"fast"`, which processes faster but with lower accuracy.
The `"hi_res"` strategy is the default. For PDFs, if `detectron2` is not
available and the user selects `"hi_res"`, the loader will fallback to
using the `"fast"` strategy.


### Testing

#### Make sure the `strategy` kwarg works

Run the following in iPython to verify that the `"fast"` strategy is
indeed faster.

```python
from langchain.document_loaders import UnstructuredFileLoader

loader = UnstructuredFileLoader("layout-parser-paper-fast.pdf", strategy="fast", mode="elements")
%timeit loader.load()

loader = UnstructuredFileLoader("layout-parser-paper-fast.pdf", mode="elements")
%timeit loader.load()
```

On my system I get:

```python
In [3]: from langchain.document_loaders import UnstructuredFileLoader

In [4]: loader = UnstructuredFileLoader("layout-parser-paper-fast.pdf", strategy="fast", mode="elements")

In [5]: %timeit loader.load()
247 ms ± 369 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [6]: loader = UnstructuredFileLoader("layout-parser-paper-fast.pdf", mode="elements")

In [7]: %timeit loader.load()
2.45 s ± 31 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
```

#### Make sure older versions of `unstructured` still work

Run `pip install unstructured==0.5.3` and then verify the following runs
without error:

```python
from langchain.document_loaders import UnstructuredFileLoader

loader = UnstructuredFileLoader("layout-parser-paper-fast.pdf",  mode="elements")
loader.load()
```
2023-03-14 18:15:28 -07:00
Matthias Kern
c3dfbdf0da
Remove outdated code from Chat VectorDB QA example (#1670) 2023-03-14 18:13:51 -07:00
Bilel MEDIMEGH
a2280f321f
Docs: Fix typo in memory/key_concepts.md (#1671)
dialouge -> dialogue
2023-03-14 18:12:01 -07:00
Xin Qiu
4e13cef05a
feat: add redisearch vectorstore (#1307)
# Description

Add `RediSearch` vectorstore for LangChain

RediSearch: [RediSearch quick
start](https://redis.io/docs/stack/search/quick_start/)

# How to use

```
from langchain.vectorstores.redisearch import RediSearch

rds = RediSearch.from_documents(docs, embeddings,redisearch_url="redis://localhost:6379")
```
2023-03-14 18:06:03 -07:00
Harrison Chase
e5c1659864
bump ver (#1668) 2023-03-14 13:05:17 -07:00
Harrison Chase
2d098e8869
Harrison/agent eval (#1620)
Co-authored-by: jerwelborn <jeremy.welborn@gmail.com>
2023-03-14 12:37:48 -07:00
Harrison Chase
8965a2f0af
bump and hotfix (#1665) 2023-03-14 11:12:53 -07:00
Harrison Chase
e222ea4ee8
update rtd config (#1664) 2023-03-14 10:40:06 -07:00
Harrison Chase
e326939759
bump version 110 (#1662) 2023-03-14 10:21:35 -07:00
Harrison Chase
7cf46b3fee
Harrison/convo agent (#1642) 2023-03-14 09:42:24 -07:00
Abhinav Upadhyay
84cd825a0e
Add a batch_size param to the add_texts API of pinecone wrapper (#1658)
A safe default value of batch_size is required by the pinecone python
client otherwise if the user of add_texts passes too many documents in a
single call, they would get a 400 error from pinecone.
2023-03-14 09:40:22 -07:00
Jon Luo
0a1b1806e9
sql: do not hard code the LIMIT clause in the table_info section (#1563)
Seeing a lot of issues in Discord in which the LLM is not using the
correct LIMIT clause for different SQL dialects. ie, it's using `LIMIT`
for mssql instead of `TOP`, or instead of `ROWNUM` for Oracle, etc.
I think this could be due to us specifying the LIMIT statement in the
example rows portion of `table_info`. So the LLM is seeing the `LIMIT`
statement used in the prompt.
Since we can't specify each dialect's method here, I think it's fine to
just replace the `SELECT... LIMIT 3;` statement with `3 rows from
table_name table:`, and wrap everything in a block comment directly
following the `CREATE` statement. The Rajkumar et al paper wrapped the
example rows and `SELECT` statement in a block comment as well anyway.
Thoughts @fpingham?
2023-03-13 23:08:27 -07:00
Brian Thorne
9ee2713272
Bugfix - allow custom input variables in chat zero shot agent's prompt (#1624)
I was trying out the `chat-zero-shot-react-description` agent for
[qabot](dbbd31bb27/qabot/agents/data_query_chain.py (L35-L52))
but langchain 0.0.108 doesn't correctly use custom 'input_variables` in
the prompt template.
2023-03-13 23:07:35 -07:00
Tim Asp
b3234bf3b0
cleanup: unify 3 different pdf loaders, rename PagedPDFSplitter (#1615)
`OnlinePDFLoader` and `PagedPDFSplitter` lived separate from the rest of
the pdf loaders.

Because they're all similar, I propose moving all to `pdy.py` and the
same docs/examples page.

Additionally, `PagedPDFSplitter` naming doesn't match the pattern the
rest of the loaders follow, so I renamed to `PyPDFLoader` and had it
inherit from `BasePDFLoader` so it can now load from remote file
sources.
2023-03-13 23:06:50 -07:00
Luis
562d9891ea
Add regex dict: (#1616)
This class enables us to send a dictionary containing an output key and
the expected format, which in turn allows us to retrieve the result of
the matching formats and extract specific information from it.

To exclude irrelevant information from our return dictionary, we can
prompt the LLM to use a specific command that notifies us when it
doesn't know the answer. We refer to this variable as the
"no_update_value".

Regarding the updated regular expression pattern
(r"{}:\s?([^.'\n']*).?"), it enables us to retrieve a format as 'Output
Key':'value'.

We have improved the regex by adding an optional space between ':' and
'value' with "s?", and by excluding points and line jumps from the
matches using "[^.'\n']*".
2023-03-13 23:05:39 -07:00
Harrison Chase
56aff797c0
docs req (#1647) 2023-03-13 16:03:32 -07:00
Harrison Chase
d53ff270e0
bump version to 109 (#1646) 2023-03-13 15:52:35 -07:00
Harrison Chase
df6c33d4b3
Harrison/new output parser (#1617) 2023-03-13 15:08:39 -07:00