The doc loaders index was picking up a bunch of subheadings because I
mistakenly made the MD titles H1s. Fixed that.
also the easy minor warnings from docs_build
I was testing out the WhatsApp Document loader, and noticed that
sometimes the date is of the following format (notice the additional
underscore):
```
3/24/23, 1:54_PM - +91 99999 99999 joined using this group's invite link
3/24/23, 6:29_PM - +91 99999 99999: When are we starting then?
```
Wierdly, the underscore is visible in Vim, but not on editors like
VSCode. I presume it is some unusual character/line terminator.
Nevertheless, I think handling this edge case will make the document
loader more robust.
Adds a loader for Slack Exports which can be a very valuable source of
knowledge to use for internal QA bots and other use cases.
```py
# Export data from your Slack Workspace first.
from langchain.document_loaders import SLackDirectoryLoader
SLACK_WORKSPACE_URL = "https://awesome.slack.com"
loader = ("Slack_Exports", SLACK_WORKSPACE_URL)
docs = loader.load()
```
---------
Co-authored-by: Mikhail Dubov <mikhail@chattermill.io>
When the code ran by the PythonAstREPLTool contains multiple statements
it will fallback to exec() instead of using eval(). With this change, it
will also return the output of the code in the same way the
PythonREPLTool will.
In #2399 we added the ability to set `max_execution_time` when creating
an AgentExecutor. This PR adds the `max_execution_time` argument to the
built-in pandas, sql, and openapi agents.
Co-authored-by: Zachary Jones <zjones@zetaglobal.com>
### Summary
Adds support for processing non HTML document types in the URL loader.
For example, the URL loader can now process a PDF or markdown files
hosted at a URL.
### Testing
```python
from langchain.document_loaders import UnstructuredURLLoader
urls = ["https://www.understandingwar.org/sites/default/files/Russian%20Offensive%20Campaign%20Assessment%2C%20April%2011%2C%202023.pdf"]
loader = UnstructuredURLLoader(urls=urls, strategy="fast")
docs = loader.load()
print(docs[0].page_content[:1000])
```
Updated the "load_memory_variables" function of the
ConversationBufferWindowMemory to support a window size of 0 (k=0).
Previous behavior would return the full memory instead of an empty
array.
Eval chain is currently very sensitive to differences in phrasing,
punctuation, and tangential information. This prompt has worked better
for me on my examples.
More general q: Do we have any framework for evaluating default prompt
changes? Could maybe start doing some regression testing?
Currently, the output type of a number of OutputParser's `parse` methods
is `Any` when it can in fact be inferred.
This PR makes BaseOutputParser use a generic type and fixes the output
types of the following parsers:
- `PydanticOutputParser`
- `OutputFixingParser`
- `RetryOutputParser`
- `RetryWithErrorOutputParser`
The output of the `StructuredOutputParser` is corrected from `BaseModel`
to `Any` since there are no type guarantees provided by the parser.
Fixes issue #2715
This PR proposes
- An NLAToolkit method to instantiate from an AI Plugin URL
- A notebook that shows how to use that alongside an example of using a
Retriever object to lookup specs and route queries to them on the fly
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
**Problem:**
The `from_documents` method in Qdrant vectorstore is unnecessary because
it does not change any default behavior from the abstract base class
method of `from_documents` (contrast this with the method in Chroma
which makes a change from default and turns `embeddings` into an
Optional parameter).
Also, the docstrings need some cleanup.
**Solution:**
Remove unnecessary method and improve docstrings.
---------
Co-authored-by: Vijay Rajaram <vrajaram3@gatech.edu>
This change allows the user to initialize the ZapierNLAWrapper with a
valid Zapier NLA OAuth Access_Token, which would be used to make
requests back to the Zapier NLA API.
When a `zapier_nla_oauth_access_token` is passed to the ZapierNLAWrapper
it is no longer required for the `ZAPIER_NLA_API_KEY ` environment
variable to be set, still having it set will not affect the behavior as
the `zapier_nla_oauth_access_token` will be used over the
`ZAPIER_NLA_API_KEY`
Currently, the function still fails if `continue_on_failure` is set to
True, because `elements` is not set.
---------
Co-authored-by: leecjohnny <johnny-lee1255@users.noreply.github.com>
Add more missed imports for integration tests. Bump `pytest` to the
current latest version.
Fix `tests/integration_tests/vectorstores/test_elasticsearch.py` to
update its cassette(easy fix).
Related PR: https://github.com/hwchase17/langchain/pull/2560
Avoid using placeholder methods that only perform a `cast()`
operation because the typing would otherwise be inferred to be the
parent `VectorStore` class. This is unnecessary with TypeVar's.
This PR proposes an update to the OpenAPI Planner and Planner Prompts to
make Patch and Delete available to the planner and executor. I followed
the same patterns as for GET and POST, and made some updates to the
examples available to the Planner and Orchestrator.
Of note, I tried to write prompts for DELETE such that the model will
only execute that job if the User specifically asks for a 'Delete' (see
the Prompt_planner.py examples to see specificity), or if the User had
previously authorized the Delete in the Conversation memory. Although
PATCH also modifies existing data, I considered it lower risk and so did
not try to enforce the same restrictions on the Planner.
When using the llama.cpp together with agent like
zero-shot-react-description, the missing branch will cause the parameter
`stop` left empty, resulting in unexpected output format from the model.
This patch fixes that issue.
I fixed an issue where an error would always occur when making a request
using the `TextRequestsWrapper` with async API.
This is caused by escaping the scope of the context, which causes the
connection to be broken when reading the response body.
The correct usage is as described in the [official
tutorial](https://docs.aiohttp.org/en/stable/client_quickstart.html#make-a-request),
where the text method must also be handled in the context scope.
<details>
<summary>Stacktrace</summary>
```
File "/home/vscode/.cache/pypoetry/virtualenvs/codehex-workspace-xS3fZVNL-py3.11/lib/python3.11/site-packages/langchain/tools/base.py", line 116, in arun
raise e
File "/home/vscode/.cache/pypoetry/virtualenvs/codehex-workspace-xS3fZVNL-py3.11/lib/python3.11/site-packages/langchain/tools/base.py", line 110, in arun
observation = await self._arun(tool_input)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/vscode/.cache/pypoetry/virtualenvs/codehex-workspace-xS3fZVNL-py3.11/lib/python3.11/site-packages/langchain/agents/tools.py", line 22, in _arun
return await self.coroutine(tool_input)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/vscode/.cache/pypoetry/virtualenvs/codehex-workspace-xS3fZVNL-py3.11/lib/python3.11/site-packages/langchain/chains/base.py", line 234, in arun
return (await self.acall(args[0]))[self.output_keys[0]]
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/vscode/.cache/pypoetry/virtualenvs/codehex-workspace-xS3fZVNL-py3.11/lib/python3.11/site-packages/langchain/chains/base.py", line 154, in acall
raise e
File "/home/vscode/.cache/pypoetry/virtualenvs/codehex-workspace-xS3fZVNL-py3.11/lib/python3.11/site-packages/langchain/chains/base.py", line 148, in acall
outputs = await self._acall(inputs)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspace/src/tools/example.py", line 153, in _acall
api_response = await self.requests_wrapper.aget("http://example.com")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/vscode/.cache/pypoetry/virtualenvs/codehex-workspace-xS3fZVNL-py3.11/lib/python3.11/site-packages/langchain/requests.py", line 130, in aget
return await response.text()
^^^^^^^^^^^^^^^^^^^^^
File "/home/vscode/.cache/pypoetry/virtualenvs/codehex-workspace-xS3fZVNL-py3.11/lib/python3.11/site-packages/aiohttp/client_reqrep.py", line 1081, in text
await self.read()
File "/home/vscode/.cache/pypoetry/virtualenvs/codehex-workspace-xS3fZVNL-py3.11/lib/python3.11/site-packages/aiohttp/client_reqrep.py", line 1037, in read
self._body = await self.content.read()
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/vscode/.cache/pypoetry/virtualenvs/codehex-workspace-xS3fZVNL-py3.11/lib/python3.11/site-packages/aiohttp/streams.py", line 349, in read
raise self._exception
aiohttp.client_exceptions.ClientConnectionError: Connection closed
```
</details>
I've added a bilibili loader, bilibili is a very active video site in
China and I think we need this loader.
Example:
```python
from langchain.document_loaders.bilibili import BiliBiliLoader
loader = BiliBiliLoader(
["https://www.bilibili.com/video/BV1xt411o7Xu/",
"https://www.bilibili.com/video/av330407025/"]
)
docs = loader.load()
```
Co-authored-by: 了空 <568250549@qq.com>