Previously, the client expected a strict 'prompt' or 'messages' format
and wouldn't permit running a chat model or llm on prompts or messages
(respectively).
Since many datasets may want to specify custom key: string , relax this
requirement.
Also, add support for running a chat model on raw prompts and LLM on
chat messages through their respective fallbacks.
# Your PR Title (What it does)
<!--
Thank you for contributing to LangChain! Your PR will appear in our next
release under the title you set. Please make sure it highlights your
valuable contribution.
Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.
After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.
-->
<!-- Remove if not applicable -->
Fixes # (issue)
## Before submitting
<!-- If you're adding a new integration, include an integration test and
an example notebook showing its use! -->
## Who can review?
Community members can review the PR once tests pass. Tag
maintainers/contributors who might be interested:
<!-- For a quicker response, figure out the right person to tag with @
@hwchase17 - project lead
Tracing / Callbacks
- @agola11
Async
- @agola11
DataLoaders
- @eyurtsev
Models
- @hwchase17
- @agola11
Agents / Tools / Toolkits
- @vowelparrot
VectorStores / Retrievers / Memory
- @dev2049
-->
# Fix subclassing OpenAIEmbeddings
Fixes#4498
## Before submitting
- Problem: Due to annotated type `Tuple[()]`.
- Fix: Change the annotated type to "Iterable[str]". Even though
tiktoken use
[Collection[str]](095924e02c/tiktoken/core.py (L80))
type annotation, but pydantic doesn't support Collection type, and
[Iterable](https://docs.pydantic.dev/latest/usage/types/#typing-iterables)
is the closest to Collection.
# fix: agenerate miss run_manager args in llm.py
<!--
Thank you for contributing to LangChain! Your PR will appear in our next
release under the title you set. Please make sure it highlights your
valuable contribution.
Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.
After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.
-->
<!-- Remove if not applicable -->
Fixes # (issue)
fix: agenerate miss run_manager args in llm.py
<!-- For a quicker response, figure out the right person to tag with @
@hwchase17 - project lead
Tracing / Callbacks
- @agola11
Async
- @agola11
DataLoaders
- @eyurtsev
Models
- @hwchase17
- @agola11
Agents / Tools / Toolkits
- @vowelparrot
VectorStores / Retrievers / Memory
- @dev2049
-->
ArxivAPIWrapper searches and downloads PDFs to get related information.
But I found that it doesn't delete the downloaded file. The reason why
this is a problem is that a lot of PDF files remain on the server. For
example, one size is about 28M.
So, I added a delete line because it's too big to maintain on the
server.
# Clean up downloaded PDF files
- Changes: Added new line to delete downloaded file
- Background: To get the information on arXiv's paper, ArxivAPIWrapper
class downloads a PDF.
It's a natural approach, but the wrapper retains a lot of PDF files on
the server.
- Problem: One size of PDFs is about 28M. It's too big to maintain on a
small server like AWS.
- Dependency: import os
Thank you.
---------
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
# Get the memory importance score from regex matched group
In `GenerativeAgentMemory`, the `_score_memory_importance()` will make a
prompt to get a rating score. The prompt is:
```
prompt = PromptTemplate.from_template(
"On the scale of 1 to 10, where 1 is purely mundane"
+ " (e.g., brushing teeth, making bed) and 10 is"
+ " extremely poignant (e.g., a break up, college"
+ " acceptance), rate the likely poignancy of the"
+ " following piece of memory. Respond with a single integer."
+ "\nMemory: {memory_content}"
+ "\nRating: "
)
```
For some LLM, it will respond with, for example, `Rating: 8`. Thus we
might want to get the score from the matched regex group.
# The cohere embedding model do not use large, small. It is deprecated.
Changed the modules default model
Fixes#4694
Co-authored-by: rajib76 <rajib76@yahoo.com>
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
**Feature**: This PR adds `from_template_file` class method to
BaseStringMessagePromptTemplate. This is useful to help user to create
message prompt templates directly from template files, including
`ChatMessagePromptTemplate`, `HumanMessagePromptTemplate`,
`AIMessagePromptTemplate` & `SystemMessagePromptTemplate`.
**Tests**: Unit tests have been added in this PR.
Co-authored-by: charosen <charosen@bupt.cn>
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
# Removed usage of deprecated methods
Replaced `SQLDatabaseChain` deprecated direct initialisation with
`from_llm` method
## Who can review?
@hwchase17
@agola11
---------
Co-authored-by: imeckr <chandanroutray2012@gmail.com>
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
# Fixed query checker for SQLDatabaseChain
When `SQLDatabaseChain`'s llm attribute was deprecated, the query
checker stopped working if `SQLDatabaseChain` is initialised via
`from_llm` method. With this fix, `SQLDatabaseChain`'s query checker
would use the same `llm` as used in the `llm_chain`
## Who can review?
@hwchase17 - project lead
Co-authored-by: imeckr <chandanroutray2012@gmail.com>
Adds some basic unit tests for the ConfluenceLoader that can be extended
later. Ports this [PR from
llama-hub](https://github.com/emptycrown/llama-hub/pull/208) and adapts
it to `langchain`.
@Jflick58 and @zywilliamli adding you here as potential reviewers
---------
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
# Improve the Chroma get() method by adding the optional "include"
parameter.
The Chroma get() method excludes embeddings by default. You can
customize the response by specifying the "include" parameter to
selectively retrieve the desired data from the collection.
---------
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
# Fix Telegram API loader + add tests.
I was testing this integration and it was broken with next error:
```python
message_threads = loader._get_message_threads(df)
KeyError: False
```
Also, this particular loader didn't have any tests / related group in
poetry, so I added those as well.
@hwchase17 / @eyurtsev please take a look on this fix PR.
---------
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
Add client methods to read / list runs and sessions.
Update walkthrough to:
- Let the user create a dataset from the runs without going to the UI
- Use the new CLI command to start the server
Improve the error message when `docker` isn't found
# Cassandra support for chat history
### Description
- Store chat messages in cassandra
### Dependency
- cassandra-driver - Python Module
## Before submitting
- Added Integration Test
## Who can review?
@hwchase17
@agola11
# Your PR Title (What it does)
<!--
Thank you for contributing to LangChain! Your PR will appear in our next
release under the title you set. Please make sure it highlights your
valuable contribution.
Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.
After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.
-->
<!-- Remove if not applicable -->
Fixes # (issue)
## Before submitting
<!-- If you're adding a new integration, include an integration test and
an example notebook showing its use! -->
## Who can review?
Community members can review the PR once tests pass. Tag
maintainers/contributors who might be interested:
<!-- For a quicker response, figure out the right person to tag with @
@hwchase17 - project lead
Tracing / Callbacks
- @agola11
Async
- @agola11
DataLoaders
- @eyurtsev
Models
- @hwchase17
- @agola11
Agents / Tools / Toolkits
- @vowelparrot
VectorStores / Retrievers / Memory
- @dev2049
-->
Co-authored-by: Jinto Jose <129657162+jj701@users.noreply.github.com>
Adds the basic retrievers for Milvus and Zilliz. Hybrid search support
will be added in the future.
Signed-off-by: Filip Haltmayer <filip.haltmayer@zilliz.com>
# Fix DeepLake Overwrite Flag Issue
Fixes Issue #4682: essentially, setting overwrite to False in the
DeepLake constructor still triggers an overwrite, because the logic is
just checking for the presence of "overwrite" in kwargs. The fix is
simple--just add some checks to inspect if "overwrite" in kwargs AND
kwargs["overwrite"]==True.
Added a new test in
tests/integration_tests/vectorstores/test_deeplake.py to reflect the
desired behavior.
Co-authored-by: Anirudh Suresh <ani@Anirudhs-MBP.cable.rcn.com>
Co-authored-by: Anirudh Suresh <ani@Anirudhs-MacBook-Pro.local>
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
Making headless an optional argument for
create_async_playwright_browser() and create_sync_playwright_browser()
By default no functionality is changed.
This allows for disabled people to use a web browser intelligently with
their voice, for example, while still seeing the content on the screen.
As well as many other use cases
---------
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
This PR adds exponential back-off to the Google PaLM api to gracefully
handle rate limiting errors.
---------
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
# Add summarization task type for HuggingFace APIs
Add summarization task type for HuggingFace APIs.
This task type is described by [HuggingFace inference
API](https://huggingface.co/docs/api-inference/detailed_parameters#summarization-task)
My project utilizes LangChain to connect multiple LLMs, including
various HuggingFace models that support the summarization task.
Integrating this task type is highly convenient and beneficial.
Fixes#4720
This reverts commit 5111bec540.
This PR introduced a bug in the async API (the `url` param isn't bound);
it also didn't update the synchronous API correctly, which makes it
error-prone (the behavior of the async and sync endpoints would be
different)
Fixes some bugs I found while testing with more advanced datasets and
queries. Includes using the output of PowerBI to parse the error and
give that back to the LLM.
# Add GraphQL Query Support
This PR introduces a GraphQL API Wrapper tool that allows LLM agents to
query GraphQL databases. The tool utilizes the httpx and gql Python
packages to interact with GraphQL APIs and provides a simple interface
for running queries with LLM agents.
@vowelparrot
---------
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
# Added support for streaming output response to
HuggingFaceTextgenInference LLM class
Current implementation does not support streaming output. Updated to
incorporate this feature. Tagging @agola11 for visibility.
Instead of halting the entire program if this tool encounters an error,
it should pass the error back to the agent to decide what to do.
This may be best suited for @vowelparrot to review.