Todo
- [x] copy over integration tests
- [x] update docs with new instructions in #15513
- [x] add linear ticket to bump core -> community, community->langchain,
and core->openai deps
- [ ] (optional): add `pip install langchain-openai` command to each
notebook using it
- [x] Update docstrings to not need `openai` install
- [x] Add serialization
- [x] deprecate old models
Contributor steps:
- [x] Add secret names to manual integrations workflow in
.github/workflows/_integration_test.yml
- [x] Add secrets to release workflow (for pre-release testing) in
.github/workflows/_release.yml
Maintainer steps (Contributors should not do these):
- [x] set up pypi and test pypi projects
- [x] add credential secrets to Github Actions
- [ ] add package to conda-forge
Functional changes to existing classes:
- now relies on openai client v1 (1.6.1) via concrete dep in
langchain-openai package
Codebase organization
- some function calling stuff moved to
`langchain_core.utils.function_calling` in order to be used in both
community and langchain-openai
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
Removes unused `Params` in `libs/langchain/langchain/llms/mlflow.py`.
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
The example code for `llms.Mlflow` is outdated.
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
- **Description:** `MarkdownHeaderTextSplitter` currently strips header
lines from chunked content. Many applications require these header lines
are preserved. This adds an optional parameter to preserve those headers
in the chunked content.
- **Issue:** #2836 (relevant)
- **Dependencies:** -
- **Tag maintainer:** @baskaryan
- **Twitter handle:** @finnless
Unit tests and new examples in notebook included.
cc @rlancemartin
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
Adds `WasmChat` integration. `WasmChat` runs GGUF models locally or via
chat service in lightweight and secure WebAssembly containers. In this
PR, `WasmChatService` is introduced as the first step of the
integration. `WasmChatService` is driven by
[llama-api-server](https://github.com/second-state/llama-utils) and
[WasmEdge Runtime](https://wasmedge.org/).
---------
Signed-off-by: Xin Liu <sam@secondstate.io>
Follow up on https://github.com/langchain-ai/langchain/pull/13048.
This PR intends to simplify the Qdrant async implementation by replacing
the internal GRPC methods with the `QdrantAsyncClient` methods.
This is a backward compatible change with no additional steps required
after merge.
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Fixes#14347
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** Added the traceback of the previous error to keep the
initial error type,
- **Issue:** #14347 ,
- **Dependencies:** None,
- **Tag maintainer:**
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
---------
Co-authored-by: Julien Raffy <julien.raffy@emeria.eu>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
- **Description:** the ability to add all extra parameter of vectorstore
and using them SemanticSimilarityExampleSelector.
- **Issue:** #14583
- **Dependencies:** no dependensies
- **Tag maintainer:**
- **Twitter handle:** @AmirMalekiz
---------
Co-authored-by: Amir Maleki <amaleki@fb.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Description: Add support for setting the `score_threshold` for
similarity search in SupabaseVectoreStore.
This pull request addresses issue #14438
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
- **Description:** changed json.py to handle additional cases of partial
json string to be parsed, basically by dropping the last character in
the string until a valid json string is found or the string is empty.
Also added additional test cases.
- **Issue:** function parse_partial_json could not parse cases where the
key is present but the value is not.
---------
Co-authored-by: Nuno Campos <nuno@langchain.dev>
Because Milvus' collection_name doesn't support UFT8 characters in other
languages, I want the `collection_descriotion`.
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
**Description:** Fix for processing for serpapi response for Google Maps
API
**Issue:** Due to the fact corresponding
[api](https://serpapi.com/google-maps-api) returns 'local_results' as
list, and old version requested `res["local_results"].keys()` of the
list. As the result we got exception: ```AttributeError: 'list' object
has no attribute 'keys'```.
Way to reproduce wrong behaviour:
```
params = {
"engine": "google_maps",
"type": "search",
"google_domain": "google.de",
"ll": "@51.1917,10.525,14z",
"hl": "de",
"gl": "de",
}
search = SerpAPIWrapper(params=params)
results = search.run("cafe")
```
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Ran <rccalman@gmail.com>
Because Milvus doesn't support nullable fields, but document metadata is
very rich, so it makes more sense to store it as json.
https://github.com/milvus-io/pymilvus/issues/1705#issuecomment-1731112372
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
BigQuery vector search lets you use GoogleSQL to do semantic search,
using vector indexes for fast but approximate results, or using brute
force for exact results.
This PR integrates LangChain vectorstore with BigQuery Vector Search.
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://python.langchain.com/docs/contributing/
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
---------
Co-authored-by: Vlad Kolesnikov <vladkol@google.com>
- **Description:** replace score_threshold with args
- **Issue:** needs a way to pass more options to similarity search
- **Dependencies:** None
- **Twitter handle:** @workbot
---------
Co-authored-by: JY <jyjy@jaguardb>
- **Description:** Tool now supports querying over 200 million
scientific articles, vastly expanding its reach beyond the 2 million
articles accessible through Arxiv. This update significantly broadens
access to the entire scope of scientific literature.
- **Dependencies:** semantischolar
https://github.com/danielnsilva/semanticscholar
- **Twitter handle:** @shauryr
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
<!-- Thank you for contributing to LangChain!
Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes if applicable,
- **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
<!-- Thank you for contributing to LangChain!
Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes if applicable,
- **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
…tch]: import models from community
ran
```bash
git grep -l 'from langchain\.chat_models' | xargs -L 1 sed -i '' "s/from\ langchain\.chat_models/from\ langchain_community.chat_models/g"
git grep -l 'from langchain\.llms' | xargs -L 1 sed -i '' "s/from\ langchain\.llms/from\ langchain_community.llms/g"
git grep -l 'from langchain\.embeddings' | xargs -L 1 sed -i '' "s/from\ langchain\.embeddings/from\ langchain_community.embeddings/g"
git checkout master libs/langchain/tests/unit_tests/llms
git checkout master libs/langchain/tests/unit_tests/chat_models
git checkout master libs/langchain/tests/unit_tests/embeddings/test_imports.py
make format
cd libs/langchain; make format
cd ../experimental; make format
cd ../core; make format
```
- easier to write custom logic/loops with automatic tracing
- if you don't want to streaming support write a regular function and
pass to RunnableLambda
- if you do want streaming write a generator and pass it to
RunnableGenerator
```py
import json
from typing import AsyncIterator
from langchain_core.messages import BaseMessage, FunctionMessage, HumanMessage
from langchain_core.agents import AgentAction, AgentFinish
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.runnables import Runnable, RunnableGenerator, RunnablePassthrough
from langchain_core.tools import BaseTool
from langchain.agents.output_parsers import OpenAIFunctionsAgentOutputParser
from langchain.chat_models import ChatOpenAI
from langchain.tools.render import format_tool_to_openai_function
def _get_tavily():
from langchain.tools.tavily_search import TavilySearchResults
from langchain.utilities.tavily_search import TavilySearchAPIWrapper
tavily_search = TavilySearchAPIWrapper()
return TavilySearchResults(api_wrapper=tavily_search)
async def _agent_executor_generator(
input: AsyncIterator[list[BaseMessage]],
*,
max_iterations: int = 10,
tools: dict[str, BaseTool],
agent: Runnable[list[BaseMessage], BaseMessage],
parser: Runnable[BaseMessage, AgentAction | AgentFinish],
) -> AsyncIterator[BaseMessage]:
messages = [m async for mm in input for m in mm]
for _ in range(max_iterations):
next_message = await agent.ainvoke(messages)
yield next_message
messages.append(next_message)
parsed = await parser.ainvoke(next_message)
if isinstance(parsed, AgentAction):
result = await tools[parsed.tool].ainvoke(parsed.tool_input)
next_message = FunctionMessage(name=parsed.tool, content=json.dumps(result))
yield next_message
messages.append(next_message)
elif isinstance(parsed, AgentFinish):
return
def get_agent_executor(tools: list[BaseTool], system_message: str):
llm = ChatOpenAI(model="gpt-4-1106-preview", temperature=0, streaming=True)
prompt = ChatPromptTemplate.from_messages(
[
("system", system_message),
MessagesPlaceholder(variable_name="messages"),
]
)
llm_with_tools = llm.bind(
functions=[format_tool_to_openai_function(t) for t in tools]
)
agent = {"messages": RunnablePassthrough()} | prompt | llm_with_tools
parser = OpenAIFunctionsAgentOutputParser()
executor = RunnableGenerator(_agent_executor_generator)
return executor.bind(
tools={tool.name for tool in tools}, agent=agent, parser=parser
)
agent = get_agent_executor([_get_tavily()], "You are a very nice agent!")
async def main():
async for message in agent.astream(
[HumanMessage(content="whats the weather in sf tomorrow?")]
):
print(message)
if __name__ == "__main__":
import asyncio
asyncio.run(main())
```
results in this trace
https://smith.langchain.com/public/fa17f05d-9724-4d08-8fa1-750f8fcd051b/r
<!-- Thank you for contributing to LangChain!
Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes if applicable,
- **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
- **Description:** SingleFileFacebookMessengerChatLoader did not handle
the case for when messages had stickers and/or photos so fixed that.
- **Issue:** #15356
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
- **Description:** updates/enhancements to IBM
[watsonx.ai](https://www.ibm.com/products/watsonx-ai) LLM provider
(prompt tuned models and prompt templates deployments support)
- **Dependencies:**
[ibm-watsonx-ai](https://pypi.org/project/ibm-watsonx-ai/),
- **Tag maintainer:** : @hwchase17 , @eyurtsev , @baskaryan
- **Twitter handle:** details in comment below.
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally. ✅
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
The fix#14221 has broken default gitlab url which is forcing the users
to specify GITLAB_URL for default one. With this fix if GITLAB_URL is
not set, the default gitlab url will be taken.
- **Description:** Add the GITHUB URL instead of None
- **Issue:** the issue #14221 has broken the default github URL
- **Dependencies:** None
- **Tag maintainer:** @hwchase17
- **Twitter handle:** manjunath_shiva
- **Description:** This PR adds `api_base` to `_client_params` in the
`chat_model` of LiteLLM to ensure it's included in API calls.
Previously, `api_base` was set on the client but was not included in the
parameters passed to the completion function. This change ensures that
`api_base` is correctly passed to all API calls.
- **Issue:** #14338
- **Tag maintainer:** @hwchase17 @agola11
- **Twitter handle:** @LMS_David_RS
Sometimes, the tool_schema is like:
` {'action_name': 'search_items', 'action': {'term': 'pizza'}}`
sometimes, specially with gpt3.5 it comes like:
`{'action_name': 'search_items', 'term': 'pizza'}`
and it fails.
This PR is a way to make it work in both scenarios.
issues releated: #6624
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
Co-authored-by: Lucca Zenobio <lucca.zenobio@ifood.com.br>
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
This change addresses the issue where DashScopeEmbeddingAPI limits
requests to 25 lines of data, and DashScopeEmbeddings did not handle
cases with more than 25 lines, leading to errors. I have implemented a
fix to manage data exceeding this limit efficiently.
---------
Co-authored-by: xuxiang <xuxiang@aliyun.com>
Adding to my previously, already merged PR I made some further
improvements:
* Added documentation to the existing Pydantic Parser notebook, with an
example using LCEL and `with_retry()` on `OutputParserException`.
* Added an additional output example to the prompt
* More lenient parser in terms of LLM output format
* Amended unit test
FYI @hwchase17
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
- **Description:** Update _retrieve_ref inside json_schema.py to include
an isdigit() check
- **Issue:** This library is used inside dereference_refs inside
langchain_community.agent_toolkits.openapi.spec. When I read in a yaml
file which has references for "400", "401" etc; the line "out =
out[component]" causes a KeyError. The isdigit() check ensures that if
it is an integer like "400" or "401"; it converts it into integer before
using it as a key to prevent the error.
- **Dependencies:** No dependencies
- **Tag maintainer:** @baskaryan
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
- **Description:** Using PGVector vector store, it was only possible to
filter for values equals, in or not in metadata. Extended this feature
to work with the following keywords : IN, NIN, BETWEEN, GT, LT, NE, EQ,
LIKE, CONTAINS, OR, AND
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
The regex used to match "Action" and "Action Input" in the output parser
has been updated. Previously, the regex did not correctly handle
multi-line inputs for "Action Input". The updated code uses the
're.DOTALL' flag to ensure multi-line inputs are correctly captured.
<!-- Thank you for contributing to LangChain!
Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes if applicable,
- **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
- **Description:**
- This PR introduces a significant enhancement to the LangChain project
by integrating a new chat model powered by the third-generation base
large model, ChatGLM3, via the zhipuai API.
- This advanced model supports functionalities like function calls, code
interpretation, and intelligent Agent capabilities.
- The additions include the chat model itself, comprehensive
documentation in the form of Python notebook docs, and thorough testing
with both unit and integrated tests.
- **Dependencies:** This update relies on the ZhipuAI package as a key
dependency.
- **Twitter handle:** If this PR receives spotlight attention, we would
be honored to receive a mention for our integration of the advanced
ChatGLM3 model via the ZhipuAI API. Kindly tag us at @kaiwu.
To ensure quality and standards, we have performed extensive linting and
testing. Commands such as make format, make lint, and make test have
been run from the root of the modified package to ensure compliance with
LangChain's coding standards.
TO DO: Continue refining and enhancing both the unit tests and
integrated tests.
---------
Co-authored-by: jing <jingguo92@gmail.com>
Co-authored-by: hyy1987 <779003812@qq.com>
Co-authored-by: jianchuanqi <qijianchuan@hotmail.com>
Co-authored-by: lirq <whuclarence@gmail.com>
Co-authored-by: whucalrence <81530213+whucalrence@users.noreply.github.com>
Co-authored-by: Jing Guo <48378126+JaneCrystall@users.noreply.github.com>
Description: Volcano Ark is an enterprise-grade large-model service
platform for developers, providing a full range of functions and
services such as model training, inference, evaluation, fine-tuning. You
can visit its homepage at https://www.volcengine.com/docs/82379/1099455
for details. This change could help developers use the platform for
embedding.
Issue: None
Dependencies: volcengine
Tag maintainer: @baskaryan
Twitter handle: @hinnnnnnnnnnnns
---------
Co-authored-by: lujingxuansc <lujingxuansc@bytedance.com>
**Description:** the MWDumpLoader implementation currently does not
support the lazy_load method, and the files are usually very large. We
are proposing refactoring the load function, extracting two private
functions with the functionality of loading the dump file and parsing a
single page, to reuse the code in the lazy_load implementation.
**Description:**
This PR adds the `**kwargs` parameter to six calls in the `chroma.py`
package. All functions already were able to receive `kwargs` but they
were discarded before.
**Issue:**
When passing `kwargs` to functions in the `chroma.py` package they are
being ignored.
For example:
```
chroma_instance.similarity_search_with_score(
query,
k=100,
include=["metadatas", "documents", "distances", "embeddings"], # this parameter gets ignored
)
```
The `include` parameter does not get passed on to the next function and
does not have any effect.
**Dependencies:**
None
- **Description:**
- support custom kwargs in object initialization. For instantance, QPS
differs from multiple object(chat/completion/embedding with diverse
models), for which global env is not a good choice for configuration.
- **Issue:** no
- **Dependencies:** no
- **Twitter handle:** no
@baskaryan PTAL
These can happen for edge cases not covered by `default` handler (eg.
"strange" keys in dicts)
<!-- Thank you for contributing to LangChain!
Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes if applicable,
- **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
- Any direct usage of ThreadPoolExecutor or asyncio.run_in_executor
needs manual handling of context vars
<!-- Thank you for contributing to LangChain!
Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes if applicable,
- **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
- **Description:** fix parse issue for AIMessageChunk when using
- **Issue:** https://github.com/langchain-ai/langchain/issues/14511
- **Dependencies:** none
- **Twitter handle:** none
Taken from this fix:
https://github.com/gpt-engineer-org/gpt-engineer/issues/804#issuecomment-1769853850
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
- **Description:** fixes and upgrades for the Tongyi LLM and ChatTongyi
Model
- Fixed typos; it should be `Tongyi`, not `OpenAI`.
- Fixed a bug in `stream_generate_with_retry`; it's a real stream
generator now.
- Fixed a bug in `validate_environment`; the `dashscope_api_key` should
be properly handled when set by environment variables or initialization
parameters.
- Changed the `dashscope` response to incremental output by setting the
parameter `incremental_output`, which eliminates the need for the
prefix-removal trick.
- Removed some unused parameters, like `n`, `prefix_messages`.
- Added `_stream` method.
- Added async methods support, such as `_astream`, `_agenerate`,
`_abatch`.
- **Dependencies:** No new dependencies.
- **Tag maintainer:** @hwchase17
> PS: Some may be confused about the terms `dashscope`, `tongyi`, and
`Qwen`:
> - `dashscope`: A platform to deploy LLMs and provide APIs to invoke
the LLM.
> - `tongyi`: A brand name or overall term about Alibaba Cloud's LLM/AI.
> - `Qwen`: An LLM that is open-sourced and deployed in `dashscope`.
>
> We use the `dashscope` SDK to interact with the `tongyi`-`Qwen` LLM.
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Correcting a small typo ('the' instead of 'then') and changing another
'the' (instead of 'then' too, it was a hard day for the 'n' key :D) to
'also' to match better with what is done in the code
<!-- Thank you for contributing to LangChain!
Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes if applicable,
- **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
- do not match text after - in the middle of a sentence
<!-- Thank you for contributing to LangChain!
Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes if applicable,
- **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
…parse
<!-- Thank you for contributing to LangChain!
Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes if applicable,
- **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
```shell
Python 3.11.6 (main, Nov 2 2023, 04:39:43) [Clang 14.0.3 (clang-1403.0.22.14.1)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> s = {'name': 'gc', 'arguments': '{"prompt":"hi\nbob."}'}
>>> import json
>>> json.loads(s['arguments'])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/opt/homebrew/Cellar/python@3.11/3.11.6_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/json/__init__.py", line 346, in loads
return _default_decoder.decode(s)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Cellar/python@3.11/3.11.6_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Cellar/python@3.11/3.11.6_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/json/decoder.py", line 353, in raw_decode
obj, end = self.scan_once(s, idx)
^^^^^^^^^^^^^^^^^^^^^^
json.decoder.JSONDecodeError: Invalid control character at: line 1 column 14 (char 13)
>>> json.loads(s['arguments'].replace('\n', '\\n'))
{'prompt': 'hi\nbob.'}
>>>
```
---------
Co-authored-by: Nuno Campos <nuno@langchain.dev>
While using `chain.batch`, the default implementation uses a
`ThreadPoolExecutor` and run the chains in separate threads. An issue
with this approach is that that [the token counting
callback](https://python.langchain.com/docs/modules/callbacks/token_counting)
fails to work as a consequence of the context not being propagated
between threads. This PR adds context propagation to the new threads and
adds some thread synchronization in the OpenAI callback. With this
change, the token counting callback works as intended.
Having the context propagation change would be highly beneficial for
those implementing custom callbacks for similar functionalities as well.
---------
Co-authored-by: Nuno Campos <nuno@langchain.dev>
- Enables strict=False by default
- Uses partial json recovery logic by default
<!-- Thank you for contributing to LangChain!
Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes if applicable,
- **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
<!-- Thank you for contributing to LangChain!
Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes if applicable,
- **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
- **Description:** The Github error prompt is confused because of JWT
enctrypt to somebody not familiar with Github connection method. This PR
is to add some useful error prompt to help users troubleshooting.
- **Issue:**
https://github.com/langchain-ai/langchain/issues/14550#issuecomment-1867445049
- **Dependencies:** None,
- **Twitter handle:** None
<!-- Thank you for contributing to LangChain!
Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes if applicable,
- **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
<!-- Thank you for contributing to LangChain!
Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes if applicable,
- **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
…ableBinding
<!-- Thank you for contributing to LangChain!
Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes if applicable,
- **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
…unnableAssign or RunnablePick
<!-- Thank you for contributing to LangChain!
Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes if applicable,
- **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
<!-- Thank you for contributing to LangChain!
Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes if applicable,
- **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
**Description:** Implement stream and astream methods for RunnableLambda
to make streaming work for functions returning Runnable
- **Issue:** https://github.com/langchain-ai/langchain/issues/11998
- **Dependencies:** No new dependencies
- **Twitter handle:** https://twitter.com/qtangs
---------
Co-authored-by: Nuno Campos <nuno@langchain.dev>
**Description:**
Adding async methods to booth OllamaLLM and ChatOllama to enable async
streaming and async .on_llm_new_token callbacks.
**Issue:**
ChatOllama is not working in combination with an AsyncCallbackManager
because the .on_llm_new_token method is not awaited.
<!-- Thank you for contributing to LangChain!
Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes if applicable,
- **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
<!-- Thank you for contributing to LangChain!
Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes if applicable,
- **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
- Added ensure_ascii property to ElasticsearchChatMessageHistory
<!-- Thank you for contributing to LangChain!
Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes if applicable,
- **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
---------
Co-authored-by: Ivan Chetverikov <ivan.chetverikov@raftds.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
**Description**: The parameter chunk_type was being hard coded to
"extractive_answers", so that when "snippet" was being passed, it was
being ignored. This change simply doesn't do that.
Added the call function get_summaries_as_docs inside of Arxivloader
- **Description:** Added a function that returns the documents from
get_summaries_as_docs, as the call signature is present in the parent
file but never used from Arxivloader, this can be used from Arxivloader
itself just like .load() as both the signatures are same.
- **Issue:** Reduces time to load papers as no pdf is processed only
metadata is pulled from Arxiv allowing users for faster load times on
bulk loads. Users can then choose one or more paper and use ID directly
with .load() to load pdf thereby loading all the contents of the paper.
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://python.langchain.com/docs/contributing/
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
## Description
Changes the behavior of `add_user_message` and `add_ai_message` to allow
for messages of those types to be passed in. Currently, if you want to
use the `add_user_message` or `add_ai_message` methods, you have to pass
in a string. For `add_message` on `ChatMessageHistory`, however, you
have to pass a `BaseMessage`. This behavior seems a bit inconsistent.
Personally, I'd love to be able to be explicit that I want to
`add_user_message` and pass in a `HumanMessage` without having to grab
the `content` attribute. This PR allows `add_user_message` to accept
`HumanMessage`s or `str`s and `add_ai_message` to accept `AIMessage`s or
`str`s to add that functionality and ensure backwards compatibility.
## Issue
* None
## Dependencies
* None
## Tag maintainer
@hinthornw
@baskaryan
## Note
`make test` results in `make: *** No rule to make target 'test'. Stop.`
- **Description:** `tools.gmail.send_message` implements a
`SendMessageSchema` that is not used anywhere. `GmailSendMessage` also
does not have an `args_schema` attribute (this led to issues when
invoking the tool with an OpenAI functions agent, at least for me). Here
we add the missing attribute and a minimal test for the tool.
- **Issue:** N/A
- **Dependencies:** N/A
- **Twitter handle:** N/A
---------
Co-authored-by: Chester Curme <chestercurme@microsoft.com>
- **Description:** In response to user feedback, this PR refactors the
Baseten integration with updated model endpoints, as well as updates
relevant documentation. This PR has been tested by end users in
production and works as expected.
- **Issue:** N/A
- **Dependencies:** This PR actually removes the dependency on the
`baseten` package!
- **Twitter handle:** https://twitter.com/basetenco
# Description
This PR adds the ability to pass a `botocore.config.Config` instance to
the boto3 client instantiated by the Bedrock LLM.
Currently, the Bedrock LLM doesn't support a way to pass a Config, which
means that some settings (e.g., timeouts and retry configuration)
require instantiating a new boto3 client with a Config and then
replacing the LLM's client:
```python
llm = Bedrock(
region_name='us-west-2',
model_id="anthropic.claude-v2",
model_kwargs={'max_tokens_to_sample': 4096, 'temperature': 0},
)
llm.client = boto_client('bedrock-runtime', region_name='us-west-2', config=Config({'read_timeout': 300}))
```
# Issue
N/A
# Dependencies
N/A
<!-- Thank you for contributing to LangChain!
Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes if applicable,
- **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
fix spellings
**seperate -> separate**: found more occurrences, see
https://github.com/langchain-ai/langchain/pull/14602
**initialise -> intialize**: the latter is more common in the repo
**pre-defined > predefined**: adding a comma after a prefix is a
delicate matter, but this is a generally accepted word
also, another word that appears in the repo is "fs" (stands for
filesystem), e.g., in `libs/core/langchain_core/prompts/loading.py`
` """Unified method for loading a prompt from LangChainHub or local
fs."""`
Isn't "filesystem" better?
<!-- Thank you for contributing to LangChain!
Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes if applicable,
- **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Ran <rccalman@gmail.com>
<!-- Thank you for contributing to LangChain!
Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes if applicable,
- **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
- **Description:** This PR fixes test failures on Windows caused by path
handling differences and unescaped special characters in regex. The
failing tests are:
```
FAILED tests/unit_tests/storage/test_filesystem.py::test_yield_keys - AssertionError: assert ['key1', 'subdir\\key2'] == ['key1', 'subdir/key2']
FAILED tests/unit_tests/test_imports.py::test_importable_all - ModuleNotFoundError: No module named 'langchain_community.langchain_community\\adapters'
FAILED tests/unit_tests/tools/file_management/test_utils.py::test_get_validated_relative_path_errs_on_absolute - re.error: incomplete escape \U at position 53
FAILED tests/unit_tests/tools/file_management/test_utils.py::test_get_validated_relative_path_errs_on_parent_dir - re.error: incomplete escape \U at position 69
FAILED tests/unit_tests/tools/file_management/test_utils.py::test_get_validated_relative_path_errs_for_symlink_outside_root - re.error: incomplete escape \U at position 64
```
- **Issue:** fixes
https://github.com/langchain-ai/langchain/issues/11775 (partially)
- **Dependencies:** none
<!-- Thank you for contributing to LangChain!
Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes if applicable,
- **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
<!-- Thank you for contributing to LangChain!
Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes if applicable,
- **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
<!-- Thank you for contributing to LangChain!
Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes if applicable,
- **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
<!-- Thank you for contributing to LangChain!
Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes if applicable,
- **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
Builds on #14040 with community refactor merged and notebook updated.
Note that with this refactor, models will be imported from
`langchain_community.chat_models.huggingface` rather than the main
`langchain` repo.
---------
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Signed-off-by: ugm2 <unaigaraymaestre@gmail.com>
Signed-off-by: Yuchen Liang <yuchenl3@andrew.cmu.edu>
Co-authored-by: Andrew Reed <andrew.reed.r@gmail.com>
Co-authored-by: Andrew Reed <areed1242@gmail.com>
Co-authored-by: A-Roucher <aymeric.roucher@gmail.com>
Co-authored-by: Aymeric Roucher <69208727+A-Roucher@users.noreply.github.com>
Replace this entire comment with:
- **Description:** @kurtisvg has raised a point that it's a good idea to
have a fixed version for embeddings (since otherwise a user might run a
query with one version vs a vectorstore where another version was used).
In order to avoid breaking changes, I'd suggest to give users a warning,
and make a `model_name` a required argument in 1.5 months.
Surrealdb client changes from 0.3.1 to 0.3.2 broke the surrealdb vectore
integration.
This PR updates the code to work with the updated client. The change is
backwards compatible with previous versions of surrealdb client.
Also expanded the vector store implementation to store and retrieve
metadata that's included with the document object.
- **Description:** Fixed jaguar.py to import JaguarHttpClient with try
and catch
- **Issue:** the issue # Unable to use the JaguarHttpClient at run time
- **Dependencies:** It requires "pip install -U jaguardb-http-client"
- **Twitter handle:** workbot
---------
Co-authored-by: JY <jyjy@jaguardb>
Co-authored-by: Bagatur <baskaryan@gmail.com>
**Description**
For the Momento Vector Index (MVI) vector store implementation, pass
through `filter_expression` kwarg to the MVI client, if specified. This
change will enable the MVI self query implementation in a future PR.
Also fixes some integration tests.
- **Description:** Fix typo in class Docstring to replace
AZURE_OPENAI_API_ENDPOINT by AZURE_OPENAI_ENDPOINT
- **Issue:** the issue #14901
- **Dependencies:** NA
- **Twitter handle:**
Co-authored-by: Yacine Bouakkaz <Yacine.Bouakkaz@evokegroup.com>
* This PR adds `stream` implementations to Runnable Branch.
* Runnable Branch still does not support `transform` so it'll break streaming if it happens in middle or end of sequence, but will work if happens at beginning of sequence.
* Fixes use the async callback manager for async methods
* Handle BaseException rather than Exception, so more errors could be logged as errors when they are encountered
---------
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Description: Adding Summarization to Vectara, to reflect it provides not
only vector-store type functionality but also can return a summary.
Also added:
MMR capability (in the Vectara platform side)
Updated templates
Updated documentation and IPYNB examples
Tag maintainer: @baskaryan
Twitter handle: @ofermend
---------
Co-authored-by: Ofer Mendelevitch <ofermend@gmail.com>
**What is the reproduce code?**
```python
from langchain.chains import LLMChain, load_chain
from langchain.llms import Databricks
from langchain.prompts import PromptTemplate
def transform_output(response):
# Extract the answer from the responses.
return str(response["candidates"][0]["text"])
def transform_input(**request):
full_prompt = f"""{request["prompt"]}
Be Concise.
"""
request["prompt"] = full_prompt
return request
chat_model = Databricks(
endpoint_name="llama2-13B-chat-Brambles",
transform_input_fn=transform_input,
transform_output_fn=transform_output,
verbose=True,
)
print(f"Test chat model: {chat_model('What is Apache Spark')}") # This works
llm_chain = LLMChain(llm=chat_model, prompt=PromptTemplate.from_template("{chat_input}"))
llm_chain("colorful socks") # this works
llm_chain.save("databricks_llm_chain.yaml") # transform_input_fn and transform_output_fn are not serialized into the model yaml file
loaded_chain = load_chain("databricks_llm_chain.yaml") # The Databricks LLM is recreated with transform_input_fn=None, transform_output_fn=None.
loaded_chain("colorful socks") # Thus this errors. The transform_output_fn is needed to produce the correct output
```
Error:
```
File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-6c34afab-3473-421d-877f-1ef18930ef4d/lib/python3.10/site-packages/pydantic/v1/main.py", line 341, in __init__
raise validation_error
pydantic.v1.error_wrappers.ValidationError: 1 validation error for Generation
text
str type expected (type=type_error.str)
request payload: {'query': 'What is a databricks notebook?'}'}
```
**What does the error mean?**
When the LLM generates an answer, represented by a Generation data
object. The Generation data object takes a str field called text, e.g.
Generation(text=”blah”). However, the Databricks LLM tried to put a
non-str to text, e.g. Generation(text={“candidates”:[{“text”: “blah”}]})
Thus, pydantic errors.
**Why the output format becomes incorrect after saving and loading the
Databricks LLM?**
Databrick LLM does not support serializing transform_input_fn and
transform_output_fn, so they are not serialized into the model yaml
file. When the Databricks LLM is loaded, it is recreated with
transform_input_fn=None, transform_output_fn=None. Without
transform_output_fn, the output text is not unwrapped, thus errors.
Missing transform_output_fn causes this error.
Missing transform_input_fn causes the additional prompt “Be Concise.” to
be lost after saving and loading.
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://python.langchain.com/docs/contributing/
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
## Description
This PR intends to add support for Qdrant's new [sparse vector
retrieval](https://qdrant.tech/articles/sparse-vectors/) by introducing
a new retriever class, `QdrantSparseVectorRetriever`.
Necessary usage docs and integration tests have been added for the
retriever.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
- **Description:**
This PR fixes the issue faces with duplicate input id in Clarifai
vectorstore class when ingesting documents into the vectorstore more
than the batch size.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
## Description
Similar to https://github.com/langchain-ai/langchain/issues/5861, I've
experienced `KeyError`s resulting from unsafe lookups in the
`convert_dict_to_message` function in [this
file](https://github.com/langchain-ai/langchain/blob/master/libs/community/langchain_community/adapters/openai.py).
While that issue focused on `KeyError 'content'`, I've opened another
issue (#14764) about how the problem still exists in the same function
but with `KeyError 'role'`. The fix for #5861 only added a safe lookup
to the specific line that was giving them trouble.. This PR fixes the
unsafe lookup in the rest of the function but the problem still exists
across the repo.
## Issues
* #14764
* #5861
## Dependencies
* None
## Checklist
[x] make format
[x] make lint
[ ] make test - Results in `make: *** No rule to make target 'test'.
Stop.`
## Maintainers
* @hinthornw
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
This PR adds support for PygmalionAI's [Aphrodite
Engine](https://github.com/PygmalionAI/aphrodite-engine), based on
vLLM's attention mechanism. At the moment, this PR does not include
support for the API servers, but they will be added in a later PR.
The only dependency as of now is `aphrodite-engine==0.4.2`. We pin the
version to prevent breakage due to changes in the aphrodite-engine
library.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
- **Description:** Introducing an ability to work with the
[YandexGPT](https://cloud.yandex.com/en/services/yandexgpt) embeddings
models.
---------
Co-authored-by: Dmitry Tyumentsev <dmitry.tyumentsev@raftds.com>
- **Description:** Modify community chat model vertexai to handle png
and other image types encoded in base64
- **Dependencies:** added `import re` but no new dependencies.
This addresses a problem where the vertexai method
_parse_chat_history_gemini() was only recognizing image uris in jpeg
format. I made a simple change to cover other extension types.
- **Description:** The Qianfan SDK offers multiple authentication
methods, but in the `QianfanEndpoint` of Langchain, it currently only
supports authentication through AK and SK. In order to accommodate users
who wish to use alternative authentication methods, this pull request
makes AK and SK optional. This change should not impact existing users,
while allowing users to configure other authentication methods as per
the Qianfan SDK documentation.
- **Issue:** /
- **Dependencies:** No
- **Tag maintainer:** No
- **Twitter handle:**
Added Entry ID as a return value inside get_summaries_as_docs
- **Description:** Added the Entry ID as a return, so it's easier to
track the IDs of the papers that are being returned.
With the addition return of the entry ID in functions like
ArxivRetriever, it will be easier to reference the ID of the paper
itself.
- **Description:** Going forward, we have a own API `pip install
gradientai`. Therefore gradually removing the self-build packages in
llamaindex, haystack and langchain.
- **Issue:** None.
- **Dependencies:** `pip install gradientai`
- **Tag maintainer:** @michaelfeil
Very simple change in relation to the issue
https://github.com/langchain-ai/langchain/issues/14550
@baskaryan, @eyurtsev, @hwchase17.
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
**Description:** Added logic for re-calling the YandexGPT API in case of
an error
---------
Co-authored-by: Dmitry Tyumentsev <dmitry.tyumentsev@raftds.com>
Description: A new vector store Jaguar is being added. Class, test
scripts, and documentation is added.
Issue: None -- This is the first PR contributing to LangChain
Dependencies: This depends on "pip install -U jaguardb-http-client"
client http package
Tag maintainer: @baskaryan, @eyurtsev, @hwchase1
Twitter handle: @workbot
---------
Co-authored-by: JY <jyjy@jaguardb>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Addded missed docstrings. Fixed inconsistency in docstrings.
**Note** CC @efriis
There were PR errors on
`langchain_experimental/prompt_injection_identifier/hugging_face_identifier.py`
But, I didn't touch this file in this PR! Can it be some cache problems?
I fixed this error.
- **Description:** added support for chat_history for Google
GenerativeAI (to actually use the `chat` API) plus since Gemini
currently doesn't have a support for SystemMessage, added support for it
only if a user provides additional `convert_system_message_to_human`
flag during model initialization (in this case, SystemMessage would be
prepanded to the first HumanMessage)
- **Issue:** #14710
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** lkuligin
---------
Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com>
- **Description:** This is addition to [my previous
PR](https://github.com/langchain-ai/langchain/pull/13930) with
improvements to flexibility allowing different models and notebook to
use ONNX runtime for faster speed. Since the last PR, [our
model](https://huggingface.co/laiyer/deberta-v3-base-prompt-injection)
got more than 660k downloads, and with the [public
benchmark](https://huggingface.co/spaces/laiyer/prompt-injection-benchmark)
showed much fewer false-positives than the previous one from deepset.
Additionally, on the ONNX runtime, it can be running 3x faster on the
CPU, which might be handy for builders using Langchain.
**Issue:** N/A
- **Dependencies:** N/A
- **Tag maintainer:** N/A
- **Twitter handle:** `@laiyer_ai`
Fixing issue - https://github.com/langchain-ai/langchain/issues/14494 to
avoid Kendra query ValidationException
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** Update kendra.py to avoid Kendra query
ValidationException,
- **Issue:** the issue
#https://github.com/langchain-ai/langchain/issues/14494,
- **Dependencies:** None,
- **Tag maintainer:** ,
- **Twitter handle:**
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
- **Description:**
- Add a break case to `text_splitter.py::split_text_on_tokens()` to
avoid unwanted item at the end of result.
- Add a testcase to enforce the behavior.
- **Issue:**
- #14649
- #5897
- **Dependencies:** n/a,
---
**Quick illustration of change:**
```
text = "foo bar baz 123"
tokenizer = Tokenizer(
chunk_overlap=3,
tokens_per_chunk=7
)
output = split_text_on_tokens(text=text, tokenizer=tokenizer)
```
output before change: `["foo bar", "bar baz", "baz 123", "123"]`
output after change: `["foo bar", "bar baz", "baz 123"]`
This is technically a breaking change because it'll switch out default
models from `text-davinci-003` to `gpt-3.5-turbo-instruct`, but OpenAI
is shutting off those endpoints on 1/4 anyways.
Feels less disruptive to switch out the default instead.
Gpt-3.5 sometimes calls with empty string arguments instead of `{}`
I'd assume it's because the typescript representation on their backend
makes it a bit ambiguous.
- **Description:** VertexAIEmbeddings performance improvements
- **Twitter handle:** @vladkol
## Improvements
- Dynamic batch size, starting from 250, lowering down to 5. Batch size
varies across regions.
Some regions support larger batches, and it significantly improves
performance.
When running large batches of texts in `us-central1`, performance gain
can be up to 3.5x.
The dynamic batching also makes sure every batch is below 20K token
limit.
- New model parameter `embeddings_type` that translates to `task_type`
parameter of the API. Newer model versions support [different embeddings
task
types](https://cloud.google.com/vertex-ai/docs/generative-ai/embeddings/get-text-embeddings#api_changes_to_models_released_on_or_after_august_2023).
Now that it's supported again for OAI chat models .
Shame this wouldn't include it in the `.invoke()` output though (it's
not included in the message itself). Would need to do a follow-up for
that to be the case
Fixed:
- `_agenerate` return value in the YandexGPT Chat Model
- duplicate line in the documentation
Co-authored-by: Dmitry Tyumentsev <dmitry.tyumentsev@raftds.com>
Builds out a developer documentation section in the docs
- Links it from contributing.md
- Adds an initial guide on how to contribute an integration
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Adds the option for `similarity_score_threshold` when using
`MongoDBAtlasVectorSearch` as a vector store retriever.
Example use:
```
vector_search = MongoDBAtlasVectorSearch.from_documents(...)
qa_retriever = vector_search.as_retriever(
search_type="similarity_score_threshold",
search_kwargs={
"score_threshold": 0.5,
}
)
qa = RetrievalQA.from_chain_type(
llm=OpenAI(),
chain_type="stuff",
retriever=qa_retriever,
)
docs = qa({"query": "..."})
```
I've tested this feature locally, using a MongoDB Atlas Cluster with a
vector search index.
… (#14723)
- **Description:** Minor updates per marketing requests. Namely, name
decisions (AI Foundation Models / AI Playground)
- **Tag maintainer:** @hinthornw
Do want to pass around the PR for a bit and ask a few more marketing
questions before merge, but just want to make sure I'm not working in a
vacuum. No major changes to code functionality intended; the PR should
be for documentation and only minor tweaks.
Note: QA model is a bit borked across staging/prod right now. Relevant
teams have been informed and are looking into it, and I'm placeholdered
the response to that of a working version in the notebook.
Co-authored-by: Vadim Kudlay <32310964+VKudlay@users.noreply.github.com>
Replace this entire comment with:
- **Description:** added support for new Google GenerativeAI models
- **Twitter handle:** lkuligin
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
Description: Added NVIDIA AI Playground Initial support for a selection of models (Llama models, Mistral, etc.)
Dependencies: These models do depend on the AI Playground services in NVIDIA NGC. API keys with a significant amount of trial compute are available (10K queries as of the time of writing).
H/t to @VKudlay
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
Co-authored-by: fangkeke <3339698829@qq.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
Add a new ChatGoogleGenerativeAI class in a `langchain-google-genai`
package.
Still todo: add a deprecation warning in PALM
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Leonid Kuligin <lkuligin@yandex.ru>
Co-authored-by: Bagatur <baskaryan@gmail.com>
h/t to @lkuligin
- **Description:** added new models on VertexAI
- **Twitter handle:** @lkuligin
---------
Co-authored-by: Leonid Kuligin <lkuligin@yandex.ru>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
This PR adds an example notebook for the Databricks Vector Search vector
store. It also adds an introduction to the Databricks Vector Search
product on the Databricks's provider page.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
When using local Chatglm2-6B by changing OPENAI_BASE_URL to localhost,
the token_usage in ChatOpenAI becomes None. This leads to an
AttributeError when trying to access token_usage.items().
This commit adds a check to ensure token_usage is not None before
accessing its items. This change prevents the AttributeError and allows
ChatOpenAI to work seamlessly with a local Chatglm2-6B model, aligning
with the way it operates with the OpenAI API.
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
**Description:** This PR fixes `HuggingFaceHubEmbeddings` by making the
API token optional (as in the client beneath). Most models don't require
one. I also updated the notebook for TEI (text-embeddings-inference)
accordingly as requested here #14288. In addition, I fixed a mistake in
the POST call parameters.
**Tag maintainers:** @baskaryan
## Description
New YAML output parser as a drop-in replacement for the Pydantic output
parser. Yaml is a much more token-efficient format than JSON, proving to
be **~35% faster and using the same percentage fewer completion
tokens**.
☑️ Formatted
☑️ Linted
☑️ Tested (analogous to the existing`test_pydantic_parser.py`)
The YAML parser excels in situations where a list of objects is
required, where the root object needs no key:
```python
class Products(BaseModel):
__root__: list[Product]
```
I ran the prompt `Generate 10 healthy, organic products` 10 times on one
chain using the `PydanticOutputParser`, the other one using
the`YamlOutputParser` with `Products` (see below) being the targeted
model to be created.
LLMs used were Fireworks' `lama-v2-34b-code-instruct` and OpenAI
`gpt-3.5-turbo`. All runs succeeded without validation errors.
```python
class Nutrition(BaseModel):
sugar: int = Field(description="Sugar in grams")
fat: float = Field(description="% of daily fat intake")
class Product(BaseModel):
name: str = Field(description="Product name")
stats: Nutrition
class Products(BaseModel):
"""A list of products"""
products: list[Product] # Used `__root__` for the yaml chain
```
Stats after 10 runs reach were as follows:
### JSON
ø time: 7.75s
ø tokens: 380.8
### YAML
ø time: 5.12s
ø tokens: 242.2
Looking forward to feedback, tips and contributions!
- **Description:** There is a bug in RedisNum filter that filter towards
value 0 will be parsed as "*". This is a fix to it.
- **Issue:** NA
- **Dependencies:** NA
- **Tag maintainer:** NA
- **Twitter handle:** NA
This PR updates RunnableWithMessage history to support user specific
configuration for the factory.
It extends support to passing multiple named arguments into the factory
if the factory takes more than a single argument.
TIL `**` globstar doesn't work in make
Makefile changes fix that.
`__getattr__` changes allow import of all files, but raise error when
accessing anything from the module.
file deletions were corresponding libs change from #14559
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
**Description**
The `SmartLLMChain` was was fixed to output key "resolution".
Unfortunately, this prevents the ability to use multiple `SmartLLMChain`
in a `SequentialChain` because of colliding output keys. This change
simply gives the option the customize the output key to allow for
sequential chaining. The default behavior is the same as the current
behavior.
Now, it's possible to do the following:
```
from langchain.chat_models import ChatOpenAI
from langchain.prompts import PromptTemplate
from langchain_experimental.smart_llm import SmartLLMChain
from langchain.chains import SequentialChain
joke_prompt = PromptTemplate(
input_variables=["content"],
template="Tell me a joke about {content}.",
)
review_prompt = PromptTemplate(
input_variables=["scale", "joke"],
template="Rate the following joke from 1 to {scale}: {joke}"
)
llm = ChatOpenAI(temperature=0.9, model_name="gpt-4-32k")
joke_chain = SmartLLMChain(llm=llm, prompt=joke_prompt, output_key="joke")
review_chain = SmartLLMChain(llm=llm, prompt=review_prompt, output_key="review")
chain = SequentialChain(
chains=[joke_chain, review_chain],
input_variables=["content", "scale"],
output_variables=["review"],
verbose=True
)
response = chain.run({"content": "chickens", "scale": "10"})
print(response)
```
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
This reverts commit 38813d7090. This is a
temporary fix, as I don't see a clear way on how to use multiple keys
with `Qdrant.from_texts`.
Context: #14378
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
---------
Co-authored-by: Brace Sproul <braceasproul@gmail.com>
- **Description:** In Qdrant allows to input list of keys as the
content_payload_key to retrieve multiple fields (the generated document
will contain the dictionary {field: value} in a string),
- **Issue:** Previously we were able to retrieve only one field from the
vector database when making a search
- **Dependencies:**
- **Tag maintainer:**
- **Twitter handle:** @jb_dlb
---------
Co-authored-by: Jean Baptiste De La Broise <jeanbaptiste.delabroise@mdpi.com>
Description: This PR masked baidu qianfan - Chat_Models API Key and
added unit tests.
Issue: the issue langchain-ai#12165.
Tag maintainer: @eyurtsev
---------
Co-authored-by: xiayi <xiayi@bytedance.com>
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
We found a request with `max_tokens=None` results in the following error
in Anthropic:
```
HTTPError: 400 Client Error: Bad Request for url: https://oregon.staging.cloud.databricks.com/serving-endpoints/corey-anthropic/invocations.
Response text: {"error_code":"INVALID_PARAMETER_VALUE","message":"INVALID_PARAMETER_VALUE: max_tokens was not of type Integer: null"}
```
This PR excludes `max_tokens` if it's None.
- **Description:** new parameters in OpenAIEmbeddings() constructor
(retry_min_seconds and retry_max_seconds) that allow parametrization by
the user of the former min_seconds and max_seconds that were hidden in
_create_retry_decorator() and _async_retry_decorator()
- **Issue:** #9298, #12986
- **Dependencies:** none
- **Tag maintainer:** @hwchase17
- **Twitter handle:** @adumont
make format ✅
make lint ✅
make test ✅
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Description :
Updated the functions with new Clarifai python SDK.
Enabled initialisation of Clarifai class with model URL.
Updated docs with new functions examples.
Remove whitespaces from the input of the ListSQLDatabaseTool for better
support.
for example, the input "table1,table2,table3" will throw an exception
whiteout the change although it's a valid input.
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** add gitlab url from env,
- **Issue:** no issue,
- **Dependencies:** no,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
This PR adds support for metadata filters of the form:
`{"filter": {"key": { "NIN" : ["list", "of", "values"]}}}`
"IN" is already supported, so this is a quick & related update to add
"NIN"
- **Description:**
1. Add system parameters to the ERNIE LLM API to set the role of the
LLM.
2. Add support for the ERNIE-Bot-turbo-AI model according from the
document https://cloud.baidu.com/doc/WENXINWORKSHOP/s/Alp0kdm0n.
3. For the function call of ErnieBotChat, align with the
QianfanChatEndpoint.
With this PR, the `QianfanChatEndpoint()` can use the `function calling`
ability with `create_ernie_fn_chain()`. The example is as the following:
```
from langchain.prompts import ChatPromptTemplate
import json
from langchain.prompts.chat import (
ChatPromptTemplate,
)
from langchain.chat_models import QianfanChatEndpoint
from langchain.chains.ernie_functions import (
create_ernie_fn_chain,
)
def get_current_news(location: str) -> str:
"""Get the current news based on the location.'
Args:
location (str): The location to query.
Returs:
str: Current news based on the location.
"""
news_info = {
"location": location,
"news": [
"I have a Book.",
"It's a nice day, today."
]
}
return json.dumps(news_info)
def get_current_weather(location: str, unit: str="celsius") -> str:
"""Get the current weather in a given location
Args:
location (str): location of the weather.
unit (str): unit of the tempuature.
Returns:
str: weather in the given location.
"""
weather_info = {
"location": location,
"temperature": "27",
"unit": unit,
"forecast": ["sunny", "windy"],
}
return json.dumps(weather_info)
template = ChatPromptTemplate.from_messages([
("user", "{user_input}"),
])
chat = QianfanChatEndpoint(model="ERNIE-Bot-4")
chain = create_ernie_fn_chain([get_current_weather, get_current_news], chat, template, verbose=True)
res = chain.run("北京今天的新闻是什么?")
print(res)
```
The result of the above code:
```
> Entering new LLMChain chain...
Prompt after formatting:
Human: 北京今天的新闻是什么?
> Finished chain.
{'name': 'get_current_news', 'arguments': {'location': '北京'}}
```
For the `ErnieBotChat`, now can use the `system` parameter to set the
role of the LLM.
```
from langchain.prompts import ChatPromptTemplate
from langchain.chains import LLMChain
from langchain.chat_models import ErnieBotChat
llm = ErnieBotChat(model_name="ERNIE-Bot-turbo-AI", system="你是一个能力很强的机器人,你的名字叫 小叮当。无论问你什么问题,你都可以给出答案。")
prompt = ChatPromptTemplate.from_messages(
[
("human", "{query}"),
]
)
chain = LLMChain(llm=llm, prompt=prompt, verbose=True)
res = chain.run(query="你是谁?")
print(res)
```
The result of the above code:
```
> Entering new LLMChain chain...
Prompt after formatting:
Human: 你是谁?
> Finished chain.
我是小叮当,一个智能机器人。我可以为你提供各种服务,包括回答问题、提供信息、进行计算等。如果你需要任何帮助,请随时告诉我,我会尽力为你提供最好的服务。
```
- **Description:** Added a notebook to illustrate how to use
`text-embeddings-inference` from huggingface. As
`HuggingFaceHubEmbeddings` was using a deprecated client, I made the
most of this PR updating that too.
- **Issue:** #13286
- **Dependencies**: None
- **Tag maintainer:** @baskaryan
- **Description:** Update code to correctly pass the kwargs
- **Issue:** #14295
- **Dependencies:** -
- **Tag maintainer:**
<--
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
#issue-14295
- **Description:** allows not enforcing function usage when a single
function is passed to an openAI function executable (or corresponding
legacy chain). This is a desired feature in the case where the model
does not have enough information to call a function, and needs to get
back to the user.
- **Issue:** N/A
- **Dependencies:** N/A
- **Tag maintainer:** N/A
Add metadata to the blob object. This makes it easier
to make a pipeline that properly propagates metadata information
from raw content to the derived content.
- Fixes `input_variables=[""]` crashing validations with a template
`"{}"`
- Uses `__cause__` for proper `Exception` chaining in
`check_valid_template`
- **Description:** Fix#11737 issue (extra_tools option of
create_pandas_dataframe_agent is not working),
- **Issue:** #11737 ,
- **Dependencies:** no,
- **Tag maintainer:** @baskaryan, @eyurtsev, @hwchase17 I needed this
method at work, so I modified it myself and used it. There is a similar
issue(#11737) and PR(#13018) of @PyroGenesis, so I combined my code at
the original PR.
You may be busy, but it would be great help for me if you checked. Thank
you.
- **Twitter handle:** @lunara_x
If you need an .ipynb example about this, please tag me.
I will share what I am working on after removing any work-related
content.
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
- **Description:** Enhanced `create_sync_playwright_browser` and
`create_async_playwright_browser` functions to accept a list of
arguments. These arguments are now forwarded to
`browser.chromium.launch()` for customizable browser instantiation.
- **Issue:** #13143
- **Dependencies:** None
- **Tag maintainer:** @eyurtsev,
- **Twitter handle:** Dr_Bearden
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
- **Description:** Adapt JinaEmbeddings to run with the new Jina AI
Embedding platform
- **Twitter handle:** https://twitter.com/JinaAI_
---------
Co-authored-by: Joan Fontanals Martinez <joan.fontanals.martinez@jina.ai>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
- **Description:**
Reference library azure-search-documents has been adapted in version
11.4.0:
1. Notebook explaining Azure AI Search updated with most recent info
2. HnswVectorSearchAlgorithmConfiguration --> HnswAlgorithmConfiguration
3. PrioritizedFields(prioritized_content_fields) -->
SemanticPrioritizedFields(content_fields)
4. SemanticSettings --> SemanticSearch
5. VectorSearch(algorithm_configurations) -->
VectorSearch(configurations)
--> Changes now reflected on Langchain: default vector search config
from langchain is now compatible with officially released library from
Azure.
- **Issue:**
Issue creating a new index (due to wrong class used for default vector
search configuration) if using latest version of azure-search-documents
with current langchain version
- **Dependencies:** azure-search-documents>=11.4.0,
- **Tag maintainer:** ,
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
- **Description:** This PR modifies the LLM validation in OpenAI
function agents to check whether the LLM supports OpenAI functions based
on a property (`supports_oia_functions`) instead of whether the LLM
passed to the agent `isinstance` of `ChatOpenAI`. This allows classes
that extend `BaseChatModel` to be passed to these agents as long as
they've been integrated with the OpenAI APIs and have this property set,
even if they don't extend `ChatOpenAI`.
- **Issue:** N/A
- **Dependencies:** none
for issue https://github.com/langchain-ai/langchain/issues/13162
migrate openai audio api, as [openai v1.0.0 Migration
Guide](https://github.com/openai/openai-python/discussions/742)
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
---------
Co-authored-by: Double Max <max@ground-map.com>
- **Description:** In openapi/planner deal with json in markdown output
cases
- **Issue:** In some cases LLMs could return json in markdown which
can't be loaded.
- **Dependencies:**
- **Tag maintainer:** @eyurtsev
- **Twitter handle:**
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
- **Description:** Adds doc key to metadata field when adding document
to Azure Search.
- **Issue:** -,
- **Dependencies:** -,
- **Tag maintainer:** @eyurtsev,
- **Twitter handle:** @finnless
Right now the document key with the name FIELDS_ID is not included in
the FIELDS_METADATA field, and therefore is not included in the Document
returned from a query. This is really annoying if you want to be able to
modify that item in the vectorstore.
Other's thoughts on this are welcome.
Description: There's a copy-paste typo where on_llm_error() calls
_on_chain_error() instead of _on_llm_error().
Issue: #13580
Dependencies: None
Tag maintainer: @hwchase17
Twitter handle: @jwatte
"Run `make format`, `make lint` and `make test` to check this locally."
The test scripts don't work in a plain Ubuntu LTS 20.04 system.
It looks like the dev container pulling is stuck. Or maybe the internet
is just ornery today.
---------
Co-authored-by: jwatte <jwatte@observeinc.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
here it is validating shapely.geometry.point.Point: if not
isinstance(data_frame[page_content_column].iloc[0], gpd.GeoSeries):
raise ValueError(
f"Expected data_frame[{page_content_column}] to be a GeoSeries" you need
it to validate the geoSeries and not the shapely.geometry.point.Point
if not isinstance(data_frame[page_content_column], gpd.GeoSeries):
raise ValueError(
f"Expected data_frame[{page_content_column}] to be a GeoSeries"
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
**Description**
Implements `max_marginal_relevance_search` and
`max_marginal_relevance_search_by_vector` for the Momento Vector Index
vectorstore.
Additionally bumps the `momento` dependency in the lock file and adds
logging to the implementation.
**Dependencies**
✅ updates `momento` dependency in lock file
**Tag maintainer**
@baskaryan
**Twitter handle**
Please tag @momentohq for Momento Vector Index and @mloml for the
contribution 🙇
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
Hi! I'm Alex, Python SDK Team Lead from
[Comet](https://www.comet.com/site/).
This PR contains our new integration between langchain and Comet -
`CometTracer` class which uses new `comet_llm` python package for
submitting data to Comet.
No additional dependencies for the langchain package are required
directly, but if the user wants to use `CometTracer`, `comet-llm>=2.0.0`
should be installed. Otherwise an exception will be raised from
`CometTracer.__init__`.
A test for the feature is included.
There is also an already existing callback (and .ipynb file with
example) which ideally should be deprecated in favor of a new tracer. I
wasn't sure how exactly you'd prefer to do it. For example we could open
a separate PR for that.
I'm open to your ideas :)
Running a large number of requests to Embaas' servers (or any server)
can result in intermittent network failures (both from local and
external network/service issues). This PR implements exponential backoff
retries to help mitigate this issue.
The Github utilities are fantastic, so I'm adding support for deeper
interaction with pull requests. Agents should read "regular" comments
and review comments, and the content of PR files (with summarization or
`ctags` abbreviations).
Progress:
- [x] Add functions to read pull requests and the full content of
modified files.
- [x] Function to use Github's built in code / issues search.
Out of scope:
- Smarter summarization of file contents of large pull requests (`tree`
output, or ctags).
- Smarter functions to checkout PRs and edit the files incrementally
before bulk committing all changes.
- Docs example for creating two agents:
- One watches issues: For every new issue, open a PR with your best
attempt at fixing it.
- The other watches PRs: For every new PR && every new comment on a PR,
check the status and try to finish the job.
<!-- Thank you for contributing to LangChain!
Replace this comment with:
- Description: a description of the change,
- Issue: the issue # it fixes (if applicable),
- Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!
Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use.
Maintainer responsibilities:
- General / Misc / if you don't know who to tag: @baskaryan
- DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
- Models / Prompts: @hwchase17, @baskaryan
- Memory: @hwchase17
- Agents / Tools / Toolkits: @hinthornw
- Tracing / Callbacks: @agola11
- Async: @agola11
If no one reviews your PR within a few days, feel free to @-mention the
same people again.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
Allow users to pass a generic `BaseStore[str, bytes]` to
MultiVectorRetriever, removing the need to use the `create_kv_docstore`
method. This encoding will now happen internally.
@rlancemartin @eyurtsev
---------
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
**Description:**
When a RunnableLambda only receives a synchronous callback, this
callback is wrapped into an async one since #13408. However, this
wrapping with `(*args, **kwargs)` causes the `accepts_config` check at
[/libs/core/langchain_core/runnables/config.py#L342](ee94ef55ee/libs/core/langchain_core/runnables/config.py (L342))
to fail, as this checks for the presence of a "config" argument in the
method signature.
Adding a `functools.wraps` around it, resolves it.
If we are not going to make the existing Docstore class also implement
`BaseStore[str, Document]`, IMO all base store implementations should
always be `[str, bytes]` so that they are more interchangeable.
CC @rlancemartin @eyurtsev
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** The existing version hardcoded search.windows.net in
the base url. This is not compatible with the gov cloud. I am allowing
the user to override the default for gov cloud support.,
- **Issue:** N/A, did not write up in an issue,
- **Dependencies:** None
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
---------
Co-authored-by: Nicholas Ceccarelli <nceccarelli2@moog.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
- **Description:** Obsidian templates can include
[variables](https://help.obsidian.md/Plugins/Templates#Template+variables)
using double curly braces. `ObsidianLoader` uses PyYaml to parse the
frontmatter of documents. This parsing throws an error when encountering
variables' curly braces. This is avoided by temporarily substituting
safe strings before parsing.
- **Issue:** #13887
- **Tag maintainer:** @hwchase17
**Description:**
Adds the document loader for [Couchbase](http://couchbase.com/), a
distributed NoSQL database.
**Dependencies:**
Added the Couchbase SDK as an optional dependency.
**Twitter handle:** nithishr
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
- **Description:** Our PR is an integration of a Steam API Tool that
makes recommendations on steam games based on user's Steam profile and
provides information on games based on user provided queries.
- **Issue:** the issue # our PR implements:
https://github.com/langchain-ai/langchain/issues/12120
- **Dependencies:** python-steam-api library, steamspypi library and
decouple library
- **Tag maintainer:** @baskaryan, @hwchase17
- **Twitter handle:** N/A
Hello langchain Maintainers,
We are a team of 4 University of Toronto students contributing to
langchain as part of our course [CSCD01 (link to course
page)](https://cscd01.com/work/open-source-project). We hope our changes
help the community. We have run make format, make lint and make test
locally before submitting the PR. To our knowledge, our changes do not
introduce any new errors.
Our PR integrates the python-steam-api, steamspypi and decouple
packages. We have added integration tests to test our python API
integration into langchain and an example notebook is also provided.
Our amazing team that contributed to this PR: @JohnY2002, @shenceyang,
@andrewqian2001 and @muntaqamahmood
Thank you in advance to all the maintainers for reviewing our PR!
---------
Co-authored-by: Shence <ysc1412799032@163.com>
Co-authored-by: JohnY2002 <johnyuan0526@gmail.com>
Co-authored-by: Andrew Qian <andrewqian2001@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: JohnY <94477598+JohnY2002@users.noreply.github.com>
### Description
Starting from [openai version
1.0.0](17ac677995 (module-level-client)),
the camel case form of `openai.ChatCompletion` is no longer supported
and has been changed to lowercase `openai.chat.completions`. In
addition, the returned object only accepts attribute access instead of
index access:
```python
import openai
# optional; defaults to `os.environ['OPENAI_API_KEY']`
openai.api_key = '...'
# all client options can be configured just like the `OpenAI` instantiation counterpart
openai.base_url = "https://..."
openai.default_headers = {"x-foo": "true"}
completion = openai.chat.completions.create(
model="gpt-4",
messages=[
{
"role": "user",
"content": "How do I output all files in a directory using Python?",
},
],
)
print(completion.choices[0].message.content)
```
So I implemented a compatible adapter that supports both attribute
access and index access:
```python
In [1]: from langchain.adapters import openai as lc_openai
...: messages = [{"role": "user", "content": "hi"}]
In [2]: result = lc_openai.chat.completions.create(
...: messages=messages, model="gpt-3.5-turbo", temperature=0
...: )
In [3]: result.choices[0].message
Out[3]: {'role': 'assistant', 'content': 'Hello! How can I assist you today?'}
In [4]: result["choices"][0]["message"]
Out[4]: {'role': 'assistant', 'content': 'Hello! How can I assist you today?'}
In [5]: result = await lc_openai.chat.completions.acreate(
...: messages=messages, model="gpt-3.5-turbo", temperature=0
...: )
In [6]: result.choices[0].message
Out[6]: {'role': 'assistant', 'content': 'Hello! How can I assist you today?'}
In [7]: result["choices"][0]["message"]
Out[7]: {'role': 'assistant', 'content': 'Hello! How can I assist you today?'}
In [8]: for rs in lc_openai.chat.completions.create(
...: messages=messages, model="gpt-3.5-turbo", temperature=0, stream=True
...: ):
...: print(rs.choices[0].delta)
...: print(rs["choices"][0]["delta"])
...:
{'role': 'assistant', 'content': ''}
{'role': 'assistant', 'content': ''}
{'content': 'Hello'}
{'content': 'Hello'}
{'content': '!'}
{'content': '!'}
In [20]: async for rs in await lc_openai.chat.completions.acreate(
...: messages=messages, model="gpt-3.5-turbo", temperature=0, stream=True
...: ):
...: print(rs.choices[0].delta)
...: print(rs["choices"][0]["delta"])
...:
{'role': 'assistant', 'content': ''}
{'role': 'assistant', 'content': ''}
{'content': 'Hello'}
{'content': 'Hello'}
{'content': '!'}
{'content': '!'}
...
```
### Twitter handle
[lin_bob57617](https://twitter.com/lin_bob57617)
- **Description:** to support not only publicly available Hugging Face
endpoints, but also protected ones (created with "Inference Endpoints"
Hugging Face feature), I have added ability to specify custom api_url.
But if not specified, default behaviour won't change
- **Issue:** #9181,
- **Dependencies:** no extra dependencies
**Description:** The way the condition is checked in the
`return_stopped_response` function of `OpenAIAgent` may not be correct,
when the value returned is `AgentFinish` from the tools it does not work
properly.
Thanks for review, @baskaryan, @eyurtsev, @hwchase17.
- **Description:** Adds `llm_chain_kwargs` to `BaseRetrievalQA.from_llm`
so these can be passed to the LLM at runtime,
- **Issue:** https://github.com/langchain-ai/langchain/issues/14216,
---------
Signed-off-by: ugm2 <unaigaraymaestre@gmail.com>
- **Description:** As part of my conversation with Cerebrium team,
`model_api_request` will be no longer available in cerebrium lib so it
needs to be replaced.
- **Issue:** #12705 12705,
- **Dependencies:** Cerebrium team (agreed)
- **Tag maintainer:** @eyurtsev
- **Twitter handle:** No official Twitter account sorry :D
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
**Description:** Adding a possibility to use asynchronous callback
handler in human-in-the-loop validation tool. Very useful, for example,
if you want to implement a validation over Telegram bot.
**Issue:** -
**Dependencies:** -
---------
Co-authored-by: Daniyar_Supiyev <daniyar_supiyev@epam.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
- **Description** An integration to allow the Yellowbrick Data Warehouse
to function as a vector store
---------
Co-authored-by: markcusack <markcusack@markcusacksmac.lan>
Co-authored-by: markcusack <markcusack@Mark-Cusack-sMac.local>
- **Description**: This PR addresses an issue with the OpenAI API
streaming response, where initially the key (arguments) is provided but
the value is None. Subsequently, it updates with {"arguments": "{\n"},
leading to a type inconsistency that causes an exception. The specific
error encountered is ValueError: additional_kwargs["arguments"] already
exists in this message, but with a different type. This change aims to
resolve this inconsistency and ensure smooth API interactions.
- **Issue**: None.
- **Dependencies**: None.
- **Tag maintainer**: @eyurtsev
This is an updated version of #13229 based on the refactored code.
Credit goes to @superken01.
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
- **Description:** some vector stores have a flag for try deleting the
collection before creating it (such as ´vectorpg´). This is a useful
flag when prototyping indexing pipelines and also for integration tests.
Added the bool flag `pre_delete_collection ` to the constructor (default
False)
- **Tag maintainer:** @hemidactylus
- **Twitter handle:** nicoloboschi
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** This extends `OpenAIEmbeddings` to add support for
non-`tiktoken` based embeddings, specifically for use with the new
`text-generation-webui` API (`--extensions openai`) which does not
support `tiktoken` encodings, but rather strings
- **Issue:** Not found,
- **Dependencies:** HuggingFace `transformers.AutoTokenizer` is new
dependency for running the model without `tiktoken`
- **Tag maintainer:** @baskaryan based on last commit for
`langchain-core` refactor
- **Twitter handle:** @xychelsea
Modified the tokenization process to be model-agnostic, allowing for
both OpenAI and non-OpenAI model tokenizations, by setting the new
default `bool` flag `tiktoken_enabled` to `False`. This requeires
HuggingFace’s AutoTokenizer and handling tokenization for models
requiring different preprocessing steps to generate a chunked string
request rather than a list of integers.
Updated the embeddings generation process to accommodate non-OpenAI
models. This includes converting tokenized text into embeddings using
OpenAI’s and Hugging Face’s model architectures.
-->
Hi,
I made some code changes on the Hologres vector store to improve the
data insertion performance.
Also, this version of the code uses `hologres-vector` library. This
library is more convenient for us to update, and more efficient in
performance.
The code has passed the format/lint/spell check. I have run the unit
test for Hologres connecting to my own database.
Please check this PR again and tell me if anything needs to change.
Best,
Changgeng,
Developer @ Alibaba Cloud
Co-authored-by: Changgeng Zhao <zhaochanggeng.zcg@alibaba-inc.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
- **Description:** Fixes the Mathpix PDF loader API integration.
Specifically, ensures that Mathpix auth headers are provided for every
request, and ensures that we recognize all errors that can occur during
a request. Also, the option to provide API keys as kwargs never actually
worked before, but now that's fixed too.
- **Issue:** #11249
- **Dependencies:** None
- **Description:**
This PR introduces the Slack toolkit to LangChain, which allows users to
read and write to Slack using the Slack API. Specifically, we've added
the following tools.
1. get_channel: Provides a summary of all the channels in a workspace.
2. get_message: Gets the message history of a channel.
3. send_message: Sends a message to a channel.
4. schedule_message: Sends a message to a channel at a specific time and
date.
- **Issue:** This pull request addresses [Add Slack Toolkit
#11747](https://github.com/langchain-ai/langchain/issues/11747)
- **Dependencies:** package`slack_sdk`
Note: For this toolkit to function you will need to add a Slack app to
your workspace. Additional info can be found
[here](https://slack.com/help/articles/202035138-Add-apps-to-your-Slack-workspace).
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: ArianneLavada <ariannelavada@gmail.com>
Co-authored-by: ArianneLavada <84357335+ArianneLavada@users.noreply.github.com>
Co-authored-by: ariannelavada@gmail.com <you@example.com>
Unnecessarily overridden methods:
- Give the idea the subclass is doing something special (when it isn't)
- Block CTRL-click to the actual method
This PR removes some unnecessarily overridden methods in
`StdOutCallbackHandler`
Supercedes https://github.com/langchain-ai/langchain/pull/12858
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
Hi,
There is some unintended behavior in Html2TextTransformer.
The current code is **directly modifying the original documents that are
passed as arguments to the function.**
Therefore, not only the return of the function but also the input
variables are being modified simultaneously.
**To resolve this, I added unit test code as well.**
reference link: [Shallow vs Deep Copying of Python
Objects](https://realpython.com/copying-python-objects/)
Thanks! ☺️
Before, we need to use `params` to pass extra parameters:
```python
from langchain.llms import Databricks
Databricks(..., params={"temperature": 0.0})
```
Now, we can directly specify extra params:
```python
from langchain.llms import Databricks
Databricks(..., temperature=0.0)
```
This PR adds an "Azure AI data" document loader, which allows Azure AI
users to load their registered data assets as a document object in
langchain.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
See PR title.
From what I can see, `poetry` will auto-include this. Please let me know
if I am missing something here.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
… properly
Fixed a bug that was causing the streaming transfer to not work
properly.
- **Description:
1、The on_llm_new_token method in the streaming callback can now be
called properly in streaming transfer mode.
2、In streaming transfer mode, LLM can now correctly output the complete
response instead of just the first token.
- **Tag maintainer: @wangxuqi
- **Twitter handle: @kGX7XJjuYxzX9Km
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
* Add support for passing a specific file to the file system blob loader
* Allow specifying a class parameter for the parser for the generic
loader
```python
class AudioLoader(GenericLoader):
@staticmethod
def get_parser(**kwargs):
return MyAudioParser(**kwargs):
```
The intent of the GenericLoader is to provide on-ramps from different
sources (e.g., web, s3, file system).
An alternative is to use pipelining syntax or creating a Pipeline
```
FileSystemBlobLoader(...) | MyAudioParser
```
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
- **Description:** just a little change of ErnieChatBot class
description, sugguesting user to use more suitable class
- **Issue:** none,
- **Dependencies:** none,
- **Tag maintainer:** @baskaryan ,
- **Twitter handle:** none
**Description**
`embed_with_retry` is for sync operations and not for async operations.
Use `async_embed_with_retry` for appropriate async operations.
I'm using `OpenAIEmbedding(http_client=httpx.AsyncClient())` with only
async operations.
However, I got an error when I use `embedding.aembed_documents` because
`embed_with_retry` uses sync OpenAI client with async http client.
Description
when the desc of arg in python docstring contains ":", the
`_parse_python_function_docstring` will raise **ValueError: too many
values to unpack (expected 2)**.
A sample desc would be:
"""
Args:
error_arg: this is an arg with an additional ":" symbol
"""
So, set `maxsplit` parameter to fix it.
The number of times I try to format a string (especially in lcel) is
embarrassingly high. Think this may be more actionable than the default
error message. Now I get nice helpful errors
```
KeyError: "Input to ChatPromptTemplate is missing variable 'input'. Expected: ['input'] Received: ['dialogue']"
```
**Description:** By combining the document timestamp refresh within a
single call to update(), this enables batching of multiple documents in
a single SQL statement. This is important for non-local databases where
tens of milliseconds has a huge impact on performance when doing
document-by-document SQL statements.
**Issue:** #11935
**Dependencies:** None
**Tag maintainer:** @eyurtsev
CC @baskaryan @hwchase17 @jmorganca
Having a bit of trouble importing `langchain_experimental` from a
notebook, will figure it out tomorrow
~Ah and also is blocked by #13226~
---------
Co-authored-by: Lance Martin <lance@langchain.dev>
Co-authored-by: Bagatur <baskaryan@gmail.com>
## Description
Related to https://github.com/mlflow/mlflow/pull/10420. MLflow AI
gateway will be deprecated and replaced by the `mlflow.deployments`
module. Happy to split this PR if it's too large.
```
pip install git+https://github.com/langchain-ai/langchain.git@refs/pull/13699/merge#subdirectory=libs/langchain
```
## Dependencies
Install mlflow from https://github.com/mlflow/mlflow/pull/10420:
```
pip install git+https://github.com/mlflow/mlflow.git@refs/pull/10420/merge
```
## Testing plan
The following code works fine on local and databricks:
<details><summary>Click</summary>
<p>
```python
"""
Setup
-----
mlflow deployments start-server --config-path examples/gateway/openai/config.yaml
databricks secrets create-scope <scope>
databricks secrets put-secret <scope> openai-api-key --string-value $OPENAI_API_KEY
Run
---
python /path/to/this/file.py secrets/<scope>/openai-api-key
"""
from langchain.chat_models import ChatMlflow, ChatDatabricks
from langchain.embeddings import MlflowEmbeddings, DatabricksEmbeddings
from langchain.llms import Databricks, Mlflow
from langchain.schema.messages import HumanMessage
from langchain.chains.loading import load_chain
from mlflow.deployments import get_deploy_client
import uuid
import sys
import tempfile
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
###############################
# MLflow
###############################
chat = ChatMlflow(
target_uri="http://127.0.0.1:5000", endpoint="chat", params={"temperature": 0.1}
)
print(chat([HumanMessage(content="hello")]))
embeddings = MlflowEmbeddings(target_uri="http://127.0.0.1:5000", endpoint="embeddings")
print(embeddings.embed_query("hello")[:3])
print(embeddings.embed_documents(["hello", "world"])[0][:3])
llm = Mlflow(
target_uri="http://127.0.0.1:5000",
endpoint="completions",
params={"temperature": 0.1},
)
print(llm("I am"))
llm_chain = LLMChain(
llm=llm,
prompt=PromptTemplate(
input_variables=["adjective"],
template="Tell me a {adjective} joke",
),
)
print(llm_chain.run(adjective="funny"))
# serialization/deserialization
with tempfile.TemporaryDirectory() as tmpdir:
print(tmpdir)
path = f"{tmpdir}/llm.yaml"
llm_chain.save(path)
loaded_chain = load_chain(path)
print(loaded_chain("funny"))
###############################
# Databricks
###############################
secret = sys.argv[1]
client = get_deploy_client("databricks")
# External - chat
name = f"chat-{uuid.uuid4()}"
client.create_endpoint(
name=name,
config={
"served_entities": [
{
"name": "test",
"external_model": {
"name": "gpt-4",
"provider": "openai",
"task": "llm/v1/chat",
"openai_config": {
"openai_api_key": "{{" + secret + "}}",
},
},
}
],
},
)
try:
chat = ChatDatabricks(
target_uri="databricks", endpoint=name, params={"temperature": 0.1}
)
print(chat([HumanMessage(content="hello")]))
finally:
client.delete_endpoint(endpoint=name)
# External - embeddings
name = f"embeddings-{uuid.uuid4()}"
client.create_endpoint(
name=name,
config={
"served_entities": [
{
"name": "test",
"external_model": {
"name": "text-embedding-ada-002",
"provider": "openai",
"task": "llm/v1/embeddings",
"openai_config": {
"openai_api_key": "{{" + secret + "}}",
},
},
}
],
},
)
try:
embeddings = DatabricksEmbeddings(target_uri="databricks", endpoint=name)
print(embeddings.embed_query("hello")[:3])
print(embeddings.embed_documents(["hello", "world"])[0][:3])
finally:
client.delete_endpoint(endpoint=name)
# External - completions
name = f"completions-{uuid.uuid4()}"
client.create_endpoint(
name=name,
config={
"served_entities": [
{
"name": "test",
"external_model": {
"name": "gpt-3.5-turbo-instruct",
"provider": "openai",
"task": "llm/v1/completions",
"openai_config": {
"openai_api_key": "{{" + secret + "}}",
},
},
}
],
},
)
try:
llm = Databricks(
endpoint_name=name,
model_kwargs={"temperature": 0.1},
)
print(llm("I am"))
finally:
client.delete_endpoint(endpoint=name)
# Foundation model - chat
chat = ChatDatabricks(
endpoint="databricks-llama-2-70b-chat", params={"temperature": 0.1}
)
print(chat([HumanMessage(content="hello")]))
# Foundation model - embeddings
embeddings = DatabricksEmbeddings(endpoint="databricks-bge-large-en")
print(embeddings.embed_query("hello")[:3])
# Foundation model - completions
llm = Databricks(
endpoint_name="databricks-mpt-7b-instruct", model_kwargs={"temperature": 0.1}
)
print(llm("hello"))
llm_chain = LLMChain(
llm=llm,
prompt=PromptTemplate(
input_variables=["adjective"],
template="Tell me a {adjective} joke",
),
)
print(llm_chain.run(adjective="funny"))
# serialization/deserialization
with tempfile.TemporaryDirectory() as tmpdir:
print(tmpdir)
path = f"{tmpdir}/llm.yaml"
llm_chain.save(path)
loaded_chain = load_chain(path)
print(loaded_chain("funny"))
```
Output:
```
content='Hello! How can I assist you today?'
[-0.025058426, -0.01938856, -0.027781019]
[-0.025058426, -0.01938856, -0.027781019]
sorry, but I cannot continue the sentence as it is incomplete. Can you please provide more information or context?
Sure, here's a classic one for you:
Why don't scientists trust atoms?
Because they make up everything!
/var/folders/dz/cd_nvlf14g9g__n3ph0d_0pm0000gp/T/tmpx_4no6ad
{'adjective': 'funny', 'text': "Sure, here's a classic one for you:\n\nWhy don't scientists trust atoms?\n\nBecause they make up everything!"}
content='Hello! How can I assist you today?'
[-0.025058426, -0.01938856, -0.027781019]
[-0.025058426, -0.01938856, -0.027781019]
a 23 year old female and I am currently studying for my master's degree
content="\nHello! It's nice to meet you. Is there something I can help you with or would you like to chat for a bit?"
[0.051055908203125, 0.007221221923828125, 0.003879547119140625]
[0.051055908203125, 0.007221221923828125, 0.003879547119140625]
hello back
Well, I don't really know many jokes, but I do know this funny story...
/var/folders/dz/cd_nvlf14g9g__n3ph0d_0pm0000gp/T/tmp7_ds72ex
{'adjective': 'funny', 'text': " Well, I don't really know many jokes, but I do know this funny story..."}
```
</p>
</details>
The existing workflow doesn't break:
<details><summary>click</summary>
<p>
```python
import uuid
import mlflow
from mlflow.models import ModelSignature
from mlflow.types.schema import ColSpec, Schema
class MyModel(mlflow.pyfunc.PythonModel):
def predict(self, context, model_input):
return str(uuid.uuid4())
with mlflow.start_run():
mlflow.pyfunc.log_model(
"model",
python_model=MyModel(),
pip_requirements=["mlflow==2.8.1", "cloudpickle<3"],
signature=ModelSignature(
inputs=Schema(
[
ColSpec("string", "prompt"),
ColSpec("string", "stop"),
]
),
outputs=Schema(
[
ColSpec(name=None, type="string"),
]
),
),
registered_model_name=f"lang-{uuid.uuid4()}",
)
# Manually create a serving endpoint with the registered model and run
from langchain.llms import Databricks
llm = Databricks(endpoint_name="<name>")
llm("hello") # 9d0b2491-3d13-487c-bc02-1287f06ecae7
```
</p>
</details>
## Follow-up tasks
(This PR is too large. I'll file a separate one for follow-up tasks.)
- Update `docs/docs/integrations/providers/mlflow_ai_gateway.mdx` and
`docs/docs/integrations/providers/databricks.md`.
---------
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
…parameters.
In Langchain's `dumps()` function, I've added a `**kwargs` parameter.
This allows users to pass additional parameters to the underlying
`json.dumps()` function, providing greater flexibility and control over
JSON serialization.
Many parameters available in `json.dumps()` can be useful or even
necessary in specific situations. For example, when using an Agent with
return_intermediate_steps set to true, the output is a list of
AgentAction objects. These objects can't be serialized without using
Langchain's `dumps()` function.
The issue arises when using the Agent with a language other than
English, which may contain non-ASCII characters like 'é'. The default
behavior of `json.dumps()` sets ensure_ascii to true, converting
`{"name": "José"}` into `{"name": "Jos\u00e9"}`. This can make the
output hard to read, especially in the case of intermediate steps in
agent logs.
By allowing users to pass additional parameters to `json.dumps()` via
Langchain's dumps(), we can solve this problem. For instance, users can
set `ensure_ascii=False` to maintain the original characters.
This update also enables users to pass other useful `json.dumps()`
parameters like `sort_keys`, providing even more flexibility.
The implementation takes into account edge cases where a user might pass
a "default" parameter, which is already defined by `dumps()`, or an
"indent" parameter, which is also predefined if `pretty=True` is set.
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
### Description
Hello,
The [integration_test
README](https://github.com/langchain-ai/langchain/tree/master/libs/langchain/tests)
was indicating incorrect paths for the `.env.example` and `.env` files.
`tests/.env.example` ->`tests/integration_tests/.env.example`
While it’s a minor error, it could **potentially lead to confusion** for
the document’s readers, so I’ve made the necessary corrections.
Thank you! ☺️
### Related Issue
- https://github.com/langchain-ai/langchain/pull/2806
**Description:**
Added support for a Pandas DataFrame OutputParser with format
instructions, along with unit tests and a demo notebook. Namely, we've
added the ability to request data from a DataFrame, have the LLM parse
the request, and then use that request to retrieve a well-formatted
response.
Within LangChain, it seamlessly integrates with language models like
OpenAI's `text-davinci-003`, facilitating streamlined interaction using
the format instructions (just like the other output parsers).
This parser structures its requests as
`<operation/column/row>[<optional_array_params>]`. The instructions
detail permissible operations, valid columns, and array formats,
ensuring clarity and adherence to the required format.
For example:
- When the LLM receives the input: "Retrieve the mean of `num_legs` from
rows 1 to 3."
- The provided format instructions guide the LLM to structure the
request as: "mean:num_legs[1..3]".
The parser processes this formatted request, leveraging the LLM's
understanding to extract the mean of `num_legs` from rows 1 to 3 within
the Pandas DataFrame.
This integration allows users to communicate requests naturally, with
the LLM transforming these instructions into structured commands
understood by the `PandasDataFrameOutputParser`. The format instructions
act as a bridge between natural language queries and precise DataFrame
operations, optimizing communication and data retrieval.
**Issue:**
- https://github.com/langchain-ai/langchain/issues/11532
**Dependencies:**
No additional dependencies :)
**Tag maintainer:**
@baskaryan
**Twitter handle:**
No need. :)
---------
Co-authored-by: Wasee Alam <waseealam@protonmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
**Description:**
When using Vald, only insecure grpc connection was supported, so secure
connection is now supported.
In addition, grpc metadata can be added to Vald requests to enable
authentication with a token.
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
Response_if_no_docs_found is not implemented in
ConversationalRetrievalChain for async code paths. Implemented it and
added test cases
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
# Description
This PR implements Self-Query Retriever for MongoDB Atlas vector store.
I've implemented the comparators and operators that are supported by
MongoDB Atlas vector store according to the section titled "Atlas Vector
Search Pre-Filter" from
https://www.mongodb.com/docs/atlas/atlas-vector-search/vector-search-stage/.
Namely:
```
allowed_comparators = [
Comparator.EQ,
Comparator.NE,
Comparator.GT,
Comparator.GTE,
Comparator.LT,
Comparator.LTE,
Comparator.IN,
Comparator.NIN,
]
"""Subset of allowed logical operators."""
allowed_operators = [
Operator.AND,
Operator.OR
]
```
Translations from comparators/operators to MongoDB Atlas filter
operators(you can find the syntax in the "Atlas Vector Search
Pre-Filter" section from the previous link) are done using the following
dictionary:
```
map_dict = {
Operator.AND: "$and",
Operator.OR: "$or",
Comparator.EQ: "$eq",
Comparator.NE: "$ne",
Comparator.GTE: "$gte",
Comparator.LTE: "$lte",
Comparator.LT: "$lt",
Comparator.GT: "$gt",
Comparator.IN: "$in",
Comparator.NIN: "$nin",
}
```
In visit_structured_query() the filters are passed as "pre_filter" and
not "filter" as in the MongoDB link above since langchain's
implementation of MongoDB atlas vector
store(libs\langchain\langchain\vectorstores\mongodb_atlas.py) in
_similarity_search_with_score() sets the "filter" key to have the value
of the "pre_filter" argument.
```
params["filter"] = pre_filter
```
Test cases and documentation have also been added.
# Issue
#11616
# Dependencies
No new dependencies have been added.
# Documentation
I have created the notebook mongodb_atlas_self_query.ipynb outlining the
steps to get the self-query mechanism working.
I worked closely with [@Farhan-Faisal](https://github.com/Farhan-Faisal)
on this PR.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
# Description
We implemented a simple tool for accessing the Merriam-Webster
Collegiate Dictionary API
(https://dictionaryapi.com/products/api-collegiate-dictionary).
Here's a simple usage example:
```py
from langchain.llms import OpenAI
from langchain.agents import load_tools, initialize_agent, AgentType
llm = OpenAI()
tools = load_tools(["serpapi", "merriam-webster"], llm=llm) # Serp API gives our agent access to Google
agent = initialize_agent(
tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True
)
agent.run("What is the english word for the german word Himbeere? Define that word.")
```
Sample output:
```
> Entering new AgentExecutor chain...
I need to find the english word for Himbeere and then get the definition of that word.
Action: Search
Action Input: "English word for Himbeere"
Observation: {'type': 'translation_result'}
Thought: Now I have the english word, I can look up the definition.
Action: MerriamWebster
Action Input: raspberry
Observation: Definitions of 'raspberry':
1. rasp-ber-ry, noun: any of various usually black or red edible berries that are aggregate fruits consisting of numerous small drupes on a fleshy receptacle and that are usually rounder and smaller than the closely related blackberries
2. rasp-ber-ry, noun: a perennial plant (genus Rubus) of the rose family that bears raspberries
3. rasp-ber-ry, noun: a sound of contempt made by protruding the tongue between the lips and expelling air forcibly to produce a vibration; broadly : an expression of disapproval or contempt
4. black raspberry, noun: a raspberry (Rubus occidentalis) of eastern North America that has a purplish-black fruit and is the source of several cultivated varieties —called also blackcap
Thought: I now know the final answer.
Final Answer: Raspberry is an english word for Himbeere and it is defined as any of various usually black or red edible berries that are aggregate fruits consisting of numerous small drupes on a fleshy receptacle and that are usually rounder and smaller than the closely related blackberries.
> Finished chain.
```
# Issue
This closes#12039.
# Dependencies
We added no extra dependencies.
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
---------
Co-authored-by: Lara <63805048+larkgz@users.noreply.github.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
- **Description:** Update the document for drop box loader + made the
messages more verbose when loading pdf file since people were getting
confused
- **Issue:** #13952
- **Tag maintainer:** @baskaryan, @eyurtsev, @hwchase17,
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
- **Description:** Added a tool called RedditSearchRun and an
accompanying API wrapper, which searches Reddit for posts with support
for time filtering, post sorting, query string and subreddit filtering.
- **Issue:** #13891
- **Dependencies:** `praw` module is used to search Reddit
- **Tag maintainer:** @baskaryan , and any of the other maintainers if
needed
- **Twitter handle:** None.
Hello,
This is our first PR and we hope that our changes will be helpful to the
community. We have run `make format`, `make lint` and `make test`
locally before submitting the PR. To our knowledge, our changes do not
introduce any new errors.
Our PR integrates the `praw` package which is already used by
RedditPostsLoader in LangChain. Nonetheless, we have added integration
tests and edited unit tests to test our changes. An example notebook is
also provided. These changes were put together by me, @Anika2000,
@CharlesXu123, and @Jeremy-Cheng-stack
Thank you in advance to the maintainers for their time.
---------
Co-authored-by: What-Is-A-Username <49571870+What-Is-A-Username@users.noreply.github.com>
Co-authored-by: Anika2000 <anika.sultana@mail.utoronto.ca>
Co-authored-by: Jeremy Cheng <81793294+Jeremy-Cheng-stack@users.noreply.github.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
- **Description:** Added some of the more endpoints supported by serpapi
that are not suported on langchain at the moment, like google trends,
google finance, google jobs, and google lens
- **Issue:** [Add support for many of the querying endpoints with
serpapi #11811](https://github.com/langchain-ai/langchain/issues/11811)
---------
Co-authored-by: zushenglu <58179949+zushenglu@users.noreply.github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Ian Xu <ian.xu@mail.utoronto.ca>
Co-authored-by: zushenglu <zushenglu1809@gmail.com>
Co-authored-by: KevinT928 <96837880+KevinT928@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
- **Description:** Volc Engine MaaS serves as an enterprise-grade,
large-model service platform designed for developers. You can visit its
homepage at https://www.volcengine.com/docs/82379/1099455 for details.
This change will facilitate developers to integrate quickly with the
platform.
- **Issue:** None
- **Dependencies:** volcengine
- **Tag maintainer:** @baskaryan
- **Twitter handle:** @he1v3tica
---------
Co-authored-by: lvzhong <lvzhong@bytedance.com>
- **Description:** use post field validation for `CohereRerank`
- **Issue:** #12899 and #13058
- **Dependencies:**
- **Tag maintainer:** @baskaryan
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
- **Description:** Update 5 pdf document loaders in
`langchain.document_loaders.pdf`, to store a url in the metadata
(instead of a temporary, local file path) if the user provides a web
path to a pdf: `PyPDFium2Loader`, `PDFMinerLoader`,
`PDFMinerPDFasHTMLLoader`, `PyMuPDFLoader`, and `PDFPlumberLoader` were
updated.
- The updates follow the approach used to update `PyPDFLoader` for the
same behavior in #12092
- The `PyMuPDFLoader` changes required additional work in updating
`langchain.document_loaders.parsers.pdf.PyMuPDFParser` to be able to
process either an `io.BufferedReader` (from local pdf) or `io.BytesIO`
(from online pdf)
- The `PDFMinerPDFasHTMLLoader` change used a simpler approach since the
metadata is assigned by the loader and not the parser
- **Issue:** Fixes#7034
- **Dependencies:** None
```python
# PyPDFium2Loader example:
# old behavior
>>> from langchain.document_loaders import PyPDFium2Loader
>>> loader = PyPDFium2Loader('https://arxiv.org/pdf/1706.03762.pdf')
>>> docs = loader.load()
>>> docs[0].metadata
{'source': '/var/folders/7z/d5dt407n673drh1f5cm8spj40000gn/T/tmpm5oqa92f/tmp.pdf', 'page': 0}
# new behavior
>>> from langchain.document_loaders import PyPDFium2Loader
>>> loader = PyPDFium2Loader('https://arxiv.org/pdf/1706.03762.pdf')
>>> docs = loader.load()
>>> docs[0].metadata
{'source': 'https://arxiv.org/pdf/1706.03762.pdf', 'page': 0}
```
- **Description:** Updated to remove deprecated parameter penalty_alpha,
and use string variation of prompt rather than json object for better
flexibility. - **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** N/A
- **Tag maintainer:** @eyurtsev
- **Twitter handle:** @symbldotai
---------
Co-authored-by: toshishjawale <toshish@symbl.ai>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Instead of using JSON-like syntax to describe node and relationship
properties we changed to a shorter and more concise schema description
Old:
```
Node properties are the following:
[{'properties': [{'property': 'name', 'type': 'STRING'}], 'labels': 'Movie'}, {'properties': [{'property': 'name', 'type': 'STRING'}], 'labels': 'Actor'}]
Relationship properties are the following:
[]
The relationships are the following:
['(:Actor)-[:ACTED_IN]->(:Movie)']
```
New:
```
Node properties are the following:
Movie {name: STRING},Actor {name: STRING}
Relationship properties are the following:
The relationships are the following:
(:Actor)-[:ACTED_IN]->(:Movie)
```
Implements
[#12115](https://github.com/langchain-ai/langchain/issues/12115)
Who can review?
@baskaryan , @eyurtsev , @hwchase17
Integrated Stack Exchange API into Langchain, enabling access to diverse
communities within the platform. This addition enhances Langchain's
capabilities by allowing users to query Stack Exchange for specialized
information and engage in discussions. The integration provides seamless
interaction with Stack Exchange content, offering content from varied
knowledge repositories.
A notebook example and test cases were included to demonstrate the
functionality and reliability of this integration.
- Add StackExchange as a tool.
- Add unit test for the StackExchange wrapper and tool.
- Add documentation for the StackExchange wrapper and tool.
If you have time, could you please review the code and provide any
feedback as necessary! My team is welcome to any suggestions.
---------
Co-authored-by: Yuval Kamani <yuvalkamani@gmail.com>
Co-authored-by: Aryan Thakur <aryanthakur@Aryans-MacBook-Pro.local>
Co-authored-by: Manas1818 <79381912+manas1818@users.noreply.github.com>
Co-authored-by: aryan-thakur <61063777+aryan-thakur@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
- **Description:** The class allows to only select between a few
predefined prompts from the paper. That is not ideal, since other use
cases might need a custom prompt. The changes made allow for this. To be
able to monitor those, I also added functionality to supply a custom
run_manager.
- **Issue:** no issue, but a new feature,
- **Dependencies:** none,
- **Tag maintainer:** @hwchase17,
- **Twitter handle:** @yvesloy
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
- **Description:** Support providing whatever extra parameters you want
to the Mathpix PDF loader API request.
- **Issue:** #12773
- **Dependencies:** None
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
- **Description:** Adds a tqdm progress bar to GooglePalmEmbeddings when
embedding a list.
- **Issue:** #13637
- **Dependencies:** TQDM as a main dependency (instead of extra)
Signed-off-by: ugm2 <unaigaraymaestre@gmail.com>
---------
Signed-off-by: ugm2 <unaigaraymaestre@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
This PR is fixing an attributeError: object endpoint has no attribute
"_public_match_client" when using gcp matching engine with private VPC
network.
@baskaryan, @eyurtsev, @hwchase17.
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
- **Description:** As of OpenAI's Python package 1.0, the existing
DallEAPIWrapper does not work correctly, so the example in the LangChain
Documentation link below does not work either.
https://python.langchain.com/docs/integrations/tools/dalle_image_generator
Also, since OpenAI only supports DALL-E version 2 or version 3, I
modified the DallEAPIWrapper to support it.
- **Issue:** #13825
- **Twitter handle:** ggeutzzang
- **Description:** According to the document
https://cloud.baidu.com/doc/WENXINWORKSHOP/s/6lp69is2a, add ERNIE-Bot-8K
model support for ErnieBotChat.
- **Dependencies:** Before using the ERNIE-Bot-8K, you should have the
model's access authority.
Replace this entire comment with:
- **Description:** updates `create_llm_result` function within
`openai.py` to consider latest `params`,
- **Issue:** #8928
- **Dependencies:** -,
- **Tag maintainer:** -
- **Twitter handle:** [burkomr](https://twitter.com/burkomr)
<!-- If no one reviews your PR within a few days, please @-mention one
of @baskaryan, @eyurtsev, @hwchase17. -->
---------
Co-authored-by: Burak Ömür <burakomur@retorio.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Replace this entire comment with:
- **Description:** VertexAI models are now GA, moved away from using
preview ones from the SDK
- **Issue:** #13606
---------
Co-authored-by: Nuno Campos <nuno@boringbits.io>
**Description:**
Repair Wikipedia document loader `load_max_docs` and improve test
coverage.
**Issue:**
The Wikipedia document loader was not respecting the `load_max_docs`
paramater (not reported) and would always return a maximum of 10
documents. This is because the API wrapper (in `utilities/wikipedia.py`)
wasn't passing `top_k_results` to the underlying [Wikipedia
library](https://wikipedia.readthedocs.io/en/latest/code.html#module-wikipedia).
By default this library returns 10 results.
The default number of results for the document loader has been reduced
from 100 to 25. This is because loading 100 results takes a very long
time and is an inconvenient default. It should possibly be 10.
In addition, the documentation for the loader reported that there was a
hard limit (300) on the number of documents returned. In actuality 300
is the maximum Wikipedia query character length set by the API wrapper.
Tests have been added for the document loader (previously missing) and
to test the correct numbers of documents are being returned by each
class, both by default, and when overridden. Also repaired is the
`assert_docs` test which has been updated to correctly test for the
default metadata (which includes `source` in recent releases).
**Dependencies:**
nil
**Tag maintainer:**
@leo-gan
**Twitter handle:**
@queenvictoria
### **Description:**
Previously `python_repl` was a built-in tool, but now it has been moved
to `langchain_experimental`.
When I use `load_tools` I get an error:
```python
In [1]: from langchain.agents import load_tools
In [2]: load_tools(["python_repl"])
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
Cell In[2], line 1
----> 1 load_tools(["python_repl"])
File ~/workspace/langchain/libs/langchain/langchain/agents/load_tools.py:530, in load_tools(tool_names, llm, callbacks, **kwargs)
528 tool_names.extend(requests_method_tools)
529 elif name in _BASE_TOOLS:
--> 530 tools.append(_BASE_TOOLS[name]())
531 elif name in _LLM_TOOLS:
532 if llm is None:
File ~/workspace/langchain/libs/langchain/langchain/agents/load_tools.py:84, in _get_python_repl()
83 def _get_python_repl() -> BaseTool:
---> 84 raise ImportError(
85 "This tool has been moved to langchain experiment. "
86 "This tool has access to a python REPL. "
87 "For best practices make sure to sandbox this tool. "
88 "Read https://github.com/langchain-ai/langchain/blob/master/SECURITY.md "
89 "To keep using this code as is, install langchain experimental and "
90 "update relevant imports replacing 'langchain' with 'langchain_experimental'"
91 )
ImportError: This tool has been moved to langchain experiment. This tool has access to a python REPL. For best practices make sure to sandbox this tool. Read https://github.com/langchain-ai/langchain/blob/master/SECURITY.md To keep using this code as is, install langchain experimental and update relevant imports replacing 'langchain' with 'langchain_experimental'
```
In this case, it will be very confusing. I think it is no longer a
built-in tool now, so it can be removed from `_BASE_TOOLS`
### **Issue:**
https://github.com/langchain-ai/langchain/issues/13858,
https://github.com/langchain-ai/langchain/issues/13859,
https://github.com/langchain-ai/langchain/issues/13856
### **Twitter handle:**
[lin_bob57617](https://twitter.com/lin_bob57617)
The `integrations/vectorstores/matchingengine.ipynb` example has the
"Google Vertex AI Vector Search" title. This place this Title in the
wrong order in the ToC (it is sorted by the file name).
- Renamed `integrations/vectorstores/matchingengine.ipynb` into
`integrations/vectorstores/google_vertex_ai_vector_search.ipynb`.
- Updated a correspondent comment in docstring
- Rerouted old URL to a new URL
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
It was :
`from langchain.schema.prompts import BasePromptTemplate`
but because of the breaking change in the ns, it is now
`from langchain.schema.prompt_template import BasePromptTemplate`
This bug prevents building the API Reference for the langchain_experimental
There are the following main changes in this PR:
1. Rewrite of the DocugamiLoader to not do any XML parsing of the DGML
format internally, and instead use the `dgml-utils` library we are
separately working on. This is a very lightweight dependency.
2. Added MMR search type as an option to multi-vector retriever, similar
to other retrievers. MMR is especially useful when using Docugami for
RAG since we deal with large sets of documents within which a few might
be duplicates and straight similarity based search doesn't give great
results in many cases.
We are @docugami on twitter, and I am @tjaffri
---------
Co-authored-by: Taqi Jaffri <tjaffri@docugami.com>
- **Description:** Adds a retriever implementation for [Knowledge Bases
for Amazon Bedrock](https://aws.amazon.com/bedrock/knowledge-bases/), a
new service announced at AWS re:Invent, shortly before this PR was
opened. This depends on the `bedrock-agent-runtime` service, which will
be included in a future version of `boto3` and of `botocore`. We will
open a follow-up PR documenting the minimum required versions of `boto3`
and `botocore` after that information is available.
- **Issue:** N/A
- **Dependencies:** `boto3>=1.33.2, botocore>=1.33.2`
- **Tag maintainer:** @baskaryan
- **Twitter handles:** `@pjain7` `@dead_letter_q`
This PR includes a documentation notebook under
`docs/docs/integrations/retrievers`, which I (@dlqqq) have verified
independently.
EDIT: `bedrock-agent-runtime` service is now included in
`boto3>=1.33.2`:
5cf793f493
---------
Co-authored-by: Piyush Jain <piyushjain@duck.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Addressing incorrect order being sent to callbacks / tracers, due to the
nature of threading
---------
Co-authored-by: Nuno Campos <nuno@boringbits.io>
Add arg to omit streamed_output list, in cases where final_output is
enough this saves bandwidth
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
This PR rearranges the docstring for the `AstraDB` vector store class so
as to have all useful information in the _class_ docstring for ease of
reading.
(incidentally, due to an oversight, the docstring that was in the
constructor ended up buried below some lines of code, thereby
disappearing altogether from accessibility. Apologies.)
- **Description:** Updates to `AnthropicFunctions` to be compatible with
the OpenAI `function_call` functionality.
- **Issue:** The functionality to indicate `auto`, `none` and a forced
function_call was not completely implemented in the existing code.
- **Dependencies:** None
- **Tag maintainer:** @baskaryan , and any of the other maintainers if
needed.
- **Twitter handle:** None
I have specifically tested this functionality via AWS Bedrock with the
Claude-2 and Claude-Instant models.
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
- **Description:** We are adding functionality to extract message
content from the `attributedBody` field of the database, in case the
content is not in the `text` field.
- **Issue:** Closes#13326 and #10680
- **Dependencies:** None.
- **Tag maintainer:** @eyurtsev, @hwchase17
---------
Co-authored-by: onotate <johnp.pham@mail.utoronto.ca>
- **Description:** Previously `MarkdownHeaderTextSplitter` did not
consider tilde-fenced code blocks
(https://spec.commonmark.org/0.30/#fenced-code-blocks). This PR fixes
that.
````md
# Bug caused by previous implementation:
~~~py
foo()
# This is a comment that would be considered header
bar()
~~~
````
- **Tag maintainer:** @baskaryan
Several bug fixes:
- emails: instead of `bcc` the `cc` is used.
- errors in the truncation descriptions
- no truncation of the `message_search`
Several updates:
- generalized UTC format
- truncation limit can be changed now in _call()
Fixes#13407.
This workaround consists in letting the RunnableLambda create its
self.afunc from its self.func when self.afunc is not provided; the
change has no dependency.
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Nuno Campos <nuno@langchain.dev>
```
---- chunk 1
{'actions': [AgentActionMessageLog(tool='Search', tool_input="Leo DiCaprio's current girlfriend", log="\nInvoking: `Search` with `Leo DiCaprio's current girlfriend`\n\n\n", message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Leo DiCaprio\'s current girlfriend"\n}'}})])],
'messages': [AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Leo DiCaprio\'s current girlfriend"\n}'}})]}
---- chunk 2
{'messages': [FunctionMessage(content="According to Us, the 48-year-old actor is now “exclusively” dating Italian model Vittoria Ceretti. A source told Us that DiCaprio is “completely smitten” with Ceretti, and their relationship is “going so well that Leo's actually being exclusive.”", name='Search')],
'steps': [AgentStep(action=AgentActionMessageLog(tool='Search', tool_input="Leo DiCaprio's current girlfriend", log="\nInvoking: `Search` with `Leo DiCaprio's current girlfriend`\n\n\n", message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Leo DiCaprio\'s current girlfriend"\n}'}})]), observation="According to Us, the 48-year-old actor is now “exclusively” dating Italian model Vittoria Ceretti. A source told Us that DiCaprio is “completely smitten” with Ceretti, and their relationship is “going so well that Leo's actually being exclusive.”")]}
---- chunk 3
{'actions': [AgentActionMessageLog(tool='Search', tool_input='Vittoria Ceretti age', log='\nInvoking: `Search` with `Vittoria Ceretti age`\n\n\n', message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Vittoria Ceretti age"\n}'}})])],
'messages': [AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Vittoria Ceretti age"\n}'}})]}
---- chunk 4
{'messages': [FunctionMessage(content='25 years', name='Search')],
'steps': [AgentStep(action=AgentActionMessageLog(tool='Search', tool_input='Vittoria Ceretti age', log='\nInvoking: `Search` with `Vittoria Ceretti age`\n\n\n', message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Vittoria Ceretti age"\n}'}})]), observation='25 years')]}
---- chunk 5
{'actions': [AgentActionMessageLog(tool='Calculator', tool_input='25^0.43', log='\nInvoking: `Calculator` with `25^0.43`\n\n\n', message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Calculator', 'arguments': '{\n "__arg1": "25^0.43"\n}'}})])],
'messages': [AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Calculator', 'arguments': '{\n "__arg1": "25^0.43"\n}'}})]}
---- chunk 6
{'messages': [FunctionMessage(content='Answer: 3.991298452658078', name='Calculator')],
'steps': [AgentStep(action=AgentActionMessageLog(tool='Calculator', tool_input='25^0.43', log='\nInvoking: `Calculator` with `25^0.43`\n\n\n', message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Calculator', 'arguments': '{\n "__arg1": "25^0.43"\n}'}})]), observation='Answer: 3.991298452658078')]}
---- chunk 7
{'messages': [AIMessage(content="Leonardo DiCaprio's current girlfriend is the Italian model Vittoria Ceretti, who is 25 years old. Her age raised to the 0.43 power is approximately 3.99.")],
'output': "Leonardo DiCaprio's current girlfriend is the Italian model "
'Vittoria Ceretti, who is 25 years old. Her age raised to the 0.43 '
'power is approximately 3.99.'}
---- final
{'actions': [AgentActionMessageLog(tool='Search', tool_input="Leo DiCaprio's current girlfriend", log="\nInvoking: `Search` with `Leo DiCaprio's current girlfriend`\n\n\n", message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Leo DiCaprio\'s current girlfriend"\n}'}})]),
AgentActionMessageLog(tool='Search', tool_input='Vittoria Ceretti age', log='\nInvoking: `Search` with `Vittoria Ceretti age`\n\n\n', message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Vittoria Ceretti age"\n}'}})]),
AgentActionMessageLog(tool='Calculator', tool_input='25^0.43', log='\nInvoking: `Calculator` with `25^0.43`\n\n\n', message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Calculator', 'arguments': '{\n "__arg1": "25^0.43"\n}'}})])],
'messages': [AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Leo DiCaprio\'s current girlfriend"\n}'}}),
FunctionMessage(content="According to Us, the 48-year-old actor is now “exclusively” dating Italian model Vittoria Ceretti. A source told Us that DiCaprio is “completely smitten” with Ceretti, and their relationship is “going so well that Leo's actually being exclusive.”", name='Search'),
AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Vittoria Ceretti age"\n}'}}),
FunctionMessage(content='25 years', name='Search'),
AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Calculator', 'arguments': '{\n "__arg1": "25^0.43"\n}'}}),
FunctionMessage(content='Answer: 3.991298452658078', name='Calculator'),
AIMessage(content="Leonardo DiCaprio's current girlfriend is the Italian model Vittoria Ceretti, who is 25 years old. Her age raised to the 0.43 power is approximately 3.99.")],
'output': "Leonardo DiCaprio's current girlfriend is the Italian model "
'Vittoria Ceretti, who is 25 years old. Her age raised to the 0.43 '
'power is approximately 3.99.',
'steps': [AgentStep(action=AgentActionMessageLog(tool='Search', tool_input="Leo DiCaprio's current girlfriend", log="\nInvoking: `Search` with `Leo DiCaprio's current girlfriend`\n\n\n", message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Leo DiCaprio\'s current girlfriend"\n}'}})]), observation="According to Us, the 48-year-old actor is now “exclusively” dating Italian model Vittoria Ceretti. A source told Us that DiCaprio is “completely smitten” with Ceretti, and their relationship is “going so well that Leo's actually being exclusive.”"),
AgentStep(action=AgentActionMessageLog(tool='Search', tool_input='Vittoria Ceretti age', log='\nInvoking: `Search` with `Vittoria Ceretti age`\n\n\n', message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n "__arg1": "Vittoria Ceretti age"\n}'}})]), observation='25 years'),
AgentStep(action=AgentActionMessageLog(tool='Calculator', tool_input='25^0.43', log='\nInvoking: `Calculator` with `25^0.43`\n\n\n', message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Calculator', 'arguments': '{\n "__arg1": "25^0.43"\n}'}})]), observation='Answer: 3.991298452658078')]}
```
- **Description:** Existing model used for Prompt Injection is quite
outdated but we fine-tuned and open-source a new model based on the same
model deberta-v3-base from Microsoft -
[laiyer/deberta-v3-base-prompt-injection](https://huggingface.co/laiyer/deberta-v3-base-prompt-injection).
It supports more up-to-date injections and less prone to
false-positives.
- **Dependencies:** No
- **Tag maintainer:** -
- **Twitter handle:** @alex_yaremchuk
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
- **Description:** The experimental package needs to be compatible with
the usage of importing agents
For example, if i use `from langchain.agents import
create_pandas_dataframe_agent`, running the program will prompt the
following information:
```
Traceback (most recent call last):
File "/Users/dongwm/test/main.py", line 1, in <module>
from langchain.agents import create_pandas_dataframe_agent
File "/Users/dongwm/test/venv/lib/python3.11/site-packages/langchain/agents/__init__.py", line 87, in __getattr__
raise ImportError(
ImportError: create_pandas_dataframe_agent has been moved to langchain experimental. See https://github.com/langchain-ai/langchain/discussions/11680 for more information.
Please update your import statement from: `langchain.agents.create_pandas_dataframe_agent` to `langchain_experimental.agents.create_pandas_dataframe_agent`.
```
But when I changed to `from langchain_experimental.agents import
create_pandas_dataframe_agent`, it was actually wrong:
```python
Traceback (most recent call last):
File "/Users/dongwm/test/main.py", line 2, in <module>
from langchain_experimental.agents import create_pandas_dataframe_agent
ImportError: cannot import name 'create_pandas_dataframe_agent' from 'langchain_experimental.agents' (/Users/dongwm/test/venv/lib/python3.11/site-packages/langchain_experimental/agents/__init__.py)
```
I should use `from langchain_experimental.agents.agent_toolkits import
create_pandas_dataframe_agent`. In order to solve the problem and make
it compatible, I added additional import code to the
langchain_experimental package. Now it can be like this Used `from
langchain_experimental.agents import create_pandas_dataframe_agent`
- **Twitter handle:** [lin_bob57617](https://twitter.com/lin_bob57617)
- **Description:** Adds a tqdm progress bar to OllamaEmbeddings when
embedding a list.
- **Issue:** Related to #13637, but extended to Ollama.
- **Dependencies:** `tqdm` made a necessary dependency.
Thanks to @ugm2 for helping identify a common problem. Embeddings take a
very long time to finish on local machines, and require a progress bar
to help identify if one should even attempt the workload.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** Added a line to pass the tenant parameter to
add_data_object
- **Issue:** An extra line added from the fix for #9956
- **Dependencies:** n/a
- **Tag maintainer:** @baskaryan
Tested locally, works as expected with the line change.
---------
Co-authored-by: Simon Dai <simon6752@gmail.com>
Description: Some Elastic indexes do not return a 'metadata' field in
'_source'. However, prior to this PR, the code assumed there always is a
'metadata' field. This PR adds support for cases where the field is
missing by adding it manually.
Issue: #13869
**Description:**
This PR adds Databricks Vector Search as a new vector store in
LangChain.
- [x] Add `DatabricksVectorSearch` in `langchain/vectorstores/`
- [x] Unit tests
- [x] Add
[`databricks-vectorsearch`](https://pypi.org/project/databricks-vectorsearch/)
as a new optional dependency
We ran the following checks:
- `make format` passed ✅
- `make lint` failed but the failures were caused by other files
+ Files touched by this PR passed the linter ✅
- `make test` passed ✅
- `make coverage` failed but the failures were caused by other files.
Tests added by or related to this PR all passed
+ langchain/vectorstores/databricks_vector_search.py test coverage 94% ✅
- `make spell_check` passed ✅
The example notebook and updates to the [provider's documentation
page](https://github.com/langchain-ai/langchain/blob/master/docs/docs/integrations/providers/databricks.md)
will be added later in a separate PR.
**Dependencies:**
Optional dependency:
[`databricks-vectorsearch`](https://pypi.org/project/databricks-vectorsearch/)
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
This pull request addresses an issue found in the example code within
the docstring of `libs/core/langchain_core/runnables/passthrough.py`
The original code snippet caused a `NameError` due to the missing import
of `RunnableLambda`. The error was as follows:
```
12 return "completion"
13
---> 14 chain = RunnableLambda(fake_llm) | {
15 'original': RunnablePassthrough(), # Original LLM output
16 'parsed': lambda text: text[::-1] # Parsing logic
NameError: name 'RunnableLambda' is not defined
```
To resolve this, I have modified the example code to include the
necessary import statement for `RunnableLambda`. Additionally, I have
adjusted the indentation in the code snippet to ensure consistency and
readability.
The modified code now successfully defines and utilizes
`RunnableLambda`, ensuring that users referencing the docstring will
have a functional and clear example to follow.
There are no related GitHub issues for this particular change.
Modified Code:
```python
from langchain_core.runnables import RunnablePassthrough, RunnableParallel
from langchain_core.runnables import RunnableLambda
runnable = RunnableParallel(
origin=RunnablePassthrough(),
modified=lambda x: x+1
)
runnable.invoke(1) # {'origin': 1, 'modified': 2}
def fake_llm(prompt: str) -> str: # Fake LLM for the example
return "completion"
chain = RunnableLambda(fake_llm) | {
'original': RunnablePassthrough(), # Original LLM output
'parsed': lambda text: text[::-1] # Parsing logic
}
chain.invoke('hello') # {'original': 'completion', 'parsed': 'noitelpmoc'}
```
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
- **Description:** Added a retriever for the Outline API to ask
questions on knowledge base
- **Issue:** resolves#11814
- **Dependencies:** None
- **Tag maintainer:** @baskaryan
- **Description:** Simple change, I just added title metadata to
GoogleDriveLoader for optional File Loaders
- **Dependencies:** no dependencies
- **Tag maintainer:** @hwchase17
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
This PR provides idiomatic implementations for the exact-match and the
semantic LLM caches using Astra DB as backend through the database's
HTTP JSON API. These caches require the `astrapy` library as dependency.
Comes with integration tests and example usage in the `llm_cache.ipynb`
in the docs.
@baskaryan this is the Astra DB counterpart for the Cassandra classes
you merged some time ago, tagging you for your familiarity with the
topic. Thank you!
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
This PR adds a chat message history component that uses Astra DB for
persistence through the JSON API.
The `astrapy` package is required for this class to work.
I have added tests and a small notebook, and updated the relevant
references in the other docs pages.
(@rlancemartin this is the counterpart of the Cassandra equivalent class
you so helpfully reviewed back at the end of June)
Thank you!
- **Description:** This commit fixed the problem that Redis vector store
will change the value of a metadata from 0 to empty when saving the
document, which should be an un-intended behavior.
- **Issue:** N/A
- **Dependencies:** N/A
**Description:** Currently, if we pass in a ToolMessage back to the
chain, it crashes with error
`Got unsupported message type: `
This fixes it.
Tested locally
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
**Description:** BaseStringMessagePromptTemplate.from_template was
passing the value of partial_variables into cls(...) via **kwargs,
rather than passing it to PromptTemplate.from_template. Which resulted
in those *partial_variables being* lost and becoming required
*input_variables*.
Co-authored-by: Josep Pon Farreny <josep.pon-farreny@siemens.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Fix some circular deps:
- move PromptValue into top level module bc both PromptTemplates and
OutputParsers import
- move tracer context vars to `tracers.context` and import them in
functions in `callbacks.manager`
- add core import tests
- **Description:** We need to update the Dockerfile for templates to
also copy your README.md. This is because poetry requires that a readme
exists if it is specified in the pyproject.toml
Changes:
- remove langchain_core/schema since no clear distinction b/n schema and
non-schema modules
- make every module that doesn't end in -y plural
- where easy have 1-2 classes per file
- no more than one level of nesting in directories
- only import from top level core modules in langchain
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
- **Description:** fix a bug that prevented as_retriever() in Vectara to
use the desired input arguments
- **Issue:** as_retriever did not pass the arguments properly
- **Tag maintainer:** @baskaryan
- **Twitter handle:** @ofermend
I encountered this during summarization with VertexAI. I was receiving
an INVALID_ARGUMENT error, as it was trying to send a list of about
17000 single characters.
The [count_tokens
method](https://github.com/googleapis/python-aiplatform/blob/main/vertexai/language_models/_language_models.py#L658)
made available by Google takes in a list of prompts. It does not fail
for small texts, but it does for longer documents because the argument
list will be exceeding Googles allowed limit. Enforcing the list type
makes it work successfully.
This change will cast the input text to count to a list of that single
text so that the input format is always correct.
[Twitter](https://www.x.com/stijn_tratsaert)
- **Description:** ERNIE-Bot-Chat-4 Large Language Model adds the
ability of `Function Calling` by passing parameters through the
`functions` parameter in the request. To simplify function calling for
ERNIE-Bot-Chat-4, the `create_ernie_fn_chain()` function has been added.
The definition and usage of the `create_ernie_fn_chain()` function is
similar to that of the `create_openai_fn_chain()` function.
Examples as the follows:
```
import json
from langchain.chains.ernie_functions import (
create_ernie_fn_chain,
)
from langchain.chat_models import ErnieBotChat
from langchain.prompts import ChatPromptTemplate
def get_current_news(location: str) -> str:
"""Get the current news based on the location.'
Args:
location (str): The location to query.
Returs:
str: Current news based on the location.
"""
news_info = {
"location": location,
"news": [
"I have a Book.",
"It's a nice day, today."
]
}
return json.dumps(news_info)
def get_current_weather(location: str, unit: str="celsius") -> str:
"""Get the current weather in a given location
Args:
location (str): location of the weather.
unit (str): unit of the tempuature.
Returns:
str: weather in the given location.
"""
weather_info = {
"location": location,
"temperature": "27",
"unit": unit,
"forecast": ["sunny", "windy"],
}
return json.dumps(weather_info)
llm = ErnieBotChat(model_name="ERNIE-Bot-4")
prompt = ChatPromptTemplate.from_messages(
[
("human", "{query}"),
]
)
chain = create_ernie_fn_chain([get_current_weather, get_current_news], llm, prompt, verbose=True)
res = chain.run("北京今天的新闻是什么?")
print(res)
```
The running results of the above program are shown below:
```
> Entering new LLMChain chain...
Prompt after formatting:
Human: 北京今天的新闻是什么?
> Finished chain.
{'name': 'get_current_news', 'thoughts': '用户想要知道北京今天的新闻。我可以使用get_current_news工具来获取这些信息。', 'arguments': {'location': '北京'}}
```
- **Description:** during search with DeepLake some people are facing
backwards compatibility issues, this PR fixes it by making search
accessible for the older datasets
---------
Co-authored-by: adolkhan <adilkhan.sarsen@alumni.nu.edu.kz>
- **Description:**
- Fixes a `key_prefix` bug where passing it in on
`Redis.from_existing(...)` did not work properly. Updates doc strings
accordingly.
- Updates Redis filter classes logic with best practices on typing,
string formatting, and handling "empty" filters.
- Fixes a bug that would prevent multiple tag filters from being applied
together in some scenarios.
- Added a whole new filter unit testing module. Also updated code
formatting for a number of modules that were failing the `make`
commands.
- **Issue:** N/A
- **Dependencies:** N/A
- **Tag maintainer:** @baskaryan
- **Twitter handle:** @tchutch94
In the `FORMAT_INSTRUCTIONS` template, 4 curly braces (escaping) are
used to get single curly brace after formatting:
```
"{{{ ... }}}}" -> format_instructions.format() -> "{{ ... }}" -> template.format() -> "{ ... }".
```
Tool's `args_schema` string contains single braces `{ ... }`, and is
also transformed to `{{{{ ... }}}}` form. But this is not really correct
since there is only one `format()` call:
```
"{{{{ ... }}}}" -> template.format() -> "{{ ... }}".
```
As a result we get double curly braces in the prompt:
````
Respond to the human as helpfully and accurately as possible. You have access to the following tools:
foo: Test tool FOO, args: {{'tool_input': {{'type': 'string'}}}} # <--- !!!
...
Provide only ONE action per $JSON_BLOB, as shown:
```
{
"action": $TOOL_NAME,
"action_input": $INPUT
}
```
````
This PR fixes curly braces escaping in the `args_schema` to have single
braces in the final prompt:
````
Respond to the human as helpfully and accurately as possible. You have access to the following tools:
foo: Test tool FOO, args: {'tool_input': {'type': 'string'}} # <--- !!!
...
Provide only ONE action per $JSON_BLOB, as shown:
```
{
"action": $TOOL_NAME,
"action_input": $INPUT
}
```
````
---------
Co-authored-by: Sergey Kozlov <sergey.kozlov@ludditelabs.io>
Hi 👋 We are working with Llama2 on Bedrock, and would like to add it to
Langchain. We saw a [pull
request](https://github.com/langchain-ai/langchain/pull/13322) to add it
to the `llm.Bedrock` class, but since it concerns a chat model, we would
like to add it to `BedrockChat` as well.
- **Description:** Add support for Llama2 to `BedrockChat` in
`chat_models`
- **Issue:** the issue # it fixes (if applicable)
[#13316](https://github.com/langchain-ai/langchain/issues/13316)
- **Dependencies:** any dependencies required for this change `None`
- **Tag maintainer:** /
- **Twitter handle:** `@SimonBockaert @WouterDurnez`
---------
Co-authored-by: wouter.durnez <wouter.durnez@showpad.com>
Co-authored-by: Simon Bockaert <simon.bockaert@showpad.com>
- **Description:** This change adds an agent to the Azure Cognitive
Services toolkit for identifying healthcare entities
- **Dependencies:** azure-ai-textanalytics (Optional)
---------
Co-authored-by: James Beck <James.Beck@sa.gov.au>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Hi!
This short PR aims at:
* Fixing `OpenAIEmbeddings`' check on `chunk_size` when used with Azure
OpenAI (thus with openai < 1.0). Azure OpenAI embeddings support at most
16 chunks per batch, I believe we are supposed to take the min between
the passed value/default value and 16, not the max - which, I suppose,
was introduced by accident while refactoring the previous version of
this check from this other PR of mine: #10707
* Porting this fix to the newest class (`AzureOpenAIEmbeddings`) for
openai >= 1.0
This fixes#13539 (closed but the issue persists).
@baskaryan @hwchase17
- **Description:** There are several mistakes in the sample code in the
doc-string of `DashVector` class, and this pull request aims to correct
them.
The correction code has been tested against latest version (at the time
of creation of this pull request) of: `langchain==0.0.336`
`dashvector==1.0.6` .
- **Issue:** No issue is created for this.
- **Dependencies:** No dependency is required for this change,
<!-- - **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below), -->
- **Twitter handle:** `zeyanglin`
<!-- Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
- **Description:** AstraDB is going to deprecate the `$similarity`
projection property in favor of the ´includeSimilarity´ option flag. I
moved all the queries to the new format.
- **Tag maintainer:** @hemidactylus
- **Twitter handle:** nicoloboschi
Added a `search_kwargs` field to BingSearchAPIWrapper in
`bing_search.py,` enabling users to include extra keyword arguments in
Bing search queries. This update, like specifying language preferences,
adds more customization to searches. The `search_kwargs` seamlessly
merge with standard parameters in `_bing_search_results` method.
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
- **Description:** Fix Astra integration tests that are failing. The
`delete` always return True as the deletion is successful if no errors
are thrown. I aligned the test to verify this behaviour
- **Tag maintainer:** @hemidactylus
- **Twitter handle:** nicoloboschi
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
The issue was accuring because of `openai` update in Completions. its
not accepting `api_key` and 'api_base' args.
The fix is we check for the openai version and if ats v1 then remove
these keys from args before passing them to `Compilation.create(...)`
when sending from `VLLMOpenAI`
Fixed: #13507
@eyu
@efriis
@hwchase17
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
- **Description:** In this pull request, we address an issue related to
assigning a schema to the SQLDatabase class when utilizing an Oracle
database. The current implementation encounters a bug where, upon
attempting to execute a query, the alter session parse is not
appropriately defined for Oracle, leading to an error,
- **Issue:** #7928,
- **Dependencies:** No dependencies,
- **Tag maintainer:** @baskaryan,
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
- **Description:** This change allows for the `MWDumpLoader` to load all
namespaces including custom by default instead of only loading the
[default
namespaces](https://www.mediawiki.org/wiki/Help:Namespaces#Localisation).
- **Tag maintainer:** @hwchase17
**Description:**
This commit adds embedchain retriever along with tests and docs.
Embedchain is a RAG framework to create data pipelines.
**Twitter handle:**
- [Taranjeet's twitter](https://twitter.com/taranjeetio) and
[Embedchain's twitter](https://twitter.com/embedchain)
**Reviewer**
@hwchase17
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
**Description:**
Enhance the functionality of YoutubeLoader to enable the translation of
available transcripts by refining the existing logic.
**Issue:**
Encountering a problem with YoutubeLoader (#13523) where the translation
feature is not functioning as expected.
Tag maintainers/contributors who might be interested:
@eyurtsev
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
BUG: langchain.agents.openai_assistant has a reference as
`from langchain_experimental.openai_assistant.base import
OpenAIAssistantRunnable`
should be
`from langchain.agents.openai_assistant.base import
OpenAIAssistantRunnable`
This prevents building of the API Reference docs
## Update 2023-09-08
This PR now supports further models in addition to Lllama-2 chat models.
See [this comment](#issuecomment-1668988543) for further details. The
title of this PR has been updated accordingly.
## Original PR description
This PR adds a generic `Llama2Chat` model, a wrapper for LLMs able to
serve Llama-2 chat models (like `LlamaCPP`,
`HuggingFaceTextGenInference`, ...). It implements `BaseChatModel`,
converts a list of chat messages into the [required Llama-2 chat prompt
format](https://huggingface.co/blog/llama2#how-to-prompt-llama-2) and
forwards the formatted prompt as `str` to the wrapped `LLM`. Usage
example:
```python
# uses a locally hosted Llama2 chat model
llm = HuggingFaceTextGenInference(
inference_server_url="http://127.0.0.1:8080/",
max_new_tokens=512,
top_k=50,
temperature=0.1,
repetition_penalty=1.03,
)
# Wrap llm to support Llama2 chat prompt format.
# Resulting model is a chat model
model = Llama2Chat(llm=llm)
messages = [
SystemMessage(content="You are a helpful assistant."),
MessagesPlaceholder(variable_name="chat_history"),
HumanMessagePromptTemplate.from_template("{text}"),
]
prompt = ChatPromptTemplate.from_messages(messages)
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)
chain = LLMChain(llm=model, prompt=prompt, memory=memory)
# use chat model in a conversation
# ...
```
Also part of this PR are tests and a demo notebook.
- Tag maintainer: @hwchase17
- Twitter handle: `@mrt1nz`
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
- **Description:** Added a method `fetch_valid_documents` to
`WebResearchRetriever` class that will test the connection for every url
in `new_urls` and remove those that raise a `ConnectionError`.
- **Issue:** [Previous
PR](https://github.com/langchain-ai/langchain/pull/13353),
- **Dependencies:** None,
- **Tag maintainer:** @efriis
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
## Description
This PR adds an option to allow unsigned requests to the Neptune
database when using the `NeptuneGraph` class.
```python
graph = NeptuneGraph(
host='<my-cluster>',
port=8182,
sign=False
)
```
Also, added is an option in the `NeptuneOpenCypherQAChain` to provide
additional domain instructions to the graph query generation prompt.
This will be injected in the prompt as-is, so you should include any
provider specific tags, for example `<instructions>` or `<INSTR>`.
```python
chain = NeptuneOpenCypherQAChain.from_llm(
llm=llm,
graph=graph,
extra_instructions="""
Follow these instructions to build the query:
1. Countries contain airports, not the other way around
2. Use the airport code for identifying airports
"""
)
```
**Description**
MongoDB drivers are used in various flavors and languages. Making sure
we exercise our due diligence in identifying the "origin" of the library
calls makes it best to understand how our Atlas servers get accessed.
**Description/Issue:**
When OpenAI calls a function with no args, the args are `""` rather than
`"{}"`. Then `json.loads("")` blows up. This PR handles it correctly.
**Dependencies:** None
This PR brings a few minor improvements to the docs, namely class/method
docstrings and the demo notebook.
- A note on how to control concurrency levels to tune performance in
bulk inserts, both in the class docstring and the demo notebook;
- Slightly increased concurrency defaults after careful experimentation
(still on the conservative side even for clients running on
less-than-typical network/hardware specs)
- renamed the DB token variable to the standardized
`ASTRA_DB_APPLICATION_TOKEN` name (used elsewhere, e.g. in the Astra DB
docs)
- added a note and a reference (add_text docstring, demo notebook) on
allowed metadata field names.
Thank you!
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
fix#13356
Add supports following properties for metadata to NotionDBLoader.
- `checkbox`
- `email`
- `number`
- `select`
There are no relevant tests for this code to be updated.
- **Description:** Adds `limit_to_domains` param to the APIChain based
tools (open_meteo, TMDB, podcast_docs, and news_api)
- **Issue:** I didn't open an issue, but after upgrading to 0.0.328
using these tools would throw an error.
- **Dependencies:** N/A
- **Tag maintainer:** @baskaryan
**Note**: I included the trailing / simply because the docs here did
fc886cc303/docs/docs/use_cases/apis.ipynb (L246)
, but I checked the code and it is using `urlparse`. SoI followed the
docs since it comes down to stylee.