Commit Graph

2297 Commits (master)

Author SHA1 Message Date
Bagatur 9e7d9f9390
infra: bump langchain min test reqs (#16882) 5 months ago
Bagatur db442c635b
langchain[patch]: Release 0.1.5 (#16881) 5 months ago
Christophe Bornet a0ec045495
Add async methods to BaseStore (#16669)
- **Description:**

The BaseStore methods are currently blocking. Some implementations
(AstraDBStore, RedisStore) would benefit from having async methods.
Also once we have async methods for BaseStore, we can implement the
async `aembed_documents` in CacheBackedEmbeddings to cache the
embeddings asynchronously.

* adds async methods amget, amset, amedelete and ayield_keys to
BaseStore
  * implements the async methods for InMemoryStore
  * adds tests for InMemoryStore async methods

- **Twitter handle:** cbornet_
5 months ago
Christophe Bornet af8c5c185b
langchain[minor],community[minor]: Add async methods in BaseLoader (#16634)
Adds:
* methods `aload()` and `alazy_load()` to interface `BaseLoader`
* implementation for class `MergedDataLoader `
* support for class `BaseLoader` in async function `aindex()` with unit
tests

Note: this is compatible with existing `aload()` methods that some
loaders already had.

**Twitter handle:** @cbornet_

---------

Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
5 months ago
Bagatur b0347f3e2b
docs: add csv use case (#16756) 5 months ago
Neli Hateva c95facc293
langchain[minor], community[minor]: Implement Ontotext GraphDB QA Chain (#16019)
- **Description:** Implement Ontotext GraphDB QA Chain
  - **Issue:** N/A
  - **Dependencies:** N/A
  - **Twitter handle:** @OntotextGraphDB
5 months ago
chyroc a08f9a7ff9
langchain[patch]: support OpenAIAssistantRunnable async (#15302)
fix https://github.com/langchain-ai/langchain/issues/15299

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
Antonio Lanza 08d3fd7f2e
langchain[patch]: inconsistent results with `RecursiveCharacterTextSplitter`'s `add_start_index=True` (#16583)
This PR fixes issue #16579
5 months ago
Bagatur 5df8ab574e
infra: move indexing documentation test (#16595) 5 months ago
Bagatur f3d61a6e47
langchain[patch]: Release 0.1.4 (#16592) 5 months ago
Bagatur ef42d9d559
core[patch], community[patch], openai[patch]: consolidate openai tool… (#16485)
… converters

One way to convert anything to an OAI function:
convert_to_openai_function
One way to convert anything to an OAI tool: convert_to_openai_tool
Corresponding bind functions on OAI models: bind_functions, bind_tools
5 months ago
Anders Åhsman 355ef2a4a6
langchain[patch]: Fix doc-string grammar (#16543)
- **Description:** Small grammar fix in docstring for class
`BaseCombineDocumentsChain`.
5 months ago
Bagatur c173a69908
langchain[patch]: oai tools output parser nit (#16540)
allow positional init args
5 months ago
Martin Kolb 04651f0248
community[minor]: VectorStore integration for SAP HANA Cloud Vector Engine (#16514)
- **Description:**
This PR adds a VectorStore integration for SAP HANA Cloud Vector Engine,
which is an upcoming feature in the SAP HANA Cloud database
(https://blogs.sap.com/2023/11/02/sap-hana-clouds-vector-engine-announcement/).

  - **Issue:** N/A
- **Dependencies:** [SAP HANA Python
Client](https://pypi.org/project/hdbcli/)
  - **Twitter handle:** @sapopensource

Implementation of the integration:
`libs/community/langchain_community/vectorstores/hanavector.py`

Unit tests:
`libs/community/tests/unit_tests/vectorstores/test_hanavector.py`

Integration tests:
`libs/community/tests/integration_tests/vectorstores/test_hanavector.py`

Example notebook:
`docs/docs/integrations/vectorstores/hanavector.ipynb`

Access credentials for execution of the integration tests can be
provided to the maintainers.

---------

Co-authored-by: sascha <sascha.stoll@sap.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
Ali Zendegani 80fcc50c65
langchain[patch]: Minor Fix: Enable Passing custom_headers for Authentication in GraphQL Agent/Tool (#16413)
- **Description:** 

This PR aims to enhance the `langchain` library by enabling the support
for passing `custom_headers` in the `GraphQLAPIWrapper` usage within
`langchain/agents/load_tools.py`.

While the `GraphQLAPIWrapper` from the `langchain_community` module is
inherently capable of handling `custom_headers`, its current invocation
in `load_tools.py` does not facilitate this functionality.
This limitation restricts the use of the `graphql` tool with databases
or APIs that require token-based authentication.

The absence of support for `custom_headers` in this context also leads
to a lack of error messages when attempting to interact with secured
GraphQL endpoints, making debugging and troubleshooting more
challenging.

This update modifies the `load_tools` function to correctly handle
`custom_headers`, thereby allowing secure and authenticated access to
GraphQL services requiring tokens.

Example usage after the proposed change:
```python
tools = load_tools(
    ["graphql"],
    graphql_endpoint="https://your-graphql-endpoint.com/graphql",
    custom_headers={"Authorization": f"Token {api_token}"},
)
```
  - **Issue:** None,
  - **Dependencies:** None,
  - **Twitter handle:** None
5 months ago
Krista Pratico 0e2e7d8b83
langchain[patch]: allow passing client with OpenAIAssistantRunnable (#16486)
- **Description:** This addresses the issue tagged below where if you
try to pass your own client when creating an OpenAI assistant, a
pydantic error is raised:

Example code:

```python
import openai
from langchain.agents.openai_assistant import OpenAIAssistantRunnable

client = openai.OpenAI()
interpreter_assistant = OpenAIAssistantRunnable.create_assistant(
    name="langchain assistant",
    instructions="You are a personal math tutor. Write and run code to answer math questions.",
    tools=[{"type": "code_interpreter"}],
    model="gpt-4-1106-preview",
    client=client
)

```

Error:
`pydantic.v1.errors.ConfigError: field "client" not yet prepared, so the
type is still a ForwardRef. You might need to call
OpenAIAssistantRunnable.update_forward_refs()`

It additionally updates type hints and docstrings to indicate that an
AzureOpenAI client is permissible as well.

  - **Issue:** https://github.com/langchain-ai/langchain/issues/15948
  - **Dependencies:** N/A
5 months ago
Gianfranco Demarco c69f599594
langchain[patch]: Extract _aperform_agent_action from _aiter_next_step from AgentExecutor (#15707)
- **Description:** extreact the _aperform_agent_action in the
AgentExecutor class to allow for easier overriding. Extracted logic from
_iter_next_step into a new method _perform_agent_action for consistency
and easier overriding.
- **Issue:** #15706

Closes #15706
5 months ago
i-w-a 95ee69a301
langchain[patch]: In HTMLHeaderTextSplitter set default encoding to utf-8 (#16372)
- **Description:** The HTMLHeaderTextSplitter Class now explicitly
specifies utf-8 encoding in the part of the split_text_from_file method
that calls the HTMLParser.
- **Issue:** Prevent garbled characters due to differences in encoding
of html files (except for English in particular, I noticed that problem
with Japanese).
  - **Dependencies:** No dependencies,
  - **Twitter handle:**  @i_w__a
5 months ago
Bagatur ba326b98d0
langchain[patch]: Release 0.1.3 (#16475) 5 months ago
Erick Friis cfe95ab085
multiple: update langsmith dep (#16407) 5 months ago
Bagatur af9f1738ca
langchain[patch]: Release 0.1.2 (#16388) 5 months ago
Bagatur 1dc6c1ce06
core[patch], community[patch], langchain[patch], docs: Update SQL chains/agents/docs (#16168)
Revamp SQL use cases docs. In the process update SQL chains and agents.
5 months ago
Bob Lin acc14802d1
Fix `conn` field definition in SQLiteEntityStore (#15440) 5 months ago
Ryan French 3d23a5eb36
langchain[patch]: Allow OpenSearch Query Translator to correctly work with Date types (#16022)
**Description:**

Fixes an issue where the Date type in an OpenSearch Self Querying
Retriever would fail to generate a valid query

**Issue:**
https://github.com/langchain-ai/langchain/issues/14225
5 months ago
Hamza Kyamanywa 39b3c6d94c
langchain[patch]: Add konlpy based text splitting for Korean (#16003)
- **Description:** Adds a text splitter based on
[Konlpy](https://konlpy.org/en/latest/#start) which is a Python package
for natural language processing (NLP) of the Korean language. (It is
like Spacy or NLTK for Korean)
- **Dependencies:** Konlpy would have to be installed before this
splitter is used,
  - **Twitter handle:** @untilhamza
5 months ago
Bagatur 2454fefc53
docs: agent prompt docs (#16105) 5 months ago
SN f175bf7d7b
Use env for revision id if not passed in as param; use `git describe` as backup (#16227)
Co-authored-by: William Fu-Hinthorn <13333726+hinthornw@users.noreply.github.com>
5 months ago
Erick Friis b9495da92d
langchain[patch]: fix stuff documents chain api docs render (#16159) 5 months ago
SN 7d444724d7
Add revision identifier to run_on_dataset (#16167)
Allow specifying revision identifier for better project versioning
5 months ago
Eugene Yurtsev 5d8c147332
docs: Document and test PydanticOutputFunctionsParser (#15759)
This PR adds documentation and testing to
`PydanticOutputFunctionsParser(OutputFunctionsParser)`.
5 months ago
Christophe Bornet 3502a407d9
infra: Use dotenv in langchain-community's integration tests (#16137)
* Removed some env vars not used in langchain package IT
* Added Astra DB env vars in langchain package, used for cache tests
* Added conftest.py to load env vars in langchain_community IT
* Added .env.example in  langchain_community IT
5 months ago
Leonid Ganeline 2709d3e5f2
langchain[patch]: updated imports for `langchain.callbacks` (#16060)
Updated imports from 'langchain` to `core` where it is possible

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
Leonid Ganeline c5f6b828ad
langchain[patch], community[minor]: move `output_parsers.ernie_functions` (#16057)
`output_parsers.ernie_functions` moved into `community`
5 months ago
Leonid Ganeline 49aff3ea5b
langchain[patch]: updated `agents` imports (#16061)
Updated imports into `langchain` to `core` where it is possible

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
Leonid Ganeline 60b1bd02d7
langchain[patch]: updated imports for `output_parsers` (#16059)
Updated imports from `langchain` to `core` where it is possible
5 months ago
Leonid Ganeline 9e9ad9b0e9
langchain[patch]: updated `retrievers` imports (#16062)
Updated imports into `langchain` to `core` where it is possible

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
Leonid Ganeline d350be959d
langchain[patch]: updated `chains` imports (#16064)
Updated imports into `langchain` to `core` where it is possible

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
ChengZi 8597484195
langchain[patch]: support more comparators in Milvus self-querying retriever (#16076)
- **Description:** Support IN and LIKE comparators in Milvus
self-querying retriever, based on [Boolean Expression
Rules](https://milvus.io/docs/boolean.md)
  - **Issue:** No
  - **Dependencies:** No
  - **Twitter handle:** No

Signed-off-by: ChengZi <chen.zhang@zilliz.com>
5 months ago
William FH e5cf1e2414
Community[patch]use secret str in Tavily and HuggingFaceInferenceEmbeddings (#16109)
So the api keys don't show up in repr's 

Still need to do tests
5 months ago
Bagatur 8840a8cc95
docs: tool-use use case (#15783)
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
5 months ago
Bagatur 3d34347a85
langchain[patch]: bump core dep to 0.1.9 (#16104) 5 months ago
Bagatur 62a2e9ee19
langchain[patch]: Release 0.1.1 (#16103) 5 months ago
Nuno Campos 770f57196e
Add unit test for overridden lc_namespace (#16093) 5 months ago
Bagatur 697a6f2c80
langchain[patch]: fix requests lint (#16049) 5 months ago
Christophe Bornet 15c2b4a47e
community[minor]: Add AstraDB self query retriever (#15738)
- **Description:** this change adds a self-query retriever for AstraDB
  - **Twitter handle:** cbornet_
5 months ago
Leonid Ganeline fb676d8a9b
community[minor], langchain[minor]: refactor `output_parsers` Rail (#15852)
Moved Rail parser to `community` package.
5 months ago
Varik Matevosyan efe6cfafe2
community: Added Lantern as VectorStore (#12951)
Support [Lantern](https://github.com/lanterndata/lantern) as a new
VectorStore type.

- Added Lantern as VectorStore.
It will support 3 distance functions `l2 squared`, `cosine` and
`hamming` and will use `HNSW` index.
- Added tests
- Added example notebook
5 months ago
Bagatur c697c89ca4
docs: add agent prompt creation examples (#15957) 5 months ago
Tal eb9b334a6b
Enable customizing the output parser of `OpenAIFunctionsAgent` (#15827)
- **Description:** This PR defines the output parser of
OpenAIFunctionsAgent as an attribute, enabling customization and
subclassing of the parser logic.
- **Issue:** Subclassing is currently impossible as the
`OpenAIFunctionsAgentOutputParser` class is hard coded into the `plan`
and `aplan` methods
  - **Dependencies:** None

<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
5 months ago
JanHorcicka f454e95461
langchain: fix OutputParserException (#15914) (#15916)
**Description:**

Fixes OutputParserException thrown by the output_parser when 'query' is
'Null'.

Replace this entire comment with:
- **Description:** Current implentation of output_parser throws
OutputParserException if the response from the LLM contains `query:
null`. This unfortunately happens for my use case. And since there is no
way to modify the prompt used in SelfQueryRetriever, then we have to fix
it here, so it doesn't crash.
  - **Issue:** https://github.com/langchain-ai/langchain/issues/15914

Didn't run tests. `make test` is not working. There is no `test` rule in
the `Makefile`.

Co-authored-by: Jan Horcicka <jhorcick@amazon.com>
5 months ago
William FH 129552e3d6
Rm deprecated (#15920)
Remove the usage of deprecated methods in the test runner.
5 months ago
Nuno Campos 438beb6c94
Pass config specs through ensemble retriever (#15917)
<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
5 months ago
Eugene Yurtsev a06db53c37
Add unit tests to test openai tools agent (#15843)
This PR adds unit testing to test openai tools agent.
5 months ago
Eugene Yurtsev feb41c5e28
langchain[patch]: Improve stream_log with AgentExecutor and Runnable Agent (#15792)
This PR fixes an issue where AgentExecutor with RunnableAgent does not allow users to see individual llm tokens if streaming=True is not set explicitly on the underlying chat model.

The majority of this PR is testing code:

1. Create a test chat model that makes it easier to test streaming and
supports AIMessages that include function invocation information.
2. Tests for the chat model
3. Tests for RunnableAgent (previously untested)
4. Tests for openai agent (previously untested)
5 months ago
Eugene Yurtsev 3a8ad90509
langchain(patch): Fix output type for pydantic output parser (#15714)
This PR fixes the output type for the pydantic output parser.

Fix for: https://github.com/langchain-ai/langserve/issues/301
5 months ago
Erick Friis d136925c49
community[patch]: fix deprecation warnings on openai subclasses (#15621) 5 months ago
Bagatur 4ac61670b2
infra: fix langchain openai test dep (#15620) 5 months ago
Bagatur 81810cec2e
langchain[minor]: Release 0.1.0 (#15619) 5 months ago
Erick Friis ebc75c5ca7
openai[minor]: implement langchain-openai package (#15503)
Todo

- [x] copy over integration tests
- [x] update docs with new instructions in #15513 
- [x] add linear ticket to bump core -> community, community->langchain,
and core->openai deps
- [ ] (optional): add `pip install langchain-openai` command to each
notebook using it
- [x] Update docstrings to not need `openai` install
- [x] Add serialization
- [x] deprecate old models

Contributor steps:

- [x] Add secret names to manual integrations workflow in
.github/workflows/_integration_test.yml
- [x] Add secrets to release workflow (for pre-release testing) in
.github/workflows/_release.yml

Maintainer steps (Contributors should not do these):

- [x] set up pypi and test pypi projects
- [x] add credential secrets to Github Actions
- [ ] add package to conda-forge


Functional changes to existing classes:

- now relies on openai client v1 (1.6.1) via concrete dep in
langchain-openai package

Codebase organization

- some function calling stuff moved to
`langchain_core.utils.function_calling` in order to be used in both
community and langchain-openai
5 months ago
Bagatur 68eb3053e7
langchain[patch]: deprecate old agent classes and methods (#15558) 5 months ago
Harrison Chase 9b9449750c
update chain docs (#15495)
Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
Bagatur 00dfbd2a99
core[minor], langchain[minor]: deprecate old Chain and LLM methods (#15499) 5 months ago
Bagatur f5e4f0b30b
langchain[minor]: add warnings when importing integrations (#15505)
Should be imported from community directly
5 months ago
Bagatur b2f15738dd
core[patch], langchain[patch], community[patch]: Revert #15326 (#15546) 5 months ago
Bagatur 6e90b7a91b
langchain[patch]: bump community >=0.0.8,<0.1 (#15492) 5 months ago
Bagatur 8b7d6531a5
langchain[patch]: Release 0.0.354 (#15482) 5 months ago
Bagatur 54b58c03db
infra: add minimum deps pre release check (#15485) 6 months ago
Bagatur baeac236b6
langchain[patch], experimental[patch]: update utilities imports (#15438) 6 months ago
Nolan 6c4b5a4eff
Add option to preserve headers in MarkdownHeaderTextSplitter (#14433)
- **Description:** `MarkdownHeaderTextSplitter` currently strips header
lines from chunked content. Many applications require these header lines
are preserved. This adds an optional parameter to preserve those headers
in the chunked content.
  - **Issue:** #2836 (relevant)
  - **Dependencies:** -
  - **Tag maintainer:** @baskaryan
  - **Twitter handle:** @finnless

Unit tests and new examples in notebook included.

cc @rlancemartin

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
6 months ago
Leonid Ganeline 2bbee894bb
fixed a dependency duplicate (#15444)
BaseModel is derived twice. Left only one.
6 months ago
William FH 65afc13b8b
[Improvement] Evals: Add git info (#15446) 6 months ago
Bagatur 93e924ec96
langchain[patch], docs: update agent toolkit imports (#15434) 6 months ago
dudub12 7e6b0056b8
SQLDatabase drop the column names in the result. (#15361)
Fix for the following bug:
https://github.com/langchain-ai/langchain/issues/15360

---------

Co-authored-by: dudu butbul <100126964+dudu-upstream@users.noreply.github.com>
6 months ago
Bagatur 1678d6ca17
langchain[patch], experimental[patch], docs: update tools imports (#15433) 6 months ago
Bob Lin e57e50b213
Remove unused `_get_python_repl` (#15389)
This part of the code can also be safely cleaned up.
6 months ago
suhas-kotaki 73a628de9a
added fix for key error: doc_id (#15428)
<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
6 months ago
Leonid Ganeline b8c6ebf647
refactor `utils` (#15432)
The `langchain` [still holds several
artifacts](https://api.python.langchain.com/en/latest/langchain_api_reference.html#module-langchain.utils)
that belongs to `community`. If they moved then `langchain.utils`
namespace would be removed completely.
- moved `ernie_functions` artifacts to `community`
6 months ago
Bagatur fa5d49f2c1
docs, experimental[patch], langchain[patch], community[patch]: update storage imports (#15429)
ran 
```bash
g grep -l "langchain.vectorstores" | xargs -L 1 sed -i '' "s/langchain\.vectorstores/langchain_community.vectorstores/g"
g grep -l "langchain.document_loaders" | xargs -L 1 sed -i '' "s/langchain\.document_loaders/langchain_community.document_loaders/g"
g grep -l "langchain.chat_loaders" | xargs -L 1 sed -i '' "s/langchain\.chat_loaders/langchain_community.chat_loaders/g"
g grep -l "langchain.document_transformers" | xargs -L 1 sed -i '' "s/langchain\.document_transformers/langchain_community.document_transformers/g"
g grep -l "langchain\.graphs" | xargs -L 1 sed -i '' "s/langchain\.graphs/langchain_community.graphs/g"
g grep -l "langchain\.memory\.chat_message_histories" | xargs -L 1 sed -i '' "s/langchain\.memory\.chat_message_histories/langchain_community.chat_message_histories/g"
gco master libs/langchain/tests/unit_tests/*/test_imports.py
gco master libs/langchain/tests/unit_tests/**/test_public_api.py
```
6 months ago
Nuno Campos 6810b4b0bc
Use tz-aware utc datetimes in tracer (#15187)
<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
6 months ago
Bagatur 480626dc99
docs, community[patch], experimental[patch], langchain[patch], cli[pa… (#15412)
…tch]: import models from community

ran
```bash
git grep -l 'from langchain\.chat_models' | xargs -L 1 sed -i '' "s/from\ langchain\.chat_models/from\ langchain_community.chat_models/g"
git grep -l 'from langchain\.llms' | xargs -L 1 sed -i '' "s/from\ langchain\.llms/from\ langchain_community.llms/g"
git grep -l 'from langchain\.embeddings' | xargs -L 1 sed -i '' "s/from\ langchain\.embeddings/from\ langchain_community.embeddings/g"
git checkout master libs/langchain/tests/unit_tests/llms
git checkout master libs/langchain/tests/unit_tests/chat_models
git checkout master libs/langchain/tests/unit_tests/embeddings/test_imports.py
make format
cd libs/langchain; make format
cd ../experimental; make format
cd ../core; make format
```
6 months ago
Bagatur 8e0d5813c2
langchain[patch], experimental[patch]: replace langchain.schema imports (#15410)
Import from core instead.

Ran:
```bash
git grep -l 'from langchain.schema\.output_parser' | xargs -L 1 sed -i '' "s/from\ langchain\.schema\.output_parser/from\ langchain_core.output_parsers/g"
git grep -l 'from langchain.schema\.messages' | xargs -L 1 sed -i '' "s/from\ langchain\.schema\.messages/from\ langchain_core.messages/g"
git grep -l 'from langchain.schema\.document' | xargs -L 1 sed -i '' "s/from\ langchain\.schema\.document/from\ langchain_core.documents/g"
git grep -l 'from langchain.schema\.runnable' | xargs -L 1 sed -i '' "s/from\ langchain\.schema\.runnable/from\ langchain_core.runnables/g"
git grep -l 'from langchain.schema\.vectorstore' | xargs -L 1 sed -i '' "s/from\ langchain\.schema\.vectorstore/from\ langchain_core.vectorstores/g"
git grep -l 'from langchain.schema\.language_model' | xargs -L 1 sed -i '' "s/from\ langchain\.schema\.language_model/from\ langchain_core.language_models/g"
git grep -l 'from langchain.schema\.embeddings' | xargs -L 1 sed -i '' "s/from\ langchain\.schema\.embeddings/from\ langchain_core.embeddings/g"
git grep -l 'from langchain.schema\.storage' | xargs -L 1 sed -i '' "s/from\ langchain\.schema\.storage/from\ langchain_core.stores/g"
git checkout master libs/langchain/tests/unit_tests/schema/
make format
cd libs/experimental
make format
cd ../langchain
make format
```
6 months ago
Lucca Zenóbio 3bd0a15506
Fix for openai multi tools input format. (#14653)
Sometimes, the tool_schema is like:
  
` {'action_name': 'search_items', 'action': {'term': 'pizza'}}`

sometimes, specially with gpt3.5 it comes like:

`{'action_name': 'search_items', 'term': 'pizza'}`

and it fails.

This PR is a way to make it work in both scenarios.

issues releated: #6624 

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

Co-authored-by: Lucca Zenobio <lucca.zenobio@ifood.com.br>
6 months ago
Thomas B 9d8468a576
Enhancement on feature/yaml output parser (#14674)
Adding to my previously, already merged PR I made some further
improvements:

* Added documentation to the existing Pydantic Parser notebook, with an
example using LCEL and `with_retry()` on `OutputParserException`.
* Added an additional output example to the prompt
* More lenient parser in terms of LLM output format
* Amended unit test

FYI @hwchase17

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
6 months ago
Zeeland ff10f30149
fix: syntax error in function docs (#14641)
fix syntax error in function docs
6 months ago
Rajesh Sharma dfd7b9edda
Update regex in output parser (#15082)
The regex used to match "Action" and "Action Input" in the output parser
has been updated. Previously, the regex did not correctly handle
multi-line inputs for "Action Input". The updated code uses the
're.DOTALL' flag to ensure multi-line inputs are correctly captured.

<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
6 months ago
Nuno Campos b9636e5c98
Catch type errors in dumps/dumpd (#15336)
These can happen for edge cases not covered by `default` handler (eg.
"strange" keys in dicts)

<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
6 months ago
Nuno Campos 99000c612e
Propagate context vars in all classes/methods (#15329)
- Any direct usage of ThreadPoolExecutor or asyncio.run_in_executor
needs manual handling of context vars

<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
6 months ago
Ankush Gola 7eec8f2487
Delete V1 tracer and refactor tracer tests to core (#15326) 6 months ago
Nuno Campos 9bb1fbcadf Lint 6 months ago
Nuno Campos eb5e250188 Propagate context vars in all classes/methods
- Any direct usage of ThreadPoolExecutor or asyncio.run_in_executor needs manual handling of context vars
6 months ago
Romain Fouilland 6f15cc64b8
langchain: minor changes to StuffDocumentsChain._get_inputs (#15321)
Correcting a small typo ('the' instead of 'then') and changing another
'the' (instead of 'then' too, it was a hard day for the 'n' key :D) to
'also' to match better with what is done in the code

<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
6 months ago
Harrison Chase c3b3b77a11
[core] add test for json parser (#15297)
this should fail, but isnt

---------

Co-authored-by: Nuno Campos <nuno@langchain.dev>
6 months ago
Bagatur 50e99ec601
langchain[patch]: Release 0.0.353 (#15322) 6 months ago
chyroc 507c195a4b
Patch: improve openai functions call parser compatibility (#15197)
```shell
Python 3.11.6 (main, Nov  2 2023, 04:39:43) [Clang 14.0.3 (clang-1403.0.22.14.1)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> s = {'name': 'gc', 'arguments': '{"prompt":"hi\nbob."}'}
>>> import json
>>> json.loads(s['arguments'])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/homebrew/Cellar/python@3.11/3.11.6_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/python@3.11/3.11.6_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/python@3.11/3.11.6_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/json/decoder.py", line 353, in raw_decode
    obj, end = self.scan_once(s, idx)
               ^^^^^^^^^^^^^^^^^^^^^^
json.decoder.JSONDecodeError: Invalid control character at: line 1 column 14 (char 13)
>>> json.loads(s['arguments'].replace('\n', '\\n'))
{'prompt': 'hi\nbob.'}
>>>
```

---------

Co-authored-by: Nuno Campos <nuno@langchain.dev>
6 months ago
Nuno Campos f74151b4e4
Make all json parsing less strict by default (#15287)
- Enables strict=False by default
- Uses partial json recovery logic by default

<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
6 months ago
Harrison Chase bc5a0ef6ca
remove chat-history (#15286) 6 months ago
Harrison Chase 90aa26a90e
[langchain] agents code changes (#15278)
<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
6 months ago
Harrison Chase b86803153e
[core, langchain] modelio code improvements (#15277) 6 months ago
Christopher Queen d5e1725ace
langchain: Fix for issue #14631 - .devcontainer doesnt build (#15251)
- **Description:** Fix for issue #14631
- **Issue:** This fixes [Issue
#14631](https://github.com/langchain-ai/langchain/issues/14631)
- **Twitter handle:** [@consultchrisq
](https://twitter.com/consultchrisq?lang=en)
6 months ago
Brendan Smith 9a16590aa9
langchain: Fix class name in RetryOutputParser docstring (#15268)
`OutputFixingParser` -> `RetryOutputParser`



![i'm-helping](https://github.com/langchain-ai/langchain/assets/5986636/68f1b8ce-8a6e-4e75-9cf8-e3c93ac562c2)
6 months ago
Nuno Campos 6a5a2fb9c8
Add .pick and .assign methods to Runnable (#15229)
<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
6 months ago
Nuno Campos f36ef0739d
Add create_conv_retrieval_chain func (#15084)
```
                                                     +----------+
                                                     | MapInput |
                                                   **+----------+****
                                               ****                  ****
                                           ****                          ***
                                         **                                 ****
                  +------------------------------------+                        **
                  | Lambda(itemgetter('chat_history')) |                         *
                  +------------------------------------+                         *
                                     *                                           *
                                     *                                           *
                                     *                                           *
                       +---------------------------+            +--------------------------------+
                       | Lambda(_get_chat_history) |            | Lambda(itemgetter('question')) |
                       +---------------------------+            +--------------------------------+
                                     *                                           *
                                     *                                           *
                                     *                                           *
                      +----------------------------+                +------------------------+
                      | ContextSet('chat_history') |                | ContextSet('question') |
                      +----------------------------+                +------------------------+
                                               ****                  ****
                                                   ****          ****
                                                       **      **
                                                     +-----------+
                                                     | MapOutput |
                                                     +-----------+
                                                           *
                                                           *
                                                           *
                                                  +----------------+
                                                  | PromptTemplate |
                                                  +----------------+
                                                           *
                                                           *
                                                           *
                                                    +-------------+
                                                    | FakeListLLM |
                                                    +-------------+
                                                           *
                                                           *
                                                           *
                                                  +-----------------+
                                                  | StrOutputParser |
                                                  +-----------------+
                                                           *
                                                           *
                                                           *
                                            +----------------------------+
                                            | ContextSet('new_question') |
                                            +----------------------------+
                                                           *
                                                           *
                                                           *
                                                +---------------------+
                                                | SequentialRetriever |
                                                +---------------------+
                                                           *
                                                           *
                                                           *
                                        +------------------------------------+
                                        | Lambda(_reduce_tokens_below_limit) |
                                        +------------------------------------+
                                                           *
                                                           *
                                                           *
                                           +-------------------------------+
                                           | ContextSet('input_documents') |
                                           +-------------------------------+
                                                           *
                                                           *
                                                           *
                                                     +----------+
                                                  ***| MapInput |****
                                           *******   +----------+    ********
                                   ********                *                 *******
                            *******                         *                       ********
                        ****                                *                               ****
+-------------------------------+            +----------------------------+            +----------------------------+
| ContextGet('input_documents') |            | ContextGet('chat_history') |            | ContextGet('new_question') |
+-------------------------------+****        +----------------------------+            +----------------------------+
                                     *********                *                 *******
                                              ********         *          ******
                                                      *****    *      ****
                                                         +-----------+
                                                         | MapOutput |
                                                         +-----------+
                                                                *
                                                                *
                                                                *
                                                        +-------------+
                                                        | FakeListLLM |
                                                        +-------------+
                                                                *
                                                                *
                                                                *
                                                          +----------+
                                                       ***| MapInput |***
                                               ********   +----------+   ******
                                        *******                 *              *****
                                ********                        *                   ******
                            ****                                *                         ***
    +-------------------------------+            +----------------------------+            +-------------+
    | ContextGet('input_documents') |            | ContextGet('new_question') |          **| Passthrough |
    +-------------------------------+            +----------------------------+   *******  +-------------+
                                     *******                 *              ******
                                            ******           *       *******
                                                  ****      *    ****
                                                     +-----------+
                                                     | MapOutput |
                                                     +-----------+
```

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
6 months ago
Harrison Chase 4ad77f777e
[core] prompt changes (#15186)
change it to pass all variables through all the way in invoke
6 months ago
Bagatur 56fad2e8ff
langchain[minor]: Add stuff docs runnable (#15178)
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
6 months ago
Ran c3f8733aef
fix: correct spelling mistakes of "seperate, intialise, pre-defined" (#14647)
fix spellings

**seperate -> separate**: found more occurrences, see
https://github.com/langchain-ai/langchain/pull/14602
**initialise -> intialize**: the latter is more common in the repo
**pre-defined > predefined**: adding a comma after a prefix is a
delicate matter, but this is a generally accepted word

also, another word that appears in the repo is "fs" (stands for
filesystem), e.g., in `libs/core/langchain_core/prompts/loading.py`
` """Unified method for loading a prompt from LangChainHub or local
fs."""`
Isn't "filesystem" better?
6 months ago
Eugene Yurtsev aad3d8bd47
langchain(patch): Restrict paths in LocalFileStore cache (#15065)
This PR restricts the paths that can be resolve using the local file system cache so that all paths must be contained within the root path.
6 months ago
Ran 129a929d69
infra: Fix test filesystem paths incompatible with windows (#14388)
- **Description:** This PR fixes test failures on Windows caused by path
handling differences and unescaped special characters in regex. The
failing tests are:
```
FAILED tests/unit_tests/storage/test_filesystem.py::test_yield_keys - AssertionError: assert ['key1', 'subdir\\key2'] == ['key1', 'subdir/key2']
FAILED tests/unit_tests/test_imports.py::test_importable_all - ModuleNotFoundError: No module named 'langchain_community.langchain_community\\adapters'
FAILED tests/unit_tests/tools/file_management/test_utils.py::test_get_validated_relative_path_errs_on_absolute - re.error: incomplete escape \U at position 53
FAILED tests/unit_tests/tools/file_management/test_utils.py::test_get_validated_relative_path_errs_on_parent_dir - re.error: incomplete escape \U at position 69
FAILED tests/unit_tests/tools/file_management/test_utils.py::test_get_validated_relative_path_errs_for_symlink_outside_root - re.error: incomplete escape \U at position 64
```

- **Issue:** fixes
https://github.com/langchain-ai/langchain/issues/11775 (partially)
- **Dependencies:** none
6 months ago
Nuno Campos 71076cceaf
Move json and xml parsers to core (#15026)
<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
6 months ago
Nuno Campos 63e512b680
Implement streaming for all list output parsers (#14981)
<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
6 months ago
Nuno Campos b471166df7
Implement streaming for xml output parser (#14984)
<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
6 months ago
Bagatur 1ea6d83188
langchain[patch]: Release 0.0.352 (#14961) 6 months ago
Michael Feil 7b96de3d5d
community[patch]: update Gradient embeddings (#14846)
- **Description:** Going forward, we have a own API `pip install
gradientai`. Therefore gradually removing the self-build packages in
llamaindex, haystack and langchain.
  - **Issue:** None.
  - **Dependencies:** `pip install gradientai`
  - **Tag maintainer:** @michaelfeil
6 months ago
Bagatur 1069a93d18
langchain[patch]: export sagemaker LLMContentHandler (#14906)
Resolves #14904
6 months ago
Leonid Ganeline 6577b0d987
docstrings `langchain` update (#14870)
Added missed docstrings
6 months ago
Kane Sweet ea331f3136
Fix token text splitter duplicates (#14848)
- **Description:** 
- Add a break case to `text_splitter.py::split_text_on_tokens()` to
avoid unwanted item at the end of result.
    - Add a testcase to enforce the behavior.
  - **Issue:** 
    - #14649 
    - #5897
  - **Dependencies:** n/a,
 
---

**Quick illustration of change:**

```
text = "foo bar baz 123"

tokenizer = Tokenizer(
        chunk_overlap=3,
        tokens_per_chunk=7
)

output = split_text_on_tokens(text=text, tokenizer=tokenizer)
```
output before change: `["foo bar", "bar baz", "baz 123", "123"]`
output after change: `["foo bar", "bar baz", "baz 123"]`
6 months ago
Erick Friis 5f839beab9
community: replace deprecated davinci models (#14860)
This is technically a breaking change because it'll switch out default
models from `text-davinci-003` to `gpt-3.5-turbo-instruct`, but OpenAI
is shutting off those endpoints on 1/4 anyways.

Feels less disruptive to switch out the default instead.
6 months ago
Bagatur 714bef0cb6
langchain[patch]: Release 0.0.351 (#14867) 6 months ago
William FH 5fc2c578cf
[Bugfix] Ensure tool output is a str, for OAI Assistant (#14830)
Tool outputs have to be strings apparently. Ensure they are formatted
correctly before passing as intermediate steps.
 

```
BadRequestError: Error code: 400 - {'error': {'message': '1 validation error for Request\nbody -> tool_outputs -> 0 -> output\n  str type expected (type=type_error.str)', 'type': 'invalid_request_error', 'param': None, 'code': None}}
```
6 months ago
William FH bbc98a234d
Update parser (#14831)
Gpt-3.5 sometimes calls with empty string arguments instead of `{}`

I'd assume it's because the typescript representation on their backend
makes it a bit ambiguous.
6 months ago
Erick Friis 8a07c56313
docs: developer docs (#14776)
Builds out a developer documentation section in the docs

- Links it from contributing.md
- Adds an initial guide on how to contribute an integration

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
6 months ago
William FH 01693b291e
Permit updates in indexing (#14482) 6 months ago
Harrison Chase 16399fd61d
langchain[patch]: remove unused imports (#14680)
Co-authored-by: Bagatur <baskaryan@gmail.com>
6 months ago
William FH 4855964332
Fix OAI Tool Message (#14746)
See format here:
https://platform.openai.com/docs/guides/function-calling/parallel-function-calling


It expects a "name" argument, which we aren't providing by default.


![image](https://github.com/langchain-ai/langchain/assets/13333726/7cd82978-337c-40a1-b099-3bb25cd57eb4)


Alternative is to add the 'name' field directly to the message if people
prefer.
6 months ago
William FH e3132a7efc
[Evals] End project (#14324)
Also does some cleanup.

Now that we support updating/ending projects, do this automatically.
Then you can edit the name of the project in the app.
6 months ago
William FH 9d4100f915
Revert "[Hub|tracing] Tag hub prompts" (#14735)
Reverts langchain-ai/langchain#14720
6 months ago
William FH 852b9ca494
[Hub|tracing] Tag hub prompts (#14720)
If you're using the hub, you'll likely be interested in tracking the
commit/object when tracing. This PR adds it to the config
6 months ago
James Braza b9ef92f2f4
Fixed `DeprecationWarning` for `PromptTemplate.from_file` module-level calls (#14468)
Resolves https://github.com/langchain-ai/langchain/issues/14467
6 months ago
Thomas B b4e3e47c92
feat: Yaml output parser (#14496)
## Description
New YAML output parser as a drop-in replacement for the Pydantic output
parser. Yaml is a much more token-efficient format than JSON, proving to
be **~35% faster and using the same percentage fewer completion
tokens**.

☑️ Formatted
☑️ Linted
☑️ Tested (analogous to the existing`test_pydantic_parser.py`)

The YAML parser excels in situations where a list of objects is
required, where the root object needs no key:
```python
class Products(BaseModel):
   __root__: list[Product]
```

I ran the prompt `Generate 10 healthy, organic products` 10 times on one
chain using the `PydanticOutputParser`, the other one using
the`YamlOutputParser` with `Products` (see below) being the targeted
model to be created.

LLMs used were Fireworks' `lama-v2-34b-code-instruct` and OpenAI
`gpt-3.5-turbo`. All runs succeeded without validation errors.

```python
class Nutrition(BaseModel):
    sugar: int = Field(description="Sugar in grams")
    fat: float = Field(description="% of daily fat intake")

class Product(BaseModel):
    name: str = Field(description="Product name")
    stats: Nutrition

class Products(BaseModel):
    """A list of products"""

    products: list[Product] # Used `__root__` for the yaml chain
```
Stats after 10 runs reach were as follows:
### JSON
ø time: 7.75s
ø tokens: 380.8

### YAML
ø time: 5.12s
ø tokens: 242.2


Looking forward to feedback, tips and contributions!
6 months ago
Lance Martin 282362382c
Minor update to ensemble retriever to handle a mix of Documents or str (#14552) 6 months ago
Bagatur 57337b4862
langchain[patch]: Release 0.0.350 (#14612) 6 months ago
Bagatur d388863a3b
community[patch]: Release 0.0.2 (#14610) 6 months ago
Erick Friis 0a9d933bb2
infra: import checking bugfix (#14569) 6 months ago
Bagatur 14bfc5f9f4
langchain[patch]: Release 0.0.349 (#14570) 6 months ago
Erick Friis 482e2b94fa
infra: import CI speed (#14566)
Was taking 10 mins. Now a few seconds.
6 months ago
Erick Friis 5418d8bfd6
infra: import CI fix (#14562)
TIL `**` globstar doesn't work in make

Makefile changes fix that.

`__getattr__` changes allow import of all files, but raise error when
accessing anything from the module.

file deletions were corresponding libs change from #14559
6 months ago
Bagatur ed58eeb9c5
community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463)
Moved the following modules to new package langchain-community in a backwards compatible fashion:

```
mv langchain/langchain/adapters community/langchain_community
mv langchain/langchain/callbacks community/langchain_community/callbacks
mv langchain/langchain/chat_loaders community/langchain_community
mv langchain/langchain/chat_models community/langchain_community
mv langchain/langchain/document_loaders community/langchain_community
mv langchain/langchain/docstore community/langchain_community
mv langchain/langchain/document_transformers community/langchain_community
mv langchain/langchain/embeddings community/langchain_community
mv langchain/langchain/graphs community/langchain_community
mv langchain/langchain/llms community/langchain_community
mv langchain/langchain/memory/chat_message_histories community/langchain_community
mv langchain/langchain/retrievers community/langchain_community
mv langchain/langchain/storage community/langchain_community
mv langchain/langchain/tools community/langchain_community
mv langchain/langchain/utilities community/langchain_community
mv langchain/langchain/vectorstores community/langchain_community
mv langchain/langchain/agents/agent_toolkits community/langchain_community
mv langchain/langchain/cache.py community/langchain_community
mv langchain/langchain/adapters community/langchain_community
mv langchain/langchain/callbacks community/langchain_community/callbacks
mv langchain/langchain/chat_loaders community/langchain_community
mv langchain/langchain/chat_models community/langchain_community
mv langchain/langchain/document_loaders community/langchain_community
mv langchain/langchain/docstore community/langchain_community
mv langchain/langchain/document_transformers community/langchain_community
mv langchain/langchain/embeddings community/langchain_community
mv langchain/langchain/graphs community/langchain_community
mv langchain/langchain/llms community/langchain_community
mv langchain/langchain/memory/chat_message_histories community/langchain_community
mv langchain/langchain/retrievers community/langchain_community
mv langchain/langchain/storage community/langchain_community
mv langchain/langchain/tools community/langchain_community
mv langchain/langchain/utilities community/langchain_community
mv langchain/langchain/vectorstores community/langchain_community
mv langchain/langchain/agents/agent_toolkits community/langchain_community
mv langchain/langchain/cache.py community/langchain_community
```

Moved the following to core
```
mv langchain/langchain/utils/json_schema.py core/langchain_core/utils
mv langchain/langchain/utils/html.py core/langchain_core/utils
mv langchain/langchain/utils/strings.py core/langchain_core/utils
cat langchain/langchain/utils/env.py >> core/langchain_core/utils/env.py
rm langchain/langchain/utils/env.py
```

See .scripts/community_split/script_integrations.sh for all changes
6 months ago
Harrison Chase f5befe3b89
manual mapping (#14422) 6 months ago
Erick Friis c24f277b7c
langchain[patch], docs[patch]: use byte store in multivectorretriever (#14474) 6 months ago
Erick Friis b3f226e8f8
core[patch], langchain[patch], experimental[patch]: import CI (#14414) 6 months ago
Erick Friis 1d7e5c51aa
langchain[patch]: xfail unstable vertex test (#14462) 6 months ago
Harrison Chase 02ee0073cf
revoke serialization (#14456) 6 months ago
Erick Friis 1d725327eb
langchain[patch]: Fix scheduled testing (#14428)
- integration tests in pyproject
- integration test fixes
6 months ago
Harrison Chase 7be3eb6fbd
fix imports from core (#14430) 6 months ago
Bagatur e4d6e55c5e
langchain[patch]: Release 0.0.348 (#14417) 6 months ago
Bagatur b2280fd874
core[patch], langchain[patch]: fix required deps (#14373) 6 months ago
Kacper Łukawski 76f30f5297
langchain[patch]: Rollback multiple keys in Qdrant (#14390)
This reverts commit 38813d7090. This is a
temporary fix, as I don't see a clear way on how to use multiple keys
with `Qdrant.from_texts`.

Context: #14378
6 months ago
Erick Friis 54040b00a4
langchain[patch]: fix ChatVertexAI streaming (#14369) 6 months ago
Bagatur db6bf8b022
langchain[patch]: Release 0.0.347 (#14368) 6 months ago
William FH e5bd32ff6d
Include run_id (#14331)
in the test run outputs
6 months ago
Bagatur cc76f0e834
langchain[patch]: import nits (#14354)
import from core instead of langchain.schema
6 months ago
Jacob Lee 867ca6d0be
Fix multi vector retriever subclassing (#14350)
Fixes #14342

@eyurtsev @baskaryan

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
6 months ago
Erick Friis 7bdfc43766
core[patch], langchain[patch]: ByteStore (#14312) 6 months ago
Jean-Baptiste dlb 38813d7090
Qdrant metadata payload keys (#13001)
- **Description:** In Qdrant allows to input list of keys as the
content_payload_key to retrieve multiple fields (the generated document
will contain the dictionary {field: value} in a string),
- **Issue:** Previously we were able to retrieve only one field from the
vector database when making a search
  - **Dependencies:** 
  - **Tag maintainer:** 
  - **Twitter handle:** @jb_dlb

---------

Co-authored-by: Jean Baptiste De La Broise <jeanbaptiste.delabroise@mdpi.com>
6 months ago
Yuchen Liang ad6dfb6220
feat: mask api key for cerebriumai llm (#14272)
- **Description:** Masking API key for CerebriumAI LLM to protect user
secrets.
 - **Issue:** #12165 
 - **Dependencies:** None
 - **Tag maintainer:** @eyurtsev

---------

Signed-off-by: Yuchen Liang <yuchenl3@andrew.cmu.edu>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
6 months ago
newfinder d4d64daa1e
Mask API key for baidu qianfan (#14281)
Description: This PR masked baidu qianfan - Chat_Models API Key and
added unit tests.
Issue: the issue langchain-ai#12165.
Tag maintainer: @eyurtsev

---------

Co-authored-by: xiayi <xiayi@bytedance.com>
6 months ago
cxumol 06e3316f54
feat(add): LLM integration of Cloudflare Workers AI (#14322)
Add [Text Generation by Cloudflare Workers
AI](https://developers.cloudflare.com/workers-ai/models/text-generation/).
It's a new LLM integration.

- Dependencies: N/A
6 months ago
Harutaka Kawamura 5efaedf488
Exclude `max_tokens` from request if it's None (#14334)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->


We found a request with `max_tokens=None` results in the following error
in Anthropic:

```
HTTPError: 400 Client Error: Bad Request for url: https://oregon.staging.cloud.databricks.com/serving-endpoints/corey-anthropic/invocations. 
Response text: {"error_code":"INVALID_PARAMETER_VALUE","message":"INVALID_PARAMETER_VALUE: max_tokens was not of type Integer: null"}
```

This PR excludes `max_tokens` if it's None.
6 months ago
MinjiK a1a11ffd78
Amadeus toolkit minor update (#13002)
- update `Amadeus` toolkit with ability to switch Amadeus environments 
- update minor code explanations

---------

Co-authored-by: MinjiK <minji.kim@amadeus.com>
6 months ago
Alexandre Dumont b05c46074b
OpenAIEmbeddings: retry_min_seconds/retry_max_seconds parameters (#13138)
- **Description:** new parameters in OpenAIEmbeddings() constructor
(retry_min_seconds and retry_max_seconds) that allow parametrization by
the user of the former min_seconds and max_seconds that were hidden in
_create_retry_decorator() and _async_retry_decorator()
  - **Issue:** #9298, #12986
  - **Dependencies:** none
  - **Tag maintainer:** @hwchase17
  - **Twitter handle:** @adumont

make format 
make lint 
make test 

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
6 months ago
mogith-pn 9e5d146409
Updated integration with Clarifai python SDK functions (#13671)
Description :

Updated the functions with new Clarifai python SDK.
Enabled initialisation of Clarifai class with model URL.
Updated docs with new functions examples.
6 months ago
dudub12 8f403ea2d7
info sql tool remove whitespaces in table names (#13712)
Remove whitespaces from the input of the ListSQLDatabaseTool for better
support.
for example, the input "table1,table2,table3" will throw an exception
whiteout the change although it's a valid input.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
6 months ago
balaba-max 64d5108f99
Feature: GitLab url from ENV (#14221)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** add gitlab url from env, 
  - **Issue:** no issue,
  - **Dependencies:** no,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
6 months ago
jeffpezzone 7c2ef06136
Adds "NIN" metadata filter for pgvector to all checking for set absence (#14205)
This PR adds support for metadata filters of the form:

`{"filter": {"key": { "NIN" : ["list", "of", "values"]}}}`

"IN" is already supported, so this is a quick & related update to add
"NIN"
6 months ago
lif 20d2b4a6ba
feat: Increased compatibility with new and old versions for dalle (#14222)
- **Description:** Increased compatibility with all versions openai for
dalle,

This pr add support for openai version from 0 ~ 1.3.
6 months ago
Wang Wei 7205bfdd00
feat: 1. Add system parameters, 2. Align with the QianfanChatEndpoint for function calling (#14275)
- **Description:** 
1. Add system parameters to the ERNIE LLM API to set the role of the
LLM.
2. Add support for the ERNIE-Bot-turbo-AI model according from the
document https://cloud.baidu.com/doc/WENXINWORKSHOP/s/Alp0kdm0n.
3. For the function call of ErnieBotChat, align with the
QianfanChatEndpoint.

With this PR, the `QianfanChatEndpoint()` can use the `function calling`
ability with `create_ernie_fn_chain()`. The example is as the following:

```
from langchain.prompts import ChatPromptTemplate
import json
from langchain.prompts.chat import (
    ChatPromptTemplate,
)

from langchain.chat_models import QianfanChatEndpoint
from langchain.chains.ernie_functions import (
    create_ernie_fn_chain,
)

def get_current_news(location: str) -> str:
    """Get the current news based on the location.'

    Args:
        location (str): The location to query.
    
    Returs:
        str: Current news based on the location.
    """

    news_info = {
        "location": location,
        "news": [
            "I have a Book.",
            "It's a nice day, today."
        ]
    }

    return json.dumps(news_info)

def get_current_weather(location: str, unit: str="celsius") -> str:
    """Get the current weather in a given location

    Args:
        location (str): location of the weather.
        unit (str): unit of the tempuature.
    
    Returns:
        str: weather in the given location.
    """

    weather_info = {
        "location": location,
        "temperature": "27",
        "unit": unit,
        "forecast": ["sunny", "windy"],
    }
    return json.dumps(weather_info)

template = ChatPromptTemplate.from_messages([
    ("user", "{user_input}"),
])

chat = QianfanChatEndpoint(model="ERNIE-Bot-4")
chain = create_ernie_fn_chain([get_current_weather, get_current_news], chat, template, verbose=True)
res = chain.run("北京今天的新闻是什么?")
print(res)
```

The result of the above code:
```
> Entering new LLMChain chain...
Prompt after formatting:
Human: 北京今天的新闻是什么?
> Finished chain.
{'name': 'get_current_news', 'arguments': {'location': '北京'}}
```

For the `ErnieBotChat`, now can use the `system` parameter to set the
role of the LLM.

```
from langchain.prompts import ChatPromptTemplate
from langchain.chains import LLMChain
from langchain.chat_models import ErnieBotChat

llm = ErnieBotChat(model_name="ERNIE-Bot-turbo-AI", system="你是一个能力很强的机器人,你的名字叫 小叮当。无论问你什么问题,你都可以给出答案。")
prompt = ChatPromptTemplate.from_messages(
    [
        ("human", "{query}"),
    ]
)
chain = LLMChain(llm=llm, prompt=prompt, verbose=True)
res = chain.run(query="你是谁?")
print(res)
```

The result of the above code:

```
> Entering new LLMChain chain...
Prompt after formatting:
Human: 你是谁?
> Finished chain.
我是小叮当,一个智能机器人。我可以为你提供各种服务,包括回答问题、提供信息、进行计算等。如果你需要任何帮助,请随时告诉我,我会尽力为你提供最好的服务。
```
6 months ago
Leonid Kuligin fd5be55a7b
added get_num_tokens to GooglePalm (#14282)
added get_num_tokens to GooglePalm + a little bit of refactoring
6 months ago
Massimiliano Pronesti c215a4c9ec
feat(embeddings): text-embeddings-inference (#14288)
- **Description:** Added a notebook to illustrate how to use
`text-embeddings-inference` from huggingface. As
`HuggingFaceHubEmbeddings` was using a deprecated client, I made the
most of this PR updating that too.

- **Issue:** #13286 

- **Dependencies**: None

- **Tag maintainer:** @baskaryan
6 months ago
Tim Van Wassenhove 85b88c33f3
Fixes issue-14295: Correctly pass along the kwargs (#14296)
- **Description:** Update code to correctly pass the kwargs 
  - **Issue:** #14295 
  - **Dependencies:**  - 
  - **Tag maintainer:** 

<--
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

#issue-14295
6 months ago
Jarkko Lagus 667ad6a5de
Add support for CORS options for AzureSearch (#14305)
- **Description:** Add support for setting the CORS options when using
AzureSearch indexes
6 months ago
Karim Assi 9401539e43
Allow not enforcing function usage when a single function is passed to openai function executable (#14308)
- **Description:** allows not enforcing function usage when a single
function is passed to an openAI function executable (or corresponding
legacy chain). This is a desired feature in the case where the model
does not have enough information to call a function, and needs to get
back to the user.
  - **Issue:** N/A
  - **Dependencies:** N/A
  - **Tag maintainer:** N/A
6 months ago
Ran d22c13ec48
Mask API key for Minimax LLM (#14309)
- **Description:** Added masking for the API key for Minimax LLM + tests
inspired by https://github.com/langchain-ai/langchain/pull/12418.
- **Issue:** the issue # fixes
https://github.com/langchain-ai/langchain/issues/12165
- **Dependencies:** this fix is dependent on Minimax instantiation fix
which is introduced in
https://github.com/langchain-ai/langchain/pull/13439, so merge this one
after.
  - **Tag maintainer:** @eyurtsev

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
6 months ago
Eugene Yurtsev a74c03da3c
Add metadata to blob (#14162)
Add metadata to the blob object. This makes it easier
to make a pipeline that properly propagates metadata information
from raw content to the derived content.
6 months ago
Lance Martin 66848871fc
Multi-modal RAG template (#14186)
* OpenCLIP embeddings
* GPT-4V

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
6 months ago
Eugene Yurtsev 80637727ea
hide api key: arcee (#14304)
Hide API key for Arcee

---------

Co-authored-by: raphael <raph.nunes95@gmail.com>
6 months ago
Bagatur b2e756c0a8
langchain[patch]: Release 0.0.346 (#14307) 6 months ago
Sean Bearden 77a15fa988
Added ability to pass arguments to the Playwright browser (#13146)
- **Description:** Enhanced `create_sync_playwright_browser` and
`create_async_playwright_browser` functions to accept a list of
arguments. These arguments are now forwarded to
`browser.chromium.launch()` for customizable browser instantiation.
  - **Issue:** #13143
  - **Dependencies:** None
  - **Tag maintainer:** @eyurtsev,
  - **Twitter handle:** Dr_Bearden

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
6 months ago
Joan Fontanals dcccf8fa66
adapt Jina Embeddings to new Jina AI Embedding API (#13658)
- **Description:** Adapt JinaEmbeddings to run with the new Jina AI
Embedding platform
- **Twitter handle:** https://twitter.com/JinaAI_

---------

Co-authored-by: Joan Fontanals Martinez <joan.fontanals.martinez@jina.ai>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
6 months ago
guillaumedelande ea0afd07ca
Update azuresearch.py following recent change from azure-search-documents library (#13472)
- **Description:** 

Reference library azure-search-documents has been adapted in version
11.4.0:

1. Notebook explaining Azure AI Search updated with most recent info
2. HnswVectorSearchAlgorithmConfiguration --> HnswAlgorithmConfiguration
3. PrioritizedFields(prioritized_content_fields) -->
SemanticPrioritizedFields(content_fields)
4. SemanticSettings --> SemanticSearch
5. VectorSearch(algorithm_configurations) -->
VectorSearch(configurations)

--> Changes now reflected on Langchain: default vector search config
from langchain is now compatible with officially released library from
Azure.

  - **Issue:**
Issue creating a new index (due to wrong class used for default vector
search configuration) if using latest version of azure-search-documents
with current langchain version
  - **Dependencies:** azure-search-documents>=11.4.0,
  - **Tag maintainer:** ,

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
6 months ago
price-deshaw 5cb3393e20
update OpenAI function agents' llm validation (#13538)
- **Description:** This PR modifies the LLM validation in OpenAI
function agents to check whether the LLM supports OpenAI functions based
on a property (`supports_oia_functions`) instead of whether the LLM
passed to the agent `isinstance` of `ChatOpenAI`. This allows classes
that extend `BaseChatModel` to be passed to these agents as long as
they've been integrated with the OpenAI APIs and have this property set,
even if they don't extend `ChatOpenAI`.
  - **Issue:** N/A
  - **Dependencies:** none
6 months ago
Max Weng 74c7b799ef
migrate openai audio api (#13557)
for issue https://github.com/langchain-ai/langchain/issues/13162
migrate openai audio api, as [openai v1.0.0 Migration
Guide](https://github.com/openai/openai-python/discussions/742)

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: Double Max <max@ground-map.com>
6 months ago
Arnaud Gelas abbba6c7d8
openapi/planner.py: Deal with json in markdown output cases (#13576)
- **Description:** In openapi/planner deal with json in markdown output
cases
- **Issue:** In some cases LLMs could return json in markdown which
can't be loaded.
  - **Dependencies:**
  - **Tag maintainer:** @eyurtsev
  - **Twitter handle:**

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
6 months ago
Nolan b49104c2c9
Add missing doc key to metadata field in AzureSearch Vectorstore (#13328)
- **Description:** Adds doc key to metadata field when adding document
to Azure Search.
  - **Issue:** -,
  - **Dependencies:** -,
  - **Tag maintainer:** @eyurtsev,
  - **Twitter handle:** @finnless

Right now the document key with the name FIELDS_ID is not included in
the FIELDS_METADATA field, and therefore is not included in the Document
returned from a query. This is really annoying if you want to be able to
modify that item in the vectorstore.

Other's thoughts on this are welcome.
6 months ago
Jon Watte e042e5df35
fix: call _on_llm_error() (#13581)
Description: There's a copy-paste typo where on_llm_error() calls
_on_chain_error() instead of _on_llm_error().
Issue: #13580 
Dependencies: None
Tag maintainer: @hwchase17 
Twitter handle: @jwatte

"Run `make format`, `make lint` and `make test` to check this locally."
The test scripts don't work in a plain Ubuntu LTS 20.04 system.
It looks like the dev container pulling is stuck. Or maybe the internet
is just ornery today.

---------

Co-authored-by: jwatte <jwatte@observeinc.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
6 months ago
Hamza Ahmed fcc8e5e839
Update geodataframe.py (#13573)
here it is validating shapely.geometry.point.Point: if not
isinstance(data_frame[page_content_column].iloc[0], gpd.GeoSeries):
raise ValueError(
f"Expected data_frame[{page_content_column}] to be a GeoSeries" you need
it to validate the geoSeries and not the shapely.geometry.point.Point

if not isinstance(data_frame[page_content_column], gpd.GeoSeries):
            raise ValueError(
f"Expected data_frame[{page_content_column}] to be a GeoSeries"

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
6 months ago
Harrison Chase 2213fc9711
Harrison/bookend ai (#14258)
Co-authored-by: stvhu-bookend <142813359+stvhu-bookend@users.noreply.github.com>
6 months ago
cxumol 0d47d15a9f
add(feat): Text Embeddings by Cloudflare Workers AI (#14220)
Add [Text Embeddings by Cloudflare Workers
AI](https://developers.cloudflare.com/workers-ai/models/text-embeddings/).
It's a new integration.
Trying to align it with its langchain-js version counterpart
[here](https://api.js.langchain.com/classes/embeddings_cloudflare_workersai.CloudflareWorkersAIEmbeddings.html).
- Dependencies: N/A
- Done `make format` `make lint` `make spell_check` `make
integration_tests` and all my changes was passed
6 months ago
Harrison Chase c51001f01e
fix comet tracer (#14259) 6 months ago
Harrison Chase 4fb72ff76f
fake consistent embeddings cleanup (#14256)
delete code that could never be reached
6 months ago
Michael Landis e26906c1dc
feat: implement max marginal relevance for momento vector index (#13619)
**Description**

Implements `max_marginal_relevance_search` and
`max_marginal_relevance_search_by_vector` for the Momento Vector Index
vectorstore.

Additionally bumps the `momento` dependency in the lock file and adds
logging to the implementation.

**Dependencies**

 updates `momento` dependency in lock file

**Tag maintainer**

@baskaryan 

**Twitter handle**

Please tag @momentohq for Momento Vector Index and @mloml for the
contribution 🙇

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
6 months ago
deedy5 ee9abb6722
Bugfix duckduckgo_search news search (#13670)
- **Description:** 
Bugfix duckduckgo_search news search
  - **Issue:** 
https://github.com/langchain-ai/langchain/issues/13648
  - **Dependencies:** 
None
  - **Tag maintainer:** 
@baskaryan

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
6 months ago
Aliaksandr Kuzmik 676a077c4e
Add CometTracer (#13661)
Hi! I'm Alex, Python SDK Team Lead from
[Comet](https://www.comet.com/site/).

This PR contains our new integration between langchain and Comet -
`CometTracer` class which uses new `comet_llm` python package for
submitting data to Comet.

No additional dependencies for the langchain package are required
directly, but if the user wants to use `CometTracer`, `comet-llm>=2.0.0`
should be installed. Otherwise an exception will be raised from
`CometTracer.__init__`.

A test for the feature is included.

There is also an already existing callback (and .ipynb file with
example) which ideally should be deprecated in favor of a new tracer. I
wasn't sure how exactly you'd prefer to do it. For example we could open
a separate PR for that.

I'm open to your ideas :)
6 months ago
Harrison Chase 921c4b5597
Harrison/searchapi (#14252)
Co-authored-by: SebastjanPrachovskij <86522260+SebastjanPrachovskij@users.noreply.github.com>
6 months ago
Colin Ulin 9f9cb71d26
Embaas - added backoff retries for network requests (#13679)
Running a large number of requests to Embaas' servers (or any server)
can result in intermittent network failures (both from local and
external network/service issues). This PR implements exponential backoff
retries to help mitigate this issue.
6 months ago
Kastan Day 65faba91ad
langchain[patch]: Adding new Github functions for reading pull requests (#9027)
The Github utilities are fantastic, so I'm adding support for deeper
interaction with pull requests. Agents should read "regular" comments
and review comments, and the content of PR files (with summarization or
`ctags` abbreviations).

Progress:
- [x] Add functions to read pull requests and the full content of
modified files.
- [x] Function to use Github's built in code / issues search.

Out of scope:
- Smarter summarization of file contents of large pull requests (`tree`
output, or ctags).
- Smarter functions to checkout PRs and edit the files incrementally
before bulk committing all changes.
- Docs example for creating two agents:
- One watches issues: For every new issue, open a PR with your best
attempt at fixing it.
- The other watches PRs: For every new PR && every new comment on a PR,
check the status and try to finish the job.

<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
6 months ago
Jacob Lee a26c4a0930
Allow base_store to be used directly with MultiVectorRetriever (#14202)
Allow users to pass a generic `BaseStore[str, bytes]` to
MultiVectorRetriever, removing the need to use the `create_kv_docstore`
method. This encoding will now happen internally.

@rlancemartin @eyurtsev

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
6 months ago
Jacob Lee de86b84a70
Prefer byte store interface for Upstash BaseStore to match other Redis (#14201)
If we are not going to make the existing Docstore class also implement
`BaseStore[str, Document]`, IMO all base store implementations should
always be `[str, bytes]` so that they are more interchangeable.

CC @rlancemartin @eyurtsev
6 months ago
Harrison Chase 411aa9a41e
Harrison/nasa tool (#14245)
Co-authored-by: Jacob Matias <88005863+matiasjacob25@users.noreply.github.com>
Co-authored-by: Karam Daid <karam.daid@mail.utoronto.ca>
Co-authored-by: Jumana <jumana.fanous@mail.utoronto.ca>
Co-authored-by: KaramDaid <38271127+KaramDaid@users.noreply.github.com>
Co-authored-by: Anna Chester <74325334+CodeMakesMeSmile@users.noreply.github.com>
Co-authored-by: Jumana <144748640+jfanous@users.noreply.github.com>
6 months ago
nceccarelli 5fea63327b
Support Azure gov cloud in Azure Cognitive Search retriever (#13695)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
- **Description:** The existing version hardcoded search.windows.net in
the base url. This is not compatible with the gov cloud. I am allowing
the user to override the default for gov cloud support.,
  - **Issue:** N/A, did not write up in an issue,
  - **Dependencies:** None

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: Nicholas Ceccarelli <nceccarelli2@moog.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
6 months ago
ealt e09b876863
Fixes error loading Obsidian templates (#13888)
- **Description:** Obsidian templates can include
[variables](https://help.obsidian.md/Plugins/Templates#Template+variables)
using double curly braces. `ObsidianLoader` uses PyYaml to parse the
frontmatter of documents. This parsing throws an error when encountering
variables' curly braces. This is avoided by temporarily substituting
safe strings before parsing.
  - **Issue:** #13887
  - **Tag maintainer:** @hwchase17
6 months ago
Nithish Raghunandanan eecfa3f9e5
Add Couchbase document loader (#13979)
**Description:** 
Adds the document loader for [Couchbase](http://couchbase.com/), a
distributed NoSQL database.
**Dependencies:** 
Added the Couchbase SDK as an optional dependency.
**Twitter handle:** nithishr

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
6 months ago
Muntaqa Mahmood 25f72944a0
Add: Steam API tool (#14008)
- **Description:** Our PR is an integration of a Steam API Tool that
makes recommendations on steam games based on user's Steam profile and
provides information on games based on user provided queries.
- **Issue:** the issue # our PR implements:
https://github.com/langchain-ai/langchain/issues/12120
- **Dependencies:** python-steam-api library, steamspypi library and
decouple library
  - **Tag maintainer:** @baskaryan, @hwchase17 
  - **Twitter handle:** N/A

Hello langchain Maintainers,

We are a team of 4 University of Toronto students contributing to
langchain as part of our course [CSCD01 (link to course
page)](https://cscd01.com/work/open-source-project). We hope our changes
help the community. We have run make format, make lint and make test
locally before submitting the PR. To our knowledge, our changes do not
introduce any new errors.

Our PR integrates the python-steam-api, steamspypi and decouple
packages. We have added integration tests to test our python API
integration into langchain and an example notebook is also provided.

Our amazing team that contributed to this PR: @JohnY2002, @shenceyang,
@andrewqian2001 and @muntaqamahmood

Thank you in advance to all the maintainers for reviewing our PR!

---------

Co-authored-by: Shence <ysc1412799032@163.com>
Co-authored-by: JohnY2002 <johnyuan0526@gmail.com>
Co-authored-by: Andrew Qian <andrewqian2001@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: JohnY <94477598+JohnY2002@users.noreply.github.com>
6 months ago
Bob Lin cd2028288e
Add openai v2 adapter (#14063)
### Description

Starting from [openai version
1.0.0](17ac677995 (module-level-client)),
the camel case form of `openai.ChatCompletion` is no longer supported
and has been changed to lowercase `openai.chat.completions`. In
addition, the returned object only accepts attribute access instead of
index access:

```python
import openai

# optional; defaults to `os.environ['OPENAI_API_KEY']`
openai.api_key = '...'

# all client options can be configured just like the `OpenAI` instantiation counterpart
openai.base_url = "https://..."
openai.default_headers = {"x-foo": "true"}

completion = openai.chat.completions.create(
    model="gpt-4",
    messages=[
        {
            "role": "user",
            "content": "How do I output all files in a directory using Python?",
        },
    ],
)
print(completion.choices[0].message.content)
```

So I implemented a compatible adapter that supports both attribute
access and index access:

```python
In [1]: from langchain.adapters import openai as lc_openai
   ...: messages = [{"role": "user", "content": "hi"}]

In [2]: result = lc_openai.chat.completions.create(
   ...:     messages=messages, model="gpt-3.5-turbo", temperature=0
   ...: )

In [3]: result.choices[0].message
Out[3]: {'role': 'assistant', 'content': 'Hello! How can I assist you today?'}

In [4]: result["choices"][0]["message"]
Out[4]: {'role': 'assistant', 'content': 'Hello! How can I assist you today?'}

In [5]: result = await lc_openai.chat.completions.acreate(
   ...:     messages=messages, model="gpt-3.5-turbo", temperature=0
   ...: )

In [6]: result.choices[0].message
Out[6]: {'role': 'assistant', 'content': 'Hello! How can I assist you today?'}

In [7]: result["choices"][0]["message"]
Out[7]: {'role': 'assistant', 'content': 'Hello! How can I assist you today?'}

In [8]: for rs in lc_openai.chat.completions.create(
    ...:     messages=messages, model="gpt-3.5-turbo", temperature=0, stream=True
    ...: ):
    ...:     print(rs.choices[0].delta)
    ...:     print(rs["choices"][0]["delta"])
    ...:
{'role': 'assistant', 'content': ''}
{'role': 'assistant', 'content': ''}
{'content': 'Hello'}
{'content': 'Hello'}
{'content': '!'}
{'content': '!'}

In [20]: async for rs in await lc_openai.chat.completions.acreate(
    ...:     messages=messages, model="gpt-3.5-turbo", temperature=0, stream=True
    ...: ):
    ...:     print(rs.choices[0].delta)
    ...:     print(rs["choices"][0]["delta"])
    ...:
{'role': 'assistant', 'content': ''}
{'role': 'assistant', 'content': ''}
{'content': 'Hello'}
{'content': 'Hello'}
{'content': '!'}
{'content': '!'}
...
```

### Twitter handle

[lin_bob57617](https://twitter.com/lin_bob57617)
6 months ago
billytrend-cohere 0f02081392
Add input_type override (#14068)
Add option to override input_type for cohere's v3 embeddings models

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
6 months ago
Dmitrii Rashchenko aaabc1574f
Support of custom hugging face inference endpoints url (#14125)
- **Description:** to support not only publicly available Hugging Face
endpoints, but also protected ones (created with "Inference Endpoints"
Hugging Face feature), I have added ability to specify custom api_url.
But if not specified, default behaviour won't change
  - **Issue:** #9181,
  - **Dependencies:** no extra dependencies
6 months ago
Harrison Chase e32185193e
Harrison/embass (#14242)
Co-authored-by: Julius Lipp <lipp.julius@gmail.com>
6 months ago
umair mehmood 8504ec56e4
fixed: ModuleNotFoundError: No module named 'clarifai.auth' (#14215)
Updated the clarifai imports 

fixed: #14175 

@efriis 
@baskaryan
6 months ago
Hieu Lam ca8a022cd9
Fixed OpenAIFunctionsAgent not returning when receiving AgentFinish (#14236)
**Description:** The way the condition is checked in the
`return_stopped_response` function of `OpenAIAgent` may not be correct,
when the value returned is `AgentFinish` from the tools it does not work
properly.


Thanks for review, @baskaryan, @eyurtsev, @hwchase17.
6 months ago
Unai Garay Maestre 6826feea14
Adds `llm_chain_kwargs` to `BaseRetrievalQA.from_llm` (#14224)
- **Description:** Adds `llm_chain_kwargs` to `BaseRetrievalQA.from_llm`
so these can be passed to the LLM at runtime,
- **Issue:** https://github.com/langchain-ai/langchain/issues/14216,

---------

Signed-off-by: ugm2 <unaigaraymaestre@gmail.com>
6 months ago
James Braza 6ce5dab38c
Clarifying descriptions in `GuardrailsOutputParser` (#14228)
Upstreaming knowledge from
https://github.com/guardrails-ai/guardrails/discussions/473 to LangChain
6 months ago
geret1 50aee687c6
langchain[patch]: Cerebrium model_api_request deprecation (#12704)
- **Description:** As part of my conversation with Cerebrium team,
`model_api_request` will be no longer available in cerebrium lib so it
needs to be replaced.
  - **Issue:** #12705 12705,
  - **Dependencies:** Cerebrium team (agreed)
  - **Tag maintainer:** @eyurtsev 
  - **Twitter handle:** No official Twitter account sorry :D

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
6 months ago
William FH 246dc4f9cc
langchain[patch]: Pass kwargs to chat fireworks (#14183)
Otherwise `.bind()` isn't really any good
7 months ago
Kaiboon Ee e961c57fd2
langchain[patch]: Mask API key for Arcee LLM (#14193)
- **Description:** Mask API key for Arcee LLM and its associated unit
tests
  - **Issue:** https://github.com/langchain-ai/langchain/issues/12165
  - **Dependencies:** N/A
  - **Tag maintainer:** @eyurtsev
  - **Twitter handle:** `eekaiboon`

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
Daniyar Supiyev 092f302c0f
langchain[patch]: Asynchronous human-in-the-loop callback (#14195)
**Description:** Adding a possibility to use asynchronous callback
handler in human-in-the-loop validation tool. Very useful, for example,
if you want to implement a validation over Telegram bot.
**Issue:** -
**Dependencies:** -

---------

Co-authored-by: Daniyar_Supiyev <daniyar_supiyev@epam.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
Mark Cusack 16c83f786c
Adds the Yellowbrick Data Warehouse as a supported vector store (#13820)
- **Description** An integration to allow the Yellowbrick Data Warehouse
to function as a vector store

---------

Co-authored-by: markcusack <markcusack@markcusacksmac.lan>
Co-authored-by: markcusack <markcusack@Mark-Cusack-sMac.local>
7 months ago
Nicolò Boschi e204657b3c
AstraDB VectorStore: implement pre_delete_collection (#13780)
- **Description:** some vector stores have a flag for try deleting the
collection before creating it (such as ´vectorpg´). This is a useful
flag when prototyping indexing pipelines and also for integration tests.
Added the bool flag `pre_delete_collection ` to the constructor (default
False)
  - **Tag maintainer:** @hemidactylus 
  - **Twitter handle:** nicoloboschi

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
7 months ago
Chelsea E. Manning 2780d2d4dd
Extend OpenAIEmbeddings class to support non-`tiktoken` based embeddings (#13884)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
- **Description:** This extends `OpenAIEmbeddings` to add support for
non-`tiktoken` based embeddings, specifically for use with the new
`text-generation-webui` API (`--extensions openai`) which does not
support `tiktoken` encodings, but rather strings
  - **Issue:** Not found,
- **Dependencies:** HuggingFace `transformers.AutoTokenizer` is new
dependency for running the model without `tiktoken`
- **Tag maintainer:** @baskaryan based on last commit for
`langchain-core` refactor
  - **Twitter handle:** @xychelsea

Modified the tokenization process to be model-agnostic, allowing for
both OpenAI and non-OpenAI model tokenizations, by setting the new
default `bool` flag `tiktoken_enabled` to `False`. This requeires
HuggingFace’s AutoTokenizer and handling tokenization for models
requiring different preprocessing steps to generate a chunked string
request rather than a list of integers.

Updated the embeddings generation process to accommodate non-OpenAI
models. This includes converting tokenized text into embeddings using
OpenAI’s and Hugging Face’s model architectures.
 -->
7 months ago
Changgeng Zhao 9b59bde93d
Update Hologres vector store: use hologres-vector (#13767)
Hi,
I made some code changes on the Hologres vector store to improve the
data insertion performance.
Also, this version of the code uses `hologres-vector` library. This
library is more convenient for us to update, and more efficient in
performance.
The code has passed the format/lint/spell check. I have run the unit
test for Hologres connecting to my own database.
Please check this PR again and tell me if anything needs to change.

Best,
Changgeng,
Developer @ Alibaba Cloud

Co-authored-by: Changgeng Zhao <zhaochanggeng.zcg@alibaba-inc.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
7 months ago
Nicolò Boschi 0de7cf898d
Ensure AstraDB integration tests clean up the environment (#13774)
- **Description:** currently astra_db integration tests might leave
orphan collections
  - **Tag maintainer:** @hemidactylus 
  - **Twitter handle:** nicoloboschi
7 months ago
Chad Norvell 8a0951d934
Fix Mathpix PDF loader integration (#13949)
- **Description:** Fixes the Mathpix PDF loader API integration.
Specifically, ensures that Mathpix auth headers are provided for every
request, and ensures that we recognize all errors that can occur during
a request. Also, the option to provide API keys as kwargs never actually
worked before, but now that's fixed too.
  - **Issue:** #11249
  - **Dependencies:** None
7 months ago
gzyJoy 32d4bb4590
Added Slacktoolkit (#14012)
- **Description:** 
This PR introduces the Slack toolkit to LangChain, which allows users to
read and write to Slack using the Slack API. Specifically, we've added
the following tools.
1. get_channel: Provides a summary of all the channels in a workspace.
2. get_message: Gets the message history of a channel.
3. send_message: Sends a message to a channel.
4. schedule_message: Sends a message to a channel at a specific time and
date.

- **Issue:** This pull request addresses [Add Slack Toolkit
#11747](https://github.com/langchain-ai/langchain/issues/11747)
  - **Dependencies:** package`slack_sdk`
Note: For this toolkit to function you will need to add a Slack app to
your workspace. Additional info can be found
[here](https://slack.com/help/articles/202035138-Add-apps-to-your-Slack-workspace).

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: ArianneLavada <ariannelavada@gmail.com>
Co-authored-by: ArianneLavada <84357335+ArianneLavada@users.noreply.github.com>
Co-authored-by: ariannelavada@gmail.com <you@example.com>
7 months ago
Richie 99e5ee6a84
fix(vectorstores): incorrect import for mongodb atlas DriverInfo (#14060)
- **Description:** fix `import` issue for `mongodb atlas` vectore store
integration
  - **Issue:** none
  - **Dependencies:** none

while trying to follow official `langchain`'s [mongodb integration
guide](https://python.langchain.com/docs/integrations/vectorstores/mongodb_atlas),
an import error will happen.

It's caused by incorrect import location:
- `from pymongo import DriverInfo` should be `from pymongo.driver_info
import DriverInfo`
- reference: [pymongo's DriverInfo
class](https://pymongo.readthedocs.io/en/stable/api/pymongo/driver_info.html#pymongo.driver_info.DriverInfo)

Thanks!
7 months ago
James Braza 052e23be3e
Added Python `logging` tracer (#14190)
This PR creates a logging handler and adds a simple unit test of it

Supercedes https://github.com/langchain-ai/langchain/pull/12862

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
7 months ago
Bob Lin 62505043be
Closed #14069 (#14166)
### Description

Fix #14069

### Twitter handle

[lin_bob57617](https://twitter.com/lin_bob57617)
7 months ago
Yong woo Song 9938086df0
Fix Html2TextTransformer for shallow copy (#14197)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
Hi,
There is some unintended behavior in Html2TextTransformer.
The current code is **directly modifying the original documents that are
passed as arguments to the function.**
Therefore, not only the return of the function but also the input
variables are being modified simultaneously.
**To resolve this, I added unit test code as well.**

reference link: [Shallow vs Deep Copying of Python
Objects](https://realpython.com/copying-python-objects/)

Thanks! ☺️
7 months ago
h3l 818252b1f8
Fix: (issue #14127) Volc Engine MaaS import error (#14194)
- **Description:** fix Volc Engine MaaS import error
- **Issue:** [the issue # it fixes (if
applicable),](https://github.com/langchain-ai/langchain/issues/14127)
  - **Dependencies:** None
  - **Tag maintainer:** @baskaryan 
  - **Twitter handle:**

Co-authored-by: lvzhong <lvzhong@bytedance.com>
7 months ago
Bagatur 0bdb434383
langchain[patch]: Release langchain 0.0.345 (#14184) 7 months ago
Harutaka Kawamura 41ee3be95f
langchain[patch]: Support passing parameters to `llms.Databricks` and `llms.Mlflow` (#14100)
Before, we need to use `params` to pass extra parameters:

```python
from langchain.llms import Databricks

Databricks(..., params={"temperature": 0.0})
```

Now, we can directly specify extra params:

```python
from langchain.llms import Databricks

Databricks(..., temperature=0.0)
```
7 months ago
Samuel Kemp fd781c89cc
langchain[minor]: add azure ai data document loader (#13404)
This PR adds an "Azure AI data" document loader, which allows Azure AI
users to load their registered data assets as a document object in
langchain.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
James Braza 24385a00de
core[minor], langchain[patch], experimental[patch]: Added missing `py.typed` to `langchain_core` (#14143)
See PR title.

From what I can see, `poetry` will auto-include this. Please let me know
if I am missing something here.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
quantum00549 f7c257553d
langchain[patch]: fixed a bug that was causing the streaming transfer to not work… (#10827)
… properly

Fixed a bug that was causing the streaming transfer to not work
properly.
 - **Description: 
1、The on_llm_new_token method in the streaming callback can now be
called properly in streaming transfer mode.
2、In streaming transfer mode, LLM can now correctly output the complete
response instead of just the first token.
- **Tag maintainer: @wangxuqi 
- **Twitter handle: @kGX7XJjuYxzX9Km

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
Eugene Yurtsev 6d0209e0aa
Improve file system blob loader and generic loader (#14004)
* Add support for passing a specific file to the file system blob loader
* Allow specifying a class parameter for the parser for the generic
loader

```python

class AudioLoader(GenericLoader):
  @staticmethod
  def get_parser(**kwargs):
     return MyAudioParser(**kwargs):
```

The intent of the GenericLoader is to provide on-ramps from different
sources (e.g., web, s3, file system).

An alternative is to use pipelining syntax or creating a Pipeline

```
FileSystemBlobLoader(...) | MyAudioParser
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
Amyh102 b6d26d3f9f
infra[patch]: Add unit tests for Huggingface dataset loader (#14053)
- **Description:** Add unit tests for huggingface dataset loader and
sample huggingface dataset for future tests. Updates dependencies for
`datasets` module.
- Adds coverage for [previous pull
request](https://github.com/langchain-ai/langchain/pull/13864)
  - **Tag maintainer:** @hwchase17

---------

Co-authored-by: Amy Han <amyhan@Amys-Air.lan>
Co-authored-by: Amy Han <amyhan@Amys-MacBook-Air.local>
Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
Govinda Totla 62a3473ac0
docs[patch]: add text_splitter.py test (#14025)
Description: Add HTMLHeaderTextSplitter unit test
Dependencies: none
7 months ago
axiangcoding 1b36ddf16c
docs[patch]: add deprecated note for ErnieChatBot (#14061)
- **Description:** just a little change of ErnieChatBot class
description, sugguesting user to use more suitable class
  - **Issue:** none,
  - **Dependencies:** none,
  - **Tag maintainer:** @baskaryan ,
  - **Twitter handle:** none
7 months ago
Devin Dahoon Kim 32da0a4d71
langchain[patch]: use async_embed_with_retry in _aget_len_safe_embeddings (#14110)
**Description**

`embed_with_retry` is for sync operations and not for async operations.
Use `async_embed_with_retry` for appropriate async operations.


I'm using `OpenAIEmbedding(http_client=httpx.AsyncClient())` with only
async operations.
However, I got an error when I use `embedding.aembed_documents` because
`embed_with_retry` uses sync OpenAI client with async http client.
7 months ago
lijie 371bcb7580
langchain[patch]: set maxsplit when parse python function docstring (#14121)
Description

when the desc of arg in python docstring contains ":", the
`_parse_python_function_docstring` will raise **ValueError: too many
values to unpack (expected 2)**.

A sample desc would be:
"""
Args: 
    error_arg: this is an arg with an additional ":" symbol
"""

So, set `maxsplit` parameter to fix it.
7 months ago
Harrison Chase ae646701c4
Harrison/ibm (#14133)
Co-authored-by: Mateusz Szewczyk <139469471+MateuszOssGit@users.noreply.github.com>
7 months ago
Eugene Yurtsev 943aa01c14
Improve indexing performance for Postgres (remote database) for refresh for async API (#14132)
This PR speeds up the indexing api on the async path by batching the uid
updates in the sql record manager (which may be remote).
7 months ago
William FH 71c2e184b4
[Nits] Evaluation - Some Rendering Improvements (#14097)
- Improve rendering of aggregate results at the end
- flatten reference if present
7 months ago
Mark Scannell 9b0e46dcf0
Improve indexing performance for Postgres (remote database) for refresh (#14126)
**Description:** By combining the document timestamp refresh within a
single call to update(), this enables batching of multiple documents in
a single SQL statement. This is important for non-local databases where
tens of milliseconds has a huge impact on performance when doing
document-by-document SQL statements.
**Issue:** #11935 
**Dependencies:** None
**Tag maintainer:** @eyurtsev
7 months ago
Jacob Lee 3328507f11
langchain[patch], experimental[minor]: Adds OllamaFunctions wrapper (#13330)
CC @baskaryan @hwchase17 @jmorganca 

Having a bit of trouble importing `langchain_experimental` from a
notebook, will figure it out tomorrow

~Ah and also is blocked by #13226~

---------

Co-authored-by: Lance Martin <lance@langchain.dev>
Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
Bagatur 4063bf144a
langchain[patch]: release 0.0.344 (#14095) 7 months ago
Harutaka Kawamura 0d08a692a3
langchain[minor]: Migrate mlflow and databricks classes to deployments APIs. (#13699)
## Description

Related to https://github.com/mlflow/mlflow/pull/10420. MLflow AI
gateway will be deprecated and replaced by the `mlflow.deployments`
module. Happy to split this PR if it's too large.

```
pip install git+https://github.com/langchain-ai/langchain.git@refs/pull/13699/merge#subdirectory=libs/langchain
```

## Dependencies

Install mlflow from https://github.com/mlflow/mlflow/pull/10420:

```
pip install git+https://github.com/mlflow/mlflow.git@refs/pull/10420/merge
```

## Testing plan

The following code works fine on local and databricks:

<details><summary>Click</summary>
<p>

```python
"""
Setup
-----
mlflow deployments start-server --config-path examples/gateway/openai/config.yaml
databricks secrets create-scope <scope>
databricks secrets put-secret <scope> openai-api-key --string-value $OPENAI_API_KEY

Run
---
python /path/to/this/file.py secrets/<scope>/openai-api-key
"""
from langchain.chat_models import ChatMlflow, ChatDatabricks
from langchain.embeddings import MlflowEmbeddings, DatabricksEmbeddings
from langchain.llms import Databricks, Mlflow
from langchain.schema.messages import HumanMessage
from langchain.chains.loading import load_chain
from mlflow.deployments import get_deploy_client
import uuid
import sys
import tempfile
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate

###############################
# MLflow
###############################
chat = ChatMlflow(
    target_uri="http://127.0.0.1:5000", endpoint="chat", params={"temperature": 0.1}
)
print(chat([HumanMessage(content="hello")]))

embeddings = MlflowEmbeddings(target_uri="http://127.0.0.1:5000", endpoint="embeddings")
print(embeddings.embed_query("hello")[:3])
print(embeddings.embed_documents(["hello", "world"])[0][:3])

llm = Mlflow(
    target_uri="http://127.0.0.1:5000",
    endpoint="completions",
    params={"temperature": 0.1},
)
print(llm("I am"))

llm_chain = LLMChain(
    llm=llm,
    prompt=PromptTemplate(
        input_variables=["adjective"],
        template="Tell me a {adjective} joke",
    ),
)
print(llm_chain.run(adjective="funny"))

# serialization/deserialization
with tempfile.TemporaryDirectory() as tmpdir:
    print(tmpdir)
    path = f"{tmpdir}/llm.yaml"
    llm_chain.save(path)
    loaded_chain = load_chain(path)
    print(loaded_chain("funny"))

###############################
# Databricks
###############################
secret = sys.argv[1]
client = get_deploy_client("databricks")

# External - chat
name = f"chat-{uuid.uuid4()}"
client.create_endpoint(
    name=name,
    config={
        "served_entities": [
            {
                "name": "test",
                "external_model": {
                    "name": "gpt-4",
                    "provider": "openai",
                    "task": "llm/v1/chat",
                    "openai_config": {
                        "openai_api_key": "{{" + secret + "}}",
                    },
                },
            }
        ],
    },
)
try:
    chat = ChatDatabricks(
        target_uri="databricks", endpoint=name, params={"temperature": 0.1}
    )
    print(chat([HumanMessage(content="hello")]))
finally:
    client.delete_endpoint(endpoint=name)

# External - embeddings
name = f"embeddings-{uuid.uuid4()}"
client.create_endpoint(
    name=name,
    config={
        "served_entities": [
            {
                "name": "test",
                "external_model": {
                    "name": "text-embedding-ada-002",
                    "provider": "openai",
                    "task": "llm/v1/embeddings",
                    "openai_config": {
                        "openai_api_key": "{{" + secret + "}}",
                    },
                },
            }
        ],
    },
)
try:
    embeddings = DatabricksEmbeddings(target_uri="databricks", endpoint=name)
    print(embeddings.embed_query("hello")[:3])
    print(embeddings.embed_documents(["hello", "world"])[0][:3])
finally:
    client.delete_endpoint(endpoint=name)

# External - completions
name = f"completions-{uuid.uuid4()}"
client.create_endpoint(
    name=name,
    config={
        "served_entities": [
            {
                "name": "test",
                "external_model": {
                    "name": "gpt-3.5-turbo-instruct",
                    "provider": "openai",
                    "task": "llm/v1/completions",
                    "openai_config": {
                        "openai_api_key": "{{" + secret + "}}",
                    },
                },
            }
        ],
    },
)
try:
    llm = Databricks(
        endpoint_name=name,
        model_kwargs={"temperature": 0.1},
    )
    print(llm("I am"))
finally:
    client.delete_endpoint(endpoint=name)


# Foundation model - chat
chat = ChatDatabricks(
    endpoint="databricks-llama-2-70b-chat", params={"temperature": 0.1}
)
print(chat([HumanMessage(content="hello")]))

# Foundation model - embeddings
embeddings = DatabricksEmbeddings(endpoint="databricks-bge-large-en")
print(embeddings.embed_query("hello")[:3])

# Foundation model - completions
llm = Databricks(
    endpoint_name="databricks-mpt-7b-instruct", model_kwargs={"temperature": 0.1}
)
print(llm("hello"))
llm_chain = LLMChain(
    llm=llm,
    prompt=PromptTemplate(
        input_variables=["adjective"],
        template="Tell me a {adjective} joke",
    ),
)
print(llm_chain.run(adjective="funny"))

# serialization/deserialization
with tempfile.TemporaryDirectory() as tmpdir:
    print(tmpdir)
    path = f"{tmpdir}/llm.yaml"
    llm_chain.save(path)
    loaded_chain = load_chain(path)
    print(loaded_chain("funny"))

```

Output:

```
content='Hello! How can I assist you today?'
[-0.025058426, -0.01938856, -0.027781019]
[-0.025058426, -0.01938856, -0.027781019]
sorry, but I cannot continue the sentence as it is incomplete. Can you please provide more information or context?
Sure, here's a classic one for you:

Why don't scientists trust atoms?

Because they make up everything!
/var/folders/dz/cd_nvlf14g9g__n3ph0d_0pm0000gp/T/tmpx_4no6ad
{'adjective': 'funny', 'text': "Sure, here's a classic one for you:\n\nWhy don't scientists trust atoms?\n\nBecause they make up everything!"}
content='Hello! How can I assist you today?'
[-0.025058426, -0.01938856, -0.027781019]
[-0.025058426, -0.01938856, -0.027781019]
 a 23 year old female and I am currently studying for my master's degree
content="\nHello! It's nice to meet you. Is there something I can help you with or would you like to chat for a bit?"
[0.051055908203125, 0.007221221923828125, 0.003879547119140625]
[0.051055908203125, 0.007221221923828125, 0.003879547119140625]

hello back
 Well, I don't really know many jokes, but I do know this funny story...
/var/folders/dz/cd_nvlf14g9g__n3ph0d_0pm0000gp/T/tmp7_ds72ex
{'adjective': 'funny', 'text': " Well, I don't really know many jokes, but I do know this funny story..."}
```

</p>
</details>

The existing workflow doesn't break:

<details><summary>click</summary>
<p>

```python
import uuid

import mlflow
from mlflow.models import ModelSignature
from mlflow.types.schema import ColSpec, Schema


class MyModel(mlflow.pyfunc.PythonModel):
    def predict(self, context, model_input):
        return str(uuid.uuid4())


with mlflow.start_run():
    mlflow.pyfunc.log_model(
        "model",
        python_model=MyModel(),
        pip_requirements=["mlflow==2.8.1", "cloudpickle<3"],
        signature=ModelSignature(
            inputs=Schema(
                [
                    ColSpec("string", "prompt"),
                    ColSpec("string", "stop"),
                ]
            ),
            outputs=Schema(
                [
                    ColSpec(name=None, type="string"),
                ]
            ),
        ),
        registered_model_name=f"lang-{uuid.uuid4()}",
    )

# Manually create a serving endpoint with the registered model and run
from langchain.llms import Databricks

llm = Databricks(endpoint_name="<name>")
llm("hello")  # 9d0b2491-3d13-487c-bc02-1287f06ecae7
```

</p>
</details> 

## Follow-up tasks

(This PR is too large. I'll file a separate one for follow-up tasks.)

- Update `docs/docs/integrations/providers/mlflow_ai_gateway.mdx` and
`docs/docs/integrations/providers/databricks.md`.

---------

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
Jeremy Naccache a14cf87576
core[patch]: Add **kwargs to Langchain's dumps() to allow passing of json.dumps() … (#10628)
…parameters.

In Langchain's `dumps()` function, I've added a `**kwargs` parameter.
This allows users to pass additional parameters to the underlying
`json.dumps()` function, providing greater flexibility and control over
JSON serialization.

Many parameters available in `json.dumps()` can be useful or even
necessary in specific situations. For example, when using an Agent with
return_intermediate_steps set to true, the output is a list of
AgentAction objects. These objects can't be serialized without using
Langchain's `dumps()` function.

The issue arises when using the Agent with a language other than
English, which may contain non-ASCII characters like 'é'. The default
behavior of `json.dumps()` sets ensure_ascii to true, converting
`{"name": "José"}` into `{"name": "Jos\u00e9"}`. This can make the
output hard to read, especially in the case of intermediate steps in
agent logs.

By allowing users to pass additional parameters to `json.dumps()` via
Langchain's dumps(), we can solve this problem. For instance, users can
set `ensure_ascii=False` to maintain the original characters.

This update also enables users to pass other useful `json.dumps()`
parameters like `sort_keys`, providing even more flexibility.

The implementation takes into account edge cases where a user might pass
a "default" parameter, which is already defined by `dumps()`, or an
"indent" parameter, which is also predefined if `pretty=True` is set.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
7 months ago
Yong woo Song f4d520ccb5
Fix .env file path in integration_test README.md (#14028)
<!-- Thank you for contributing to LangChain!



Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

### Description
Hello, 

The [integration_test
README](https://github.com/langchain-ai/langchain/tree/master/libs/langchain/tests)
was indicating incorrect paths for the `.env.example` and `.env` files.

`tests/.env.example` ->`tests/integration_tests/.env.example`

While it’s a minor error, it could **potentially lead to confusion** for
the document’s readers, so I’ve made the necessary corrections.

Thank you! ☺️

### Related Issue
- https://github.com/langchain-ai/langchain/pull/2806
7 months ago
Rohan Dey 41a4c06a94
Added support for a Pandas DataFrame OutputParser (#13257)
**Description:**

Added support for a Pandas DataFrame OutputParser with format
instructions, along with unit tests and a demo notebook. Namely, we've
added the ability to request data from a DataFrame, have the LLM parse
the request, and then use that request to retrieve a well-formatted
response.

Within LangChain, it seamlessly integrates with language models like
OpenAI's `text-davinci-003`, facilitating streamlined interaction using
the format instructions (just like the other output parsers).

This parser structures its requests as
`<operation/column/row>[<optional_array_params>]`. The instructions
detail permissible operations, valid columns, and array formats,
ensuring clarity and adherence to the required format.

For example:

- When the LLM receives the input: "Retrieve the mean of `num_legs` from
rows 1 to 3."
- The provided format instructions guide the LLM to structure the
request as: "mean:num_legs[1..3]".

The parser processes this formatted request, leveraging the LLM's
understanding to extract the mean of `num_legs` from rows 1 to 3 within
the Pandas DataFrame.

This integration allows users to communicate requests naturally, with
the LLM transforming these instructions into structured commands
understood by the `PandasDataFrameOutputParser`. The format instructions
act as a bridge between natural language queries and precise DataFrame
operations, optimizing communication and data retrieval.

**Issue:**

- https://github.com/langchain-ai/langchain/issues/11532

**Dependencies:**

No additional dependencies :)

**Tag maintainer:**

@baskaryan 

**Twitter handle:**

No need. :)

---------

Co-authored-by: Wasee Alam <waseealam@protonmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
7 months ago
Masanori Taniguchi 235bdb9fa7
Support Vald secure connection (#13269)
**Description:** 
When using Vald, only insecure grpc connection was supported, so secure
connection is now supported.
In addition, grpc metadata can be added to Vald requests to enable
authentication with a token.

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
7 months ago
sudranga d1d693b2a7
Fix issue where response_if_no_docs_found is not implemented on async… (#13297)
Response_if_no_docs_found is not implemented in
ConversationalRetrievalChain for async code paths. Implemented it and
added test cases

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
7 months ago
AthulVincent 67c55cb5b0
Implemented MongoDB Atlas Self-Query Retriever (#13321)
# Description 
This PR implements Self-Query Retriever for MongoDB Atlas vector store.

I've implemented the comparators and operators that are supported by
MongoDB Atlas vector store according to the section titled "Atlas Vector
Search Pre-Filter" from
https://www.mongodb.com/docs/atlas/atlas-vector-search/vector-search-stage/.

Namely:
```
allowed_comparators = [
      Comparator.EQ,
      Comparator.NE,
      Comparator.GT,
      Comparator.GTE,
      Comparator.LT,
      Comparator.LTE,
      Comparator.IN,
      Comparator.NIN,
  ]

"""Subset of allowed logical operators."""
allowed_operators = [
    Operator.AND,
    Operator.OR
]
```
Translations from comparators/operators to MongoDB Atlas filter
operators(you can find the syntax in the "Atlas Vector Search
Pre-Filter" section from the previous link) are done using the following
dictionary:
```
map_dict = {
            Operator.AND: "$and",
            Operator.OR: "$or",
            Comparator.EQ: "$eq",
            Comparator.NE: "$ne",
            Comparator.GTE: "$gte",
            Comparator.LTE: "$lte",
            Comparator.LT: "$lt",
            Comparator.GT: "$gt",
            Comparator.IN: "$in",
            Comparator.NIN: "$nin",
        }
```

In visit_structured_query() the filters are passed as "pre_filter" and
not "filter" as in the MongoDB link above since langchain's
implementation of MongoDB atlas vector
store(libs\langchain\langchain\vectorstores\mongodb_atlas.py) in
_similarity_search_with_score() sets the "filter" key to have the value
of the "pre_filter" argument.
```
params["filter"] = pre_filter
```
Test cases and documentation have also been added.

# Issue
#11616 

# Dependencies
No new dependencies have been added.

# Documentation
I have created the notebook mongodb_atlas_self_query.ipynb outlining the
steps to get the self-query mechanism working.

I worked closely with [@Farhan-Faisal](https://github.com/Farhan-Faisal)
on this PR.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
Josef Zoller c2e3963da4
Merriam-Webster Dictionary Tool (#12044)
# Description

We implemented a simple tool for accessing the Merriam-Webster
Collegiate Dictionary API
(https://dictionaryapi.com/products/api-collegiate-dictionary).

Here's a simple usage example:

```py
from langchain.llms import OpenAI
from langchain.agents import load_tools, initialize_agent, AgentType

llm = OpenAI()
tools = load_tools(["serpapi", "merriam-webster"], llm=llm) # Serp API gives our agent access to Google
agent = initialize_agent(
  tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True
)
agent.run("What is the english word for the german word Himbeere? Define that word.")
```

Sample output:

```
> Entering new AgentExecutor chain...
 I need to find the english word for Himbeere and then get the definition of that word.
Action: Search
Action Input: "English word for Himbeere"
Observation: {'type': 'translation_result'}
Thought: Now I have the english word, I can look up the definition.
Action: MerriamWebster
Action Input: raspberry
Observation: Definitions of 'raspberry':

1. rasp-ber-ry, noun: any of various usually black or red edible berries that are aggregate fruits consisting of numerous small drupes on a fleshy receptacle and that are usually rounder and smaller than the closely related blackberries
2. rasp-ber-ry, noun: a perennial plant (genus Rubus) of the rose family that bears raspberries
3. rasp-ber-ry, noun: a sound of contempt made by protruding the tongue between the lips and expelling air forcibly to produce a vibration; broadly : an expression of disapproval or contempt
4. black raspberry, noun: a raspberry (Rubus occidentalis) of eastern North America that has a purplish-black fruit and is the source of several cultivated varieties —called also blackcap

Thought: I now know the final answer.
Final Answer: Raspberry is an english word for Himbeere and it is defined as any of various usually black or red edible berries that are aggregate fruits consisting of numerous small drupes on a fleshy receptacle and that are usually rounder and smaller than the closely related blackberries.

> Finished chain.
```

# Issue

This closes #12039.

# Dependencies

We added no extra dependencies.

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: Lara <63805048+larkgz@users.noreply.github.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
7 months ago
Mohammad Mohtashim f3dd4a10cf
DROP BOX Loader Documentation Update (#14047)
- **Description:** Update the document for drop box loader + made the
messages more verbose when loading pdf file since people were getting
confused
  - **Issue:** #13952
  - **Tag maintainer:** @baskaryan, @eyurtsev, @hwchase17,

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
7 months ago
Cheng (William) Huang a00db4b28f
Add multi-input Reddit search tool (#13893)
- **Description:** Added a tool called RedditSearchRun and an
accompanying API wrapper, which searches Reddit for posts with support
for time filtering, post sorting, query string and subreddit filtering.
  - **Issue:** #13891 
  - **Dependencies:** `praw` module is used to search Reddit
- **Tag maintainer:** @baskaryan , and any of the other maintainers if
needed
  - **Twitter handle:** None.

  Hello,

This is our first PR and we hope that our changes will be helpful to the
community. We have run `make format`, `make lint` and `make test`
locally before submitting the PR. To our knowledge, our changes do not
introduce any new errors.

Our PR integrates the `praw` package which is already used by
RedditPostsLoader in LangChain. Nonetheless, we have added integration
tests and edited unit tests to test our changes. An example notebook is
also provided. These changes were put together by me, @Anika2000,
@CharlesXu123, and @Jeremy-Cheng-stack

Thank you in advance to the maintainers for their time.

---------

Co-authored-by: What-Is-A-Username <49571870+What-Is-A-Username@users.noreply.github.com>
Co-authored-by: Anika2000 <anika.sultana@mail.utoronto.ca>
Co-authored-by: Jeremy Cheng <81793294+Jeremy-Cheng-stack@users.noreply.github.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
7 months ago
Jawad Arshad 00a6e8962c
langchain[minor]: Add serpapi tools (#13934)
- **Description:** Added some of the more endpoints supported by serpapi
that are not suported on langchain at the moment, like google trends,
google finance, google jobs, and google lens
- **Issue:** [Add support for many of the querying endpoints with
serpapi #11811](https://github.com/langchain-ai/langchain/issues/11811)

---------

Co-authored-by: zushenglu <58179949+zushenglu@users.noreply.github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Ian Xu <ian.xu@mail.utoronto.ca>
Co-authored-by: zushenglu <zushenglu1809@gmail.com>
Co-authored-by: KevinT928 <96837880+KevinT928@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
h3l dbaeb163aa
langchain[minor]: add volcengine endpoint as LLM (#13942)
- **Description:** Volc Engine MaaS serves as an enterprise-grade,
large-model service platform designed for developers. You can visit its
homepage at https://www.volcengine.com/docs/82379/1099455 for details.
This change will facilitate developers to integrate quickly with the
platform.
  - **Issue:** None
  - **Dependencies:** volcengine
  - **Tag maintainer:** @baskaryan 
  - **Twitter handle:** @he1v3tica

---------

Co-authored-by: lvzhong <lvzhong@bytedance.com>
7 months ago
Mohammad Ahmad 1600ebe6c7
langchain[patch]: Mask API key for ForeFrontAI LLM (#14013)
- **Description:** Mask API key for ForeFrontAI LLM and associated unit
tests
  - **Issue:** https://github.com/langchain-ai/langchain/issues/12165
  - **Dependencies:** N/A
  - **Tag maintainer:** @eyurtsev 
  - **Twitter handle:** `__mmahmad__`

I made the API key non-optional since linting required adding validation
for None, but the key is required per documentation:
https://python.langchain.com/docs/integrations/llms/forefrontai
7 months ago
yoch a0e859df51
langchain[patch]: fix cohere reranker init #12899 (#14029)
- **Description:** use post field validation for `CohereRerank`
  - **Issue:** #12899 and #13058
  - **Dependencies:** 
  - **Tag maintainer:** @baskaryan

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
123-fake-st 9bd6e9df36
update pdf document loaders' metadata source to url for online pdf (#13274)
- **Description:** Update 5 pdf document loaders in
`langchain.document_loaders.pdf`, to store a url in the metadata
(instead of a temporary, local file path) if the user provides a web
path to a pdf: `PyPDFium2Loader`, `PDFMinerLoader`,
`PDFMinerPDFasHTMLLoader`, `PyMuPDFLoader`, and `PDFPlumberLoader` were
updated.
- The updates follow the approach used to update `PyPDFLoader` for the
same behavior in #12092
- The `PyMuPDFLoader` changes required additional work in updating
`langchain.document_loaders.parsers.pdf.PyMuPDFParser` to be able to
process either an `io.BufferedReader` (from local pdf) or `io.BytesIO`
(from online pdf)
- The `PDFMinerPDFasHTMLLoader` change used a simpler approach since the
metadata is assigned by the loader and not the parser
  - **Issue:** Fixes #7034
  - **Dependencies:** None


```python
# PyPDFium2Loader example:
# old behavior
>>> from langchain.document_loaders import PyPDFium2Loader
>>> loader = PyPDFium2Loader('https://arxiv.org/pdf/1706.03762.pdf')
>>> docs = loader.load()
>>> docs[0].metadata
{'source': '/var/folders/7z/d5dt407n673drh1f5cm8spj40000gn/T/tmpm5oqa92f/tmp.pdf', 'page': 0}

# new behavior
>>> from langchain.document_loaders import PyPDFium2Loader
>>> loader = PyPDFium2Loader('https://arxiv.org/pdf/1706.03762.pdf')
>>> docs = loader.load()
>>> docs[0].metadata
{'source': 'https://arxiv.org/pdf/1706.03762.pdf', 'page': 0}
```
7 months ago
Toshish Jawale 6f64cb5078
Remove deprecated param and flexibility for prompt (#13310)
- **Description:** Updated to remove deprecated parameter penalty_alpha,
and use string variation of prompt rather than json object for better
flexibility. - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** N/A
  - **Tag maintainer:** @eyurtsev
  - **Twitter handle:** @symbldotai

---------

Co-authored-by: toshishjawale <toshish@symbl.ai>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
7 months ago
Tomaz Bratanic 3eb391561b
langchain[minor]: Reduce the number of tokens required to describe a Cypher/Neo4j schema (#13851)
Instead of using JSON-like syntax to describe node and relationship
properties we changed to a shorter and more concise schema description

Old:

```
        Node properties are the following:
        [{'properties': [{'property': 'name', 'type': 'STRING'}], 'labels': 'Movie'}, {'properties': [{'property': 'name', 'type': 'STRING'}], 'labels': 'Actor'}]
        Relationship properties are the following:
        []
        The relationships are the following:
        ['(:Actor)-[:ACTED_IN]->(:Movie)']
```

New:

```
Node properties are the following:
Movie {name: STRING},Actor {name: STRING}
Relationship properties are the following:

The relationships are the following:
(:Actor)-[:ACTED_IN]->(:Movie)
```
7 months ago
Sauhaard 7ec4dbeb80
langchain[minor]: Add StackExchange API integration (#14002)
Implements
[#12115](https://github.com/langchain-ai/langchain/issues/12115)

Who can review?
@baskaryan , @eyurtsev , @hwchase17 

Integrated Stack Exchange API into Langchain, enabling access to diverse
communities within the platform. This addition enhances Langchain's
capabilities by allowing users to query Stack Exchange for specialized
information and engage in discussions. The integration provides seamless
interaction with Stack Exchange content, offering content from varied
knowledge repositories.

A notebook example and test cases were included to demonstrate the
functionality and reliability of this integration.

- Add StackExchange as a tool.
- Add unit test for the StackExchange wrapper and tool.
- Add documentation for the StackExchange wrapper and tool.

If you have time, could you please review the code and provide any
feedback as necessary! My team is welcome to any suggestions.

---------

Co-authored-by: Yuval Kamani <yuvalkamani@gmail.com>
Co-authored-by: Aryan Thakur <aryanthakur@Aryans-MacBook-Pro.local>
Co-authored-by: Manas1818 <79381912+manas1818@users.noreply.github.com>
Co-authored-by: aryan-thakur <61063777+aryan-thakur@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
Bagatur d4405bc94e
langchain[patch]: Release 0.0.343 (#14037) 7 months ago
Yves Zumbühl 9c0ad0cebb
langchain[patch]: Improve HyDe with custom prompts and ability to supply the run_manager (#14016)
- **Description:** The class allows to only select between a few
predefined prompts from the paper. That is not ideal, since other use
cases might need a custom prompt. The changes made allow for this. To be
able to monitor those, I also added functionality to supply a custom
run_manager.
  - **Issue:** no issue, but a new feature,
  - **Dependencies:** none,
  - **Tag maintainer:** @hwchase17,
  - **Twitter handle:** @yvesloy

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
Chad Norvell 1c4bfb8c5f
langchain[patch]: Mathpix PDF loader supports arbitrary extra params (#13950)
- **Description:** Support providing whatever extra parameters you want
to the Mathpix PDF loader API request.
  - **Issue:** #12773
  - **Dependencies:** None

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
Unai Garay Maestre 9e2ae866c4
langchain[patch]: Adds progress bar to GooglePalmEmbeddings (#13812)
- **Description:** Adds a tqdm progress bar to GooglePalmEmbeddings when
embedding a list.
  - **Issue:** #13637
  - **Dependencies:** TQDM as a main dependency (instead of extra)


Signed-off-by: ugm2 <unaigaraymaestre@gmail.com>

---------

Signed-off-by: ugm2 <unaigaraymaestre@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
7 months ago
David Norman a578076aea
Mask api key for Together LLM (#13981)
- **Description:** Add unit tests and mask api key for Together LLM
- **Issue:** the issue
https://github.com/langchain-ai/langchain/issues/12165 ,
  - **Dependencies:** N/A
  - **Tag maintainer:** ?,
  - **Twitter handle:** N/A

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
7 months ago
Johnny 6463d2d0bd
small fix matching engine AttributeError - object has no attribute (#13763)
This PR is fixing an attributeError: object endpoint has no attribute
"_public_match_client" when using gcp matching engine with private VPC
network.

@baskaryan, @eyurtsev, @hwchase17.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
7 months ago
Amyh102 750485eaa8
Add object parsing functionality (#13864)
* **Description:** Parses huggingface dataset Sequence objects into
strings for Document loading.
* **Issue:** Fixes #10674 
* **Tag maintainter:** @baskaryan @eyurtsev

---------

Co-authored-by: Amy Han <amyhan@Amys-Air.lan>
Co-authored-by: Amy Han <amyhan@Amys-MacBook-Air.local>
7 months ago
ggeutzzang 981f78f920
Fix: (issue #13825) Getting an error with DallEAPIWrapper (#13874)
- **Description:** As of OpenAI's Python package 1.0, the existing
DallEAPIWrapper does not work correctly, so the example in the LangChain
Documentation link below does not work either.

https://python.langchain.com/docs/integrations/tools/dalle_image_generator
Also, since OpenAI only supports DALL-E version 2 or version 3, I
modified the DallEAPIWrapper to support it.

  - **Issue:** #13825 

  - **Twitter handle:** ggeutzzang
7 months ago
Kunal 74045bf5c0
max length attribute for spacy splitter for large docs (#13875)
For large size documents spacy splitter doesn't work it throws an error
as shown in below screenshot.
Reason its default max_length is 1000000 and there is no option to
increase it. So i added it in this PR.


![image](https://github.com/langchain-ai/langchain/assets/73680423/613625c3-0e21-4834-9aad-2a73cf56eecc)

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
Wang Wei fe9341a29c
feat: Add ERNIE-Bot-8K model support for ErnieBotChat. (#13716)
- **Description:** According to the document
https://cloud.baidu.com/doc/WENXINWORKSHOP/s/6lp69is2a, add ERNIE-Bot-8K
model support for ErnieBotChat.
- **Dependencies:** Before using the ERNIE-Bot-8K, you should have the
model's access authority.
7 months ago
Burak Ömür 0e462b72ef
Update openai/create_llm_result function to consider kwargs (#13815)
Replace this entire comment with:
- **Description:** updates `create_llm_result` function within
`openai.py` to consider latest `params`,
  - **Issue:** #8928
  - **Dependencies:** -,
  - **Tag maintainer:** -
  - **Twitter handle:** [burkomr](https://twitter.com/burkomr)

<!-- If no one reviews your PR within a few days, please @-mention one
of @baskaryan, @eyurtsev, @hwchase17. -->

---------

Co-authored-by: Burak Ömür <burakomur@retorio.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
7 months ago
chyroc f97ab84c6b
Merge pull request #13907
* feat: mask api_key for jina
7 months ago
nhywieza 9b86fb3fcb
secretStr for baichuan chat model api key (#13946)
Merge pull request #13946
* secretStr for baichuan chat model api key
7 months ago
卢靖轩 aff1dba252
Merge pull request #13945
* feat: mask api key for nlpcloud
7 months ago
Leonid Kuligin 85bb3a418c
Switched VertexAI models from preview (#13657)
Replace this entire comment with:
- **Description:** VertexAI models are now GA, moved away from using
preview ones from the SDK
  - **Issue:** #13606

---------

Co-authored-by: Nuno Campos <nuno@boringbits.io>
7 months ago
Bagatur 14799b139a
infra[patch]: add base deps and fix docs lint (#13998) 7 months ago
Théo LEBRUN 926d4cfda7
Set default region from boto3 session for Bedrock (#13694)
- **Description:** Set default region from boto3 session for Bedrock 
- **Issue:** #13683
7 months ago
Snow 1a33e5b500
Repair Wikipedia document loader `load_max_docs` and improve test coverage. (#13769)
**Description:** 

Repair Wikipedia document loader `load_max_docs` and improve test
coverage.

**Issue:** 

The Wikipedia document loader was not respecting the `load_max_docs`
paramater (not reported) and would always return a maximum of 10
documents. This is because the API wrapper (in `utilities/wikipedia.py`)
wasn't passing `top_k_results` to the underlying [Wikipedia
library](https://wikipedia.readthedocs.io/en/latest/code.html#module-wikipedia).
By default this library returns 10 results.

The default number of results for the document loader has been reduced
from 100 to 25. This is because loading 100 results takes a very long
time and is an inconvenient default. It should possibly be 10.

In addition, the documentation for the loader reported that there was a
hard limit (300) on the number of documents returned. In actuality 300
is the maximum Wikipedia query character length set by the API wrapper.

Tests have been added for the document loader (previously missing) and
to test the correct numbers of documents are being returned by each
class, both by default, and when overridden. Also repaired is the
`assert_docs` test which has been updated to correctly test for the
default metadata (which includes `source` in recent releases).

**Dependencies:** 
nil

**Tag maintainer:**
@leo-gan

**Twitter handle:**
@queenvictoria
7 months ago
Bob Lin 04c4878306
Remove `python_repl` from _BASE_TOOLS (#13962)
### **Description:**

Previously `python_repl` was a built-in tool, but now it has been moved
to `langchain_experimental`.

When I use `load_tools` I get an error:

```python
In [1]: from langchain.agents import load_tools

In [2]: load_tools(["python_repl"])
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
Cell In[2], line 1
----> 1 load_tools(["python_repl"])

File ~/workspace/langchain/libs/langchain/langchain/agents/load_tools.py:530, in load_tools(tool_names, llm, callbacks, **kwargs)
    528     tool_names.extend(requests_method_tools)
    529 elif name in _BASE_TOOLS:
--> 530     tools.append(_BASE_TOOLS[name]())
    531 elif name in _LLM_TOOLS:
    532     if llm is None:

File ~/workspace/langchain/libs/langchain/langchain/agents/load_tools.py:84, in _get_python_repl()
     83 def _get_python_repl() -> BaseTool:
---> 84     raise ImportError(
     85         "This tool has been moved to langchain experiment. "
     86         "This tool has access to a python REPL. "
     87         "For best practices make sure to sandbox this tool. "
     88         "Read https://github.com/langchain-ai/langchain/blob/master/SECURITY.md "
     89         "To keep using this code as is, install langchain experimental and "
     90         "update relevant imports replacing 'langchain' with 'langchain_experimental'"
     91     )

ImportError: This tool has been moved to langchain experiment. This tool has access to a python REPL. For best practices make sure to sandbox this tool. Read https://github.com/langchain-ai/langchain/blob/master/SECURITY.md To keep using this code as is, install langchain experimental and update relevant imports replacing 'langchain' with 'langchain_experimental'
```

In this case, it will be very confusing. I think it is no longer a
built-in tool now, so it can be removed from `_BASE_TOOLS`

### **Issue:** 

https://github.com/langchain-ai/langchain/issues/13858,
https://github.com/langchain-ai/langchain/issues/13859,
https://github.com/langchain-ai/langchain/issues/13856
### **Twitter handle:** 

[lin_bob57617](https://twitter.com/lin_bob57617)
7 months ago
Leonid Ganeline 52eee458bb
renamed `google_vertex_ai_vector_search` notebook (#13484)
The `integrations/vectorstores/matchingengine.ipynb` example has the
"Google Vertex AI Vector Search" title. This place this Title in the
wrong order in the ToC (it is sorted by the file name).
- Renamed `integrations/vectorstores/matchingengine.ipynb` into
`integrations/vectorstores/google_vertex_ai_vector_search.ipynb`.
- Updated a correspondent comment in docstring
- Rerouted old URL to a new URL

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
7 months ago
Taqi Jaffri 144710ad9a
langchain[minor]: Updated DocugamiLoader, includes breaking changes (#13265)
There are the following main changes in this PR:

1. Rewrite of the DocugamiLoader to not do any XML parsing of the DGML
format internally, and instead use the `dgml-utils` library we are
separately working on. This is a very lightweight dependency.
2. Added MMR search type as an option to multi-vector retriever, similar
to other retrievers. MMR is especially useful when using Docugami for
RAG since we deal with large sets of documents within which a few might
be duplicates and straight similarity based search doesn't give great
results in many cases.

We are @docugami on twitter, and I am @tjaffri

---------

Co-authored-by: Taqi Jaffri <tjaffri@docugami.com>
7 months ago
Bagatur d8fe987ef5
langchain[patch]: release 0.0.342 (#13992) 7 months ago
david qiu 9fb6805be4
langchain[minor]: Add retriever for Knowledge Bases for Amazon Bedrock (#13980)
- **Description:** Adds a retriever implementation for [Knowledge Bases
for Amazon Bedrock](https://aws.amazon.com/bedrock/knowledge-bases/), a
new service announced at AWS re:Invent, shortly before this PR was
opened. This depends on the `bedrock-agent-runtime` service, which will
be included in a future version of `boto3` and of `botocore`. We will
open a follow-up PR documenting the minimum required versions of `boto3`
and `botocore` after that information is available.
  - **Issue:** N/A
  - **Dependencies:** `boto3>=1.33.2, botocore>=1.33.2`
  - **Tag maintainer:** @baskaryan
  - **Twitter handles:** `@pjain7` `@dead_letter_q`

This PR includes a documentation notebook under
`docs/docs/integrations/retrievers`, which I (@dlqqq) have verified
independently.

EDIT: `bedrock-agent-runtime` service is now included in
`boto3>=1.33.2`:
5cf793f493

---------

Co-authored-by: Piyush Jain <piyushjain@duck.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
Nuno Campos 970fe23feb
Fixes for opengpts release (#13960) 7 months ago
David Duong 947daaf833
Exclude Bedrock client and credentials_profile_name fields from serialisation (#13603) 7 months ago
Bagatur 48fbc5513d
infra[patch], langchain[patch]: fix test deps and upper bound langchain dep on core(#13984) 7 months ago
Stefano Lottini 1fd724293b
Astra DB vector store, move constructor docstring to class docstring (#13784)
This PR rearranges the docstring for the `AstraDB` vector store class so
as to have all useful information in the _class_ docstring for ease of
reading.

(incidentally, due to an oversight, the docstring that was in the
constructor ended up buried below some lines of code, thereby
disappearing altogether from accessibility. Apologies.)
7 months ago
Ali Orozgani 32d794f5a3
iMessage loader: implement message content extraction from attributed… (#13634)
- **Description:** We are adding functionality to extract message
content from the `attributedBody` field of the database, in case the
content is not in the `text` field.
  - **Issue:** Closes #13326 and #10680 
  - **Dependencies:** None.
  - **Tag maintainer:** @eyurtsev, @hwchase17

---------

Co-authored-by: onotate <johnp.pham@mail.utoronto.ca>
7 months ago
William FH e5256bcb69
[Evals] Add Project Tags (#13982)
Add them to project extra
7 months ago
Nuno Campos e0bcc98436
infra[patch]: Use langchain core in-tree as a dev dependency (#13957)
Using the published version means master is broken for contributors
whenever we make changes in one lib that depend on the other.
7 months ago
unifyh 2703a1b061
Fix `MarkdownHeaderTextSplitter` not recognizing tilde-fenced code blocks (#13511)
- **Description:** Previously `MarkdownHeaderTextSplitter` did not
consider tilde-fenced code blocks
(https://spec.commonmark.org/0.30/#fenced-code-blocks). This PR fixes
that.
   ````md
   # Bug caused by previous implementation:
   ~~~py
   foo()
   # This is a comment that would be considered header
   bar()
   ~~~
   ````
 - **Tag maintainer:** @baskaryan
7 months ago
Leonid Ganeline 7929b26017
office365 toolkit bug fixes (#13618)
Several bug fixes:
- emails: instead of `bcc` the `cc` is used.
- errors in the truncation descriptions
- no truncation of the `message_search`
Several updates:
- generalized UTC format 
- truncation limit can be changed now in _call()
7 months ago
William FH 60309341bd
Eval Error Key (#13974) 7 months ago
Nuno Campos 391f200eaa
Implement stream() and astream() for agents (#12783)
```
---- chunk 1
{'actions': [AgentActionMessageLog(tool='Search', tool_input="Leo DiCaprio's current girlfriend", log="\nInvoking: `Search` with `Leo DiCaprio's current girlfriend`\n\n\n", message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n  "__arg1": "Leo DiCaprio\'s current girlfriend"\n}'}})])],
 'messages': [AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n  "__arg1": "Leo DiCaprio\'s current girlfriend"\n}'}})]}
---- chunk 2
{'messages': [FunctionMessage(content="According to Us, the 48-year-old actor is now “exclusively” dating Italian model Vittoria Ceretti. A source told Us that DiCaprio is “completely smitten” with Ceretti, and their relationship is “going so well that Leo's actually being exclusive.”", name='Search')],
 'steps': [AgentStep(action=AgentActionMessageLog(tool='Search', tool_input="Leo DiCaprio's current girlfriend", log="\nInvoking: `Search` with `Leo DiCaprio's current girlfriend`\n\n\n", message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n  "__arg1": "Leo DiCaprio\'s current girlfriend"\n}'}})]), observation="According to Us, the 48-year-old actor is now “exclusively” dating Italian model Vittoria Ceretti. A source told Us that DiCaprio is “completely smitten” with Ceretti, and their relationship is “going so well that Leo's actually being exclusive.”")]}
---- chunk 3
{'actions': [AgentActionMessageLog(tool='Search', tool_input='Vittoria Ceretti age', log='\nInvoking: `Search` with `Vittoria Ceretti age`\n\n\n', message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n  "__arg1": "Vittoria Ceretti age"\n}'}})])],
 'messages': [AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n  "__arg1": "Vittoria Ceretti age"\n}'}})]}
---- chunk 4
{'messages': [FunctionMessage(content='25 years', name='Search')],
 'steps': [AgentStep(action=AgentActionMessageLog(tool='Search', tool_input='Vittoria Ceretti age', log='\nInvoking: `Search` with `Vittoria Ceretti age`\n\n\n', message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n  "__arg1": "Vittoria Ceretti age"\n}'}})]), observation='25 years')]}
---- chunk 5
{'actions': [AgentActionMessageLog(tool='Calculator', tool_input='25^0.43', log='\nInvoking: `Calculator` with `25^0.43`\n\n\n', message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Calculator', 'arguments': '{\n  "__arg1": "25^0.43"\n}'}})])],
 'messages': [AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Calculator', 'arguments': '{\n  "__arg1": "25^0.43"\n}'}})]}
---- chunk 6
{'messages': [FunctionMessage(content='Answer: 3.991298452658078', name='Calculator')],
 'steps': [AgentStep(action=AgentActionMessageLog(tool='Calculator', tool_input='25^0.43', log='\nInvoking: `Calculator` with `25^0.43`\n\n\n', message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Calculator', 'arguments': '{\n  "__arg1": "25^0.43"\n}'}})]), observation='Answer: 3.991298452658078')]}
---- chunk 7
{'messages': [AIMessage(content="Leonardo DiCaprio's current girlfriend is the Italian model Vittoria Ceretti, who is 25 years old. Her age raised to the 0.43 power is approximately 3.99.")],
 'output': "Leonardo DiCaprio's current girlfriend is the Italian model "
           'Vittoria Ceretti, who is 25 years old. Her age raised to the 0.43 '
           'power is approximately 3.99.'}
---- final
{'actions': [AgentActionMessageLog(tool='Search', tool_input="Leo DiCaprio's current girlfriend", log="\nInvoking: `Search` with `Leo DiCaprio's current girlfriend`\n\n\n", message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n  "__arg1": "Leo DiCaprio\'s current girlfriend"\n}'}})]),
             AgentActionMessageLog(tool='Search', tool_input='Vittoria Ceretti age', log='\nInvoking: `Search` with `Vittoria Ceretti age`\n\n\n', message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n  "__arg1": "Vittoria Ceretti age"\n}'}})]),
             AgentActionMessageLog(tool='Calculator', tool_input='25^0.43', log='\nInvoking: `Calculator` with `25^0.43`\n\n\n', message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Calculator', 'arguments': '{\n  "__arg1": "25^0.43"\n}'}})])],
 'messages': [AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n  "__arg1": "Leo DiCaprio\'s current girlfriend"\n}'}}),
              FunctionMessage(content="According to Us, the 48-year-old actor is now “exclusively” dating Italian model Vittoria Ceretti. A source told Us that DiCaprio is “completely smitten” with Ceretti, and their relationship is “going so well that Leo's actually being exclusive.”", name='Search'),
              AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n  "__arg1": "Vittoria Ceretti age"\n}'}}),
              FunctionMessage(content='25 years', name='Search'),
              AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Calculator', 'arguments': '{\n  "__arg1": "25^0.43"\n}'}}),
              FunctionMessage(content='Answer: 3.991298452658078', name='Calculator'),
              AIMessage(content="Leonardo DiCaprio's current girlfriend is the Italian model Vittoria Ceretti, who is 25 years old. Her age raised to the 0.43 power is approximately 3.99.")],
 'output': "Leonardo DiCaprio's current girlfriend is the Italian model "
           'Vittoria Ceretti, who is 25 years old. Her age raised to the 0.43 '
           'power is approximately 3.99.',
 'steps': [AgentStep(action=AgentActionMessageLog(tool='Search', tool_input="Leo DiCaprio's current girlfriend", log="\nInvoking: `Search` with `Leo DiCaprio's current girlfriend`\n\n\n", message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n  "__arg1": "Leo DiCaprio\'s current girlfriend"\n}'}})]), observation="According to Us, the 48-year-old actor is now “exclusively” dating Italian model Vittoria Ceretti. A source told Us that DiCaprio is “completely smitten” with Ceretti, and their relationship is “going so well that Leo's actually being exclusive.”"),
           AgentStep(action=AgentActionMessageLog(tool='Search', tool_input='Vittoria Ceretti age', log='\nInvoking: `Search` with `Vittoria Ceretti age`\n\n\n', message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Search', 'arguments': '{\n  "__arg1": "Vittoria Ceretti age"\n}'}})]), observation='25 years'),
           AgentStep(action=AgentActionMessageLog(tool='Calculator', tool_input='25^0.43', log='\nInvoking: `Calculator` with `25^0.43`\n\n\n', message_log=[AIMessageChunk(content='', additional_kwargs={'function_call': {'name': 'Calculator', 'arguments': '{\n  "__arg1": "25^0.43"\n}'}})]), observation='Answer: 3.991298452658078')]}
```
7 months ago
Michael Feil 686162670e
langchain[minor]: Adding `infinity` embedding integration. (#13928)
This adds integation to https://github.com/michaelfeil/infinity. Users
requested it in https://github.com/michaelfeil/infinity/issues/36
@saatvikshah

Follows my implementation of gradient.ai.

Feedback 1: Well done - I love your CI / repo / poetry setup - I adapted
a lot in https://github.com/michaelfeil/infinity.
Feedback 2: Not so good: The openai integration contains to much reverse
engineering - in general projects such as michaelfeil/infinity and
huggingface/text-embeddings-inference are compatible to the `pip install
openai` package.

Reverse engineering like this one is really hindering the use for me:

8e88ba16a8/libs/langchain/langchain/embeddings/openai.py (L347)

8e88ba16a8/libs/langchain/langchain/embeddings/openai.py (L351)
- it is about preventing 3rd party providers to use the same url + uses
interfaces of openai, that are not publically documented.
7 months ago
Bagatur 10a6e7cbb6
langchain[patch], core[patch]: Make common utils public (#13932)
- rename `langchain_core.chat_models.base._generate_from_stream` -> `generate_from_stream`
- rename `langchain_core.chat_models.base._agenerate_from_stream` -> `agenerate_from_stream`
- export `langchain_core.utils.utils.build_extra_kwargs` from `langchain_core.utils`
7 months ago
Tyler Titsworth afcfa2a5e7
langchain[patch]: Add progress bar option to OllamaEmbeddings (#13882)
- **Description:** Adds a tqdm progress bar to OllamaEmbeddings when
embedding a list.
- **Issue:** Related to #13637, but extended to Ollama.
- **Dependencies:** `tqdm` made a necessary dependency.

Thanks to @ugm2 for helping identify a common problem. Embeddings take a
very long time to finish on local machines, and require a progress bar
to help identify if one should even attempt the workload.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
jeremyb-data cd77fba562
Improvement: Weaviate multitenant adddocs (#13827)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
- **Description:** Added a line to pass the tenant parameter to
add_data_object
  - **Issue:** An extra line added from the fix for #9956
  - **Dependencies:** n/a
  - **Tag maintainer:** @baskaryan 

Tested locally, works as expected with the line change.

---------

Co-authored-by: Simon Dai <simon6752@gmail.com>
7 months ago
jiangying 3e30cd8261
NIT: comment typo (#13817) 7 months ago
Assaf Toledo ba62ff89cc
BUGFIX: Support for elastic indices that don't return 'metadata' in '_source' (#13903)
Description: Some Elastic indexes do not return a 'metadata' field in
'_source'. However, prior to this PR, the code assumed there always is a
'metadata' field. This PR adds support for cases where the field is
missing by adding it manually.

Issue: #13869
7 months ago
Enric Soler Rastrollo c156d0281a
BUGFIX: Use embedding key in azure_cosmos_db index creation (#13919)
Description: Implement embedding key parametrisation
Issue: https://github.com/langchain-ai/langchain/issues/13918
Dependencies: None
Tag maintainer: @hwchase17 @izzymsft
Twitter handle:@MaddogoS
7 months ago
Bagatur ac67422a3d
IMPROVEMENT: import Document from core (#13905) 7 months ago
chyroc 886bc2d50a
IMPROVEMENT: fix qianfan validate_environment typo (#13908) 7 months ago
Chengzu Ou 4b8e053fe8
FEATURE: Add Databricks Vector Search as a new vector store (#13621)
**Description:**
This PR adds Databricks Vector Search as a new vector store in
LangChain.

- [x] Add `DatabricksVectorSearch` in `langchain/vectorstores/`
- [x] Unit tests
- [x] Add
[`databricks-vectorsearch`](https://pypi.org/project/databricks-vectorsearch/)
as a new optional dependency

We ran the following checks:
- `make format` passed  
- `make lint` failed but the failures were caused by other files
    + Files touched by this PR passed the linter  
- `make test` passed  
- `make coverage` failed but the failures were caused by other files.
Tests added by or related to this PR all passed
+ langchain/vectorstores/databricks_vector_search.py test coverage 94% 
- `make spell_check` passed  

The example notebook and updates to the [provider's documentation
page](https://github.com/langchain-ai/langchain/blob/master/docs/docs/integrations/providers/databricks.md)
will be added later in a separate PR.

**Dependencies:**
Optional dependency:
[`databricks-vectorsearch`](https://pypi.org/project/databricks-vectorsearch/)

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
Leonid Kuligin 25387db432
BUFIX: add support for various OSS images from Vertex Model Garden (#13917)
- **Description:** add support for various OSS images from Model
Garden
  - **Issue:** #13370
7 months ago
Bagatur 46b3311190
RELEASE: 0.0.341 (#13926) 7 months ago
Dylan Williams 1983a39894
FEATURE: Add OneNote document loader (#13841)
- **Description:** Added OneNote document loader
  - **Issue:** #12125
  - **Dependencies:** msal

Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
Tomaz Bratanic 1ad65f7a98
BUGFIX: Fix bugs with Cypher validation (#13849)
Fixes https://github.com/langchain-ai/langchain/issues/13803. Thanks to
@sakusaku-rich
7 months ago
Harrison Chase 6a35831128
BUGFIX: export more types (#13886)
Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
Yusuf Khan 935f78c944
FEATURE: Add retriever for Outline (#13889)
- **Description:** Added a retriever for the Outline API to ask
questions on knowledge base
  - **Issue:** resolves #11814
  - **Dependencies:** None
  - **Tag maintainer:** @baskaryan
7 months ago
Bagatur 0efa59cbb8
RELEASE: 0.0.339rc3 (#13852) 7 months ago
raelix c172605ea6
IMPROVEMENT: Added title metadata to GoogleDriveLoader for optional File Loaders (#13832)
- **Description:** Simple change, I just added title metadata to
GoogleDriveLoader for optional File Loaders
  - **Dependencies:** no dependencies
  - **Tag maintainer:** @hwchase17

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
Stefano Lottini 19c68c7652
FEATURE: Astra DB, LLM cache classes (exact-match and semantic cache) (#13834)
This PR provides idiomatic implementations for the exact-match and the
semantic LLM caches using Astra DB as backend through the database's
HTTP JSON API. These caches require the `astrapy` library as dependency.

Comes with integration tests and example usage in the `llm_cache.ipynb`
in the docs.

@baskaryan this is the Astra DB counterpart for the Cassandra classes
you merged some time ago, tagging you for your familiarity with the
topic. Thank you!

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
Stefano Lottini 272df9dcae
Astra DB, chat message history (#13836)
This PR adds a chat message history component that uses Astra DB for
persistence through the JSON API.
The `astrapy` package is required for this class to work.

I have added tests and a small notebook, and updated the relevant
references in the other docs pages.

(@rlancemartin this is the counterpart of the Cassandra equivalent class
you so helpfully reviewed back at the end of June)

Thank you!
7 months ago
Bagatur 58f7e109ac
BUGFIX: Add import types and typevars from core (#13829) 7 months ago
Bagatur 751226e067
bump 0.0.339rc2 (#13787) 7 months ago
Bagatur 72c108b003
IMPROVEMENT: filter global warnings properly (#13754) 7 months ago
Bagatur 0be515f720
RELEASE: 0.0.339rc1 (#13746) 7 months ago
Bagatur 32d087fcb8
REFACTOR: combine core documents files (#13733) 7 months ago
William FH 5b90fe5b1c
Fix locking (#13725) 7 months ago
Bagatur 16af282429
BUGFIX: add prompt imports for backwards compat (#13702) 7 months ago
dandanwei d47ee1ae79
BUGFIX: redis vector store overwrites falsey metadata (#13652)
- **Description:** This commit fixed the problem that Redis vector store
will change the value of a metadata from 0 to empty when saving the
document, which should be an un-intended behavior.
  - **Issue:** N/A
  - **Dependencies:** N/A
7 months ago
Bagatur a21e84faf7
BUGFIX: llm backwards compat imports (#13698) 7 months ago
Yujie Qian ace9e64d62
IMPROVEMENT: VoyageEmbeddings embed_general_texts (#13620)
- **Description:** add method embed_general_texts in VoyageEmebddings to
support input_type
  - **Issue:** 
  - **Dependencies:** 
  - **Tag maintainer:** 
  - **Twitter handle:** @Voyage_AI_
7 months ago
tanujtiwari-at 5064890fcf
BUGFIX: handle tool message type when converting to string (#13626)
**Description:** Currently, if we pass in a ToolMessage back to the
chain, it crashes with error

`Got unsupported message type: `

This fixes it. 

Tested locally

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
Erick Friis c5ae9f832d
INFRA: Lint for imports (#13632)
- Adds pydantic/import linting to core
- Adds a check for `langchain_experimental` imports to langchain
7 months ago
Erick Friis 131db4ba68
BUGFIX: anthropic models on bedrock (#13629)
Introduced in #13403
7 months ago
David Ruan 04bddbaba4
BUGFIX: Update bedrock.py to fix provider bug (#13646)
Provider check was incorrectly failing for anything other than "meta"
7 months ago
Bagatur dc53523837
IMPROVEMENT: bump core dep 0.0.3 (#13690) 7 months ago
Bagatur a208abe6b7
add callback import test (#13689) 7 months ago
Bagatur 083afba697
BUG: Add core utils imports (#13688) 7 months ago
Bagatur c61e30632e
BUG: more core fixes (#13665)
Fix some circular deps:
- move PromptValue into top level module bc both PromptTemplates and
OutputParsers import
- move tracer context vars to `tracers.context` and import them in
functions in `callbacks.manager`
- add core import tests
7 months ago
William FH 59df16ab92
Update name (#13676) 7 months ago
Bagatur 11614700a4
bump 0.0.339rc0 (#13664) 7 months ago
Bagatur d32e511826
REFACTOR: Refactor langchain_core (#13627)
Changes:
- remove langchain_core/schema since no clear distinction b/n schema and
non-schema modules
- make every module that doesn't end in -y plural
- where easy have 1-2 classes per file
- no more than one level of nesting in directories
- only import from top level core modules in langchain
7 months ago
William FH 17c6551c18
Add error rate (#13568)
To the in-memory outputs. Separate it out from the outputs so it's
present in the dataframe.describe() results
7 months ago
Nuno Campos 8329f81072
Use pytest asyncio auto mode (#13643)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
7 months ago
Bagatur 99b4f46cbe
REFACTOR: Add core as dep (#13623) 7 months ago
Harrison Chase d82cbf5e76
Separate out langchain_core package (#13577)
Co-authored-by: Nuno Campos <nuno@boringbits.io>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
7 months ago
Bagatur e620347a83
RELEASE: bump 339 (#13613) 7 months ago
Ofer Mendelevitch 52e23e50b1
BUG: Fix search_kwargs in Vectara retriever (#13299)
- **Description:** fix a bug that prevented as_retriever() in Vectara to
use the desired input arguments
  - **Issue:** as_retriever did not pass the arguments properly
  - **Tag maintainer:** @baskaryan
  - **Twitter handle:** @ofermend
7 months ago
Holt Skinner 1c08dbfb33
IMPROVEMENT: Reduce post-processing time for `DocAIParser` (#13210)
- Remove `WrappedDocument` introduced in
https://github.com/langchain-ai/langchain/pull/11413
- https://github.com/googleapis/python-documentai-toolbox/issues/198 in
Document AI Toolbox to improve initialization time for `WrappedDocument`
object.

@lkuligin

@baskaryan

@hwchase17
7 months ago
Leonid Kuligin f3fcdea574
fixed an UnboundLocalError when no documents are found (#12995)
Replace this entire comment with:
  - **Description:** fixed a bug
  - **Issue:** the issue # #12780
7 months ago
Stijn Tratsaert b6f70d776b
VertexAI LLM count_tokens method requires list of prompts (#13451)
I encountered this during summarization with VertexAI. I was receiving
an INVALID_ARGUMENT error, as it was trying to send a list of about
17000 single characters.

The [count_tokens
method](https://github.com/googleapis/python-aiplatform/blob/main/vertexai/language_models/_language_models.py#L658)
made available by Google takes in a list of prompts. It does not fail
for small texts, but it does for longer documents because the argument
list will be exceeding Googles allowed limit. Enforcing the list type
makes it work successfully.

This change will cast the input text to count to a list of that single
text so that the input format is always correct.

[Twitter](https://www.x.com/stijn_tratsaert)
7 months ago
Wang Wei fe7b40cb2a
feat: add ERNIE-Bot-4 Function Calling (#13320)
- **Description:** ERNIE-Bot-Chat-4 Large Language Model adds the
ability of `Function Calling` by passing parameters through the
`functions` parameter in the request. To simplify function calling for
ERNIE-Bot-Chat-4, the `create_ernie_fn_chain()` function has been added.
The definition and usage of the `create_ernie_fn_chain()` function is
similar to that of the `create_openai_fn_chain()` function.

Examples as the follows:

```
import json

from langchain.chains.ernie_functions import (
    create_ernie_fn_chain,
)
from langchain.chat_models import ErnieBotChat
from langchain.prompts import ChatPromptTemplate

def get_current_news(location: str) -> str:
    """Get the current news based on the location.'

    Args:
        location (str): The location to query.
    
    Returs:
        str: Current news based on the location.
    """

    news_info = {
        "location": location,
        "news": [
            "I have a Book.",
            "It's a nice day, today."
        ]
    }

    return json.dumps(news_info)

def get_current_weather(location: str, unit: str="celsius") -> str:
    """Get the current weather in a given location

    Args:
        location (str): location of the weather.
        unit (str): unit of the tempuature.
    
    Returns:
        str: weather in the given location.
    """

    weather_info = {
        "location": location,
        "temperature": "27",
        "unit": unit,
        "forecast": ["sunny", "windy"],
    }
    return json.dumps(weather_info)

llm = ErnieBotChat(model_name="ERNIE-Bot-4")
prompt = ChatPromptTemplate.from_messages(
    [
        ("human", "{query}"),
    ]
)

chain = create_ernie_fn_chain([get_current_weather, get_current_news], llm, prompt, verbose=True)
res = chain.run("北京今天的新闻是什么?")
print(res)
```

The running results of the above program are shown below:
```
> Entering new LLMChain chain...
Prompt after formatting:
Human: 北京今天的新闻是什么?



> Finished chain.
{'name': 'get_current_news', 'thoughts': '用户想要知道北京今天的新闻。我可以使用get_current_news工具来获取这些信息。', 'arguments': {'location': '北京'}}
```
7 months ago
Adilkhan Sarsen 10418ab0c1
DeepLake Backwards compatibility fix (#13388)
- **Description:** during search with DeepLake some people are facing
backwards compatibility issues, this PR fixes it by making search
accessible for the older datasets

---------

Co-authored-by: adolkhan <adilkhan.sarsen@alumni.nu.edu.kz>
7 months ago
Tyler Hutcherson 190952fe76
IMPROVEMENT: Minor redis improvements (#13381)
- **Description:**
- Fixes a `key_prefix` bug where passing it in on
`Redis.from_existing(...)` did not work properly. Updates doc strings
accordingly.
- Updates Redis filter classes logic with best practices on typing,
string formatting, and handling "empty" filters.
- Fixes a bug that would prevent multiple tag filters from being applied
together in some scenarios.
- Added a whole new filter unit testing module. Also updated code
formatting for a number of modules that were failing the `make`
commands.
  - **Issue:** N/A
  - **Dependencies:** N/A
  - **Tag maintainer:** @baskaryan 
  - **Twitter handle:** @tchutch94
7 months ago
Sergey Kozlov df03267edf
Fix tool arguments formatting in StructuredChatAgent (#10480)
In the `FORMAT_INSTRUCTIONS` template, 4 curly braces (escaping) are
used to get single curly brace after formatting:

```
"{{{ ... }}}}" -> format_instructions.format() ->  "{{ ... }}" -> template.format() -> "{ ... }".
```

Tool's `args_schema` string contains single braces `{ ... }`, and is
also transformed to `{{{{ ... }}}}` form. But this is not really correct
since there is only one `format()` call:

```
"{{{{ ... }}}}" -> template.format() -> "{{ ... }}".
```

As a result we get double curly braces in the prompt:
````
Respond to the human as helpfully and accurately as possible. You have access to the following tools:

foo: Test tool FOO, args: {{'tool_input': {{'type': 'string'}}}}    # <--- !!!
...
Provide only ONE action per $JSON_BLOB, as shown:

```
{
  "action": $TOOL_NAME,
  "action_input": $INPUT
}
```
````

This PR fixes curly braces escaping in the `args_schema` to have single
braces in the final prompt:
````
Respond to the human as helpfully and accurately as possible. You have access to the following tools:

foo: Test tool FOO, args: {'tool_input': {'type': 'string'}}    # <--- !!!
...
Provide only ONE action per $JSON_BLOB, as shown:

```
{
  "action": $TOOL_NAME,
  "action_input": $INPUT
}
```
````

---------

Co-authored-by: Sergey Kozlov <sergey.kozlov@ludditelabs.io>
7 months ago
Wouter Durnez ef7802b325
Add llama2-13b-chat-v1 support to `chat_models.BedrockChat` (#13403)
Hi 👋 We are working with Llama2 on Bedrock, and would like to add it to
Langchain. We saw a [pull
request](https://github.com/langchain-ai/langchain/pull/13322) to add it
to the `llm.Bedrock` class, but since it concerns a chat model, we would
like to add it to `BedrockChat` as well.

- **Description:** Add support for Llama2 to `BedrockChat` in
`chat_models`
- **Issue:** the issue # it fixes (if applicable)
[#13316](https://github.com/langchain-ai/langchain/issues/13316)
  - **Dependencies:** any dependencies required for this change `None`
  - **Tag maintainer:** /
  - **Twitter handle:** `@SimonBockaert @WouterDurnez`

---------

Co-authored-by: wouter.durnez <wouter.durnez@showpad.com>
Co-authored-by: Simon Bockaert <simon.bockaert@showpad.com>
7 months ago
jwbeck97 a93616e972
FEAT: Add azure cognitive health tool (#13448)
- **Description:** This change adds an agent to the Azure Cognitive
Services toolkit for identifying healthcare entities
  - **Dependencies:** azure-ai-textanalytics (Optional)

---------

Co-authored-by: James Beck <James.Beck@sa.gov.au>
Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago