Commit Graph

5010 Commits (199e64d37288330a80f9aeb19e00dff23af71d4c)

Author SHA1 Message Date
Eugene Yurtsev 8f7cc73817
ci: Add script to check for pickle usage in community (#22863)
Add script to check for pickle usage in community.
3 months ago
Eugene Yurtsev 77209f315e
community[patch]: FAISS VectorStore deserializer should be opt-in (#22861)
FAISS deserializer uses pickle module. Users have to opt-in to
de-serialize.
3 months ago
Eugene Yurtsev ce0b0f22a1
experimental[major]: Force users to opt-in into code that relies on the python repl (#22860)
This should make it obvious that a few of the agents in langchain
experimental rely on the python REPL as a tool under the hood, and will
force users to opt-in.
3 months ago
Isaac Francisco 869523ad72
[docs]: added info for TavilySearchResults (#22765) 3 months ago
ccurme 42257b120f
partners: fix numpy dep (#22858)
Following https://github.com/langchain-ai/langchain/pull/22813, which
added python 3.12 to CI, here we update numpy accordingly in partner
packages.
3 months ago
Isaac Francisco 345fd3a556
minor functionality change: adding API functionality to tavilysearch (#22761) 3 months ago
Isaac Francisco 034257e9bf
docs: improved recursive url loader docs (#22648)
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
3 months ago
ccurme b626c3ca23
groq[patch]: add usage_metadata to (a)invoke and (a)stream (#22834) 3 months ago
James Braza 45b394268c
core[patch]: allowing latest `packaging` versions (#22792)
Allowing version 24 of https://github.com/pypa/packaging

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
3 months ago
Karim Lalani 276be6cdd4
[experimental][llms][OllamaFunctions] tool calling related fixes (#22339)
Fixes issues with tool calling to handle tool objects correctly. Added
support to handle ToolMessage correctly.
Added additional checks for error conditions.

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
3 months ago
Christophe Bornet d04e899b56
ci: add testing with Python 3.12 (#22813)
We need to use a different version of numpy for py3.8 and py3.12 in
pyproject.
And so do projects that use that Python version range and import
langchain.

    - **Twitter handle:** _cbornet
3 months ago
HyoJin Kang b6bf2bb234
community[patch]: fix database uri type in SQLDatabase (#22661)
**Description**
sqlalchemy uses "sqlalchemy.engine.URL" type for db uri argument.
Added 'URL' type for compatibility.

**Issue**: None

**Dependencies:** None

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
3 months ago
Eugene Yurtsev 5dbbdcbf8e
core[patch]: Update remaining root_validators (#22829)
This PR updates the remaining root_validators in core to either be explicit pre-init or post-init validators.
3 months ago
Eugene Yurtsev 265e650e64
community[patch]: Update root_validators embeddings: llamacpp, jina, dashscope, mosaicml, huggingface_hub, Toolkits: Connery, ChatModels: PAI_EAS, (#22828)
This PR updates root validators for:

* Embeddings: llamacpp, jina, dashscope, mosaicml, huggingface_hub
* Toolkits: Connery
* ChatModels: PAI_EAS

Following this issue:
https://github.com/langchain-ai/langchain/issues/22819
3 months ago
JonZeolla 32ba8cfab0
community[minor]: implement huggingface show_progress consistently (#22682)
- **Description:** This implements `show_progress` more consistently
(i.e. it is also added to the `HuggingFaceBgeEmbeddings` object).
- **Issue:** This implements `show_progress` more consistently in the
embeddings huggingface classes. Previously this could have been set via
`encode_kwargs`.
 - **Dependencies:** None
 - **Twitter handle:** @jonzeolla
3 months ago
Eugene Yurtsev 74e705250f
core[patch]: update some root_validators (#22787)
Update some of the @root_validators to be explicit pre=True or
pre=False, skip_on_failure=True for pydantic 2 compatibility.
3 months ago
mrhbj a1268d9e9a
community[patch]: fix hunyuan message include chinese signature error (#22795) (#22796)
… (#22795)

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
3 months ago
Mr. Lance E Sloan «UMich» 08c466c603
community[patch]: bugfix for `YoutubeLoader`'s `LINES` format (#22815)
- **Description:** A change I submitted recently introduced a bug in
`YoutubeLoader`'s `LINES` output format. In those conditions, curly
braces ("`{}`") creates a set, not a dictionary. This bugfix explicitly
specifies that a dictionary is created.
- **Issue:** N/A
- **Dependencies:** N/A
- **Twitter:** lsloan_umich
- **Mastodon:**
[lsloan@mastodon.social](https://mastodon.social/@lsloan)
3 months ago
Philippe PRADOS 23c22fcbc9
langchain[minor]: Make EmbeddingsFilters async (#22737)
Add native async implementation for EmbeddingsFilter
3 months ago
ccurme 936aedd10c
mistral[patch]: add usage_metadata to (a)invoke and (a)stream (#22781) 3 months ago
mrhbj 9212c9fcb8
community[patch]: fix hunyuan client json analysis (#22452) (#22767)
Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [x] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
3 months ago
Rohan Aggarwal 86e8224cf1
community[patch]: Support for old clients (Thin and Thick) Oracle Vector Store (#22766)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
Support for old clients (Thin and Thick) Oracle Vector Store


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
Support for old clients (Thin and Thick) Oracle Vector Store

- [ ] **Add tests and docs**: If you're adding a new integration, please
include
Have our own local tests

---------

Co-authored-by: rohan.aggarwal@oracle.com <rohaagga@phoenix95642.dev3sub2phx.databasede3phx.oraclevcn.com>
3 months ago
Mr. Lance E Sloan «UMich» 84dc2dd059
community[patch]: Load YouTube transcripts (captions) as fixed-duration chunks with start times (#21710)
- **Description:** Add a new format, `CHUNKS`, to
`langchain_community.document_loaders.youtube.YoutubeLoader` which
creates multiple `Document` objects from YouTube video transcripts
(captions), each of a fixed duration. The metadata of each chunk
`Document` includes the start time of each one and a URL to that time in
the video on the YouTube website.
  
I had implemented this for UMich (@umich-its-ai) in a local module, but
it makes sense to contribute this to LangChain community for all to
benefit and to simplify maintenance.

- **Issue:** N/A
- **Dependencies:** N/A
- **Twitter:** lsloan_umich
- **Mastodon:**
[lsloan@mastodon.social](https://mastodon.social/@lsloan)

With regards to **tests and documentation**, most existing features of
the `YoutubeLoader` class are not tested. Only the
`YoutubeLoader.extract_video_id()` static method had a test. However,
while I was waiting for this PR to be reviewed and merged, I had time to
add a test for the chunking feature I've proposed in this PR.

I have added an example of using chunking to the
`docs/docs/integrations/document_loaders/youtube_transcript.ipynb`
notebook.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
3 months ago
Aayush Kataria 71811e0547
community[minor]: Adds a vector store for Azure Cosmos DB for NoSQL (#21676)
This PR add supports for Azure Cosmos DB for NoSQL vector store.

Summary:

Description: added vector store integration for Azure Cosmos DB for
NoSQL Vector Store,
Dependencies: azure-cosmos dependency,
Tag maintainer: @hwchase17, @baskaryan @efriis @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
3 months ago
Mohammad Mohtashim 36cad5d25c
[Community]: Added Metadata filter support for DocumentDB Vector Store (#22777)
- **Description:** As pointed out in this issue #22770, DocumentDB
`similarity_search` does not support filtering through metadata which
this PR adds by passing in the parameter `filter`. Also this PR fixes a
minor Documentation error.
- **Issue:** #22770

---------

Co-authored-by: Erick Friis <erickfriis@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
3 months ago
Dmitry Stepanov 912751e268
Ollama vision support (#22734)
**Description:** Ollama vision with messages in OpenAI-style support `{
"image_url": { "url": ... } }`
**Issue:** #22460 

Added flexible solution for ChatOllama to support chat messages with
images. Works when you provide either `image_url` as a string or as a
dict with "url" inside (like OpenAI does). So it makes available to use
tuples with `ChatPromptTemplate.from_messages()`

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
3 months ago
Philippe PRADOS 0908b01cb2
langchain[minor]: Add native async implementation to LLMFilter, add concurrency to both sync and async paths (#22739)
Thank you for contributing to LangChain!

- [ ] **PR title**: "langchain: Fix chain_filter.py to be compatible
with async"


- [ ] **PR message**: 
    - **Description:** chain_filter is not compatible with async.
    - **Twitter handle:** pprados


- [X ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

---------

Signed-off-by: zhangwangda <zhangwangda94@163.com>
Co-authored-by: Prakul <discover.prakul@gmail.com>
Co-authored-by: Lei Zhang <zhanglei@apache.org>
Co-authored-by: Gin <ictgtvt@gmail.com>
Co-authored-by: wangda <38549158+daziz@users.noreply.github.com>
Co-authored-by: Max Mulatz <klappradla@posteo.net>
3 months ago
Jaeyeon Kim(김재연) ce4e29ae42
community[minor]: fix redis store docstring and streamline initialization code (#22730)
Thank you for contributing to LangChain!

### Description

Fix the example in the docstring of redis store.
Change the initilization logic and remove redundant check, enhance error
message.

### Issue

The example in docstring of how to use redis store was wrong.

![image](https://github.com/langchain-ai/langchain/assets/37469330/78c5d9ce-ee66-45b3-8dfe-ea29f125e6e9)

### Dependencies
Nothing



- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
3 months ago
am-kinetica ad101adec8
community[patch]: Kinetica Integrations handled error in querying; quotes in table names; updated gpudb API (#22724)
- [ ] **Miscellaneous updates and fixes**: 
- **Description:** Handled error in querying; quotes in table names;
updated gpudb API
- **Issue:** Threw an error with an error message difficult to
understand if a query failed or returned no records
    - **Dependencies:** Updated GPUDB API version to `7.2.0.9`


@baskaryan @hwchase17
3 months ago
Mathis Joffre ea43f40daf
community[minor]: Add support for OVHcloud AI Endpoints Embedding (#22667)
**Description:** Add support for [OVHcloud AI
Endpoints](https://endpoints.ai.cloud.ovh.net/) Embedding models.

Inspired by:
https://gist.github.com/gmasse/e1f99339e161f4830df6be5d0095349a

Signed-off-by: Joffref <mariusjoffre@gmail.com>
4 months ago
Erick Friis 2aaf86ddae
core: fix mustache falsy cases (#22747) 4 months ago
Eugene Yurtsev 5a7eac191a
core[patch]: Add missing type annotations (#22756)
Add missing type annotations.

The missing type annotations will raise exceptions with pydantic 2.
4 months ago
Eugene Yurtsev 05d31a2f00
community[patch]: Add missing type annotations (#22758)
Add missing type annotations to objects in community.
These missing type annotations will raise type errors in pydantic 2.
4 months ago
Naka Masato 3237909221
langchain[patch]: allow to use partial variables in create_sql_query_chain (#22688)
- **Description:** allow to use partial variables to pass `top_k` and
`table_info`
- **Issue:** no
- **Dependencies:** no
- **Twitter handle:** @gymnstcs

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
4 months ago
Bharat Ramanathan 2b5631a6be
community[patch]: fix `WandbTracer` to work with new "RunV2" API (#22673)
- **Description:** This PR updates the `WandbTracer` to work with the
new RunV2 API so that wandb Traces logging works correctly for new
LangChain versions. Here's an example
[run](https://wandb.ai/parambharat/langchain-tracing/runs/wpm99ftq) from
the existing tests
- **Issue:** https://github.com/wandb/wandb/issues/7762
- **Twitter handle:** @ParamBharat

_If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17._
4 months ago
Oguz Vuruskaner f0f4532579
community[patch]: fix deepinfra inference (#22680)
This PR includes:

1. Update of default model to LLama3.
2. Handle some 400x errors with more user friendly error messages.
3. Handle user errors.
4 months ago
Lucas Tucker cb79e80b0b
docs: standardize ChatHuggingFace (#22693)
**Updated ChatHuggingFace doc string as per issue #22296**:
"langchain_huggingface: updated docstring for ChatHuggingFace in
langchain_huggingface to match that of the description (in the appendix)
provided in issue #22296. "

**Issue:** This PR is in response to issue #22296, and more specifically
ChatHuggingFace model. In particular, this PR updates the docstring for
langchain/libs/partners/hugging_face/langchain_huggingface/chat_models/huggingface.py
by adding the following sections: Instantiate, Invoke, Stream, Async,
Tool calling, and Response metadata. I used the template from the
Anthropic implementation and referenced the Appendix of the original
issue post. I also noted that: langchain_community hugging face llms do
not work with langchain_huggingface's ChatHuggingFace model (at least
for me); the .stream(messages) functionality of ChatHuggingFace only
returned a block of response.

---------

Co-authored-by: lucast2021 <lucast2021@headroyce.org>
Co-authored-by: Bagatur <baskaryan@gmail.com>
4 months ago
Tomaz Bratanic 76a193decc
community[patch]: Add function response to graph cypher qa chain (#22690)
LLMs struggle with Graph RAG, because it's different from vector RAG in
a way that you don't provide the whole context, only the answer and the
LLM has to believe. However, that doesn't really work a lot of the time.
However, if you wrap the context as function response the accuracy is
much better.

btw... `union[LLMChain, Runnable]` is linting fun, that's why so many
ignores
4 months ago
X-HAN 34edfe4a16
community[minor]: add Volcengine Rerank (#22700)
**Description:** this PR adds Volcengine Rerank capability to Langchain,
you can find Volcengine Rerank API from
[here](https://www.volcengine.com/docs/84313/1254474) &
[here](https://www.volcengine.com/docs/84313/1254605).
[Volcengine](https://www.volcengine.com/) is a cloud service platform
developed by ByteDance, the parent company of TikTok. You can obtain
Volcengine API AK/SK from
[here](https://www.volcengine.com/docs/84313/1254553).

**Dependencies:** VolcengineRerank depends on `volcengine` python
package.

**Twitter handle:** my twitter/x account is https://x.com/LastMonopoly
and I'd like a mention, thank you!


**Tests and docs**
  1. integration test: `test_volcengine_rerank.py`
  2. example notebook: `volcengine_rerank.ipynb`

**Lint and test**: I have run `make format`, `make lint` and `make test`
from the root of the package I've modified.
4 months ago
Mohammad Mohtashim c3cce98d86
community[patch]: Small Fix in OutlookMessageLoader (Close the Message once Open) (#22744)
- **Description:** A very small fix where we close the message when it
opened
- **Issue:** #22729
4 months ago
ccurme f9fdca6cc2
openai: add `parallel_tool_calls` to api ref (#22746)
![Screenshot 2024-06-10 at 1 41 24
PM](https://github.com/langchain-ai/langchain/assets/26529506/2626bf9c-41c6-4431-b2e1-f59de1e4e468)
4 months ago
Max Mulatz 058a64c563
Community[minor]: Add language parser for Elixir (#22742)
Hi 👋 

First off, thanks a ton for your work on this 💚 Really appreciate what
you're providing here for the community.

## Description

This PR adds a basic language parser for the
[Elixir](https://elixir-lang.org/) programming language. The parser code
is based upon the approach outlined in
https://github.com/langchain-ai/langchain/pull/13318: it's using
`tree-sitter` under the hood and aligns with all the other `tree-sitter`
based parses added that PR.

The `CHUNK_QUERY` I'm using here is probably not the most sophisticated
one, but it worked for my application. It's a starting point to provide
"core" parsing support for Elixir in LangChain. It enables people to use
the language parser out in real world applications which may then lead
to further tweaking of the queries. I consider this PR just the ground
work.

- **Dependencies:** requires `tree-sitter` and `tree-sitter-languages`
from the extended dependencies
- **Twitter handle:**`@bitcrowd`

## Checklist

- [x] **PR title**: "package: description"
- [x] **Add tests and docs**
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified.

<!-- If no one reviews your PR within a few days, please @-mention one
of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. -->
4 months ago
Philippe PRADOS 2d4689d721
langchain[minor]: Add pgvector to list of supported vectorstores in self query retriever (#22678)
The fact that we outsourced pgvector to another project has an
unintended effect. The mapping dictionary found by
`_get_builtin_translator()` cannot recognize the new version of pgvector
because it comes from another package.
`SelfQueryRetriever` no longer knows `PGVector`.

I propose to fix this by creating a global dictionary that can be
populated by various database implementations. Thus, importing
`langchain_postgres` will allow the registration of the `PGvector`
mapping.

But for the moment I'm just adding a lazy import

Furthermore, the implementation of _get_builtin_translator()
reconstructs the BUILTIN_TRANSLATORS variable with each invocation,
which is not very efficient. A global map would be an optimization.

- **Twitter handle:** pprados

@eyurtsev, can you review this PR? And unlock the PR [Add async mode for
pgvector](https://github.com/langchain-ai/langchain-postgres/pull/32)
and PR [community[minor]: Add SQL storage
implementation](https://github.com/langchain-ai/langchain/pull/22207)?

Are you in favour of a global dictionary-based implementation of
Translator?
4 months ago
Enzo Poggio 8f019e91d7
community[patch]: Use Custom Logger Instead of Root Logger in get_user_agent Function (#22691)
## Description
This PR addresses a logging inconsistency in the `get_user_agent`
function. Previously, the function was using the root logger to log a
warning message when the "USER_AGENT" environment variable was not set.
This bypassed the custom logger `log` that was created at the start of
the module, leading to potential inconsistencies in logging behavior.

Changes:
- Replaced `logging.warning` with `log.warning` in the `get_user_agent`
function to ensure that the custom logger is used.

This change ensures that all logging in the `get_user_agent` function
respects the configurations of the custom logger, leading to more
consistent and predictable logging behavior.

## Dependencies

None

## Issue 

None

## Tests and docs

☝🏻 see description


## `make format`, `make lint` & `cd libs/community; make test`

```shell
> make format 
poetry run ruff format docs templates cookbook
1417 files left unchanged
poetry run ruff check --select I --fix docs templates cookbook
All checks passed!
```

```shell
> make lint
poetry run ruff check docs templates cookbook
All checks passed!
poetry run ruff format docs templates cookbook --diff
1417 files already formatted
poetry run ruff check --select I docs templates cookbook
All checks passed!
git grep 'from langchain import' docs/docs templates cookbook | grep -vE 'from langchain import (hub)' && exit 1 || exit 0
```

~cd libs/community; make test~ too much dependencies for integration ...

```shell
>  poetry run pytest tests/unit_tests   
....
==== 884 passed, 466 skipped, 4447 warnings in 15.93s ====
```

I choose you randomly : @ccurme
4 months ago
Philippe PRADOS 9aabb446c5
community[minor]: Add SQL storage implementation (#22207)
Hello @eyurtsev

- package: langchain-comminity
- **Description**: Add SQL implementation for docstore. A new
implementation, in line with my other PR ([async
PGVector](https://github.com/langchain-ai/langchain-postgres/pull/32),
[SQLChatMessageMemory](https://github.com/langchain-ai/langchain/pull/22065))
- Twitter handler: pprados

---------

Signed-off-by: ChengZi <chen.zhang@zilliz.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Piotr Mardziel <piotrm@gmail.com>
Co-authored-by: ChengZi <chen.zhang@zilliz.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
4 months ago
Nithish Raghunandanan f2f0e0e13d
couchbase: Add the initial version of Couchbase partner package (#22087)
Co-authored-by: Nithish Raghunandanan <nithishr@users.noreply.github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
4 months ago
Cahid Arda Öz 6c07eb0c12
community[minor]: Add UpstashRatelimitHandler (#21885)
Adding `UpstashRatelimitHandler` callback for rate limiting based on
number of chain invocations or LLM token usage.

For more details, see [upstash/ratelimit-py
repository](https://github.com/upstash/ratelimit-py) or the notebook
guide included in this PR.

Twitter handle: @cahidarda

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
4 months ago
Erick Friis 9e03864d64
core: add error message for non-structured llm to StructuredPrompt (#22684)
previously was the blank `NotImplementedError` from
`BaseLanguageModel.with_structured_output`
4 months ago
ccurme f32d57f6f0
anthropic: refactor streaming to use events api; add streaming usage metadata (#22628)
- Refactor streaming to use raw events;
- Add `stream_usage` class attribute and kwarg to stream methods that,
if True, will include separate chunks in the stream containing usage
metadata.

There are two ways to implement streaming with anthropic's python sdk.
They have slight differences in how they surface usage metadata.
1. [Use helper
functions](https://github.com/anthropics/anthropic-sdk-python?tab=readme-ov-file#streaming-helpers).
This is what we are doing now.
```python
count = 1
with client.messages.stream(**params) as stream:
    for text in stream.text_stream:
        snapshot = stream.current_message_snapshot
        print(f"{count}: {snapshot.usage} -- {text}")
        count = count + 1

final_snapshot = stream.get_final_message()
print(f"{count}: {final_snapshot.usage}")
```
```
1: Usage(input_tokens=8, output_tokens=1) -- Hello
2: Usage(input_tokens=8, output_tokens=1) -- !
3: Usage(input_tokens=8, output_tokens=1) --  How
4: Usage(input_tokens=8, output_tokens=1) --  can
5: Usage(input_tokens=8, output_tokens=1) --  I
6: Usage(input_tokens=8, output_tokens=1) --  assist
7: Usage(input_tokens=8, output_tokens=1) --  you
8: Usage(input_tokens=8, output_tokens=1) --  today
9: Usage(input_tokens=8, output_tokens=1) -- ?
10: Usage(input_tokens=8, output_tokens=12)
```
To do this correctly, we need to emit a new chunk at the end of the
stream containing the usage metadata.

2. [Handle raw
events](https://github.com/anthropics/anthropic-sdk-python?tab=readme-ov-file#streaming-responses)
```python
stream = client.messages.create(**params, stream=True)
count = 1
for event in stream:
    print(f"{count}: {event}")
    count = count + 1
```
```
1: RawMessageStartEvent(message=Message(id='msg_01Vdyov2kADZTXqSKkfNJXcS', content=[], model='claude-3-haiku-20240307', role='assistant', stop_reason=None, stop_sequence=None, type='message', usage=Usage(input_tokens=8, output_tokens=1)), type='message_start')
2: RawContentBlockStartEvent(content_block=TextBlock(text='', type='text'), index=0, type='content_block_start')
3: RawContentBlockDeltaEvent(delta=TextDelta(text='Hello', type='text_delta'), index=0, type='content_block_delta')
4: RawContentBlockDeltaEvent(delta=TextDelta(text='!', type='text_delta'), index=0, type='content_block_delta')
5: RawContentBlockDeltaEvent(delta=TextDelta(text=' How', type='text_delta'), index=0, type='content_block_delta')
6: RawContentBlockDeltaEvent(delta=TextDelta(text=' can', type='text_delta'), index=0, type='content_block_delta')
7: RawContentBlockDeltaEvent(delta=TextDelta(text=' I', type='text_delta'), index=0, type='content_block_delta')
8: RawContentBlockDeltaEvent(delta=TextDelta(text=' assist', type='text_delta'), index=0, type='content_block_delta')
9: RawContentBlockDeltaEvent(delta=TextDelta(text=' you', type='text_delta'), index=0, type='content_block_delta')
10: RawContentBlockDeltaEvent(delta=TextDelta(text=' today', type='text_delta'), index=0, type='content_block_delta')
11: RawContentBlockDeltaEvent(delta=TextDelta(text='?', type='text_delta'), index=0, type='content_block_delta')
12: RawContentBlockStopEvent(index=0, type='content_block_stop')
13: RawMessageDeltaEvent(delta=Delta(stop_reason='end_turn', stop_sequence=None), type='message_delta', usage=MessageDeltaUsage(output_tokens=12))
14: RawMessageStopEvent(type='message_stop')
```

Here we implement the second option, in part because it should make
things easier when implementing streaming tool calls in the near future.

This would add two new chunks to the stream-- one at the beginning and
one at the end-- with blank content and containing usage metadata. We
add kwargs to the stream methods and a class attribute allowing for this
behavior to be toggled. I enabled it by default. If we merge this we can
add the same kwargs / attribute to OpenAI.

Usage:
```python
from langchain_anthropic import ChatAnthropic

model = ChatAnthropic(
    model="claude-3-haiku-20240307",
    temperature=0
)

full = None
for chunk in model.stream("hi"):
    full = chunk if full is None else full + chunk
    print(chunk)

print(f"\nFull: {full}")
```
```
content='' id='run-8a20843f-25c7-4025-ad72-9add395899e3' usage_metadata={'input_tokens': 8, 'output_tokens': 0, 'total_tokens': 8}
content='Hello' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content='!' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' How' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' can' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' I' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' assist' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' you' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' today' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content='?' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content='' id='run-8a20843f-25c7-4025-ad72-9add395899e3' usage_metadata={'input_tokens': 0, 'output_tokens': 12, 'total_tokens': 12}

Full: content='Hello! How can I assist you today?' id='run-8a20843f-25c7-4025-ad72-9add395899e3' usage_metadata={'input_tokens': 8, 'output_tokens': 12, 'total_tokens': 20}
```
4 months ago
Bagatur 235d91940d
community[patch]: Release 0.2.4 (#22643) 4 months ago
William FH be79ce9336
[Core] Unified Enable/Disable Tracing (#22576) 4 months ago
Bagatur fe2e5a3b74
langchain[patch]: Release 0.2.3 (#22644) 4 months ago
Erick Friis a24a9c6427
multiple: get rid of pyproject extras (#22581)
They cause `poetry lock` to take a ton of time, and `uv pip install` can
resolve the constraints from these toml files in trivial time
(addressing problem with #19153)

This allows us to properly upgrade lockfile dependencies moving forward,
which revealed some issues that were either fixed or type-ignored (see
file comments)
4 months ago
Bagatur 4367e89c9a
core[patch]: Release 0.2.5 (#22642) 4 months ago
Eugene Yurtsev 28f744c1f5
core[patch]: Correctly order parent ids in astream events (from root to immediate parent), add defensive check for cycles (#22637)
This PR makes two changes:

1. Fixes the order of parent IDs to be from root to immediate parent
2. Adds a simple defensive check for cycles
4 months ago
Eugene Yurtsev 035a9c9609
core[minor]: Add parent_ids to astream_events API (#22563)
Include a list of parent ids for each event in astream events.
4 months ago
Nicolas Nkiere 51005e2776
core[minor]: Add an async root listener and with_alisteners method (#22151)
- [x] **Adding AsyncRootListener**: "langchain_core: Adding
AsyncRootListener"

- **Description:** Adding an AsyncBaseTracer, AsyncRootListener and
`with_alistener` function. This is to enable binding async root listener
to runnables. This currently only supported for sync listeners.
- **Issue:** None
- **Dependencies:** None

- [x] **Add tests and docs**: Added units tests and example snippet code
within the function description of `with_alistener`


- [x] **Lint and test**: Run make format_diff, make lint_diff and make
test
4 months ago
seyf97 2904c50cd5
openai[patch]: correct grammar in exception message in embeddings/base.py (#22629)
Correct the grammar error for missing transformers package ValueError
4 months ago
Anush 80560419b0
qdrant[patch]: Make path optional in from_existing_collection() (#21875)
## Description

The `path` param is used to specify the local persistence directory,
which isn't required if using Qdrant server.

This is a breaking but necessary change.
4 months ago
ccurme b57aa89f34
multiple: implement ls_params (#22621)
implement ls_params for ai21, fireworks, groq.
4 months ago
Xiangrui Meng f26ab93df8
community: support Databricks Unity Catalog functions as LangChain tools (#22555)
This PR adds support for using Databricks Unity Catalog functions as
LangChain tools, which runs inside a Databricks SQL warehouse.

* An example notebook is provided.
4 months ago
ccurme c1ef731503
anthropic: update attribute name and alias (#22625)
update name to `stop_sequences` and alias to `stop` (instead of the
other way around), since `stop_sequences` is the name used by anthropic.
4 months ago
lucasiscovici 05bf98b2f9
community[patch]: pgvector replace nin_ by not_in (#22619)
- [ ] **community**: "pgvector: replace nin_ by not_in"

- [ ] **PR message**: nin_ do not exist in sqlalchemy orm, it's not_in
4 months ago
ccurme 3999761201
multiple: add `stop` attribute (#22573) 4 months ago
ccurme e08879147b
Revert "anthropic: stream token usage" (#22624)
Reverts langchain-ai/langchain#20180
4 months ago
Bagatur 0d495f3f63
anthropic: stream token usage (#20180)
open to other ideas
<img width="1181" alt="Screenshot 2024-04-08 at 5 34 08 PM"
src="https://github.com/langchain-ai/langchain/assets/22008038/03eb11c4-5eb5-43e3-9109-a13f76098fa4">

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
4 months ago
Satyam Kumar 17b486a37b
openai, azure: update model_name in ChatResult to use name from API response (#22569)
The response.get("model", self.model_name) checks if the model key
exists in the response dictionary. If it does, it uses that value;
otherwise, it uses self.model_name.

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
4 months ago
Christophe Bornet 12ddb4fc6f
core[patch]: Use explicit classes for InMemoryByteStore and InMemoryStore (#22608)
The current implementation doesn't work well with type checking.
Instead replace with class definition that correctly works with type
checking.
4 months ago
andyjessen cfed68e06f
docs: Fix description (#22611)
This commit fixes the description of the hair_color field.
4 months ago
ccurme 1925bde32e
together: bump langchain-core (#22616)
langchain-together depends on langchain-openai ^0.1.8
langchain-openai 0.1.8 has langchain-core >= 0.2.2

Here we bump langchain-core to 0.2.2, just to pass minimum dependency
version tests.
4 months ago
ccurme 35f4aa927b
together[patch]: Release 0.1.3 (#22615) 4 months ago
andyjessen 8b40428f58
docs: Fix typo (#22603)
This commit changes minor typo in the field description.
4 months ago
Isaac Francisco ba3e219d83
community[patch]: recursive url loader fix and unit tests (#22521)
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
4 months ago
Jeffrey Mak 5fc5ed463c
community[patch]:Support filter for AzureAISearchRetriever (#22303)
**Description**: 
The AzureAISearchRetriever does not support the "$filter" argument
offered in the AISearch API:
https://learn.microsoft.com/en-us/rest/api/searchservice/documents/search-get?view=rest-searchservice-2023-11-01&tabs=HTTP
The $filter allows filtering of indexes based on values in metadata.

**Issue**: 
https://github.com/langchain-ai/langchain/issues/19885

**Dependencies**: 
No

**Twitter handle**: 
@Jeffreym9M
 

- [ ] **Add tests and docs**: Not relevant


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
4 months ago
Isaac Francisco 148088a588
docs: duckduckgosearch options listed (#22568)
Co-authored-by: Bagatur <baskaryan@gmail.com>
4 months ago
X-HAN 62f13f95e4
community[minor]: add DashScope Rerank (#22403)
**Description:** this PR adds DashScope Rerank capability to Langchain,
you can find DashScope Rerank API from
[here](https://help.aliyun.com/document_detail/2780058.html?spm=a2c4g.2780059.0.0.6d995024FlrJ12)
&
[here](https://help.aliyun.com/document_detail/2780059.html?spm=a2c4g.2780058.0.0.63f75024cr11N9).
[DashScope](https://dashscope.aliyun.com/) is the generative AI service
from Alibaba Cloud (Aliyun). You can create DashScope API key from
[here](https://bailian.console.aliyun.com/?apiKey=1#/api-key).

**Dependencies:** DashScopeRerank depends on `dashscope` python package.

**Twitter handle:** my twitter/x account is https://x.com/LastMonopoly
and I'd like a mention, thanks you!


**Tests and docs**
  1. integration test: `test_dashscope_rerank.py`
  2. example notebook: `dashscope_rerank.ipynb`

**Lint and test**: I have run `make format`, `make lint` and `make test`
from the root of the package I've modified.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
4 months ago
Ethan Yang 29064848f9
[Community]add option to delete the prompt from HF output (#22225)
This will help to solve pattern mismatching issue when parsing the
output in Agent.

https://github.com/langchain-ai/langchain/issues/21912
4 months ago
Bagatur 584a1e30ac
community[patch]: AzureSearch async functions (#22075) 4 months ago
Bagatur 1a911018bc
langchain[minor]: add universal init_model (#22039)
decisions to discuss
- only chat models
- model_provider isn't based on any existing values like llm-type,
package names, class names
- implemented as function not as a wrapper ChatModel
- function name (init_model)
- in langchain as opposed to community or core
- marked beta
4 months ago
ccurme af129974a3
community: update how OpenAIAssistantV2Runnable creates threads with tool_resources (#22549)
https://github.com/langchain-ai/langchain/issues/22503
4 months ago
Bagatur 51a0d4574e
community[patch]: Release 0.2.3 (#22562) 4 months ago
Bagatur b2daba37c7
nomic[patch]: Release 0.1.2 (#22561) 4 months ago
Zach Nussbaum 14f3014cce
embeddings: nomic embed vision (#22482)
Thank you for contributing to LangChain!

**Description:** Adds Langchain support for Nomic Embed Vision
**Twitter handle:** nomic_ai,zach_nussbaum


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Lance Martin <122662504+rlancemartin@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
4 months ago
leila-messallem 3280a5b49b
community[patch]: improve test setup to accurately test filtering of labels in neo4j (#22531)
**Description:** This PR addresses an issue with an existing test that
was not effectively testing the intended functionality. The previous
test setup did not adequately validate the filtering of the labels in
neo4j, because the nodes and relationship in the test data did not have
any properties set. Without properties these labels would not have been
returned, regardless of the filtering.

---------

Co-authored-by: Oskar Hane <oh@oskarhane.com>
4 months ago
Mohammad Mohtashim 7fcef2556c
[Experimental]: Async agenerate method ollama functions (#21682)
- **Description:** :
Added Async method for Generate for OllamaFunctions which was missing
and was raising errors for the users.
   
- **Issue:** 
#21422
4 months ago
Stefano Lottini 328d0c99f2
community[minor]: Add support for metadata indexing policy in Cassandra vector store (#22548)
This PR adds a constructor `metadata_indexing` parameter to the
Cassandra vector store to allow optional fine-tuning of which fields of
the metadata are to be indexed.

This is a feature supported by the underlying CassIO library. Indexing
mode of "all", "none" or deny- and allow-list based choices are
available.

The rationale is, in some cases it's advisable to programmatically
exclude some portions of the metadata from the index if one knows in
advance they won't ever be used at search-time. this keeps the index
more lightweight and performant and avoids limitations on the length of
_indexed_ strings.

I added a integration test of the feature. I also added the possibility
of running the integration test with Cassandra on an arbitrary IP
address (e.g. Dockerized), via
`CASSANDRA_CONTACT_POINTS=10.1.1.5,10.1.1.6 poetry run pytest [...]` or
similar.

While I was at it, I added a line to the `.gitignore` since the mypy
_test_ cache was not ignored yet.

My X (Twitter) handle: @rsprrs.
4 months ago
Emilien Chauvet c3d4126eb1
community[minor]: add user agent for web scraping loaders (#22480)
**Description:** This PR adds a `USER_AGENT` env variable that is to be
used for web scraping. It creates a util to get that user agent and uses
it in the classes used for scraping in [this piece of
doc](https://python.langchain.com/v0.1/docs/use_cases/web_scraping/).
Identifying your scraper is considered a good politeness practice, this
PR aims at easing it.
**Issue:** `None`
**Dependencies:** `None`
**Twitter handle:** `None`
4 months ago
Philippe PRADOS 8250c177de
community[minor]: Add native async support to SQLChatMessageHistory (#22065)
# package community: Fix SQLChatMessageHistory

## Description
Here is a rewrite of `SQLChatMessageHistory` to properly implement the
asynchronous approach. The code circumvents [issue
22021](https://github.com/langchain-ai/langchain/issues/22021) by
accepting a synchronous call to `def add_messages()` in an asynchronous
scenario. This bypasses the bug.

For the same reasons as in [PR
22](https://github.com/langchain-ai/langchain-postgres/pull/32) of
`langchain-postgres`, we use a lazy strategy for table creation. Indeed,
the promise of the constructor cannot be fulfilled without this. It is
not possible to invoke a synchronous call in a constructor. We
compensate for this by waiting for the next asynchronous method call to
create the table.

The goal of the `PostgresChatMessageHistory` class (in
`langchain-postgres`) is, among other things, to be able to recycle
database connections. The implementation of the class is problematic, as
we have demonstrated in [issue
22021](https://github.com/langchain-ai/langchain/issues/22021).

Our new implementation of `SQLChatMessageHistory` achieves this by using
a singleton of type (`Async`)`Engine` for the database connection. The
connection pool is managed by this singleton, and the code is then
reentrant.

We also accept the type `str` (optionally complemented by `async_mode`.
I know you don't like this much, but it's the only way to allow an
asynchronous connection string).

In order to unify the different classes handling database connections,
we have renamed `connection_string` to `connection`, and `Session` to
`session_maker`.

Now, a single transaction is used to add a list of messages. Thus, a
crash during this write operation will not leave the database in an
unstable state with a partially added message list. This makes the code
resilient.

We believe that the `PostgresChatMessageHistory` class is no longer
necessary and can be replaced by:
```
PostgresChatMessageHistory = SQLChatMessageHistory
```
This also fixes the bug.


## Issue
- [issue 22021](https://github.com/langchain-ai/langchain/issues/22021)
  - Bug in _exit_history()
  - Bugs in PostgresChatMessageHistory and sync usage
  - Bugs in PostgresChatMessageHistory and async usage
- [issue
36](https://github.com/langchain-ai/langchain-postgres/issues/36)
 ## Twitter handle:
pprados

## Tests
- libs/community/tests/unit_tests/chat_message_histories/test_sql.py
(add async test)

@baskaryan, @eyurtsev or @hwchase17 can you check this PR ?
And, I've been waiting a long time for validation from other PRs. Can
you take a look?
- [PR 32](https://github.com/langchain-ai/langchain-postgres/pull/32)
- [PR 15575](https://github.com/langchain-ai/langchain/pull/15575)
- [PR 13200](https://github.com/langchain-ai/langchain/pull/13200)

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
4 months ago
Vincent Min 59bef31997
community[minor]: Improve InMemoryVectorStore with ability to persist to disk and filter on metadata. (#22186)
- **Description:** The InMemoryVectorStore is a nice and simple vector
store implementation for quick development and debugging. The current
implementation is quite limited in its functionalities. This PR extends
the functionalities by adding utility function to persist the vector
store to a json file and to load it from a json file. We choose the json
file format because it allows inspection of the database contents in a
text editor, which is great for debugging. Furthermore, it adds a
`filter` keyword that can be used to filter out documents on their
`page_content` or `metadata`.
- **Issue:** -
- **Dependencies:** -
- **Twitter handle:** @Vincent_Min
4 months ago
Christophe Bornet c34ad8c163
core[patch]: Improve VectorStore API doc (#22547) 4 months ago
maang-h 89128b7a49
community[patch]: add detailed paragraph and example for BaichuanTextEmbeddings (#22031)
- **Description:** add detailed paragraph and example for
BaichuanTextEmbeddings
   - **Issue:** the issue #21983
4 months ago
Anthony Bernabeu 4e676a63b8
community[minor]: Added filter search for LanceDB (#22461)
- [ ] **community**: "vectorstore: added filtering support for LanceDB
vector store"

- [ ] **This PR adds filtering capabilities to LanceDB**:
- **Description:** In LanceDB filtering can be applied when searching
for data into the vectorstore. It is using the SQL language as mentioned
in the LanceDB documentation.
    - **Issue:** #18235 
    - **Dependencies:** No

- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
4 months ago
Erick Friis 4050d6ea2b
huggingface: remove text-generation dep (#22543) 4 months ago
Erick Friis a6fc74f379
ai21: fix core version (#22544) 4 months ago
Asaf Joseph Gardin 75cba742e5
ai21: fix ai21 unittests (#22526)
Co-authored-by: Asaf Gardin <asafg@ai21.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
4 months ago
Erick Friis 58192d617f
community: fix huggingface deprecations (#22522) 4 months ago
Christophe Bornet 8ba868d3b0
core[patch]: Add similarity_score_threshold to VectorStore search types (#22477) 4 months ago
Eugene Yurtsev 9120cf5df2
core[patch]: Deduplicate of callback handlers in merge_configs (#22478)
This PR adds deduplication of callback handlers in merge_configs.

Fix for this issue:
https://github.com/langchain-ai/langchain/issues/22227

The issue appears when the code is:

1) running python >=3.11
2) invokes a runnable from within a runnable
3) binds the callbacks to the child runnable from the parent runnable
using with_config

In this case, the same callbacks end up appearing twice: (1) the first
time from with_config, (2) the second time with langchain automatically
propagating them on behalf of the user.


Prior to this PR this will emit duplicate events:

```python
@tool
async def get_items(question: str, callbacks: Callbacks):  # <--- Accept callbacks
    """Ask question"""
    template = ChatPromptTemplate.from_messages(
        [
            (
                "human",
                "'{question}"
            )
        ]
    )
    chain = template | chat_model.with_config(
        {
            "callbacks": callbacks,  # <-- Propagate callbacks
        }
    )
    return await chain.ainvoke({"question": question})
```

Prior to this PR this will work work correctly (no duplicate events):

```python
@tool
async def get_items(question: str, callbacks: Callbacks):  # <--- Accept callbacks
    """Ask question"""
    template = ChatPromptTemplate.from_messages(
        [
            (
                "human",
                "'{question}"
            )
        ]
    )
    chain = template | chat_model
    return await chain.ainvoke({"question": question}, {"callbacks": callbacks})
```

This will also work (as long as the user is using python >= 3.11) -- as
langchain will automatically propagate callbacks

```python
@tool
async def get_items(question: str,):  
    """Ask question"""
    template = ChatPromptTemplate.from_messages(
        [
            (
                "human",
                "'{question}"
            )
        ]
    )
    chain = template | chat_model
    return await chain.ainvoke({"question": question})
```
4 months ago
Ofer Mendelevitch ad502e8d50
community[minor]: Vectara Integration Update - Streaming, FCS, Chat, updates to documentation and example notebooks (#21334)
Thank you for contributing to LangChain!

**Description:** update to the Vectara / Langchain integration to
integrate new Vectara capabilities:
- Full RAG implemented as a Runnable with as_rag()
- Vectara chat supported with as_chat()
- Both support streaming response
- Updated documentation and example notebook to reflect all the changes
- Updated Vectara templates

**Twitter handle:** ofermend

**Add tests and docs**: no new tests or docs, but updated both existing
tests and existing docs
4 months ago
Bagatur cb183a9bf1
docs: update anthropic chat model (#22483)
Related to #22296

And update anthropic to accept base_url
4 months ago
Erick Friis d700ce8545
robocorp: typo (#22509) 4 months ago
Erick Friis 39fd44579a
robocorp: release 0.0.9.post1 (#22507) 4 months ago
Erick Friis 339e3b7f55
ai21: release 0.1.6 (#22508) 4 months ago
ccurme 3c53cea760
together, upstage: bump minimum langchain-openai version (#22505) 4 months ago
Bagatur efcb04f84b
mongodb[patch]: Release 0.1.6 (#22501) 4 months ago
Bagatur 222b1ba112
groq[patch]: Release 0.1.5 (#22500) 4 months ago
Bagatur f021be510e
milvus[patch]: Release 0.1.1 (#22499) 4 months ago
Bagatur 64d68c17cd
upstage[patch]: Release 0.1.6 (#22498) 4 months ago
Bagatur 48fba40fce
experimental[patch]: Release 0.0.60 (#22497) 4 months ago
Bagatur e60f88ccdd
community[patch]: Release 0.2.2 (#22496) 4 months ago
Bagatur 85aa218564
langchain[patch]: Release 0.2.2 (#22495) 4 months ago
Bagatur 8e86080def
mistralai[patch]: Release 0.1.8 (#22494) 4 months ago
Bagatur e850de2422
huggingface[patch]: release 0.0.2 (#22493) 4 months ago
Bagatur 99a3cad258
text-splitters[patch]: Release 0.2.1 (#22490) 4 months ago
Bagatur 161b02a8be
core[patch]: Release 0.2.4 (#22489) 4 months ago
Joydeep Banik Roy 3796672c67
community, milvus, pinecone, qdrant, mongo: Broadcast operation failure while using simsimd beyond v3.7.7 (#22271)
- [ ] **Packages affected**: 
  - community: fix `cosine_similarity` to support simsimd beyond 3.7.7
- partners/milvus: fix `cosine_similarity` to support simsimd beyond
3.7.7
- partners/mongodb: fix `cosine_similarity` to support simsimd beyond
3.7.7
- partners/pinecone: fix `cosine_similarity` to support simsimd beyond
3.7.7
- partners/qdrant: fix `cosine_similarity` to support simsimd beyond
3.7.7


- [ ] **Broadcast operation failure while using simsimd beyond v3.7.7**:
- **Description:** I was using simsimd 4.3.1 and the unsupported operand
type issue popped up. When I checked out the repo and ran the tests,
they failed as well (have attached a screenshot for that). Looks like it
is a variant of https://github.com/langchain-ai/langchain/issues/18022 .
Prior to 3.7.7, simd.cdist returned an ndarray but now it returns
simsimd.DistancesTensor which is ineligible for a broadcast operation
with numpy. With this change, it also remove the need to explicitly cast
`Z` to numpy array
    - **Issue:** #19905
    - **Dependencies:** No
    - **Twitter handle:** https://x.com/GetzJoydeep

<img width="1622" alt="Screenshot 2024-05-29 at 2 50 00 PM"
src="https://github.com/langchain-ai/langchain/assets/31132555/fb27b383-a9ae-4a6f-b355-6d503b72db56">

- [ ] **Considerations**: 
1. I started with community but since similar changes were there in
Milvus, MongoDB, Pinecone, and QDrant so I modified their files as well.
If touching multiple packages in one PR is not the norm, then I can
remove them from this PR and raise separate ones
2. I have run and verified that the tests work. Since, only MongoDB had
tests, I ran theirs and verified it works as well. Screenshots attached
:
<img width="1573" alt="Screenshot 2024-05-29 at 2 52 13 PM"
src="https://github.com/langchain-ai/langchain/assets/31132555/ce87d1ea-19b6-4900-9384-61fbc1a30de9">
<img width="1614" alt="Screenshot 2024-05-29 at 3 33 51 PM"
src="https://github.com/langchain-ai/langchain/assets/31132555/6ce1d679-db4c-4291-8453-01028ab2dca5">
  

I have added a test for simsimd. I feel it may not go well with the
CI/CD setup as installing simsimd is not a dependency requirement. I
have just imported simsimd to ensure simsimd cosine similarity is
invoked. However, its not a good approach. Suggestions are welcome and I
can make the required changes on the PR. Please provide guidance on the
same as I am new to the community.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
4 months ago
KyrianC 03178ee74f
community[minor]: Add tools calls to `ChatEdenAI` (#22320)
### Description  
Add tools implementation to `ChatEdenAI`:
- `bind_tools()`
- `with_structured_output()`

### Documentation 
Updated `docs/docs/integrations/chat/edenai.ipynb`

### Notes
We don´t support stream with tools as of yet. If stream is called with
tools we directly yield the whole message from `generate` (implemented
the same way as Anthropic did).
4 months ago
pranavvuppala 9d4350e69a
docs : Update docstrings for OpenAI base.py (#22221)
- [x] **PR title**: Update docstrings for OpenAI base.py
-**Description:** Updated the docstring of few OpenAI functions for a
better understanding of the function.
    - **Issue:** #21983

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
4 months ago
Anindyadeep 7a197539aa
communty[patch]: Native RAG Support in Prem AI langchain (#22238)
This PR adds native RAG support in langchain premai package. The same
has been added in the docs too.
4 months ago
Rahul Triptahi 77ad857934
community[minor]: Enable retrieval api calls in PebbloRetrievalQA (#21958)
Description: Enable app discovery and Prompt/Response apis in
PebbloSafeRetrieval
Documentation: NA
Unit test: N/A

---------

Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
4 months ago
liugz18 8fd231086e
experimental[patch]: Fix graph_transformers llms #21482 (#22417)
Fix AttributeError on calling
LLMGraphTransformer.convert_to_graph_documents #21482

 since raw_schema is always a str

@baskaryan
4 months ago
ccurme 6db25b4e31
core[patch]: bump langsmith (#22476)
Noticing errors logged in some situations when tracing with Langsmith:
```python
from langchain_core.pydantic_v1 import BaseModel
from langchain_anthropic import ChatAnthropic


class AnswerWithJustification(BaseModel):
    """An answer to the user question along with justification for the answer."""
    answer: str
    justification: str


llm = ChatAnthropic(model="claude-3-haiku-20240307")
structured_llm = llm.with_structured_output(AnswerWithJustification)

list(structured_llm.stream("What weighs more a pound of bricks or a pound of feathers"))
```
```
Error in LangChainTracer.on_chain_end callback: AttributeError("'NoneType' object has no attribute 'append'")
[AnswerWithJustification(answer='A pound of bricks and a pound of feathers weigh the same amount.', justification='This is because a pound is a unit of mass, not volume. By definition, a pound of any material, whether bricks or feathers, will weigh the same - one pound. The physical size or volume of the materials does not matter when measuring by mass. So a pound of bricks and a pound of feathers both weigh exactly one pound.')]
```
4 months ago
Bagatur 17c127531a
community[patch]: deprecate all HF classes (#22444) 4 months ago
Nuno Campos 58b118544e
Use immutable sequence type for batch/batch_as_completed types (#22433)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
4 months ago
Christophe Bornet 9a8fe58ebe
community[minor]: Improve Cassandra VectorStore as_retriever (#22465)
The Vectorstore's API `as_retriever` doesn't expose explicitly the
parameters `search_type` and `search_kwargs` and so these are not well
documented.
This PR improves `as_retriever` for the Cassandra VectorStore by making
these parameters explicit.

NB: An alternative would have been to modify `as_retriever` in
`Vectorstore`. But there's probably a good reason these were not exposed
in the first place ? Is it because implementations may decide to not
support them and have fixed values when creating the
VectorStoreRetriever ?
4 months ago
Christophe Bornet 23bba18f92
core[patch]: Fix VectorStore's as_retriever mutating tags param (#22470)
The current VectorStore `as_retriever` implementation mutates the `tags`
param when it's passed in kwargs.
This fix ensures that a copy is done.
4 months ago
Michal Gregor 98b2e7b195
huggingface[patch]: Support for HuggingFacePipeline in ChatHuggingFace. (#22194)
- **Description:** Added support for using HuggingFacePipeline in
ChatHuggingFace (previously it was only usable with API endpoints,
probably by oversight).
- **Issue:** #19997 
- **Dependencies:** none
- **Twitter handle:** none

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
4 months ago
Fahreddin Özcan 0061ded002
community[patch]: Upstash Vector Store Namespace Support (#22251)
This PR introduces namespace support for Upstash Vector Store, which
would allow users to partition their data in the vector index.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
4 months ago
Guangdong Liu bc7e32f315
core(patch):fix partial_variables not working with SystemMessagePromptTemplate (#20711)
- **Issue:**  close #17560
- @baskaryan, @eyurtsev
4 months ago
Dristy Srivastava ef3df45d9d
community[minor]: Updating payload for pebblo discover API (#22309)
**Description:** Updating response for pebblo discover API. Also
updating filed name case type
**Documentation:** N/A
**Unit tests:** N/A
4 months ago
Miroslav cbd5720011
huggingface[patch]: Skip Login to HuggingFaceHub when token is not set (#22365) 4 months ago
bhardwaj-vipul f397a84a59
langchain[patch]: Fix MongoDBAtlasVectorSearch reference in self query retriever (#22401)
**Description:** 
SelfQuery Retriever with MongoDBAtlasVectorSearch (from
langchain_mongodb import MongoDBAtlasVectorSearch) and
Chroma (from langchain_chroma import Chroma) is not supported.
The imports in the [builtin
translators](8cbce684d4/libs/langchain/langchain/retrievers/self_query/base.py (L73))
points to the
[deprecated](acaf214a45/libs/community/langchain_community/vectorstores/mongodb_atlas.py (L36))
vectorstore.

**Issue:** 
#22272

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
4 months ago
ccurme afe89a1411
community: add standard chat model params to Ollama (#22446) 4 months ago
Ethan Yang 52da6a160d
community[patch]: Update OpenVINO embedding and reranker to support static input shape (#22171)
It can help to deploy embedding models on NPU device
4 months ago
Tom Clelford c599732e1a
text-splitters[patch]: fix HTMLSectionSplitter parsing of xslt paths (#22176)
## Description
This PR allows passing the HTMLSectionSplitter paths to xslt files. It
does so by fixing two trivial bugs with how passed paths were being
handled. It also changes the default value of the param `xslt_path` to
`None` so the special case where the file was part of the langchain
package could be handled.

## Issue
#22175
4 months ago
maang-h 01352bb55f
community[minor]: Implement MiniMaxChat interface (#22391)
- **Description:** Implement MiniMaxChat interface, include:
    - No longer inherits the LLM class (like other chat model)
    - Update request parameters (v1 -> v2)
        - update `base url`
        - update message role (system, user, assistant)
        - add `stream` function
        - no longer use `group id`
    - Implement the `_stream`, `_agenerate`, and `_astream` interfaces

[minimax v2 api
document](https://platform.minimaxi.com/document/guides/chat-model/V2?id=65e0736ab2845de20908e2dd)
4 months ago
Brandon Sharp 56e5aa4dd9
community[patch]: Airtable to allow for addtl params (#22092)
- [X] **PR title**: "community: added optional params to Airtable
table.all()"


- [X] **PR message**: 
- **Description:** Add's **kwargs to AirtableLoader to allow for kwargs:
https://pyairtable.readthedocs.io/en/latest/api.html#pyairtable.Table.all
    - **Issue:** N/A
    - **Dependencies:** N/A
    - **Twitter handle:** parakoopa88


- [X] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [X] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/


If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
4 months ago
Harichandan Roy 1f751343e2
community[patch]: update embeddings/oracleai.py (#22240)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"

"community/embeddings: update oracleai.py"

- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!

Adding oracle VECTOR_ARRAY_T support.

- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

Tests are not impacted.

- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Done.

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.


If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
4 months ago
maang-h 13140dc4ff
community[patch]: Update the default api_url and reqeust_body of sparkllm embedding (#22136)
- **Description:** When I was running the SparkLLMTextEmbeddings,
app_id, api_key and api_secret are all correct, but it cannot run
normally using the current URL.

    ```python
    # example
    from langchain_community.embeddings import SparkLLMTextEmbeddings

    embedding= SparkLLMTextEmbeddings(
        spark_app_id="my-app-id",
        spark_api_key="my-api-key",
        spark_api_secret="my-api-secret"
    )
    embedding= "hello"
    print(spark.embed_query(text1))
    ```

![sparkembedding](https://github.com/langchain-ai/langchain/assets/55082429/11daa853-4f67-45b2-aae2-c95caa14e38c)
   
So I updated the url and request body parameters according to
[Embedding_api](https://www.xfyun.cn/doc/spark/Embedding_api.html), now
it is runnable.
4 months ago
Yuwen Hu ba0dca46d7
community[minor]: Add IPEX-LLM BGE embedding support on both Intel CPU and GPU (#22226)
**Description:** [IPEX-LLM](https://github.com/intel-analytics/ipex-llm)
is a PyTorch library for running LLM on Intel CPU and GPU (e.g., local
PC with iGPU, discrete GPU such as Arc, Flex and Max) with very low
latency. This PR adds ipex-llm integrations to langchain for BGE
embedding support on both Intel CPU and GPU.
**Dependencies:** `ipex-llm`, `sentence-transformers`
**Contribution maintainer**: @Oscilloscope98 
**tests and docs**: 
- langchain/docs/docs/integrations/text_embedding/ipex_llm.ipynb
- langchain/docs/docs/integrations/text_embedding/ipex_llm_gpu.ipynb
-
langchain/libs/community/tests/integration_tests/embeddings/test_ipex_llm.py

---------

Co-authored-by: Shengsheng Huang <shannie.huang@gmail.com>
4 months ago
Jacob Lee c01467b1f4
core[patch]: RFC: Allow concatenation of messages with multi part content (#22002)
Anthropic's streaming treats tool calls as different content parts
(streamed back with a different index) from normal content in the
`content`.

This means that we need to update our chunk-merging logic to handle
chunks with multi-part content. The alternative is coerceing Anthropic's
responses into a string, but we generally like to preserve model
provider responses faithfully when we can. This will also likely be
useful for multimodal outputs in the future.

This current PR does unfortunately make `index` a magic field within
content parts, but Anthropic and OpenAI both use it at the moment to
determine order anyway. To avoid cases where we have content arrays with
holes and to simplify the logic, I've also restricted merging to chunks
in order.

TODO: tests

CC @baskaryan @ccurme @efriis
4 months ago
Dan 86509161b0
community: fix AzureSearch delete documents (#22315)
**Description**

Fix AzureSearch delete documents method by using FIELDS_ID variable
instead of the hard coded "id" value

**Issue:** 

This is linked to this issue:
https://github.com/langchain-ai/langchain/issues/22314

Co-authored-by: dseban <dan.seban@neoxia.com>
4 months ago
Harrison Chase 8fad2e209a
fix error message (#22437)
Was confusing when language is in Enum but not implemented
4 months ago
Bagatur 678a19a5f7
infra: bump anthropic mypy 1 (#22373) 4 months ago
Nuno Campos ceb73ad06f
core: In BaseRetriever make get_relevant_docs delegate to invoke (#22434)
- This fixes all the tracing issues with people still using
get_relevant_docs, and a change we need for 0.3 anyway

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
4 months ago
Charles John 2d81a72884
community: fix missing `apify_api_token` field in ApifyWrapper (#22421)
- **Description:** The `ApifyWrapper` class expects `apify_api_token` to
be passed as a named parameter or set as an environment variable. But
the corresponding field was missing in the class definition causing the
argument to be ignored when passed as a named param. This patch fixes
that.
4 months ago
Joan Fontanals a7ae16f912
add `embed_image` API to JinaEmbedding (#22416)
- **Description:** Add `embed_image` to JinaEmbedding to embed images
 - **Twitter handle:** https://x.com/JinaAI_
4 months ago
Nuno Campos ed8e9c437a
core: In RunnableSequence pass kwargs to the first step (#22393)
- This is a pattern that shows up occasionally in langgraph questions,
people chain a graph to something else after, and want to pass the graph
some kwargs (eg. stream_mode)
4 months ago
Bagatur a8098f5ddb
anthropic[patch]: Release 0.1.15, fix sdk tools break (#22369) 4 months ago
Erick Friis 6ffa0acf32
ai21: fix text-splitters version (#22366) 4 months ago
Bagatur 2b9f1469d8
core[patch]: Release 0.2.3 (#22329) 4 months ago
Harrison Chase ee32369265
core[patch]: fix runnable history and add docs (#22283) 4 months ago
William FH dcec133b85
[Core] Update Tracing Interops (#22318)
LangSmith and LangChain context var handling evolved in parallel since
originally we didn't expect people to want to interweave the decorator
and langchain code.

Once we get a new langsmith release, this PR will let you seemlessly
hand off between @traceable context and runnable config context so you
can arbitrarily nest code.

It's expected that this fails right now until we get another release of
the SDK
4 months ago
ccurme f34337447f
openai: update ChatOpenAI api ref (#22324)
Update to reflect that token usage is no longer default in streaming
mode.

Add detail for streaming context under Token Usage section.
4 months ago
ChengZi 2443e85533
docs: fix milvus import and update template (#22306)
docs: fix milvus import problem
update milvus-rag template with milvus-lite

Signed-off-by: ChengZi <chen.zhang@zilliz.com>
4 months ago
WU LIFU 86698b02a9
doc: fix wrong documentation on FAISS load_local function (#22310)
### Issue: #22299 

### descriptions
The documentation appears to be wrong. When the user actually sets this
parameter "asynchronous" to be True, it fails because the __init__
function of FAISS class doesn't allow this parameter. In fact, most of
the class/instance functions of this class have both the sync/async
version, so it looks like what we need is just to remove this parameter
from the doc.

Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [x] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

Co-authored-by: Lifu Wu <lifu@nextbillion.ai>
4 months ago
maang-h 596c062cba
community[patch]: Standardize qianfan model init args name (#22322)
- **Description:**  
    - Standardize qianfan chat model intialization arguments name
        - qianfan_ak (qianfan api key)  -> api_key
        - qianfan_sk (qianfan secret key)  ->  secret_key
       
    - Delete unuse variable
- **Issue:** #20085
4 months ago
Dobiichi-Origami 10b12e1c08
community: adding tool_call_id for every ToolCall (#22323)
- **Description:** This PR contains a bugfix which result in malfunction
of multi-turn conversation in QianfanChatEndpoint and adaption for
ToolCall and ToolMessage
4 months ago
ccurme f39e1a2288
community, docs: update token usage tracking callback + how-to guides (#22145) 4 months ago
Bagatur 2bc50fb895
docs, cli[patch]: chat model template nit (#22294) 4 months ago
Bagatur aa6c31df53
cli[patch]: Release 0.0.24 (#22293) 4 months ago
Bagatur 627a337887
docs, cli[patch]: chat model doc template (#22290)
Update ChatModel integration doc template, integration docstring, and
adds langchain-cli command to easily create just doc (for updating
existing integrations):

```bash
langchain-cli integration create-doc --name "foo-bar"
```
4 months ago
ccurme 6e1df72a88
openai[patch]: Release 0.1.8 (#22291) 4 months ago
ccurme e71b0b5827
core[patch]: Release 0.2.2 (#22289) 4 months ago
Bagatur 6dd0f095c3
docs: revamp ChatOpenAI (#22253)
Can build API ref docs by running
```bash
make api_docs_clean; make api_docs_quick_preview API_PKG=openai
```
only builds openai ref, takes ~20 sec
4 months ago
Erick Friis 00c70d98c2
robocorp: release 0.0.9 (#22282) 4 months ago
Mikko Korpela fc5909ad6f
langchain-robocorp: Fix parsing of Union types (such as Optional). (#22277) 4 months ago
ccurme af1f723ada
openai: don't override stream_options default (#22242)
ChatOpenAI supports a kwarg `stream_options` which can take values
`{"include_usage": True}` and `{"include_usage": False}`.

Setting include_usage to True adds a message chunk to the end of the
stream with usage_metadata populated. In this case the final chunk no
longer includes `"finish_reason"` in the `response_metadata`. This is
the current default and is not yet released. Because this could be
disruptive to workflows, here we remove this default. The default will
now be consistent with OpenAI's API (see parameter
[here](https://platform.openai.com/docs/api-reference/chat/create#chat-create-stream_options)).

Examples:
```python
from langchain_openai import ChatOpenAI

llm = ChatOpenAI()

for chunk in llm.stream("hi"):
    print(chunk)
```
```
content='' id='run-8cff4721-2acd-4551-9bf7-1911dae46b92'
content='Hello' id='run-8cff4721-2acd-4551-9bf7-1911dae46b92'
content='!' id='run-8cff4721-2acd-4551-9bf7-1911dae46b92'
content='' response_metadata={'finish_reason': 'stop'} id='run-8cff4721-2acd-4551-9bf7-1911dae46b92'
```

```python
for chunk in llm.stream("hi", stream_options={"include_usage": True}):
    print(chunk)
```
```
content='' id='run-39ab349b-f954-464d-af6e-72a0927daa27'
content='Hello' id='run-39ab349b-f954-464d-af6e-72a0927daa27'
content='!' id='run-39ab349b-f954-464d-af6e-72a0927daa27'
content='' response_metadata={'finish_reason': 'stop'} id='run-39ab349b-f954-464d-af6e-72a0927daa27'
content='' id='run-39ab349b-f954-464d-af6e-72a0927daa27' usage_metadata={'input_tokens': 8, 'output_tokens': 9, 'total_tokens': 17}
```

```python
llm = ChatOpenAI().bind(stream_options={"include_usage": True})

for chunk in llm.stream("hi"):
    print(chunk)
```
```
content='' id='run-59918845-04b2-41a6-8d90-f75fb4506e0d'
content='Hello' id='run-59918845-04b2-41a6-8d90-f75fb4506e0d'
content='!' id='run-59918845-04b2-41a6-8d90-f75fb4506e0d'
content='' response_metadata={'finish_reason': 'stop'} id='run-59918845-04b2-41a6-8d90-f75fb4506e0d'
content='' id='run-59918845-04b2-41a6-8d90-f75fb4506e0d' usage_metadata={'input_tokens': 8, 'output_tokens': 9, 'total_tokens': 17}
```
4 months ago
Karim Lalani a1899439fc
[experimental][llms][ollama_functions] Update OllamaFunctions to send `tool_calls` attribute (#21625)
Update OllamaFunctions to return `tool_calls` for AIMessages when used
for tool calling.
4 months ago
Bagatur d61bdeba25
core[patch]: allow access RunnableWithFallbacks.runnable attrs (#22139)
RFC, candidate fix for #13095 #22134
4 months ago
SteveLiao 7496fe2b16
Update parent_document_retriever.py about **kwargs (#22219)
Add kwargs in add_documents function

**langchain**: Add **kwargs in parent_document_retriever"
 - **Add kwargs for `add_document` in `parent_document_retriever.py`** 


If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
4 months ago
Erick Friis 93240fac68
milvus: fix core dep (#22239) 4 months ago
ChengZi 404d92ded0
milvus: New langchain_milvus package and new milvus features (#21077)
New features:

- New langchain_milvus package in partner
- Milvus collection hybrid search retriever
- Zilliz cloud pipeline retriever
- Milvus Local guid
- Rag-milvus template

---------

Signed-off-by: ChengZi <chen.zhang@zilliz.com>
Signed-off-by: Jael Gu <mengjia.gu@zilliz.com>
Co-authored-by: Jael Gu <mengjia.gu@zilliz.com>
Co-authored-by: Jackson <jacksonxie612@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Erick Friis <erickfriis@gmail.com>
4 months ago
Leonid Ganeline d6995e814b
ai21[patch]: added `license` (#22153)
The `pyproject.toml` missed the `license` parameter. I've added it as
`MIT`
4 months ago
Maddy Adams 8332a36f69
infra: update langchainhub and add integration test (#22154)
**Description:** Update langchainhub integration test dependency and add
an integration test for pulling private prompt
**Dependencies:** langchainhub 0.1.16
4 months ago
Will Higgins 83d10df78d
community[patch]: Update firecrawl api key name (#22183)
Change 'FIREWALL' to 'FIRECRAWL' as I believe this may have been in
error. Other docs refer to 'FIRECRAWL_API_KEY'.

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
4 months ago
hmasdev bbd7015b5d
core[patch]: Add `TypeError` handler into `get_graph` of `Runnable` (#19856)
# Description

## Problem

`Runnable.get_graph` fails when `InputType` or `OutputType` property
raises `TypeError`.

-
003c98e5b4/libs/core/langchain_core/runnables/base.py (L250-L274)
-
003c98e5b4/libs/core/langchain_core/runnables/base.py (L394-L396)

This problem prevents getting a graph of `Runnable` objects whose
`InputType` or `OutputType` property raises `TypeError` but whose
`invoke` works well, such as `langchain.output_parsers.RegexParser`,
which I have already pointed out in #19792 that a `TypeError` would
occur.

## Solution

- Add `try-except` syntax to handle `TypeError` to the codes which get
`input_node` and `output_node`.

# Issue
- #19801 

# Twitter Handle
- [hmdev3](https://twitter.com/hmdev3)

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
4 months ago
Mohammad Mohtashim 577ed68b59
mistralai[patch]: Added Json Mode for ChatMistralAI (#22213)
- **Description:** Powered
[ChatMistralAI.with_structured_output](fbfed65fb1/libs/partners/mistralai/langchain_mistralai/chat_models.py (L609))
via json mode
 

-  **Issue:** #22081

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
4 months ago
Pavlo Paliychuk 342df7cf83
community[minor]: Add Zep Cloud components + docs + examples (#21671)
Thank you for contributing to LangChain!

- [x] **PR title**: community: Add Zep Cloud components + docs +
examples

- [x] **PR message**: 
We have recently released our new zep-cloud sdks that are compatible
with Zep Cloud (not Zep Open Source). We have also maintained our Cloud
version of langchain components (ChatMessageHistory, VectorStore) as
part of our sdks. This PRs goal is to port these components to langchain
community repo, and close the gap with the existing Zep Open Source
components already present in community repo (added
ZepCloudMemory,ZepCloudVectorStore,ZepCloudRetriever).
Also added a ZepCloudChatMessageHistory components together with an
expression language example ported from our repo. We have left the
original open source components intact on purpose as to not introduce
any breaking changes.
    - **Issue:** -
- **Dependencies:** Added optional dependency of our new cloud sdk
`zep-cloud`
    - **Twitter handle:** @paulpaliychuk51


- [x] **Add tests and docs**


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
4 months ago
Jan Soubusta cccc8fbe2f
community[patch]: DuckDB VS - expose similarity, improve performance of from_texts (#20971)
3 fixes of DuckDB vector store:
- unify defaults in constructor and from_texts (users no longer have to
specify `vector_key`).
- include search similarity into output metadata (fixes #20969)
- significantly improve performance of `from_documents`

Dependencies: added Pandas to speed up `from_documents`.
I was thinking about CSV and JSON options, but I expect trouble loading
JSON values this way and also CSV and JSON options require storing data
to disk.
Anyway, the poetry file for langchain-community already contains a
dependency on Pandas.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
4 months ago
Erick Friis 42ffcb2ff1
anthropic: release 0.1.14rc2, test release note gen (#22147) 4 months ago
Ameya Shenoy 8ba492ed6a
community[minor]: clickhouse -- ability to use secure connection (#22108)
- **Description:** this PR gives clickhouse client the ability to use a
secure connection to the clickhosue server
- **Issue:** fixes #22082
- **Dependencies:** -
- **Twitter handle:** `_codingcoffee_`

Signed-off-by: Ameya Shenoy <shenoy.ameya@gmail.com>
Co-authored-by: Shresth Rana <shresth@grapevine.in>
4 months ago
ccurme 9a010fb761
openai: read stream_options (#21548)
OpenAI recently added a `stream_options` parameter to its chat
completions API (see [release
notes](https://platform.openai.com/docs/changelog/added-chat-completions-stream-usage)).
When this parameter is set to `{"usage": True}`, an extra "empty"
message is added to the end of a stream containing token usage. Here we
propagate token usage to `AIMessage.usage_metadata`.

We enable this feature by default. Streams would now include an extra
chunk at the end, **after** the chunk with
`response_metadata={'finish_reason': 'stop'}`.

New behavior:
```
[AIMessageChunk(content='', id='run-4b20dbe0-3817-4f62-b89d-03ef76f25bde'),
 AIMessageChunk(content='Hello', id='run-4b20dbe0-3817-4f62-b89d-03ef76f25bde'),
 AIMessageChunk(content='!', id='run-4b20dbe0-3817-4f62-b89d-03ef76f25bde'),
 AIMessageChunk(content='', response_metadata={'finish_reason': 'stop'}, id='run-4b20dbe0-3817-4f62-b89d-03ef76f25bde'),
 AIMessageChunk(content='', id='run-4b20dbe0-3817-4f62-b89d-03ef76f25bde', usage_metadata={'input_tokens': 8, 'output_tokens': 9, 'total_tokens': 17})]
```

Old behavior (accessible by passing `stream_options={"include_usage":
False}` into (a)stream:
```
[AIMessageChunk(content='', id='run-1312b971-c5ea-4d92-9015-e6604535f339'),
 AIMessageChunk(content='Hello', id='run-1312b971-c5ea-4d92-9015-e6604535f339'),
 AIMessageChunk(content='!', id='run-1312b971-c5ea-4d92-9015-e6604535f339'),
 AIMessageChunk(content='', response_metadata={'finish_reason': 'stop'}, id='run-1312b971-c5ea-4d92-9015-e6604535f339')]
```

From what I can tell this is not yet implemented in Azure, so we enable
only for ChatOpenAI.
4 months ago
Rahul Triptahi 1a485f59b9
community[patch]: Put authorized identities behind a feature flag in SharepointLoader (#22125)
Description: Put authorised identities behind a feature flag, load_auth.
Documentation: N/A
Unit tests: N/A

---------

Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
4 months ago
sasha 1c9ceff503
community: add metadata to chain logging; (#22122)
Hey, I'm Sasha. The SDK engineer from [Comet](https://comet.com).
This PR updates the CometTracer class.
Added metadata to CometTracerr. From now on, both chains and spans will
send it.
4 months ago
Jirka Lhotka 7c0459faf2
community: Update costs of openai finetuned models (#22124)
- **Description:** Update costs of finetuned models and add
gpt-3-turbo-0125. Source: https://openai.com/api/pricing/
  - **Issue:** N/A
  - **Dependencies:** None
4 months ago
Eugene Yurtsev d3db83abe3
community[major]: lint for usage of xml library (#22132)
* Lint for usage of standard xml library
* Add forced opt-in for quip client
* Actual security issue is with underlying QuipClient not LangChain
integration (since the client is doing the parsing), but adding
enforcement at the LangChain level.
4 months ago
Bagatur baa3c975cb
anthropic[patch]: allow tool call mutation (#22130)
If tool_use blocks and tool_calls with overlapping IDs are present,
prefer the values of the tool_calls. Allows for mutating AIMessages just
via tool_calls.
4 months ago
Christophe Bornet c838de5027
doc: Add doc for CassandraByteStore (#22126)
Preview:
https://langchain-git-fork-cbornet-doc-cassandrabytestore-langchain.vercel.app/v0.2/docs/integrations/stores/cassandra/
4 months ago
ccurme 0ea1e89b2c
groq: read tool calls from .tool_calls attribute (#22096) 4 months ago
Eugene Yurtsev 2d693c484e
docs: fix some spelling mistakes caught by newest version of code spell (#22090)
Going to merge this even though it doesn't pass all tests, and open a
separate PR for the remaining spelling mistakes.
4 months ago
Pavel Zloi fe26f937e4
community[minor]: ManticoreSearch engine added to vectorstore (#19117)
**Description:** ManticoreSearch engine added to vectorstores
**Issue:** no issue, just a new feature
**Dependencies:** https://pypi.org/project/manticoresearch-dev/
**Twitter handle:** @EvilFreelancer

- Example notebook with test integration:

https://github.com/EvilFreelancer/langchain/blob/manticore-search-vectorstore/docs/docs/integrations/vectorstores/manticore_search.ipynb

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
4 months ago
Erick Friis 95c3e5f85f
cli: model name substitution fix, release 0.0.23 (#22089) 4 months ago
ccurme 152c8cac33
anthropic, openai: cut pre-releases (#22083) 4 months ago
ccurme cd07521170
core: bump to 0.2.1rc (#22080) 4 months ago
ccurme fbfed65fb1
core, partners: add token usage attribute to AIMessage (#21944)
```python
class UsageMetadata(TypedDict):
    """Usage metadata for a message, such as token counts.

    Attributes:
        input_tokens: (int) count of input (or prompt) tokens
        output_tokens: (int) count of output (or completion) tokens
        total_tokens: (int) total token count
    """

    input_tokens: int
    output_tokens: int
    total_tokens: int
```
```python
class AIMessage(BaseMessage):
    ...
    usage_metadata: Optional[UsageMetadata] = None
    """If provided, token usage information associated with the message."""
    ...
```
4 months ago
Bagatur 3d26807b92
community[patch]: Release. 0.2.1 (#22073) 4 months ago
Bagatur 2d968213d7
langchain[patch]: Release 0.2.1 (#22074) 4 months ago
maang-h 9aba9e3e33
community[patch]: Update the default “API URL” and “MODEL” of sparkllm (#22070)
- **Description:** When I was running the sparkllm, I found that the
default parameters currently used could no longer run correctly.
    - original parameters & values:
         - spark_api_url: "wss://spark-api.xf-yun.com/v3.1/chat"
         - spark_llm_domain: "generalv3"
    ```python
    # example
    
    from langchain_community.chat_models import ChatSparkLLM
    
spark = ChatSparkLLM(spark_app_id="my_app_id",
spark_api_key="my_api_key", spark_api_secret="my_api_secret")
    spark.invoke("hello")
    ```

![sparkllm](https://github.com/langchain-ai/langchain/assets/55082429/5369bfdf-4305-496a-bcf5-2d3f59d39414)

So I updated them to 3.5 (same as sparkllm official website). After the
update, they can be used normally.
    - new parameters & values:
         - spark_api_url: "wss://spark-api.xf-yun.com/v3.5/chat"
         - spark_llm_domain: "generalv3.5"
4 months ago
junkeon 4fda7bf4f2
upstage[patch] : fix error handling in Layout Analysis parser (#22054)
This pull request addresses and fixes exception handling in the
UpstageLayoutAnalysisParser and enhances the test coverage by adding
error exception tests for the document loader. These improvements ensure
robust error handling and increase the reliability of the system when
dealing with external API calls and JSON responses.

### Changes Made
1. Fix Request Exception Handling:

- Issue: The existing implementation of UpstageLayoutAnalysisParser did
not properly handle exceptions thrown by the requests library, which
could lead to unhandled exceptions and potential crashes.
- Solution: Added comprehensive exception handling for
requests.RequestException to catch any request-related errors. This
includes logging the error details and raising a ValueError with a
meaningful error message.

2. Add Error Exception Tests for Document Loader:

- New Tests: Introduced new test cases to verify the robustness of the
UpstageLayoutAnalysisLoader against various error scenarios. The tests
ensure that the loader gracefully handles:
- RequestException: Simulates network issues or invalid API requests to
ensure appropriate error handling and user feedback.
- JSONDecodeError: Simulates scenarios where the API response is not a
valid JSON, ensuring the system does not crash and provides clear error
messaging.
4 months ago
JuHyung Son d9eff44400
partner-upstage[patch]: embeddings empty list bug (#22057)
Fixed an error in `embed_documents` when the input was given as an empty
list. And I have revised the document.
4 months ago
Martin Triska 2df8ac402a
community[minor]: Added propagation of document metadata from O365BaseLoader (#20663)
**Description:**
- Added propagation of document metadata from O365BaseLoader to
FileSystemBlobLoader (O365BaseLoader uses FileSystemBlobLoader under the
hood).
- This is done by passing dictionary `metadata_dict`: key=filename and
value=dictionary containing document's metadata
- Modified `FileSystemBlobLoader` to accept the `metadata_dict`, use
`mimetype` from it (if available) and pass metadata further into blob
loader.

**Issue:**
- `O365BaseLoader` under the hood downloads documents to temp folder and
then uses `FileSystemBlobLoader` on it.
- However metadata about the document in question is lost in this
process. In particular:
- `mime_type`: `FileSystemBlobLoader` guesses `mime_type` from the file
extension, but that does not work 100% of the time.
- `web_url`: this is useful to keep around since in RAG LLM we might
want to provide link to the source document. In order to work well with
document parsers, we pass the `web_url` as `source` (`web_url` is
ignored by parsers, `source` is preserved)

**Dependencies:**
None

**Twitter handle:**
@martintriska1

Please review @baskaryan

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
4 months ago
Eugene Yurtsev e5541d1da7
community[patch]: Update doc-string in CloudBlobLoader (#22069)
Update doc-string
4 months ago
Philippe PRADOS 6dd621d636
community[minor]: Add CloudBlobLoader that supports loading data from cloud buckets (#21957)
Thank you for contributing to LangChain!

- [ ] **PR title**: "Add CloudBlobLoader"
  - community: Add CloudBlobLoader

- [ ] **PR message**: Add cloud blob loader
    - **Description:** 
 Langchain provides several approaches to read different file formats:

Specific loaders (`CVSLoader`) or blob-compatible loaders
(`FileSystemBlobLoader`). The only implementation proposed for
BlobLoader is `FileSystemBlobLoader`.
      
Many projects retrieve files from cloud storage. We propose a new
implementation of `BlobLoader` to read files from the three cloud
storage systems. The interface is strictly identical to
`FileSystemBlobLoader`. The only difference is the constructor, which
takes a cloud "url" object such as `s3://my-bucket`, `az://my-bucket`,
or `gs://my-bucket`.
      
By streamlining the process, this novel implementation eliminates the
requirement to pre-download files from cloud storage to local temporary
files (which are seldom removed).
      
The code relies on the
[CloudPathLib](https://cloudpathlib.drivendata.org/stable/) library to
interpret cloud URLs. This has been added as an optional dependency.

```Python
loader = CloudBlobLoader("s3://mybucket/id")
for blob in loader.yield_blobs():
    print(blob)
```

- [X] **Dependencies:** CloudPathLib
- [X] **Twitter handle:** pprados


- [X] **Add tests and docs**: Add unit test, but it's easy to convert to
integration test, with some files in a cloud storage (see
`test_cloud_blob_loader.py`)

- [X] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified.

Hello from Paris @hwchase17. Can you review this PR?

---------

Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
4 months ago
Christophe Bornet 74947ec894
community[minor]: Add Cassandra ByteStore (#22064) 4 months ago
Christophe Bornet fea6b99b16
community[minor]: Add async methods to CassandraChatMessageHistory (#21975) 4 months ago
Sky 12d65f17ff
community[patch]: surrealdb provide functions for MMR (Maximal Marginal Relevance) (#21185)
This PR contains 4 added functions:

- max_marginal_relevance_search_by_vector
- amax_marginal_relevance_search_by_vector
- max_marginal_relevance_search
- amax_marginal_relevance_search

I'm no langchain expert, but tried do inspect other vectorstore sources
like chroma, to build these functions for SurrealDB. If someone has some
changes for me, please let me know. Otherwise I would be happy, if these
changes are added to the repository, so that I can use the orignal repo
and not my local monkey patched version.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
4 months ago
Bruno Alvisio 5eabe90494
community[patch]: Adding HEADER to the list of supported locations (#21946)
**Description:** adds headers to the list of supported locations when
generating the openai function schema
4 months ago
Bagatur 50186da0a1
infra: rm unused # noqa violations (#22049)
Updating #21137
4 months ago
acho98 45ed5f3f51
community[minor]: Add Clova Embeddings for LangChain Community (#21890)
- [ ] **PR title**: "Add Naver ClovaX embedding to LangChain community"
- HyperClovaX is a large language model developed by
[Naver](https://clova-x.naver.com/welcome).
It's a powerful and purpose-trained LLM.

- You can visit the embedding service provided by
[ClovaX](https://www.ncloud.com/product/aiService/clovaStudio)

- You may get CLOVA_EMB_API_KEY, CLOVA_EMB_APIGW_API_KEY,
CLOVA_EMB_APP_ID From
https://www.ncloud.com/product/aiService/clovaStudio

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
4 months ago
arpitkumar980 444c2a3d9f
community[patch]: sharepoint loader identity enabled (#21176)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:https://github.com/arpitkumar980/langchain.git
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
4 months ago
HuiyuanYan bf3aefce93
community[patch]: Update tongyi.py to support MultimodalConversation in dashscope. (#21249)
Add the support of multimodal conversation in dashscope,now we can use
multimodal language model "qwen-vl-v1", "qwen-vl-chat-v1",
"qwen-audio-turbo" to processing picture an audio. :)

- [ ] **PR title**: "community: add multimodal conversation support in
dashscope"



- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** add multimodal conversation support in dashscope
    - **Issue:** 
    - **Dependencies:** dashscope≥1.18.0
    - **Twitter handle:** none :)


- [ ] **How to use it?**:
   - ```python
     Tongyi_chat = ChatTongyi(
        top_p=0.5,
        dashscope_api_key=api_key,
        model="qwen-vl-v1"
     )
     response= Tongyi_chat.invoke(
        input = 
        [
        {
            "role": "user",
            "content": [
{"image":
"https://dashscope.oss-cn-beijing.aliyuncs.com/images/dog_and_girl.jpeg"},
                {"text": "这是什么?"}
            ]
        }
        ]
       )
      ```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
4 months ago
mochi 63284ffebf
experimental[patch], docs: refine notebook for MyScale `SelfQueryRetriever` (#22016)
- **Description:** upgrade model to `gpt-4o`
4 months ago
MSubik d948783a4c
community[patch]: standardize init args, update for javelin sdk release. (#21980)
Related to
[20085](https://github.com/langchain-ai/langchain/issues/20085) Updated
the Javelin chat model to standardize the initialization argument. Also
fixed an existing bug, where code was initialized with incorrect call to
the JavelinClient defined in the javelin_sdk, resulting in an
initialization error. See related [Javelin
Documentation](https://docs.getjavelin.io/docs/javelin-python/quickstart).
4 months ago
Mohammad Mohtashim 16617dd239
community[patch]: AzureSearchVectorStoreRetriever Fixed to account for search_kwargs (#21572)
- **Description:** Fixed `AzureSearchVectorStoreRetriever` to account
for search_kwargs. More explanation is in the mentioned issue.
- **Issue:** #21492

---------

Co-authored-by: MAC <mac@MACs-MacBook-Pro.local>
Co-authored-by: Massimiliano Pronesti <massimiliano.pronesti@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
4 months ago
Klaudia Lemiec 45351d1bc6
docs: Chroma docstrings update (#22001)
Thank you for contributing to LangChain!

- [X] **PR title**: "docs: Chroma docstrings update"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [X] **PR message**: 
    - **Description:** Added and updated Chroma docstrings
    - **Issue:** https://github.com/langchain-ai/langchain/issues/21983


- [X] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
  - only docs


- [X] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
4 months ago
Jerron Lim 28456c2c33
community[patch]: add args_schema to WikipediaQueryRun (#22019)
Description: This change adds args_schema (pydantic BaseModel) to
WikipediaQueryRun for correct schema formatting on LLM function calls

Issue: currently using WikipediaQueryRun with OpenAI function calling
returns the following error "TypeError: WikipediaQueryRun._run() got an
unexpected keyword argument '__arg1' ". This happens because the schema
sent to the LLM is "input: '{"__arg1":"Hunter x Hunter"}'" while the
method should be called with the "query" parameter.

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
4 months ago
Mazen Ramadan 3c1d77dd64
community[minor]: Add Scrapfly Loader community integration (#22036)
Added [Scrapfly](https://scrapfly.io/) Web Loader integration. Scrapfly
is a web scraping API that allows extracting web page data into
accessible markdown or text datasets.

- __Description__: Added Scrapfly web loader for retrieving web page
data as markdown or text.
- Dependencies: scrapfly-sdk
- Twitter: @thealchemi1st

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
4 months ago
ccurme b51a1eba4d
langchain, community: move OpenAIAssistantV2Runnable to community (#22044) 4 months ago
CaroFG 6b98140b38
community[patch]: update for compatibility with Meilisearch v1.8 (#21979)
Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** Updates Meilisearch vectorstore for compatibility
with v1.8. Adds [”showRankingScore”:
true”](https://www.meilisearch.com/docs/reference/api/search#ranking-score)
in the search parameters and replaces `_semanticScore` field with `
_rankingScore`


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
4 months ago
Oleksii Pokotylo 98c0b093bb
community[patch]: Extend AzureSearch with `maximal_marginal_relevance`, `from_embeddings` (#21065)
**Description:**
- Extend AzureSearch with `maximal_marginal_relevance` (for vector and
hybrid search)
- Add construction `from_embeddings` - if the user has already embedded
the texts
- Add `add_embeddings` 
- Refactor common parts (`_simple_search`, `_results_to_documents`,
`_reorder_results_with_maximal_marginal_relevance`)
- Add `vector_search_dimensions` as a parameter to the constructor to
avoid extra calls to `embed_query` (most of the time the user applies
the same model and knows the dimension)

**Issue:** none
**Dependencies:** none

- [x] **Add tests and docs**: The docstrings have been added to the new
functions, and unified for the existing ones. The example notebook is
great in illustrating the main usage of AzureSearch, adding the new
methods would only dilute the main content.
- [x] **Lint and test**

---------

Co-authored-by: Oleksii Pokotylo <oleksii.pokotylo@pwc.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
4 months ago
SaschaStoll 709664a079
community[patch]: Performant filter columns option for Hanavector (#21971)
**Description:** Backwards compatible extension of the initialisation
interface of HanaDB to allow the user to specify
specific_metadata_columns that are used for metadata storage of selected
keys which yields increased filter performance. Any not-mentioned
metadata remains in the general metadata column as part of a JSON
string. Furthermore switched to executemany for batch inserts into
HanaDB.

**Issue:** N/A

**Dependencies:** no new dependencies added

**Twitter handle:** @sapopensource

---------

Co-authored-by: Martin Kolb <martin.kolb@sap.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
4 months ago
Bagatur 16b55b0704
langchain[patch]: remove dataclasses-json dep (#22042)
vestigial dep afaict
4 months ago
Christos Boulmpasakos c3bcfad66d
text-splitters[patch]: Extend TextSplitter:keep_separator functionality (#21130)
**Description:** Added extra functionality to `CharacterTextSplitter`,
`TextSplitter` classes.
The user can select whether to append the separator to the previous
chunk with `keep_separator='end' ` or else prepend to the next chunk.
Previous functionality prepended by default to next chunk.
  
**Issue:** Fixes #20908

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
4 months ago
Eric Zhang e7e41eaabe
langchain: add RankLLM Reranker (#21171)
Integrate RankLLM reranker (https://github.com/castorini/rank_llm) into
LangChain

An example notebook is given in
`docs/docs/integrations/retrievers/rankllm-reranker.ipynb`

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
4 months ago
maang-h fc93bed8c4
community: Fix CSVLoader columns is None (#20701)
- **Bug code**: In
langchain_community/document_loaders/csv_loader.py:100

- **Description**: currently, when 'CSVLoader' reads the column as None
in the 'csv' file, it will report an error because the 'CSVLoader' does
not verify whether the column is of str type and does not consider how
to handle the corresponding 'row_data' when the column is' None 'in the
csv. This pr provides a solution.

- **Issue:**  Fix #20699 

- **thinking:**

1. Refer to the processing method for
'langchain_community/document_loaders/csv_loader.py:100' when **'v'**
equals'None', and apply the same method to '**k**'.
(Reference`csv.DictReader` ,**'k'** will only be None when `
len(columns) < len(number_row_data)` is established)
2. **‘k’** equals None only holds when it is the last column, and its
corresponding **'v'** type is a list. Therefore, I referred to the data
format in 'Document' and used ',' to concatenated the elements in the
list.(But I'm not sure if you accept this form, if you have any other
ideas, communicate)

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
4 months ago
Nithin James Padayatti 403142eaba
langchain: added revision_example prompt template (#20916)
**Description:** Added revision_example prompt template to include the
revision request and revision examples in the revision chain.
    **Issue:** Not Applicable
    **Dependencies:** Not Applicable
    **Twitter handle:**  @nithinjp09
4 months ago
Sihan Chen 1f81277b9b
community[minor]: allow enabling proxy in aiohttp session in AsyncHTML (#19499)
Allow enabling proxy in aiohttp session async html
4 months ago
Eugene Yurtsev 36813d2f00
community[patch]: Fix remaining __inits__ in community (#22037)
Fixes the __init__ files in community to use __all__ which is statically
defined.
4 months ago
Eugene Yurtsev 58360a1e53
community[patch]: Add unit test to verify that init is correctly defined (#22030)
Fix some __init__ files and add a unit test
4 months ago
Erick Friis ef53ccf54b
robocorp: release 0.0.8 (#22034) 4 months ago
Matthew Hoffman 4f2e3bd7fd
community[patch]: fix public interface for embeddings module (#21650)
## Description

The existing public interface for `langchain_community.emeddings` is
broken. In this file, `__all__` is statically defined, but is
subsequently overwritten with a dynamic expression, which type checkers
like pyright do not support. pyright actually gives the following
diagnostic on the line I am requesting we remove:


[reportUnsupportedDunderAll](https://github.com/microsoft/pyright/blob/main/docs/configuration.md#reportUnsupportedDunderAll):

```
Operation on "__all__" is not supported, so exported symbol list may be incorrect
```

Currently, I get the following errors when attempting to use publicablly
exported classes in `langchain_community.emeddings`:

```python
import langchain_community.embeddings

langchain_community.embeddings.HuggingFaceEmbeddings(...)  #  error: "HuggingFaceEmbeddings" is not exported from module "langchain_community.embeddings" (reportPrivateImportUsage)
```

This is solved easily by removing the dynamic expression.
4 months ago
Eugene Yurtsev 8d82160a8a
community[patch]: Clean up logic in import checking unit test (#22026)
Clean up unit test
4 months ago
Tomaz Bratanic d8a1f1114d
community[patch]: Handle exceptions where node props aren't consistent in neo4j schema (#22027) 4 months ago
WeichenXu b0ef5e778a
community[patch]: Fix ChatDatabricsk in case that streaming response doesn't have role field in delta chunk (#21897)
Thank you for contributing to LangChain!

- [X] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


**Description:**
Fix ChatDatabricsk in case that streaming response doesn't have role
field in delta chunk


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [X] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Signed-off-by: Weichen Xu <weichen.xu@databricks.com>
4 months ago
Eugene Yurtsev aed64daabb
community[patch]: Add unit test to catch bad __all__ definitions (#21996)
This will catch all dynamic __all__ definitions.
4 months ago
Bagatur 3b0437c05b
core[patch]: Release 0.2.1 (#22003) 4 months ago
Kefan You 24b5c27bb1
community[patch]: raise_for_status logic missing in async _fetch of WebBaseLoader (#21948)
## 'raise_for_status' parameter of WebBaseLoader works in sync load but
not in async load.
In webBaseLoader:  

Sync load is calling `_scrape` and has `raise_for_status` properly
handled.
```
    def _scrape(
        self,
        url: str,
        parser: Union[str, None] = None,
        bs_kwargs: Optional[dict] = None,
    ) -> Any:
        from bs4 import BeautifulSoup

        if parser is None:
            if url.endswith(".xml"):
                parser = "xml"
            else:
                parser = self.default_parser

        self._check_parser(parser)

        html_doc = self.session.get(url, **self.requests_kwargs)
        if self.raise_for_status:
            html_doc.raise_for_status()

        if self.encoding is not None:
            html_doc.encoding = self.encoding
        elif self.autoset_encoding:
            html_doc.encoding = html_doc.apparent_encoding
        return BeautifulSoup(html_doc.text, parser, **(bs_kwargs or {}))
```
Async load is calling `_fetch` but missing `raise_for_status` logic.
```
    async def _fetch(
        self, url: str, retries: int = 3, cooldown: int = 2, backoff: float = 1.5
    ) -> str:
        async with aiohttp.ClientSession() as session:
            for i in range(retries):
                try:
                    async with session.get(
                        url,
                        headers=self.session.headers,
                        ssl=None if self.session.verify else False,
                        cookies=self.session.cookies.get_dict(),
                    ) as response:
                        return await response.text()
```

Co-authored-by: kefan.you <darkfss@sina.com>
4 months ago
Surya Rath eb096675a8
OpenAI Assistants v2 api support for OpenAIAssistantRunnable (#21484)
**Title**: "langchain: OpenAI Assistants v2 api support"

***Descriptions*** 
- [x] "attachments" support added along with backward compatibility of
"file_ids"
- [x]  "tool_resources" support added while creating new assistant

- [ ] "tool_choice" parameter support
- [ ]  Streaming support


- **Dependencies:** OpenAI v2 API (openai>=1.23.0)
- **Twitter handle:** @skanta_rath

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
4 months ago
Eugene Yurtsev 7a5d042bd2
langchain[patch]: Add unit test to detect changes to community imports (#21998)
Add unit tests for community imports
4 months ago
Eugene Yurtsev 90f4d8842f
langchain[patch]: Turn on all deprecations for 0.2 (#21999)
- Turn on all 0.2 import deprecations.
- Update error messag with URL to upgrade instructions.
4 months ago
Asaf Joseph Gardin a042e804b4
ai21: AI21 Jamba docs (#21978)
- Updated docs to have an example to use Jamba instead of J2

---------

Co-authored-by: Asaf Gardin <asafg@ai21.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
4 months ago
Pengcheng Liu 4cf523949a
community[patch]: Update model client to support vision model in Tong… (#21474)
- **Description:** Tongyi uses different client for chat model and
vision model. This PR chooses proper client based on model name to
support both chat model and vision model. Reference [tongyi
document](https://help.aliyun.com/zh/dashscope/developer-reference/tongyi-qianwen-vl-plus-api?spm=a2c4g.11186623.0.0.27404c9a7upm11)
for details.

```
from langchain_core.messages import HumanMessage
from langchain_community.chat_models import ChatTongyi

llm = ChatTongyi(model_name='qwen-vl-max')
image_message = {
    "image": "https://lilianweng.github.io/posts/2023-06-23-agent/agent-overview.png"
}
text_message = {
    "text": "summarize this picture",
}
message = HumanMessage(content=[text_message, image_message])
llm.invoke([message])
```

- **Issue:** None
- **Dependencies:** None
- **Twitter handle:** None
4 months ago
Sevin F. Varoglu 1bc0ea5496
community[patch]: update OctoAIEmbeddings to subclass OpenAIEmbeddings (#21805) 4 months ago
Eugene Yurtsev ded53297e0
core[patch]: Add unit test for RunnableGenerator for eventstream v2 (#21990)
No unit tests with runnable generator
4 months ago
Nuno Campos fb6108c8f5
core[patch]: In astream_events(version=v2) tap output of root run (#21977)
- if tap_output_iter/aiter is called multiple times for the same run
issue events only once
- if chat model run is tapped don't issue duplicate on_llm_new_token
events
- if first chunk arrives after run has ended do not emit it as a stream
event

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
4 months ago
Bagatur 72d4a8eeed
community[patch]: AzureSearch dont overwrite default async (#21989) 4 months ago
ccurme a983465694
docs: set default anthropic model (#21988)
`ChatAnthropic()` raises ValidationError.
4 months ago
ccurme 4be5537837
Revert "anthropic: set default model" (#21987)
Reverts langchain-ai/langchain#21986
4 months ago
ccurme 35439cf3bd
anthropic: set default model (#21986)
Various docs reference `ChatAnthropic()`, but this currently raises
ValidationError.
4 months ago
ccurme 0923136851
langchain: default to Runnable in MultiQueryRetriever (#21770)
- `llm_chain` becomes `Union[LLMChain, Runnable]`
- `.from_llm` creates a runnable

tested by verifying that docs/how_to/MultiQueryRetriever.ipynb runs
unchanged with sync/async invoke (and that it runs if we specifically
instantiate with LLMChain).
4 months ago
Yulong Wang 8e1aeb8ad5
community[patch]: Fix typo in arxiv tool's doc (#21970)
Fix typo in arxiv tool's doc
4 months ago
Robert Caulk 54adcd9e82
community[minor]: add AskNews retriever and AskNews tool (#21581)
We add a tool and retriever for the [AskNews](https://asknews.app)
platform with example notebooks.

The retriever can be invoked with:

```py
from langchain_community.retrievers import AskNewsRetriever

retriever = AskNewsRetriever(k=3)

retriever.invoke("impact of fed policy on the tech sector")
```

To retrieve 3 documents in then news related to fed policy impacts on
the tech sector. The included notebook also includes deeper details
about controlling filters such as category and time, as well as
including the retriever in a chain.

The tool is quite interesting, as it allows the agent to decide how to
obtain the news by forming a query and deciding how far back in time to
look for the news:

```py
from langchain_community.tools.asknews import AskNewsSearch
from langchain import hub
from langchain.agents import AgentExecutor, create_openai_functions_agent
from langchain_openai import ChatOpenAI

tool = AskNewsSearch()

instructions = """You are an assistant."""
base_prompt = hub.pull("langchain-ai/openai-functions-template")
prompt = base_prompt.partial(instructions=instructions)
llm = ChatOpenAI(temperature=0)
asknews_tool = AskNewsSearch()
tools = [asknews_tool]
agent = create_openai_functions_agent(llm, tools, prompt)
agent_executor = AgentExecutor(
    agent=agent,
    tools=tools,
    verbose=True,
)

agent_executor.invoke({"input": "How is the tech sector being affected by fed policy?"})
```

---------

Co-authored-by: Emre <e@emre.pm>
4 months ago
Jesse S fc79b372cb
community[minor]: add aerospike vectorstore integration (#21735)
Please let me know if you see any possible areas of improvement. I would
very much appreciate your constructive criticism if time allows.

**Description:**
- Added a aerospike vector store integration that utilizes
[Aerospike-Vector-Search](https://aerospike.com/products/vector-database-search-llm/)
add-on.
- Added both unit tests and integration tests
- Added a docker compose file for spinning up a test environment
- Added a notebook

 **Dependencies:** any dependencies required for this change
- aerospike-vector-search

 **Twitter handle:** 
- No twitter, you can use my GitHub handle or LinkedIn if you'd like

Thanks!

---------

Co-authored-by: Jesse Schumacher <jschumacher@aerospike.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
4 months ago
Prince Canuma 3587c60396
community[patch]: Fix MLX LLM Stream (#20575)
Closes #20561

This PR fixes MLX LLM stream `AttributeError`. 

Recently, `mlx-lm` changed the token decoding logic, which affected the
LC+MLX integration.

Additionally, I made minor fixes such as: docs example broken link and
enforcing pipeline arguments (max_tokens, temp and etc) for invoke.
   
- **Issue:** #20561
    
- **Twitter handle:** @Prince_Canuma
4 months ago
Rahul Triptahi 96bd0b0844
community[patch]: Remove redundant pebblo cloud api call (#21589)
Description: removed redundant pebblo cloud api call. Changed classified
`doc` key to `ai_apps_data`.
Documentation: N/A
Unit tests: N/A
4 months ago
Param Singh d07885f8b7
community[patch]: standardized sparkllm init args (#21633)
Related to #20085 
@baskaryan 

Thank you for contributing to LangChain!

community:sparkllm[patch]: standardized init args

updated `spark_api_key` so that aliased to `api_key`. Added integration
test for `sparkllm` to test that it continues to set the same underlying
attribute.

updated temperature with Pydantic Field, added to the integration test.

Ran `make format`,`make test`, `make lint`, `make spell_check`
4 months ago
Dhruv Chawla d4359d3de6
community[patch]: Update UpTrain Callback Handler to support the new UpTrain evaluation schema (#21656)
UpTrain has a new dashboard now that makes it easier to view projects
and evaluations. Using this requires specifying both project_name and
evaluation_name when performing evaluations. I have updated the code to
support it.
4 months ago
Alex Riina c0e3c3a350
openai[patch], community[patch]: add pricing and max context window for GPT-4o (#21673)
# Add pricing and max context window for GPT-4o
- community: add cost per 1k tokens and max context window
- partners: add max context window

**Description:** adds static information about GPT-4o based on
https://openai.com/api/pricing/ and
https://platform.openai.com/docs/models/gpt-4o so that GPT-4o reporting
is accurate.

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
4 months ago
缨缨 bd39b2ccdf
community: enable SupabaseVectorStore to support extended table fields (#21762)
Thank you for contributing to LangChain!

- [x] **PR title**: "community: enable SupabaseVectorStore to support
extended table fields"

- [x] **PR message**: 
- Added extension fields to the function _add_vectors so that users can
add other custom fields when insert a record into the database. eg:
    

![image](https://github.com/langchain-ai/langchain/assets/10885578/e1d5ca20-936e-4cab-ba69-8fdd23b8ce8f)

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
4 months ago
Leonid Ganeline e98a4fd19a
ai21[patch]: configuration fix (#21790)
added "repository" and "Source Code" parameters (these parameters are
missed only in this partner package configuration).
4 months ago
Trayan Azarov f54cbf8ff5
chroma[patch]: Chroma - remove reference to collection upon delete_collection (#21817)
**Description**:

- Reference to `Collection` object is set to `None` when deleting a
collection `delete_collection()`
- Added utility method `reset_collection()` to allow recreating the
collection
- Moved collection creation out of `__init__` into
`__ensure_collection()` to be reused by object init and
`reset_collection()`
- `_collection` is now a property to avoid breaking changes

**Issues**: 

- chroma-core/chroma#2213

**Twitter**: @t_azarov
4 months ago
Jens b0b302ec6b
community[patch]: fixed aleph alpha default emedding request (#21826)
- **Description:** In the aleph alpha client the paramater `normalize`
is *not* optional. Setting this to `None` gives an error.
- **Dependencies:** None

Co-authored-by: Jens Lücke <jens.luecke@tngtech.com>
Co-authored-by: Jens <jens.luecke@hu-berlin.de>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
4 months ago
Jorge Piedrahita Ortiz e6207ad4f3
community[patch]: Sambanova integration api update (#21848)
- **Description:**:
        SambaStudio generic endpoint compatibility added
        Improved error description, and handling
        streaming examples added
4 months ago
Michael Reed 7a5e1bcf99
core[patch]: Fix NPE in function_calling._get_python_function_required_args (#21863)
Example error message:
line 206, in _get_python_function_required_args
    if is_function_type and required[0] == "self":
                            ~~~~~~~~^^^
IndexError: list index out of range

Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [x] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
4 months ago
Liuww 332ffed393
community[patch]: Adopting the lighter-weight xinference_client (#21900)
While integrating the xinference_embedding, we observed that the
downloaded dependency package is quite substantial in size. With a focus
on resource optimization and efficiency, if the project requirements are
limited to its vector processing capabilities, we recommend migrating to
the xinference_client package. This package is more streamlined,
significantly reducing the storage space requirements of the project and
maintaining a feature focus, making it particularly suitable for
scenarios that demand lightweight integration. Such an approach not only
boosts deployment efficiency but also enhances the application's
maintainability, rendering it an optimal choice for our current context.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
4 months ago
Tomaz Bratanic a43515ca65
experimental[patch]: Pass enum only to openai in llm graph transformer (#21860)
Some models like Groq return bad request if you pass in `enum` parameter
in tool definition
4 months ago
Jiří Spilka 6499897c87
community[patch]: update apify integration to attribute API activity to langchain (#21909)
**Description:** Add `Origin/langchain` to Apify's client's user-agent
to attribute API activity to LangChain (at Apify, we aim to monitor our
integrations to evaluate whether we should invest more in the LangChain
integration regarding functionality and content)

**Issue:** None
**Dependencies:** None
**Twitter handle:** None
4 months ago
Jared Van Bortel 25d1c1c9bb
nomic: implement local embeddings with the inference_mode parameter (#21934)
## Description

This PR implements local and dynamic mode in the Nomic Embed integration
using the inference_mode and device parameters. They work as documented
[here](https://docs.nomic.ai/reference/python-api/embeddings#local-inference).

<!-- If no one reviews your PR within a few days, please @-mention one
of baskaryan, efriis, eyurtsev, hwchase17. -->

---------

Co-authored-by: Erick Friis <erickfriis@gmail.com>
4 months ago
ccurme 0e72ed39a0
infra: fix CI on text-splitters (#21935) 4 months ago
ccurme 4470d3b4a0
partners: bump core in packages implementing ls_params (#21868)
These packages all import `LangSmithParams` which was released in
langchain-core==0.2.0.

N.B. we will need to release `openai` and then bump `langchain-openai`
in `together` and `upstage`.
4 months ago
ccurme 9c76739425
mistral: implement ls_params (#21867) 4 months ago
Tomaz Bratanic d85e46321a
community[patch]: Better error message for neo4j vector when text is null (#21861) 4 months ago
Stefano Lottini f2e75f9500
cli[minor]: fix import path for two Astra DB classes in the migration json data (#21926)
This PR fixes two mistakes in the import paths from community for the
json data aiding the cli migration to 0.2.

It is intended as a quick follow-up to
https://github.com/langchain-ai/langchain/pull/21913 .

@nicoloboschi FYI
4 months ago
WilliamEspegren 30bca57aae
doc list not empty (#21208)
Make sure the doc list is not empty, and set Metadata: true in param, to
enable the user to disable metadata for slightly faster crawls.
4 months ago
David Charles 8da35fba7f
langchain[minor]: add libs/partners to dev.Dockerfile (#21902)
Resolves #21886 by adding "COPY libs/partners ../partners/" to
libs/dev.Dockerfile

Twitter: @kabakongo
4 months ago
TJ 8cd6ed3e1e
community[patch]: Update documentation string in databricks chat model (#21915)
Update typos in documentation string in databricks chat model
4 months ago
Nicolò Boschi dd00aac7ad
cli[minor]: add astradb in the cli migration to 0.2 (#21913)
astradb has a new partner package but the automatic migration cli tool
doesn't take care of migration astradb integrations
4 months ago
Coozywana b6c8b6f944
Fix base.py typo (#21862)
ChatOpenaAI --> ChatOpenAI

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
4 months ago
fzowl d3624eaba1
partners: Remove unnecessary print from voyageai embeddings (#21865)
Thank you for contributing to LangChain!

Remove unnecessary print from voyageai embeddings

- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
4 months ago
Bagatur 8b3c5f93f5
docs: lcel how to and cheatsheet (#21851) 4 months ago
Nuno Campos b1e7b40b6a
core: Tap output of sync iterators for astream_events (#21842)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
4 months ago
Eugene Yurtsev 67b6f6c82a
core[patch]: Check if event loop is closed in memory stream (#21841)
Check if event stream is closed in memory loop.

Using try/except here to avoid race condition, but this may incur a
small overhead in versions prios to 3.11
4 months ago
Erick Friis 2d3f4e1a16
experimental: release 0.0.59 (#21835) 4 months ago
Erick Friis 169f525cfb
community: release 0.2.0 (#21834) 4 months ago
Erick Friis e5046cbd72
langchain: release 0.2.0, fix min deps (#21833) 4 months ago
Erick Friis 1b555021f7
text-splitters: release 0.2.0 (#21832) 4 months ago
Erick Friis 0ad8de5eb7
langchain: release 0.2.0 (#21831) 4 months ago
Erick Friis 23310626b3
core: release 0.2.0 (#21829) 4 months ago
Eugene Yurtsev e3f30b4cde
docs: clean up link to bing search (#21825)
Documentation should be inlined, not linking to medium article.
4 months ago
ccurme 181dfef118
core, standard tests, partner packages: add test for model params (#21677)
1. Adds `.get_ls_params` to BaseChatModel which returns
```python
class LangSmithParams(TypedDict, total=False):
    ls_provider: str
    ls_model_name: str
    ls_model_type: Literal["chat"]
    ls_temperature: Optional[float]
    ls_max_tokens: Optional[int]
    ls_stop: Optional[List[str]]
```
by default it will only return
```python
{ls_model_type="chat", ls_stop=stop}
```

2. Add these params to inheritable metadata in
`CallbackManager.configure`

3. Implement `.get_ls_params` and populate all params for Anthropic +
all subclasses of BaseChatOpenAI

Sample trace:
https://smith.langchain.com/public/d2962673-4c83-47c7-b51e-61d07aaffb1b/r

**OpenAI**:
<img width="984" alt="Screenshot 2024-05-17 at 10 03 35 AM"
src="https://github.com/langchain-ai/langchain/assets/26529506/2ef41f74-a9df-4e0e-905d-da74fa82a910">

**Anthropic**:
<img width="978" alt="Screenshot 2024-05-17 at 10 06 07 AM"
src="https://github.com/langchain-ai/langchain/assets/26529506/39701c9f-7da5-4f1a-ab14-84e9169d63e7">

**Mistral** (and all others for which params are not yet populated):
<img width="977" alt="Screenshot 2024-05-17 at 10 08 43 AM"
src="https://github.com/langchain-ai/langchain/assets/26529506/37d7d894-fec2-4300-986f-49a5f0191b03">
4 months ago
Sen Lin eb7f07ae36
community[patch]: fix typo in ValueError message in load_local function (#21818)
**Description:**
Corrected an error in the `allow_dangerous_deserialization` message
within the `load_local` functions
4 months ago
Jorge Piedrahita Ortiz 700b1c7212
community: sambaverse api update (#21816)
- **Description:** fix sambaverse integration to make it compatible with
sambaverse API update / minor changes in docs
4 months ago
maang-h 9f8d18c028
community[patch]: Fix unintended newline in print statement in exception for BaichuanTextEmbeddings (#21820)
- **Code:** langchain_community/embeddings/baichuan.py:82
- **Description:** When I make an error using 'baichuan embeddings', the
printed error message is wrapped (there is actually no need to wrap)
```python
# example
from langchain_community.embeddings import BaichuanTextEmbeddings

# error key
BAICHUAN_API_KEY = "sk-xxxxxxxxxxxxx"
embeddings = BaichuanTextEmbeddings(baichuan_api_key=BAICHUAN_API_KEY)

text_1 = "今天天气不错"
query_result = embeddings.embed_query(text_1)
```



![unintended
newline](https://github.com/langchain-ai/langchain/assets/55082429/e1178ce8-62bb-405d-a4af-e3b28eabc158)
4 months ago
Bakar Tavadze 3b5ac44e03
langchain-robocorp[minor]: Enable passing additional headers to the action server. (#21809)
Actions can optionally receive secrets via request headers. This PR
enables this functionality.
4 months ago
Asaf Joseph Gardin f3289b898c
partners: Revert AI21 Labs docs scan feature (#21699)
Description: Reverted commit #21614

---------

Co-authored-by: Asaf Gardin <asafg@ai21.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
4 months ago
Eugene Yurtsev 8607735b80
langchain[patch],community[patch]: Move unit tests that depend on community to community (#21685) 4 months ago
Marco Lamina d0fae6cd54
community: Add token cost for GPT-4o model (#21771)
Adding [token cost for the new GPT-4o
model](https://openai.com/api/pricing/):
* Input cost US$5.00 / 1M tokens
* Output cost US$15.00 / 1M tokens
4 months ago
Massimiliano Pronesti 0c0db7c5db
feat(community): support semantic hybrid score threshold in Azure AI Search (#21527)
Support semantic hybrid search with a score threshold -- similar to what
we do for similarity search and for hybrid search (#20907).
4 months ago
Bagatur 6416d16d39
anthropic[patch]: Release 0.1.13, tool_choice support (#21773) 4 months ago
Stefano Lottini 040597e832
community: init signature revision for Cassandra LLM cache classes + small maintenance (#17765)
This PR improves on the `CassandraCache` and `CassandraSemanticCache`
classes, mainly in the constructor signature, and also introduces
several minor improvements around these classes.

### Init signature

A (sigh) breaking change is tentatively introduced to the constructor.
To me, the advantages outweigh the possible discomfort: the new syntax
places the DB-connection objects `session` and `keyspace` later in the
param list, so that they can be given a default value. This is what
enables the pattern of _not_ specifying them, provided one has
previously initialized the Cassandra connection through the versatile
utility method `cassio.init(...)`.

In this way, a much less unwieldy instantiation can be done, such as
`CassandraCache()` and `CassandraSemanticCache(embedding=xyz)`,
everything else falling back to defaults.

A downside is that, compared to the earlier signature, this might turn
out to be breaking for those doing positional instantiation. As a way to
mitigate this problem, this PR typechecks its first argument trying to
detect the legacy usage.
(And to make this point less tricky in the future, most arguments are
left to be keyword-only).

If this is considered too harsh, I'd like guidance on how to further
smoothen this transition. **Our plan is to make the pattern of optional
session/keyspace a standard across all Cassandra classes**, so that a
repeatable strategy would be ideal. A possibility would be to keep
positional arguments for legacy reasons but issue a deprecation warning
if any of them is actually used, to later remove them with 0.2 - please
advise on this point.

### Other changes

- class docstrings: enriched, completely moved to class level, added
note on `cassio.init(...)` pattern, added tiny sample usage code.
- semantic cache: revised terminology to never mention "distance" (it is
in fact a similarity!). Kept the legacy constructor param with a
deprecation warning if used.
- `llm_caching` notebook: uniform flow with the Cassandra and Astra DB
separate cases; better and Cassandra-first description; all imports made
explicit and from community where appropriate.
- cache integration tests moved to community (incl. the imported tools),
env var bugfix for `CASSANDRA_CONTACT_POINTS`.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
4 months ago
Kyle Cassidy eca8c4bcc6
Standardized openai init params (#21739)
## Patch Summary
community:openai[patch]: standardize init args

## Details
I made changes to the OpenAI Chat API wrapper test in the Langchain
open-source repository

- **File**: `libs/community/tests/unit_tests/chat_models/test_openai.py`
- **Changes**:
  - Updated `max_retries` with Pydantic Field
  - Updated the corresponding unit test
- **Related Issues**: #20085
  - Updated max_retries with Pydantic Field, updated the unit test.

---------

Co-authored-by: JuHyung Son <sonju0427@gmail.com>
4 months ago
Ethan Yang e44b448ec3
community: update openvino doc with streaming support (#21519)
Co-authored-by: Chester Curme <chester.curme@gmail.com>
4 months ago
ccurme 19e6bf814b
community: fix CI (#21766) 4 months ago
Mish Ushakov d77e60a7f4
community: updated Browserbase loader (#21757)
Thank you for contributing to LangChain!

- [x] **PR title**: "community: updated Browserbase loader"

- [x] **PR message**:
    Updates the Browserbase loader with more options and improved docs.

- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
4 months ago
Eugene Yurtsev 6ed0aa3239
core[major]: only use function description (#21622)
Do not prefix function signature

---

* Reason for this is that information is already present with tool
calling models.
* This will save on tokens for those models, and makes it more obvious
what the description is!
* The @tool can get more parameters to allow a user to re-introduce the
the signature if we want
4 months ago
William FH 8498b41cda
Finish agent migration doc (#21731) 4 months ago
Cheese 0ead09f84d
community: Implement `bind_tools` for ChatTongyi (#20725)
## Description

Implement `bind_tools` in ChatTongyi. Usage example:

```py
from langchain_core.tools import tool
from langchain_community.chat_models.tongyi import ChatTongyi

@tool
def multiply(first_int: int, second_int: int) -> int:
    """Multiply two integers together."""
    return first_int * second_int

llm = ChatTongyi(model="qwen-turbo")

llm_with_tools = llm.bind_tools([multiply])

msg = llm_with_tools.invoke("What's 5 times forty two")

print(msg)
```

Streaming is also supported.

## Dependencies

No Dependency is required for this change.

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
4 months ago
Bagatur 867adbf27b
docs: add aca-ds (#21746) 4 months ago
Erick Friis 06110e20b9
pinecone: bump min core version (#21742) 4 months ago
Erick Friis bd3e7d50f3
fireworks: bump min core version (#21741) 4 months ago
Erick Friis f5c31078d7
airbyte[patch]: airbyte-cdk compatible pydantic versions (#21738) 4 months ago
Erick Friis 3d33b89fa4
ibm[patch]: release 0.1.7 (#21737) 4 months ago
Erick Friis e41d801369
openai[patch]: fix embedding float precision issue (#21736)
also clean up + comment some of the embedding batching code
4 months ago
JuHyung Son 38c297a025
upstage: Support batch input in embedding request. (#21730)
**Description:** upstage embedding now supports batch input.
4 months ago
Harrison Chase 15be439719
Harrison/move flashrank rerank (#21448)
third party integration, should be in community
4 months ago
Erick Friis aca98fd150
multiple: releases with relaxed core dep (#21724) 4 months ago
Bagatur af284518bc
openai[patch]: Release 0.1.7, bump tiktoken 0.7.0 (#21723) 4 months ago
William FH ca768c8353
[Core] Check is async callable (#21714)
To permit proper coercion of objects like the following:


```python
class MyAsyncCallable:
    async def __call__(self, foo):
        return await ...

class MyAsyncGenerator:
    async def __call__(self, foo):
        await ...
        yield 
```
4 months ago
Eugene Yurtsev 5c2cfabec6
core[minor]: Add v2 implementation of astream events (#21638)
This PR introduces a v2 implementation of astream events that removes
intermediate abstractions and fixes some issues with v1 implementation.

The v2 implementation significantly reduces relevant code that's
associated with the astream events implementation together with
overhead.

After this PR, the astream events implementation:

- Uses an async callback handler
- No longer relies on BaseTracer
- No longer relies on json patch

As a result of this re-write, a number of issues were discovered with
the existing implementation.

## Changes in V2 vs. V1

### on_chat_model_end `output`

The outputs associated with `on_chat_model_end` changed depending on
whether it was within a chain or not.

As a root level runnable the output was: 

```python
"data": {"output": AIMessageChunk(content="hello world!", id='some id')}
```

As part of a chain the output was:

```
            "data": {
                "output": {
                    "generations": [
                        [
                            {
                                "generation_info": None,
                                "message": AIMessageChunk(
                                    content="hello world!", id=AnyStr()
                                ),
                                "text": "hello world!",
                                "type": "ChatGenerationChunk",
                            }
                        ]
                    ],
                    "llm_output": None,
                }
            },
```

After this PR, we will always use the simpler representation:

```python
"data": {"output": AIMessageChunk(content="hello world!", id='some id')}
```

**NOTE** Non chat models (i.e., regular LLMs) are still associated with
the more verbose format.

### Remove some `_stream` events

`on_retriever_stream` and `on_tool_stream` events were removed -- these
were not real events, but created as an artifact of implementing on top
of astream_log.

The same information is already available in the `x_on_end` events.

### Propagating Names

Names of runnables have been updated to be more consistent

```python
  model = GenericFakeChatModel(messages=infinite_cycle).configurable_fields(
        messages=ConfigurableField(
            id="messages",
            name="Messages",
            description="Messages return by the LLM",
        )
    )
```

Before:
```python
"name": "RunnableConfigurableFields",
```

After:
```python
"name": "GenericFakeChatModel",
```

### on_retriever_end

on_retriever_end will always return `output` which is a list of
documents (rather than a dict containing a key called "documents")

### Retry events

Removed the `on_retry` callback handler. It was incorrectly showing that
the failed function being retried has invoked `on_chain_end`


https://github.com/langchain-ai/langchain/pull/21638/files#diff-e512e3f84daf23029ebcceb11460f1c82056314653673e450a5831147d8cb84dL1394
4 months ago
Rajendra Kadam 54e003268e
langchain[minor]: Add PebbloRetrievalQA chain with Identity & Semantic Enforcement support (#20641)
- **Description:** PebbloRetrievalQA chain introduces identity
enforcement using vector-db metadata filtering
- **Dependencies:** None
- **Issue:** None
- **Documentation:** Adding documentation for PebbloRetrievalQA chain in
a separate PR(https://github.com/langchain-ai/langchain/pull/20746)
- **Unit tests:** New unit-tests added

---------

Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
4 months ago
Bagatur 241a6e43a5
docs: update structured how to (#21679) 4 months ago
Jib f369495fa0
mongodb: [performance] Increase DEFAULT_INSERT_BATCH_SIZE to 100,000 and introduce sizing constraints (#19608) 4 months ago
Eugene Yurtsev e69a9bedf8
core[patch]: Update mypy config (#21684)
Update mypy config to ignore checking deps from numpy and pytest (which are optional in langsmith sdk)
4 months ago
Erick Friis 9973547aef
mongodb: release 0.1.4 (#21678) 4 months ago
Jib a97473c846
mongodb[patch]: Make ObjectId JSON-serializable on generation (#21394) 4 months ago
Eugene Yurtsev 5c64c004cc
core[patch]: Add unit tests with some streaming scenarios (#21668)
Add unit tests that show differences between sync / async versions when
streaming.

The inner on_chain_chunk event is missing if mixing sync and async
functionality. Likely due to missing tap_output_iter implementation on
the sync variant of `_transform_stream_with_config`
4 months ago
Eugene Yurtsev 2ac4d2960c
core[patch]: Add unit test to catch ordering (#21669)
Add unit test to catch ordering issues
4 months ago
Zhao Blake 972d2071c6
core[patch]: Fix typo in VectorStoreExampleSelector doc-string (#21574) 4 months ago
Erick Friis 2a984e8e3f
docs: huggingface package (#21645) 4 months ago
Erick Friis c77d2f2b06
multiple: core 0.2 nonbreaking dep, check_diff community->langchain dep (#21646)
0.2 is not a breaking release for core (but it is for langchain and
community)

To keep the core+langchain+community packages in sync at 0.2, we will
relax deps throughout the ecosystem to tolerate `langchain-core` 0.2
4 months ago
Anush edd68e4ad4
qdrant: init package (#21146)
## Description

This PR introduces the new `langchain-qdrant` partner package, intending
to deprecate the community package.

## Changes

- Moved the Qdrant vector store implementation `/libs/partners/qdrant`
with integration tests.
- The conditional imports of the client library are now regular with
minor implementation improvements.
- Added a deprecation warning to
`langchain_community.vectorstores.qdrant.Qdrant`.
- Replaced references/imports from `langchain_community` with either
`langchain_core` or by moving the definitions to the `langchain_qdrant`
package itself.
- Updated the Qdrant vector store documentation to reflect the changes.

## Testing
- `QDRANT_URL` and
[`QDRANT_API_KEY`](583e36bf6b)
env values need to be set to [run integration
tests](d608c93d1f)
in the [cloud](https://cloud.qdrant.tech).
- If a Qdrant instance is running at `http://localhost:6333`, the
integration tests will use it too.
- By default, tests use an
[`in-memory`](https://github.com/qdrant/qdrant-client?tab=readme-ov-file#local-mode)
instance(Not comprehensive).

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Erick Friis <erickfriis@gmail.com>
4 months ago
Prashanth Rao 63c3a0e56c
[community][graph]: Update KuzuQAChain and docs (#21218)
This PR makes some small updates for `KuzuQAChain` for graph QA.

- Updated Cypher generation prompt (we now support `WHERE EXISTS`) and
generalize it more
- Support different LLMs for Cypher generation and QA
- Update docs and examples
4 months ago
Tomaz Bratanic 89ff6a3d3b
Add sentiment and confidence levels to diffbotgraphtransformer (#21590)
Co-authored-by: Erick Friis <erickfriis@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
4 months ago
Erick Friis 9b51ca08bc
huggingface: fix community dep checking (#21628) 4 months ago
Erick Friis 91a2ea5cd6
chroma, mongodb: fix docstrings (#21629) 4 months ago
Jofthomas afd85b60fc
huggingface: init package (#21097)
First Pr for the langchain_huggingface partner Package

- Moved some of the hugging face related class from `community` to the
new `partner package`

Still needed :
- Documentation
- Tests
- Support for the new apply_chat_template in `ChatHuggingFace`
- Confirm choice of class to support for embeddings witht he
sentence-transformer team.

cc : @efriis

---------

Co-authored-by: Cyril Kondratenko <kkn1993@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
4 months ago
Tomaz Bratanic 9fce03e7db
community[patch]: Fix neo4j enhanced schema (#21582) 4 months ago
Christophe Bornet 66a4da8ad0
community[patch]: Improve Cassandra VectorStore docsctrings (#21620) 4 months ago
adreo00 40aff1eacc
core[major]: AsyncCallbackManagerForToolRun no longer casts return object to string (#20374)
- **Description:** Stops `AsyncCallbackManagerForToolRun` from
converting the output to str
- **Issue:** #20372
- **Dependencies:** None
4 months ago
Eugene Yurtsev 25fbe356b4
community[patch]: upgrade to recent version of mypy (#21616)
This PR upgrades community to a recent version of mypy. It inserts type:
ignore on all existing failures.
4 months ago
Eugene Yurtsev b923951062
langchain[patch]: CI add lint rule for community imports (#21618)
Add a rule to check for imports from community in global scope
4 months ago
Jorge Piedrahita Ortiz 4378fbbef0
community[patch]: Fix typos in Sambanova integration doc-strings (#21617)
- **Description:** Sambanova integration docstrings updated, bad
formated

---------

Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
4 months ago
Christophe Bornet bcf53f93e1
[community]: Add missing docstring param to CassandraLoader (#21611) 4 months ago
Christophe Bornet e6fa4547b1
community[minor]: Add alazy_load to AsyncHtmlLoader (#21536)
Also fixes a bug that `_scrape` was called and was doing a second HTTP
request synchronously.

**Twitter handle:** cbornet_
4 months ago
Wang Guan b53548dcda
langchain[minor]: allow CacheBackedEmbeddings to cache queries (#20073)
Add optional caching of queries to cache backed embeddings
4 months ago
Guangdong Liu a156aace2b
core[patch]:Fix Incorrect listeners parameters for Runnable.with_listeners() and .map() (#20661)
- **Issue:** fix #20509
-  @baskaryan, @eyurtsev


![image](https://github.com/langchain-ai/langchain/assets/48236177/f799a976-b983-4d8b-b373-64392e1fd6c6)
4 months ago
junkeon 480c02bf55
upstage[minor]: add merge_and_split function for document loader (#21603)
- Introduce the `merge_and_split` function in the
`UpstageLayoutAnalysisLoader`.
- The `merge_and_split` function takes a list of documents and a
splitter as inputs.
- This function merges all documents and then divides them using the
`split_documents` method, which is a proprietary function of the
splitter.
- If the provided splitter is `None` (which is the default setting), the
function will simply merge the documents without splitting them.
4 months ago
Leonid Ganeline 500569da48
community[patch]: `vectorstores` import update (#21169)
Issue: we have several helper functions to import third-party libraries
like lancedb.import_lancedb in
[community.vectorstores](https://api.python.langchain.com/en/latest/vectorstores/langchain_community.vectorstores.lancedb.import_lancedb.html#langchain_community.vectorstores.lancedb.import_lancedb).
And we have core.utils.utils.guard_import that works exactly for this
purpose.
The import_<package> functions work inconsistently and rather be private
functions.
Change: replaced these functions with the guard_import function.

Related to #21133
4 months ago
ccurme 3003363605
langchain, community: remove cap on sqlalchemy and bump duckdb (#21509) 4 months ago
ccurme 01a3228d8e
standard tests: add test for few-shot examples (#21019) 4 months ago
Chuyuan Qu af875cff57
prompty: adding Microsoft langchain_prompty package (#21346)
Co-authored-by: Micky Liu <wayliu@microsoft.com>
Co-authored-by: wayliums <wayliums@users.noreply.github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
5 months ago
Matt Florence d3ca2cc8c3
langchain: Fix broken `OpenAIModerationChain` and implement async (#18537)
Thank you for contributing to LangChain!

## PR title
lancghain[patch]: fix `OpenAIModerationChain` and implement async

## PR message
Description: fix `OpenAIModerationChain` and implement async

Issues: 
- https://github.com/langchain-ai/langchain/issues/18533 
- https://github.com/langchain-ai/langchain/issues/13685

Dependencies: none
Twitter handle: mattflo


## Add tests and docs
 
Existing documentation is broken:
https://python.langchain.com/docs/guides/safety/moderation


- [ x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

---------

Co-authored-by: Emilia Katari <emilia@outpace.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
Co-authored-by: Erick Friis <erickfriis@gmail.com>
5 months ago
ccurme 4170e72a42
openai: fix loads unit test (#21542)
following changes to tests in core here:
https://github.com/langchain-ai/langchain/pull/21342/files
5 months ago
Erick Friis 3db85cbb5b
community: deps (#21508) 5 months ago
Erick Friis 8580e350be
cli: release 0.0.22 (#21507) 5 months ago
Anthony Chu c735849e76
azure-dynamic-sessions: add Python REPL tool (#21264)
Adds a Python REPL that executes code in a code interpreter session
using Azure Container Apps dynamic sessions.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
5 months ago
Erick Friis 02701c277f
langchain: core min version (#21506) 5 months ago
Erick Friis 13b01104c9
langchain: drop sqlalchemy max, release 0.2.0rc2 (#21504) 5 months ago
ccurme 375f447e58
community: fix builds with min dependencies (#21495) 5 months ago
Trayan Azarov ba7d53689c
community: Chroma Adding create_collection_if_not_exists flag to Chroma constructor (#21420)
- **Description:** Adds the ability to either `get_or_create` or simply
`get_collection`. This is useful when dealing with read-only Chroma
instances where users are constraint to using `get_collection`. Targeted
at Http/CloudClients mostly.
- **Issue:** chroma-core/chroma#2163
- **Dependencies:** N/A
- **Twitter handle:** `@t_azarov`




| Collection Exists | create_collection_if_not_exists | Outcome | test |

|-------------------|---------------------------------|----------------------------------------------------------------|----------------------------------------------------------|
| True | False | No errors, collection state unchanged |
`test_create_collection_if_not_exist_false_existing` |
| True | True | No errors, collection state unchanged |
`test_create_collection_if_not_exist_true_existing` |
| False | False | Error, `get_collection()` fails |
`test_create_collection_if_not_exist_false_non_existing` |
| False | True | No errors, `get_or_create_collection()` creates the
collection | `test_create_collection_if_not_exist_true_non_existing` |
5 months ago
ccurme 3bb9bec314
bedrock: add unit test for retriever (#21485)
This was implemented in
https://github.com/langchain-ai/langchain/pull/21349 but dropped before
merge.
5 months ago
Renu Rozera 4035a1d234
Add source metadata to bedrock retriever response (#21349)
Thank you for contributing to LangChain!

- [X] **PR title**: "community: Add source metadata to bedrock retriever
response"

- [X] **PR message**: 
- **Description:** Bedrock retrieve API returns extra metadata in the
response which is currently not returned in the retriever response
- **Issue:** The change adds the metadata from bedrock retrieve API
response to the bedrock retriever in a backward compatible way. Renamed
metadata to sourceMetadata as metadata term is being used in the
Document already. This is in sync with what we are doing in llama-index
as well.
    - **Dependencies:** No


- [X] **Add tests and docs**:
  1. Added unit tests
  2. Notebook already exists and does not need any change
3. Response from end to end testing, just to ensure backward
compatibility: `[Document(page_content='Exoplanets.',
metadata={'location': {'s3Location': {'uri':
's3://bucket/file_name.txt'}, 'type': 'S3'}, 'score': 0.46886647,
'source_metadata': {'x-amz-bedrock-kb-source-uri':
's3://bucket/file_name.txt', 'tag': 'space', 'team': 'Nasa', 'year':
1946.0}})]`


- [X] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Piyush Jain <piyushjain@duck.com>
5 months ago
Erick Friis f178c67ad0
community: release 0.2.0rc1, bump deps (#21470) 5 months ago
William FH b28be5d407
Pass through Run ID Explicitly (#21469) 5 months ago
Erick Friis 83eecd54fe
experimental: 0.2 relax (#21468) 5 months ago
roiperlman 9992beaff9
community: Add arguments to whisper parser (#20378)
**Description:** Added a few additional arguments to the whisper parser,
which can be consumed by the underlying API.
The prompt is especially important to fine-tune transcriptions.

---------

Co-authored-by: Roi Perlman <roi@fivesigmalabs.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
Yash cb31c3611f
Ndb enterprise (#21233)
Description: Adds NeuralDBClientVectorStore to the langchain, which is
our enterprise client.

---------

Co-authored-by: kartikTAI <129414343+kartikTAI@users.noreply.github.com>
Co-authored-by: Kartik Sarangmath <kartik@thirdai.com>
5 months ago
Oguz Vuruskaner 5b35f077f9
[community][fix](DeepInfraEmbeddings): Implement chunking for large batches (#21189)
**Description:**
This PR introduces chunking logic to the `DeepInfraEmbeddings` class to
handle large batch sizes without exceeding maximum batch size of the
backend. This enhancement ensures that embedding generation processes
large batches by breaking them down into smaller, manageable chunks,
each conforming to the maximum batch size limit.

**Issue:**
Fixes #21189

**Dependencies:**
No new dependencies introduced.
5 months ago
Sokolov Fedor f4ddf64faa
community: Add MarkdownifyTransformer to langchain_community.document_transformers (#21247)
- Added new document_transformer: MarkdonifyTransformer, that uses
`markdonify` package with customizable options to convert HTML to
Markdown. It's similar to Html2TextTransformer, but has more flexible
options and also I've noticed that sometimes MarkdownifyTransformer
performs better than html2text one, so that's why I use markdownify on
my project.
- Added docs and tests

- Usage:
```python
from langchain_community.document_transformers import MarkdownifyTransformer

markdownify = MarkdownifyTransformer()
docs_transform = markdownify.transform_documents(docs)
```

- Example of better performance on simple task, that I've noticed:
```
<html>
<head><title>Reports on product movement</title></head>
<body>
<p data-block-key="2wst7">The reports on product movement will be useful for forming supplier orders and controlling outcomes.</p>
</body>
```
**Html2TextTransformer**: 
```python
[Document(page_content='The reports on product movement will be useful for forming supplier orders and\ncontrolling outcomes.\n\n')]
# Here we can see 'and\ncontrolling', which has extra '\n' in it
```
**MarkdownifyTranformer**:
```python
[Document(page_content='Reports on product movement\n\nThe reports on product movement will be useful for forming supplier orders and controlling outcomes.')]
```

---------

Co-authored-by: Sokolov Fedor <f.sokolov@sokolov-macbook.bbrouter>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Sokolov Fedor <f.sokolov@sokolov-macbook.local>
Co-authored-by: Sokolov Fedor <f.sokolov@192.168.1.6>
5 months ago
Alex JW d3ce6aad2e
community: Instantiate GPT4AllEmbeddings with parameters (#21238)
### GPT4AllEmbeddings parameters
---

**Description:** 
As of right now the **Embed4All** class inside _GPT4AllEmbeddings_ is
instantiated as it's default which leaves no room to customize the
chosen model and it's behavior. Thus:

- GPT4AllEmbeddings can now be instantiated with custom parameters like
a different model that shall be used.

---------

Co-authored-by: AlexJauchWalser <alexander.jauch-walser@knime.com>
5 months ago
Philippe PRADOS 7be68228da
community[patch]: Make sql record manager fully compatible with async (#20735)
The `_amake_session()` method does not allow modifying the
`self.session_factory` with
anything other than `async_sessionmaker`. This prohibits advanced uses
of `index()`.

In a RAG architecture, it is necessary to import document chunks.
To keep track of the links between chunks and documents, we can use the
`index()` API.
This API proposes to use an SQL-type record manager.

In a classic use case, using `SQLRecordManager` and a vector database,
it is impossible
to guarantee the consistency of the import. Indeed, if a crash occurs
during the import
(problem with the network, ...)
there is an inconsistency between the SQL database and the vector
database.

With the
[PR](https://github.com/langchain-ai/langchain-postgres/pull/32) we are
proposing for `langchain-postgres`,
it is now possible to guarantee the consistency of the import of chunks
into
a vector database.  It's possible only if the outer session is built
with the connection.

```python
def main():
    db_url = "postgresql+psycopg://postgres:password_postgres@localhost:5432/"
    engine = create_engine(db_url, echo=True)
    embeddings = FakeEmbeddings()
    pgvector:VectorStore = PGVector(
        embeddings=embeddings,
        connection=engine,
    )

    record_manager = SQLRecordManager(
        namespace="namespace",
        engine=engine,
    )
    record_manager.create_schema()

    with engine.connect() as connection:
        session_maker = scoped_session(sessionmaker(bind=connection))
        # NOTE: Update session_factories
        record_manager.session_factory = session_maker
        pgvector.session_maker = session_maker
        with connection.begin():
            loader = CSVLoader(
                    "data/faq/faq.csv",
                    source_column="source",
                    autodetect_encoding=True,
                )
            result = index(
                source_id_key="source",
                docs_source=loader.load()[:1],
                cleanup="incremental",
                vector_store=pgvector,
                record_manager=record_manager,
            )
            print(result)
```
The same thing is possible asynchronously, but a bug in
`sql_record_manager.py`
in `_amake_session()` must first be fixed.

```python
    async def _amake_session(self) -> AsyncGenerator[AsyncSession, None]:
        """Create a session and close it after use."""

        # FIXME: REMOVE if not isinstance(self.session_factory, async_sessionmaker):~~
        if not isinstance(self.engine, AsyncEngine):
            raise AssertionError("This method is not supported for sync engines.")

        async with self.session_factory() as session:
            yield session
``` 

Then, it is possible to do the same thing asynchronously:

```python
async def main():
    db_url = "postgresql+psycopg://postgres:password_postgres@localhost:5432/"
    engine = create_async_engine(db_url, echo=True)
    embeddings = FakeEmbeddings()
    pgvector:VectorStore = PGVector(
        embeddings=embeddings,
        connection=engine,
    )
    record_manager = SQLRecordManager(
        namespace="namespace",
        engine=engine,
        async_mode=True,
    )
    await record_manager.acreate_schema()

    async with engine.connect() as connection:
        session_maker = async_scoped_session(
            async_sessionmaker(bind=connection),
            scopefunc=current_task)
        record_manager.session_factory = session_maker
        pgvector.session_maker = session_maker
        async with connection.begin():
            loader = CSVLoader(
                "data/faq/faq.csv",
                source_column="source",
                autodetect_encoding=True,
            )
            result = await aindex(
                source_id_key="source",
                docs_source=loader.load()[:1],
                cleanup="incremental",
                vector_store=pgvector,
                record_manager=record_manager,
            )
            print(result)


asyncio.run(main())
```

---------

Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Sean <sean@upstage.ai>
Co-authored-by: JuHyung-Son <sonju0427@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: YISH <mokeyish@hotmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Jason_Chen <820542443@qq.com>
Co-authored-by: Joan Fontanals <joan.fontanals.martinez@jina.ai>
Co-authored-by: Pavlo Paliychuk <pavlo.paliychuk.ca@gmail.com>
Co-authored-by: fzowl <160063452+fzowl@users.noreply.github.com>
Co-authored-by: samanhappy <samanhappy@gmail.com>
Co-authored-by: Lei Zhang <zhanglei@apache.org>
Co-authored-by: Tomaz Bratanic <bratanic.tomaz@gmail.com>
Co-authored-by: merdan <48309329+merdan-9@users.noreply.github.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
Co-authored-by: Andres Algaba <andresalgaba@gmail.com>
Co-authored-by: davidefantiniIntel <115252273+davidefantiniIntel@users.noreply.github.com>
Co-authored-by: Jingpan Xiong <71321890+klaus-xiong@users.noreply.github.com>
Co-authored-by: kaka <kaka@zbyte-inc.cloud>
Co-authored-by: jingsi <jingsi@leadincloud.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Rahul Triptahi <rahul.psit.ec@gmail.com>
Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: Shengsheng Huang <shannie.huang@gmail.com>
Co-authored-by: Michael Schock <mjschock@users.noreply.github.com>
Co-authored-by: Anish Chakraborty <anish749@users.noreply.github.com>
Co-authored-by: am-kinetica <85610855+am-kinetica@users.noreply.github.com>
Co-authored-by: Dristy Srivastava <58721149+dristysrivastava@users.noreply.github.com>
Co-authored-by: Matt <matthew.gotteiner@microsoft.com>
Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com>
5 months ago
Andreas Motl 17e42bbd18
community[patch]: pgvector: Slight refactoring to make code a bit more reusable (#16243)
- **Description:** Improve [pgvector vector store
adapter](https://github.com/langchain-ai/langchain/blob/v0.1.1/libs/community/langchain_community/vectorstores/pgvector.py)
to make it reusable by adapters deriving from that.
  - **Issue:** NA
  - **Dependencies:** NA
  - **References:** https://github.com/crate-workbench/langchain/pull/1
  - **Addressed to:** @eyurtsev, @cbornet


Hi from the CrateDB team,

first of all, thanks a stack for conceiving and maintaining LangChain.
We are currently [preparing a
patch](https://github.com/crate-workbench/langchain/pull/1) for adding
[CrateDB](https://github.com/crate/crate) to the list of community
adapters.

Because CrateDB aims to be compatible with PostgreSQL to some degree,
the vector store subsystem in LangChain derives functionality from the
corresponding implementation for pgvector.

Therefore, in order to make the implementation more reusable, we needed
to rename the private methods `__from` and `__query_collection` to the
less private counterparts `_from` and `_query_collection`, so they can
be overwritten, in order to unlock other adapters deriving from
[pgvector](https://github.com/langchain-ai/langchain/blob/v0.1.1/libs/community/langchain_community/vectorstores/pgvector.py).

With kind regards,
Andreas.
5 months ago
Mehrdad Shokri f103927b88
bugfix(community): fix Playwright import paths. (#21395)
- **Description:** Fix import class name exporeted from
'playwright.async_api' and 'playwright.sync_api' to match the correct
name in playwright tool. Change import from inline guard_import to
helper function that calls guard_import to make code more readable in
gmail tool. Upgrade playwright version to 1.43.0
- **Issue:** #21354
- **Dependencies:** upgrade playwright version(this is not required for
the bugfix itself, just trying to keep dependencies fresh. I can remove
the playwright version upgrade if you want.)
5 months ago
Shailendra Mishra aa966b6161
Replaced bind variable in SQL with formatted string for compatibility with sql syntax. (#21439)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
5 months ago
Eugene Yurtsev f92006de3c
multiple: langchain 0.2 in master (#21191)
0.2rc 

migrations

- [x] Move memory
- [x] Move remaining retrievers
- [x] graph_qa chains
- [x] some dependency from evaluation code potentially on math utils
- [x] Move openapi chain from `langchain.chains.api.openapi` to
`langchain_community.chains.openapi`
- [x] Migrate `langchain.chains.ernie_functions` to
`langchain_community.chains.ernie_functions`
- [x] migrate `langchain/chains/llm_requests.py` to
`langchain_community.chains.llm_requests`
- [x] Moving `langchain_community.cross_enoders.base:BaseCrossEncoder`
->
`langchain_community.retrievers.document_compressors.cross_encoder:BaseCrossEncoder`
(namespace not ideal, but it needs to be moved to `langchain` to avoid
circular deps)
- [x] unit tests langchain -- add pytest.mark.community to some unit
tests that will stay in langchain
- [x] unit tests community -- move unit tests that depend on community
to community
- [x] mv integration tests that depend on community to community
- [x] mypy checks

Other todo

- [x] Make deprecation warnings not noisy (need to use warn deprecated
and check that things are implemented properly)
- [x] Update deprecation messages with timeline for code removal (likely
we actually won't be removing things until 0.4 release) -- will give
people more time to transition their code.
- [ ] Add information to deprecation warning to show users how to
migrate their code base using langchain-cli
- [ ] Remove any unnecessary requirements in langchain (e.g., is
SQLALchemy required?)

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
5 months ago
ccurme 6b392d6d12
robocorp: release 0.0.6 (#21441) 5 months ago
Tommi Holmgren ee35b9ba56
langchain-robocorp: remove toolkit return content max length (#21436)
Robocorp (action server) toolkit had a limitation that the content
length returned by the tool was always cut to max 5000 chars. This was
from the time when context windows were much more limited.

This PR removes the limitation. Whatever the underlying tool provides
gets sent back to the agent.

As the robocorp toolkit no longer restricts the content, the implication
is that either the Action (tool) developer or the agent developer needs
to be aware of potentially oversized tool responses. Our point of view
is this should be the agent developer's responsibility, them being in
control of the use case and aware of the context window the LLM has.
5 months ago
JuHyung Son 710e57d779
upstage: deprecate UPSTAGE_DOCUMENT_AI_API_KEY (#21363)
Description: We are merging UPSTAGE_DOCUMENT_AI_API_KEY and
UPSTAGE_API_KEY into one, and only UPSTAGE_API_KEY will be used going
forward. And we changed the base class of ChatUpstage to BaseChatOpenAI.

---------

Co-authored-by: Sean <chosh0615@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
5 months ago
Erick Friis 6a295d1ec0
upstage: release 0.1.4 (#21432) 5 months ago
Mateusz Szewczyk 7926cc1929
ibm: Fix llm and embeddings "verify" attribute default value (#21429)
Thank you for contributing to LangChain!

- [x] **PR title**: "langchain-ibm: Fix llm and embeddings 'verify'
attribute default value"


- [x] **PR message**: 
    - **Description:** fix default value of "verify" attribute
    - **Dependencies:** `ibm_watsonx_ai`


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Co-authored-by: Erick Friis <erick@langchain.dev>
5 months ago
Dobiichi-Origami 5b00885b49
community: add `bind_tools` and `with_structured_output` support to `QianfanChatEndpoint` (#21412)
…Endpoint`

Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** add `bind_tools` and `with_structured_output` support
to `QianfanChatEndpoint`


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
5 months ago
Silas Xu aafaf3e193
The predict_and_parse is deprecated, instead pass an output parser directly to LLMChain. (#20130)
The `predict_and_parse` method is deprecated, instead pass an output
parser directly to LLMChain.

- [x] **PR title**: "langchain: update chain_extract.py"


![image](https://github.com/langchain-ai/langchain/assets/40889019/e950d79f-5a0f-4086-86e9-89f627990fe5)

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
5 months ago
ccurme 3c31bd0ed0
langchain: update use of predict_and_parse in LLMChainFilter (#21389)
Following https://github.com/langchain-ai/langchain/pull/20130

Removes deprecation warnings in docs here:
https://python.langchain.com/docs/modules/data_connection/retrievers/contextual_compression/

Tested using the same docs notebook + existing integration test.
5 months ago
Erick Friis bbdf0f8801
experimental[patch]: core and langchain dep (#21402) 5 months ago
Erick Friis e4aca0d052
experimental[patch]: release 0.0.58 (#21397) 5 months ago
Leonid Ganeline 791d59a2c8
community: `callbacks` guard_imports (#21173)
Issue: we have several helper functions to import third-party libraries
like import_uptrain in
[community.callbacks](https://api.python.langchain.com/en/latest/callbacks/langchain_community.callbacks.uptrain_callback.import_uptrain.html#langchain_community.callbacks.uptrain_callback.import_uptrain).
And we have core.utils.utils.guard_import that works exactly for this
purpose.
The import_<package> functions work inconsistently and rather be private
functions.
Change: replaced these functions with the guard_import function.

Related to #21133
5 months ago
Rahul Triptahi 7994cba18d
[Community][Minor]: Fetch loader_source of GoogleDriveLoader in PebbloSafeLoader. (#21314)
Description: This PR includes fix for loader_source to be fetched from
metadata in case of GdriveLoaders.
Documentation: NA
Unit Test: NA

Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
5 months ago
Nuno Campos ad0f3c14c2
core: allow mermaid node labels to have any characters (#21385)
- it's only node ids that are limited

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
5 months ago
Eugene Yurtsev 6a1d61dbf1
community[patch]: Fix in memory vectorstore to take into account ids when adding docs (#21384)
Should respect `ids` if passed
5 months ago
Miroslav 04e2611fea
Added additional headers for HuggingFaceInferenceAPIEmbeddings endpoint. (#21282)
Thank you for contributing to LangChain!

- [ ] **HuggingFaceInferenceAPIEmbeddings**: "Additional Headers"
  - Where: langchain, community, embeddings. huggingface.py.
- Community: add additional headers when needed by custom HuggingFace
TEI embedding endpoints. HuggingFaceInferenceAPIEmbeddings"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** Adding the `additional_headers` to be passed to
requests library if needed
    - **Dependencies:** none
 

- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. Tested with locally available TEI endpoints with and without
`additional_headers`
  2. Example  Usage
  
```python
embeddings=HuggingFaceInferenceAPIEmbeddings(
                             api_key=MY_CUSTOM_API_KEY,
                             api_url=MY_CUSTOM_TEI_URL,
                             additional_headers={
                                "Content-Type": "application/json"
                               }
)
```

 

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Massimiliano Pronesti <massimiliano.pronesti@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
5 months ago
Guangdong Liu 1fe66f5d39
community(patch) fix MoonshotChat moonshot_api_key is invaild for api key (#21361)
Description: close
https://github.com/langchain-ai/langchain/issues/21237
@baskaryan, @eyurtsev
5 months ago
Tomaz Bratanic 0bf7596839
Add simple node properties to llm graph transformer (#21369)
Add support for simple node properties in llm graph transformer.

Linter and dynamic pydantic classes aren't friends, hence I added two
ignores
5 months ago
ccurme 080af0ec53
langchain: sync -> async methods in OpenAI assistants (#21378) 5 months ago
Tomaz Bratanic ad3fd44a7f
experimental: Fix llm graph transformer bug (#21362) 5 months ago
Erick Friis bb81ae5c8c
together: fix chat model and embedding classes (#21353) 5 months ago
Hassan El Mghari d6ef5fe86a
together: add chat models, use openai base (#21337)
**Description:** Adding chat completions to the Together AI package,
which is our most popular API. Also staying backwards compatible with
the old API so folks can continue to use the completions API as well.
Also moved the embedding API to use the OpenAI library to standardize it
further.

**Twitter handle:** @nutlope

- [x] **Add tests and docs**: If you're adding a new integration, please
include
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
5 months ago
Jacob Lee a2d31307bb
Adds confirmation logs after creating a new project (#12618)
@efriis @hwchase17

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
5 months ago
Erick Friis 0fb93cd740
core: release 0.1.52 (#21350) 5 months ago
Wu Enze 32c61b3ece
community[patch]: chat message history mypy fixes #17048 (#20114)
Relates [#17048]
Description : Applied fix to redis and neo4j file.

Error was : `Cannot override writeable attribute with read-only
property`

fix with the same solution of
[[langchain/libs/community/langchain_community/chat_message_histories/elasticsearch.py](d5c412b0a9/libs/community/langchain_community/chat_message_histories/elasticsearch.py (L170-L175))]

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
5 months ago
nrpd25 95cc8e3fc3
premai[patch]:Standardized model init args (#21308)
[Standardized model init args
#20085](https://github.com/langchain-ai/langchain/issues/20085)
- Enable premai chat model to be initialized with `model_name` as an
alias for `model`, `api_key` as an alias for `premai_api_key`.
- Add initialization test `test_premai_initialization`

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
5 months ago
Nuno Campos 6f17158606
fix: core: Include in json output also fields set outside the constructor (#21342) 5 months ago
Tomaz Bratanic ac14f171ac
Add indexed properties to neo4j enhanced schema (#21335) 5 months ago
scaserini a6cdf6572f
community: add Kendra DocumentRelevanceOverrideConfigurations request parameter (#20695)
- **Description:** add **DocumentRelevanceOverrideConfigurations**
request parameter to Kendra retriever

Co-authored-by: Simone Caserini <simone.caserini@klarna.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
5 months ago
Nuno Campos 0345bcf4ef
Fix failing test for serialization (#21344)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
5 months ago
Trayan Azarov 93226b1945
community: Updated Chroma version range to include 0.5.0 release (#21224)
- Updated Chroma version range to allow releases in 0.5.x.
- Bumped mypy version as linting was failing
5 months ago
Jorge Piedrahita Ortiz e65652c3e8
community: add SambaNova embeddings integration (#21227)
- **Description:**  SambaNova hosted embeddings integration
5 months ago
Jorge Piedrahita Ortiz df1c10260c
community: minor changes sambanova integration (#21231)
- **Description:** fix: variable names in root validator not allowing
pass credentials as named parameters in llm instancing, also added
sambanova's sambaverse and sambastudio llms to __init__.py for module
import
5 months ago
Jan Soubusta d9a61c0fa9
fix: respect table_name argument when calling from_texts (#21252)
valid for from_documents() as well

fixes #21251
5 months ago
Pedro Lima bebf46c4a2
community: added args_schema to YahooFinanceNewsTool (#21232)
Description: this change adds args_schema (pydantic BaseModel) to
YahooFinanceNewsTool for correct schema formatting on LLM function calls

Issue: currently using YahooFinanceNewsTool with OpenAI function calling
returns the following error "TypeError("YahooFinanceNewsTool._run() got
an unexpected keyword argument '__arg1'")". This happens because the
schema sent to the LLM is "input: "{'__arg1': 'MSFT'}"" while the method
should be called with the "query" parameter.

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
5 months ago
Mark Cusack 060987d755
community[minor]: Add indexing via locality sensitive hashing to the Yellowbrick vector store (#20856)
- **Description:** Add LSH-based indexing to the Yellowbrick vector
store module
- **Twitter handle:** @markcusack

---------

Co-authored-by: markcusack <markcusack@markcusacksmac.lan>
Co-authored-by: markcusack <markcusack@Mark-Cusack-sMac.local>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
5 months ago
Rashmi Pawar a2fdabdad2
mark NemoEmbeddings as deprecated (#21239)
The NemoEmbeddings is deprecated, instead use
langchain-nvidia-ai-endpoints NVIDIAEmbeddings interface.

cc: @mattf

---------

Co-authored-by: Daniel Glogowski <167348611+dglogo@users.noreply.github.com>
Co-authored-by: andyjessen <62343929+andyjessen@users.noreply.github.com>
Co-authored-by: Chris Germann <88305668+TAAGECH9@users.noreply.github.com>
Co-authored-by: gere <gere@kapo.zh.ch>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
5 months ago
Erick Friis 9e4b24a2d6
langchain: release 0.1.18 (#21338) 5 months ago
Erick Friis 5c000f8d79
community: release 0.0.37 (#21332) 5 months ago
Leonid Ganeline 8c13e8a79b
langchain: `qa_chain` fix (#21279)
Issue: `load_qa_chain` is placed in the __init__.py file. As a result,
it is not listed in the API Reference docs.
BTW `load_qa_chain` is heavily presented in the doc examples, but is
missed in API Ref.
Change: moved code from init.py into a new file. Related: #21266
5 months ago
Erick Friis 7ecf9996f1
community: Revert "community: langkit dependency" (#21333)
Reverts langchain-ai/langchain#21174

Hey team - going to revert this because it doesn't seem necessary for
testing. We should only be adding optional + extended_testing
dependencies for deps that have extended tests.

otherwise it just increases probability of dependency conflicts in the
community lockfile.
5 months ago
Param Singh fee91d43b7
baichuan[patch]:standardize chat init args (#21298)
Thank you for contributing to LangChain!

community:baichuan[patch]: standardize init args

updated `baichuan_api_key` so that aliased to `api_key`. Added test that
it continues to set the same underlying attribute. Test checks for
`SecretStr`

updated `temperature` with Pydantic Field, added unit test. 

Related to https://github.com/langchain-ai/langchain/issues/20085
5 months ago
Christophe Bornet 484a009012
community[minor]: Relax constraints on Cassandra VectorStore constructors (#21209)
If Session and/or keyspace are not provided, they are resolved from
cassio's context. So they are not required.
This change is fully backward compatible.
5 months ago
Leonid Ganeline 6feddfae88
community: langkit dependency (#21174)
Issue: the `langkit` package is not presented in the `pyproject.toml`
but it is a requirement for the `WhyLabsCallbackHandler`
Change: added `langkit`

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
5 months ago
Erick Friis 811e9cee8b
core: release 0.1.51 (#21328) 5 months ago
Mateusz Szewczyk 682d21c3de
ibm: Add support for ibm-watsonx-ai new major version (#21313)
Thank you for contributing to LangChain!

- [x] **PR title**: "langchain-ibm: Add support for ibm-watsonx-ai new
major version"


- [x] **PR message**: 
    - **Description:** Add support for ibm-watsonx-ai new major version
    - **Dependencies:** `ibm_watsonx_ai`


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Co-authored-by: Erick Friis <erick@langchain.dev>
5 months ago
Chris Papademetrious ee6c922c91
langchain[minor]: enhance `LocalFileStore` to offer `update_atime` parameter that updates access times on read (#20951)
**Description:**
The `LocalFileStore` class can be used to create an on-disk
`CacheBackedEmbeddings` cache. The number of files in these embeddings
caches can grow to be quite large over time (hundreds of thousands) as
embeddings are computed for new versions of content, but the embeddings
for old/deprecated content are not removed.

A *least-recently-used* (LRU) cache policy could be applied to the
`LocalFileStore` directory to delete cache entries that have not been
referenced for some time:

```bash
# delete files that have not been accessed in the last 90 days
find embeddings_cache_dir/ -atime 90 -print0 | xargs -0 rm
```

However, most filesystems in enterprise environments disable access time
modification on read to improve performance. As a result, the access
times of these cache entry files are not updated when their values are
read.

To resolve this, this pull request updates the `LocalFileStore`
constructor to offer an `update_atime` parameter that causes access
times to be updated when a cache entry is read.

For example,

```python
file_store = LocalFileStore(temp_dir, update_atime=True)
```

The default is `False`, which retains the original behavior.

**Testing:**
I updated the LocalFileStore unit tests to test the access time update.
5 months ago
Tomaz Bratanic 5b6d1a907d
Add the extract types to diffbot graph transformer (#21315)
Before you could only extract triples (diffbot calls it facts) from
diffbot to avoid isolated nodes. However, sometimes isolated nodes can
still be useful like for prefiltering, so we want to allow users to
extract them if they want. Default behaviour is unchanged.
5 months ago
aditya thomas b868c78a12
partners[anthropic]: update unit test for key passed in from the environment (#21290)
**Description:** Update unit test for ChatAnthropic
**Issue:** Test for key passed in from the environment should not have
the key initialized in the constructor
**Dependencies:** None
5 months ago
Rohan Aggarwal 8021d2a2ab
community[minor]: Oraclevs integration (#21123)
Thank you for contributing to LangChain!

- Oracle AI Vector Search 
Oracle AI Vector Search is designed for Artificial Intelligence (AI)
workloads that allows you to query data based on semantics, rather than
keywords. One of the biggest benefit of Oracle AI Vector Search is that
semantic search on unstructured data can be combined with relational
search on business data in one single system. This is not only powerful
but also significantly more effective because you don't need to add a
specialized vector database, eliminating the pain of data fragmentation
between multiple systems.


- Oracle AI Vector Search is designed for Artificial Intelligence (AI)
workloads that allows you to query data based on semantics, rather than
keywords. One of the biggest benefit of Oracle AI Vector Search is that
semantic search on unstructured data can be combined with relational
search on business data in one single system. This is not only powerful
but also significantly more effective because you don't need to add a
specialized vector database, eliminating the pain of data fragmentation
between multiple systems.
This Pull Requests Adds the following functionalities
Oracle AI Vector Search : Vector Store
Oracle AI Vector Search : Document Loader
Oracle AI Vector Search : Document Splitter
Oracle AI Vector Search : Summary
Oracle AI Vector Search : Oracle Embeddings


- We have added unit tests and have our own local unit test suite which
verifies all the code is correct. We have made sure to add guides for
each of the components and one end to end guide that shows how the
entire thing runs.


- We have made sure that make format and make lint run clean.

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: skmishraoracle <shailendra.mishra@oracle.com>
Co-authored-by: hroyofc <harichandan.roy@oracle.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
ccurme c9e9470c5a
langchain: fix deprecation decorators on extraction chains (#21276)
Calling any of these raises
```
ValueError: A pending deprecation cannot have a scheduled removal
```
5 months ago
Wickes Wong ee1adaacaa
langchain[patch]: Fix summary buffer memory with return message flag (#21115)
## Description
Memory return could be set as `str` or `message` by `return_messages`
flag as mentioned in
https://python.langchain.com/docs/modules/memory/#whether-memory-is-a-string-or-a-list-of-messages,
where
`langchain.chains.conversation.memory.ConversationSummaryBufferMemory`
did not implement that.
This commit added `buffer_as_str` and `buffer_as_messages` function, and
`buffer` now affected by `return_messages` flag.

## Example Test Code and Output

```python
# Fix: ConversationSummaryBufferMemory with return_messages flag function
# Test code
from langchain.chains.conversation.memory import ConversationSummaryBufferMemory
from langchain_community.llms.ollama import Ollama

llm = Ollama()

# Create an instance of ConversationSummaryBufferMemory with return_messages set to True
memory = ConversationSummaryBufferMemory(return_messages=True, llm=llm)

# Add user and AI messages to the chat memory
memory.chat_memory.add_user_message("hi!")
memory.chat_memory.add_ai_message("what's up?")

# Print the buffer
print("Buffer:")
print(*map(type, memory.buffer), sep="\n")
print(memory.buffer, "\n")

# Print the buffer as a string
print("Buffer as String:")
print(type(memory.buffer_as_str))
print(memory.buffer_as_str, "\n")

# Print the buffer as messages
print("Buffer as Messages:")
print(*map(type, memory.buffer_as_messages), sep="\n")
print(memory.buffer_as_messages, "\n")

# Print the buffer after setting return_messages to False
memory.return_messages = False
print("Buffer after setting return_messages to False:")
print(type(memory.buffer))
print(memory.buffer, "\n")
```

```plaintext
Buffer:
<class 'langchain_core.messages.human.HumanMessage'>
<class 'langchain_core.messages.ai.AIMessage'>
[HumanMessage(content='hi!'), AIMessage(content="what's up?")] 

Buffer as String:
<class 'str'>
Human: hi!
AI: what's up? 

Buffer as Messages:
<class 'langchain_core.messages.human.HumanMessage'>
<class 'langchain_core.messages.ai.AIMessage'>
[HumanMessage(content='hi!'), AIMessage(content="what's up?")] 

Buffer after setting return_messages to False:
<class 'str'>
Human: hi!
AI: what's up? 
```

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
5 months ago
Leonid Ganeline 9639457222
community[patch]: `tools` imports (#21156)
Issue: we have several helper functions to import third-party libraries
like tools.gmail.utils.import_google in
[community.tools](https://api.python.langchain.com/en/latest/community_api_reference.html#id37).
And we have core.utils.utils.guard_import that works exactly for this
purpose.
The import_<package> functions work inconsistently and rather be private
functions.
Change: replaced these functions with the guard_import function.

Related to #21133
5 months ago
Leonid Ganeline 3ef8b24277
core[patch]: `utils.guard_import` fix (#21133)
Issues (nit): 
1. `utils.guard_import` prints wrong error message when there is an
import `error.` It prints the whole `module_name` but should be only the
first part as the pip package name. E.i. `langchain_core.utils` -> print
not `langchain-core` but `langchain_core.utils`. Also replace '_' with
'-' in the pip package name.
2. it does not handle the `ModuleNotFoundError` which raised if
`guard_import("wrong_module")`

Fixed issues; added ut-s. Controversial: I've reraised
`ModuleNotFoundError` as `ImportError`, since in case of the error, the
proposed action is the same - we need to install a missed package.
5 months ago
Erick Friis 36c2ca3c8b
mistralai: relax tokenizers dep (#21277) 5 months ago
Nuno Campos 6e1e0c7d5c
fix: core: draw_mermaid() would create subgroup for edges with same src and tgt (#21275)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
5 months ago
Eugene Yurtsev 26a37dce0a
langchain[patch]: Remove jsonpatch from poetry file (#21272)
jsonpatch is only used in langchain-core not in langchain
5 months ago
Eugene Yurtsev 335bd01e45
langchain[patch]: Update deprecation warning (#21268)
Update deprecation warning
5 months ago
Leonid Ganeline 23a05c3986
langchain: `summarize` chain fix (#21266)
Issue: `load_summarize_chain` is placed in the __init__.py file. As a
result, it doesn't listed in the API Reference docs.
Change: moved code from __init__.py into a new file.
5 months ago
ccurme 6da3d92b42
(all): update removal in deprecation warnings from 0.2 to 0.3 (#21265)
We are pushing out the removal of these to 0.3.

`find . -type f -name "*.py" -exec sed -i ''
's/removal="0\.2/removal="0.3/g' {} +`
5 months ago
Eugene Yurtsev d6e34f9ee5
langchain[patch]: Improve deprecation warnings (#21262)
* Remove spurious derprecation warning
* Make deprecation warnings consistent with 0.1 namespaces that were announced as deprecated
5 months ago
Eugene Yurtsev 487aff7e46
langchain[patch]: Revert 20794 until 0.2 release (#21257)
PR of 2079 was already released as part of 0.1.17rc.


Issue for 0.2 release:
https://github.com/langchain-ai/langchain/issues/21080
5 months ago
Eugene Yurtsev ba4a309d98
langchain[patch]: Revert breaking change until 0.2 release (#21256)
Reverts a minor breaking change until 0.2 release
5 months ago
Eugene Yurtsev 66a1e3f083
langchain[patch]: Fix flaky unit test (#21258)
Should sort the results of the import test since it depends on import order
5 months ago
Eugene Yurtsev 0989c48028
langchain[minor]: Re-add deleted ainetwork tool (#21254)
* Adding __init__.py to turn it into a package in community
* Adding proxy imports that assume that langchain_community is optional
5 months ago
Christophe Bornet 2fbe82f5e6
community[minor]: Relax constraints on CassandraChatMessageHistory constructor (#21241) 5 months ago
Chris Germann 3a8d1d8838
Hotfix RetrievalQA Docs: docs: Fix formatting (#21183)
# Newline Characters breaking formatting 

**Description**: 
As you can see in the image below, the formatting in the documentation
is broken. As far as I can see the two added `\n` characters are
breaking the documentation. Therefore I would propose to remove those

![image](https://github.com/langchain-ai/langchain/assets/88305668/23b6e726-71b2-4812-91ea-3e8600683733)

**Dependencies**:
None

**Twitter Handle**
- epu9byj

---------

Co-authored-by: gere <gere@kapo.zh.ch>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
5 months ago
Bagatur 67a5cc34c6
openai[patch]: Release 0.1.6 (#21236) 5 months ago
Erick Friis c1eb95b967
core: release 0.1.50 (#21230) 5 months ago
Nuno Campos 47ce8d5a57
core: tracer: remove numeric execution order (#21220)
- this hasn't been used in a long time and requires some additional
bookkeeping i'm going to streamline in the next pr
5 months ago
Bagatur 6ac6158a07
openai[patch]: support tool_choice="required" (#21216)
Co-authored-by: ccurme <chester.curme@gmail.com>
5 months ago
xindoo c1aa237bc2
langchain: fix syntax error in code comment for create_tool_calling_agent (#21205)
**PR message**:
- **Description:** Corrected a syntax error in the code comments within
the `create_tool_calling_agent` function in the langchain package.
- **Issue:** N/A
- **Dependencies:** No additional dependencies required.
- **Twitter handle:** N/A
5 months ago
ccurme eb0a2fd53a
mistral: release 0.1.6 (#21214) 5 months ago
ccurme 2d77e5e3a1
(standard tests): add test for basic conversation sequence (#21213) 5 months ago
Maxime Perrin 1ebb5a70ad
partners(mistralai): Removing unused variable in completion request (using tool_calls or content) (#21201)
This PR fixes #21196.

The error was occurring when calling chat completion API with a chat
history. Indeed, the Mistral API does not accept both `content` and
`tool_calls` in the same body.

This PR removes one of theses variables depending on the necessity.

---------

Co-authored-by: Maxime Perrin <mperrin@doing.fr>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
5 months ago
Christophe Bornet 683fb45c6b
community[patch]: Refactor CassandraDatabase wrapper (#21075)
* Introduce individual `fetch_` methods for easier typing.
* Rework some docstrings to google style
* Move some logic to the tool
* Merge the 2 cassandra utility files
5 months ago
Raghav Dixit 7d451d0041
community[patch]: Update lancedb.py (#21192)
very minor update in LanceDB integration, 'metric' argument was missing.
5 months ago
Bagatur d297d90ad9
core[patch]: Release 0.1.49 (#21211) 5 months ago
Nuno Campos 663747b730
core[patch]: Fixes for convert_messages (#21207)
- support two-tuples of any sequence type (eg. json.loads never produces
tuples)
- support type alias for role key
- if id is passed in in dict form use it
- if tool_calls passed in in dict form use them

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
Eugene Yurtsev df49404794
langchain[patch]: Make more memory code handle community dependency as optional (#21199) 5 months ago
ccurme bd5d2c2674
langchain: import InMemoryChatMessageHistory from core (#21198) 5 months ago
Eugene Yurtsev 3cd7fced5f
langchain[patch],community[minor]: Migrate memory implementations to community (#20845)
Migrates memory implementations to community
5 months ago
Eugene Yurtsev b5c3a04e4b
langchain[patch]: chat histories to handle optional community dependence (#21194) 5 months ago
Eugene Yurtsev c9119b0e75
langchain[patch],community[minor]: Move some unit tests from langchain to community, use core for fake models (#21190) 5 months ago
Eugene Yurtsev c306364b06
langchain[patch]: Update more code to use langchain community as an optional dependency (#21170)
More code to use langchain community as an optional dependency
5 months ago
Bagatur 6fa8626e2f
openai[patch]: fix azure open lc serialization, release 0.1.5 (#21159) 5 months ago
Eugene Yurtsev 94a838740e
langchain[patch]: Migrate more code in utils to use optional langchain import (#21166)
Moving is interactive util to avoid circular deps
5 months ago
Eugene Yurtsev 23fdd320bc
langchain[patch]: Migrate more code to use optional community in agents namespace (#21167) 5 months ago
Tomaz Bratanic 9e53fa7d2e
Some more fixes to neo4j enhanced schema (#21139) 5 months ago
Erick Friis 0694538c39
ai21: fix core version (#21168) 5 months ago
Eugene Yurtsev 44602bdc20
langchain[patch],community[minor]: Move load_tools to community (#21158)
Move load tools to community
5 months ago
Eugene Yurtsev 9932f49b3e
langchain[patch]: Migrate llms to use optional community imports (#21101) 5 months ago
Eugene Yurtsev 57e8e70daa
langchain[patch]: Migrate chat models to optional community imports (#21090)
Migrate chat models to optional community imports
5 months ago
Eugene Yurtsev 2914abd747
langchain[patch]: Fix how the serializable test identifies serializable objects (#21165)
dir() will not work if we're using optional imports. The only way to do this is by using contents of __all__
5 months ago
Eugene Yurtsev 23c5d87311
langchain[patch]: Migrate utils to use optional langchain_community (#21163)
Migrate utils to use optional imports from langchain community
5 months ago
Eugene Yurtsev bec3eee3fa
langchain[patch]: Migrate retrievers to use optional langchain community imports (#21155) 5 months ago
Eugene Yurtsev 43110daea5
langchain[patch]: Update some agent tool kits to handle community import as optional (#21157)
A few things that were not caught by the migration script
5 months ago
Eugene Yurtsev 59f10ab3e0
langchain[patch]: Migrate embeddings to optional imports (#21099) 5 months ago
Eugene Yurtsev 2f709d94d7
langchain[patch]: Migrate vectorstores to use optional langchain community imports (#21150) 5 months ago
Eugene Yurtsev 7230e430db
langchain[patch]: Migrate top level files to use optional langchain community (#21152)
Migrate a few top level files to treat langchain community as an optional dependency
5 months ago
Erick Friis daab9789a8
ai21: release 0.1.4 (#21151) 5 months ago
Asaf Joseph Gardin 642975dd9f
partners: AI21 Labs Jamba Support (#20815)
Description: Added support for AI21 new model - Jamba
Twitter handle: https://github.com/AI21Labs

---------

Co-authored-by: Asaf Gardin <asafg@ai21.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
5 months ago
Eugene Yurtsev 7a39fe60da
langchain[patch]: Migrate utilities to handle langchain community as optional (#21149) 5 months ago
Eugene Yurtsev b879184595
langchain[patch]: embedddings distance move import of openai embeddings into local scope (#21148) 5 months ago
Eugene Yurtsev 0e5bf16d00
langchain[patch]: Migrate document loaders to use optional langchain community imports (#21095) 5 months ago
Harrison Chase 4d1c21d97d
community[patch]: Fix alternative name in deprecation notice for sql_database (#21144) 5 months ago
East Agile 2a6f78a53f
community[minor]: Rememberizer retriever (#20052)
**Description:**
This pull request introduces a new feature for LangChain: the
integration with the Rememberizer API through a custom retriever.
This enables LangChain applications to allow users to load and sync
their data from Dropbox, Google Drive, Slack, their hard drive into a
vector database that LangChain can query. Queries involve sending text
chunks generated within LangChain and retrieving a collection of
semantically relevant user data for inclusion in LLM prompts.
User knowledge dramatically improved AI applications.
The Rememberizer integration will also allow users to access general
purpose vectorized data such as Reddit channel discussions and US
patents.

**Issue:**
N/A

**Dependencies:**
N/A

**Twitter handle:**
https://twitter.com/Rememberizer
5 months ago
Eugene Yurtsev 1ce1a10f2b
langchain[patch],community[minor]: Move graph index creator (#20795)
Move graph index creator to community
5 months ago
Eugene Yurtsev aa0bc7467c
langchain[patch]: Migrate agents module into optional imports for community (#21088) 5 months ago
Eugene Yurtsev 86ff8a3fb4
langchain[patch]: Update docstore module to use optional imports from community (#21091) 5 months ago
Eugene Yurtsev d640605694
langchain[patch]: Migrate chat loaders to optional community imports (#21089)
Migrate chat loaders to optional community imports
5 months ago
Eugene Yurtsev 2fcab9acd9
langchain[patch]: Upgrade storage to treat langchain community as optional (#21105) 5 months ago
William FH ab55f6996d
[Core] Tracing: update parent run_tree's child_runs (#21049) 5 months ago
aditya thomas 12b1caf295
openai[patch]: add tests for secret_str for keys (#20982)
**Description:** Add tests to check API keys and Active Directory tokens
are masked
**Issue:** Resolves #12165 for OpenAI and Azure OpenAI models
**Dependencies:** None

Also resolves #12473 which may be closed.

Additional contributors @alex4321 (#12473) and @onesolpark (#12542)
5 months ago
Noah 45ddf4d26f
community[patch]: Update comments for lazy_load method (#21063)
- [ ] **PR message**: 
- **Description:** Refactored the lazy_load method to use asynchronous
execution for improved performance. The method now initiates scraping of
all URLs simultaneously using asyncio.gather, enhancing data fetching
efficiency. Each Document object is yielded immediately once its content
becomes available, streamlining the entire process.
    - **Issue:** N/A
- **Dependencies:** Requires the asyncio library for handling
asynchronous tasks, which should already be part of standard Python
libraries in Python 3.7 and above.
    - **Email:** [r73327118@gmail.com](mailto:r73327118@gmail.com)

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
Liu Xiaodong 3b473d10f2
experimental: clean python repl input(experimental:Added code for PythonREPL) (#20930)
Update python.py(experimental:Added code for PythonREPL)

Added code for PythonREPL, defining a static method 'sanitize_input'
that takes the string 'query' as input and returns a sanitizing string.
The purpose of this method is to remove unwanted characters from the
input string, Specifically:

1. Delete the whitespace at the beginning and end of the string (' \s').
2. Remove the quotation marks (`` ` ``) at the beginning and end of the
string.
3. Remove the keyword "python" at the beginning of the string (case
insensitive) because the user may have typed it.

This method uses regular expressions (regex) to implement sanitizing.

It all started with this code:
from langchain.agents import Tool
from langchain_experimental.utilities import PythonREPL

python_repl = PythonREPL()
repl_tool = Tool(
    name="python_repl",
description="Remove redundant formatting marks at the beginning and end
of source code from input.Use a Python shell to execute python commands.
If you want to see the output of a value, you should print it out with
`print(...)`.",
    func=python_repl.run,
)

When I call the agent to write a piece of code for me and execute it
with the defined code, I must get an error: SyntaxError('invalid
syntax', ('<string>', 1, 1,'In', 1, 2))

After checking, I found that pythonREPL has less formatting of input
code than the soon-to-be deprecated pythonREPL tool, so I added this
step to it, so that no matter what code I ask the agent to write for me,
it can be executed smoothly and get the output result.
I have tried modifying the prompt words to solve this problem before,
but it did not work, and by adding a simple format check, the problem is
well resolved.
<img width="1271" alt="image"
src="https://github.com/langchain-ai/langchain/assets/164149097/c49a685f-d246-4b11-b655-fd952fc2f04c">

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
Ismail Hossain Polas 1fdf63fa6c
community[patch]: update package name to bagelML (#19948)
**Description**
This pull request updates the Bagel Network package name from
"betabageldb" to "bagelML" to align with the latest changes made by the
Bagel Network team.

The following modifications have been made:

- Updated all references to the old package name ("betabageldb") with
the new package name ("bagelML") throughout the codebase.
- Modified the documentation, and any relevant scripts to reflect the
package name change.
- Tested the changes to ensure that the functionality remains intact and
no breaking changes were introduced.

By merging this pull request, our project will stay up to date with the
latest Bagel Network package naming convention, ensuring compatibility
and smooth integration with their updated library.

Please review the changes and provide any feedback or suggestions. Thank
you!
5 months ago
Tomaz Bratanic 7860e4c649
experimental[patch]: Add support for non-function calling LLMs in llm graph transformers (#21014) 5 months ago
tianzedavid 5a8909440b
docs: remove repetitive words (#21058)
remove repetitive words

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
5 months ago
Tomaz Bratanic c9e96bb5e2
community[patch]: Fix neo4j enhanced schema bugs (#21072) 5 months ago
junkeon 8d2909ee25
upstage[minor]: Update few codes and add upstage loader in pdf section (#21085)
**Description:** Update UpstageLayoutAnalysisParser and Loader and add
upstage loader example in pdf section
**Dependencies:** langchain_community
**Twitter handle:** [@upstageai](https://twitter.com/upstageai)

- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
5 months ago
Bagatur bef50ded63
openai[patch]: fix special token default behavior (#21131)
By default handle special sequences as regular text
5 months ago
MacanPN 0f7f448603
community[patch]: add delete() method to AzureSearch vector store (#21127)
**Issue:**
Currently `AzureSearch` vector store does not implement `delete` method.
This PR implements it. This also makes it compatible with LangChain
indexer.

**Dependencies:**
None

**Twitter handle:**
@martintriska1

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
Erick Friis 14422a4220
langchain: fix core dep (#21128) 5 months ago
Erick Friis 6c938da302
langchain: release 0.1.17 (#21125) 5 months ago
Eugene Yurtsev bf95414758
langchain[minor]: enhance unit test to test imports recursively (#21122) 5 months ago
Eugene Yurtsev e4f51f59a2
langchain[patch]: Migrate tools to treat community imports as optional (#21117)
Migrate tools to treat community imports as optional
5 months ago
Eugene Yurtsev 9e788f09c6
langchain[patch]: Migrate output parsers to support optional community imports (#21103)
Migrate output parsers
5 months ago
Eugene Yurtsev 3853fe9f64
langchain[patch]: Migrate graphs to use optional community imports (#21100)
Migrate graphs to use optional community imports.
5 months ago
Eugene Yurtsev 8658d52587
langchain[patch]: Upgrade prompts to optional imports (#21078)
Upgrades prompts module to use optional imports.

This code was generated with a migration script, but had to be adjusted
manually a bit.

Testing in preparation for applying this code modification across the
rest of the modules in langchain package to reverse the dependency
between langchain community and langchain.
5 months ago
Eugene Yurtsev 9b6d04a187
langchain[patch]: Migrate document transformers (#21098)
Migrate document transformers
5 months ago
Eugene Yurtsev aec13a6123
langchain[patch]: Migrate callbacks module to use optional imports for community (#21086) 5 months ago
Erick Friis 8a62fb0570
community: release 0.0.36 (#21118) 5 months ago
Erick Friis 2407c353be
core: release 0.1.48 (#21113) 5 months ago
Charlie Marsh fd94aa8366
partner[patch]: Upgrade to Ruff v0.4.2 (#21108)
## Summary

No new diagnostics (given that the set of enabled rules hasn't changed),
but gains access to our new parser (much faster) and reduced false
positives all around.
5 months ago
Jamsheed Mistri 3e749369ef
community[minor]: bump version of LayerupSecurity, add support for untrusted_input parameter (#19985)
**Description:** update version of LayerupSecurity package for the
Layerup Security integration. Add untrusted_input parameter.
5 months ago
fubuki8087 f1c3687aa5
community[patch]: Using the right encoding to parse the web page in RecursiveUrlLoader (#20632)
As shown in #13749 , `RecursiveUrlLoader` has encoding issue. This PR is
to solve this.

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
Jakub Pawłowski b0b1a67771
community[patch]: Skip unexpected 404 HTTP Error in Arxiv download (#21042)
### Description:
When attempting to download PDF files from arXiv, an unexpected 404
error frequently occurs. This error halts the operation, regardless of
whether there are additional documents to process. As a solution, I
suggest implementing a mechanism to ignore and communicate this error
and continue processing the next document from the list.

Proposed Solution: To address the issue of unexpected 404 errors during
PDF downloads from arXiv, I propose implementing the following solution:

- Error Handling: Implement error handling mechanisms to catch and
handle 404 errors gracefully.
- Communication: Inform the user or logging system about the occurrence
of the 404 error.
- Continued Processing: After encountering a 404 error, continue
processing the remaining documents from the list without interruption.

This solution ensures that the application can handle unexpected errors
without terminating the entire operation. It promotes resilience and
robustness in the face of intermittent issues encountered during PDF
downloads from arXiv.

### Issue:
#20909 
### Dependencies:
none

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
5 months ago
Erick Friis b9c53e95b7
community: release 0.0.35 (#21104) 5 months ago
Eugene Yurtsev 3c064a757f
core[minor],langchain[patch],community[patch]: Move storage interfaces to core (#20750)
* Move storage interface to core
* Move in memory and file system implementation to core
5 months ago
Charlie Marsh 8f38b7a725
multiple: Remove unnecessary Ruff suppression comments (#21050)
## Summary

I ran `ruff check --extend-select RUF100 -n` to identify `# noqa`
comments that weren't having any effect in Ruff, and then `ruff check
--extend-select RUF100 -n --fix` on select files to remove all of the
unnecessary `# noqa: F401` violations. It's possible that these were
needed at some point in the past, but they're not necessary in Ruff
v0.1.15 (used by LangChain) or in the latest release.

Co-authored-by: Erick Friis <erick@langchain.dev>
5 months ago
Erick Friis 748f2ba9ea
core: release 0.1.47 (#21094) 5 months ago
Eugene Yurtsev c8f18a2524
langchain[patch]: Update import handling in `adapters` (#21079) 5 months ago
William FH 5c63ac3dd7
[Patch] Dedent docstring (#20959)
Technically a slight prompt breaking change, but I think positive EV in
that it saves tokens and results in more sane / in-distribution prompts
5 months ago
Eugene Yurtsev 845d8e0025
langchain[patch]: Update handling of deprecation warnings (#21083)
Chains should not be emitting deprecation warnings.
5 months ago
Christophe Bornet 5c77f45b06
community[minor]: Add async methods to CassandraCache and CassandraSemanticCache (#20654) 5 months ago
William FH db14d4326d
[Core] Feat Pretty Print Tool calls (#20997)
Right now, `tool_calls` are not included in the `pretty_print()` output.
Would be nice to show!


![image](https://github.com/langchain-ai/langchain/assets/13333726/6a0ffca3-d02f-4e18-bc76-513eeca2e964)
5 months ago
Kuro Denjiro fa4124b821
community[minor]: add mintbase loader to langchain (#20089)
- [x] **Add Near NFT loader**: "community: Load NFT near block chain
using mintbase graph API"

- [x] **PR message**: 
    - **Description:** a description of the change
    - **Twitter handle:**Kurodenjiro

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
Alexander Dicke d7e12750df
community[patch]: allows using `text-generation-inference` /generate route with `HuggingFaceEndpoint` (#20100)
- **Description:** allows to use the /generate route of
`text-generation-inference` with the `HuggingFaceEndpoint`
5 months ago
davidkgp 28b0b0d863
community[patch]: Fix for github issue #17690 (#20117)
…/17690

Thank you for contributing to LangChain!

- [x] **Fix Google Lens knowledge graph issue**: "langchain: community"
- Fix for [No "knowledge_graph" property in Google Lens API call from
SerpAPI](https://github.com/langchain-ai/langchain/issues/17690)


- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** handled the existence of keys in the json response of
Google Lens
- **Issue:** [No "knowledge_graph" property in Google Lens API call from
SerpAPI](https://github.com/langchain-ai/langchain/issues/17690)



- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/


If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
5 months ago
高远 a7a4630bf4
community[patch]: Modify the text field type and add new exception handling (#20116)
Co-authored-by: gaoyuan <gaoyuan.20001218@bytedance.com>
5 months ago
Rahul Triptahi c172611647
community[patch]: Add classifier_url argument in PebbloSafeLoader and documentation update. (#21030)
Description: Add classifier_url argument in PebbloSafeLoader.
Documentation: Updated PebbloSafeLoader documentation with above change
and new links for pebblo github pages.

---------

Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
5 months ago
Leonid Ganeline 08d08d7c83
docs: langchain docstrings updates (#21032)
Added missed docstings. Formatted docstrings into a consistent format.
5 months ago
Leonid Ganeline 85094cbb3a
docs: community docstring updates (#21040)
Added missed docstrings. Updated docstrings to consistent format.
5 months ago
Rodrigo Nogueira 90f19028e5
community[patch]: Add maritalk streaming (sync and async) (#19203)
Co-authored-by: RosevalJr <rdmalajr@gmail.com>
Co-authored-by: Roseval Donisete Malaquias Junior <roseval@maritaca.ai>
5 months ago
Cahid Arda Öz cc6191cb90
community[minor]: Add support for Upstash Vector (#20824)
## Description

Adding `UpstashVectorStore` to utilize [Upstash
Vector](https://upstash.com/docs/vector/overall/getstarted)!

#17012 was opened to add Upstash Vector to langchain but was closed to
wait for filtering. Now filtering is added to Upstash vector and we open
a new PR. Additionally, [embedding
feature](https://upstash.com/docs/vector/features/embeddingmodels) was
added and we add this to our vectorstore aswell.

## Dependencies

[upstash-vector](https://pypi.org/project/upstash-vector/) should be
installed to use `UpstashVectorStore`. Didn't update dependencies
because of [this comment in the previous
PR](https://github.com/langchain-ai/langchain/pull/17012#pullrequestreview-1876522450).

## Tests

Tests are added and they pass. Tests are naturally network bound since
Upstash Vector is offered through an API.

There was [a discussion in the previous PR about mocking the
unittests](https://github.com/langchain-ai/langchain/pull/17012#pullrequestreview-1891820567).
We didn't make changes to this end yet. We can update the tests if you
can explain how the tests should be mocked.

---------

Co-authored-by: ytkimirti <yusuftaha9@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
Leonid Ganeline 1a2ff56cd8
core[patch[: docstring update (#21036)
Added missed docstrings. Updated docstrings to consistent format.
5 months ago
Eugene Yurtsev f479a337cc
langchain[patch]: replace deprecated imports with imports from langchain_core (#21033)
* Output of running the migration script.
* Ran only against langchain code itself and not the unit tests.
5 months ago
Eugene Yurtsev 82d4afcac0
langchain[minor]: Code to handle dynamic imports (#20893)
Proposing to centralize code for handling dynamic imports. This allows treating langchain-community as an optional dependency.

---

The proposal is to scan the code base and to replace all existing imports with dynamic imports using this functionality.
5 months ago
Erick Friis 854ae3e1de
mistralai: release 0.1.5, allow client passing in (#21034) 5 months ago
chyroc 3e241956d3
community[minor]: add coze chat model (#20770)
add coze chat model, to call coze.com apis
5 months ago
Eugene Yurtsev 29493bb598
cli[minor]: improve confirmation message with more details (#21027)
Improve confirmation message with more details
5 months ago
Eugene Yurtsev aab78a37f3
cli[patch]: Ignore imports that change the name of the class (#21026)
Not currently handeled by migration script
5 months ago
Massimiliano Pronesti ce89b34fc0
community[patch]: support hybrid search with threshold in Azure AI Search Retriever (#20907)
Support hybrid search with a score threshold -- similar to what we do
for similarity search.
5 months ago
Andrei Panferov b3efa38cc0
community[patch]: GigaChat model selection fix (#20988)
Fixed the error that the model name is never actually put into GigaChat
request payload, always defaulting to `GigaChat-Lite`.

With this fix, model selection through
```python
import os
from langchain.chat_models.gigachat import GigaChat

chat = GigaChat(
    name="GigaChat-Pro", # <- HERE!!!!!
    ...
)
```
should actually work, as intended in
[here](804390ba4b/libs/community/langchain_community/llms/gigachat.py (L36)).

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
5 months ago
Patrick McFadin 3331865f6b
community[minor]: add Cassandra Database Toolkit (#20246)
**Description**: ToolKit and Tools for accessing data in a Cassandra
Database primarily for Agent integration. Initially, this includes the
following tools:
- `cassandra_db_schema` Gathers all schema information for the connected
database or a specific schema. Critical for the agent when determining
actions.
- `cassandra_db_select_table_data` Selects data from a specific keyspace
and table. The agent can pass paramaters for a predicate and limits on
the number of returned records.
- `cassandra_db_query` Expiriemental alternative to
`cassandra_db_select_table_data` which takes a query string completely
formed by the agent instead of parameters. May be removed in future
versions.

Includes unit test and two notebooks to demonstrate usage. 

**Dependencies**: cassio
**Twitter handle**: @PatrickMcFadin

---------

Co-authored-by: Phil Miesle <phil.miesle@datastax.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
Igor Brai b3e74f2b98
community[minor]: add mojeek search util (#20922)
**Description:** This pull request introduces a new feature to community
tools, enhancing its search capabilities by integrating the Mojeek
search engine
**Dependencies:** None

---------

Co-authored-by: Igor Brai <igor@mojeek.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
5 months ago
hmn falahi 4822beb298
Ignore self/cls from required args of class functions in convert_to_openai_tool (#20691)
Removed redundant self/cls from required args of class functions in
_get_python_function_required_args:

```python
class MemberTool:
    def search_member(
            self,
            keyword: str,
            *args,
            **kwargs,
    ):
        """Search on members with any keyword like first_name, last_name, email

        Args:
            keyword: Any keyword of member
        """

        headers = dict(authorization=kwargs['token'])
        members = []
        try:
            members = request_(
                method='SEARCH',
                url=f'{service_url}/apiv1/members',
                headers=headers,
                json=dict(query=keyword),
            )

        except Exception as e:
            logger.info(e.__doc__)

        return members

convert_to_openai_tool(MemberTool.search_member)
```
expected result:
```
{'type': 'function', 'function': {'name': 'search_member', 'description': 'Search on members with any keyword like first_name, last_name, username, email', 'parameters': {'type': 'object', 'properties': {'keyword': {'type': 'string', 'description': 'Any keyword of member'}}, 'required': ['keyword']}}}
```

#20685

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
Eugene Yurtsev 4f4ee8e2cf
cli[patch]: Update migrations file manually (#21021)
We need to replace occurrences in the code of RunnableMap not just the
import,
so for now, we don't replace RunnableMap.
5 months ago
Tomaz Bratanic 67428c4052
community[patch]: Neo4j enhanced schema (#20983)
Scan the database for example values and provide them to an LLM for
better inference of Text2cypher
5 months ago
aditya thomas 8b59bddc03
anthropic[patch]: add tests for secret_str for api key (#20986)
**Description:** Add tests to check API keys are masked
**Issue:** Resolves
https://github.com/langchain-ai/langchain/issues/12165 for Anthropic
models
**Dependencies:** None
5 months ago
Pengcheng Liu 1fad39be1c
community[minor]: Add LarkSuite wiki document loader. (#21016)
**Description:** Add LarkSuite wiki document loader. Refer to [LarkSuite
api document
](https://open.feishu.cn/document/server-docs/docs/wiki-v2/space-node/list)for
details.
**Issue:** None
**Dependencies:** None
**Twitter handle:** None
5 months ago
Leonid Ganeline dc7c06bc07
community[minor]: import fix (#20995)
Issue: When the third-party package is not installed, whenever we need
to `pip install <package>` the ImportError is raised.
But sometimes, the `ValueError` or `ModuleNotFoundError` is raised. It
is bad for consistency.
Change: replaced the `ValueError` or `ModuleNotFoundError` with
`ImportError` when we raise an error with the `pip install <package>`
message.
Note: Ideally, we replace all `try: import... except... raise ... `with
helper functions like `import_aim` or just use the existing
[langchain_core.utils.utils.guard_import](https://api.python.langchain.com/en/latest/utils/langchain_core.utils.utils.guard_import.html#langchain_core.utils.utils.guard_import)
But it would be much bigger refactoring. @baskaryan Please, advice on
this.
5 months ago
Karim Lalani 2ddac9a7c3
experimental[minor]: Add bind_tools and with_structured_output functions to OllamaFunctions (#20881)
Implemented bind_tools for OllamaFunctions.
Made OllamaFunctions sub class of ChatOllama.
Implemented with_structured_output for OllamaFunctions.

integration unit test has been updated.
notebook has been updated.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
Eugene Yurtsev d781560722
cli[minor]: Add ipynb support, add text_splitters (#20963) 5 months ago