Commit Graph

3467 Commits

Author SHA1 Message Date
Guangdong Liu
2c835baae4
code[patch]: Add in code documentation to core Runnable with_retry method (docs only) (#19192)
- **Description:** Add in code documentation to core Runnable with_retry
method (docs only)
- **Issue:** #18804 
@baskaryan @eyurtsev PTAL

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-03-19 12:52:29 -04:00
Eugene Yurtsev
4b3dd34544
core[patch]: Pass sync run manager for sync stream fallback in astream (#19280)
This PR patches the fallback in chat models and language models to pass
in the appropriate version of the run manager (sync vs. async)
2024-03-19 16:32:33 +00:00
Leonid Ganeline
d314acb2d5
core[patch]: Move globals to a module instead of a package (non breaking change) (#19159)
Classes and functions defined in __init__.py are not parsed into the API
Reference.
For example: libs/core/langchain_core/globals/__init__.py :
`set_verbose` `get_llm_cache`, `set_llm_cache`, ...
And the whole `langchain_core.globals` namespace is not visible in the
API Reference. The refactoring is just file renaming.
2024-03-19 12:29:12 -04:00
Al-Ekram Elahee Hridoy
50f93d86ec
core[minor]: Enhance cache flexibility in BaseChatModel (#17386)
- **Description:** Enhanced the `BaseChatModel` to support an
`Optional[Union[bool, BaseCache]]` type for the `cache` attribute,
allowing for both boolean flags and custom cache implementations.
Implemented logic within chat model methods to utilize the provided
custom cache implementation effectively. This change aims to provide
more flexibility in caching strategies for chat models.
  - **Issue:** Implements enhancement request #17242.
- **Dependencies:** No additional dependencies required for this change.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-03-19 11:26:58 -04:00
Zihong
ff31cc1648
experimental: update the notebook link of semantic chunk. (#19253)
update the notebook link of semantic chunk.
2024-03-19 07:24:51 -04:00
Frederico Wu
f36418a5b0
langchain: creating assistants with file_ids (#19199)
Changing OpenAIAssistantRunnable.create_assistant to send the `file_ids`
parameter to openai.beta.assistants.create

Co-authored-by: Frederico Wu <fred.diaswu@coxautoinc.com>
2024-03-18 21:34:03 -07:00
Vittorio Rigamonti
9b2f9ee952
community: VectorStore Infinispan, adding autoconfiguration (#18967)
**Description**:
this PR enable VectorStore autoconfiguration for Infinispan: if
metadatas are only of basic types, protobuf
config will be automatically generated for the user.
2024-03-18 21:33:45 -07:00
Max Jakob
6f544a6a25
elasticsearch: check for deployed models (#18973)
When creating a new index, if we use a retrieval strategy that expects a
model to be deployed in Elasticsearch, check if a model with this name
is indeed deployed before creating an index. This lowers the probability
to get into a state in which an index was created with a faulty model
ID, which cannot be overwritten any more (the index has to manually be
deleted).
2024-03-18 21:32:00 -07:00
gonvee
b82644078e
community: Add keep_alive parameter to control how long the model w… (#19005)
Add `keep_alive` parameter to control how long the model will stay
loaded into memory with Ollama。

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-19 04:29:01 +00:00
Roshan Santhosh
7afecec280
core: update _rm_titles to account for title argument name bug (#19036)
Issue : For functions which have an argument with the name 'title', the
convert_pydantic_to_openai_function generates an incorrect output and
omits the argument all together. This is because the _rm_titles function
removes all instances of the the key 'title' from the output.



Description : Updates the _rm_titles function to check the presence of
the 'type' key as well before removing the 'title' key. As the title key
that we wish to omit always has a type key along with it.

Potential gap if there is a function defined which has both title and
key as argument names, in which case this would fail. Maybe we could set
a filter on the function argument names and reject those with keyword
argument names.


No dependencies. Passed all tests. 


- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [x] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-03-18 21:25:06 -07:00
Harrison Chase
efcdf54edd
Josha91 fix docstring (#19249)
Co-authored-by: Josha van Houdt <josha.van.houdt@sap.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-18 21:19:56 -07:00
Simon Stone
58c7687174
langchain: preserve document metadata in FlashrankRerank (#19148)
**Description:** Preserves document metadata in `FlashrankRerank`
    - **Issue:** #19142
    - **Dependencies:** None
    - **Twitter handle:** n/a

---------

Co-authored-by: Simon Stone <simon.stone@dartmouth.edu>
2024-03-19 04:15:18 +00:00
Aaron Jimenez
bc648f6cfc
core: Updated docstring for Context class (#19079)
- **Description:** Improves the docstring for `class Context` by
providing an overview and an example.
- **Issue:** #18803
2024-03-18 21:15:14 -07:00
Taqi Jaffri
044bc22acc
Community: Add mistral oss model support to azureml endpoints, plus configurable timeout (#19123)
- **Description:** There was no formatter for mistral models for Azure
ML endpoints. Adding that, plus a configurable timeout (it was hard
coded before)
- **Dependencies:** none
- **Twitter handle:** @tjaffri @docugami
2024-03-18 21:10:42 -07:00
Kangmoon Seo
07de4abe70
core: Fix Exception handling in XMLOutputParser (#19126)
- **Description:** 
  - Exception handling in `XMLOutputParser`
1. Add Exception handling at `root = ET.fromstring(text)` // raises
`ET.ParseError`
    2. Fix Exception class (commonly uses in `BaseOutputParser` class)
  - AS-IS: raise `ValueError`, `ET.ParserError` without handling
    ```python
    # langchain_core/output_parsers/xml.py

        text = text.strip()
        if (text.startswith("<") or text.startswith("\n<")) and (
            text.endswith(">") or text.endswith(">\n")
        ):
            root = ET.fromstring(text)
            return self._root_to_dict(root)
        else:
            raise ValueError(f"Could not parse output: {text}")
    ```
  - TO-BE: raise `OutputParserException`
    ```python
    # langchain_core/output_parsers/xml.py

        text = text.strip()
        if (text.startswith("<") or text.startswith("\n<")) and (
            text.endswith(">") or text.endswith(">\n")
        ):
            try:
                root = ET.fromstring(text)
                return self._root_to_dict(root)

            except ET.ParseError:
raise OutputParserException(f"Could not parse output: {text}")

        else:
raise OutputParserException(f"Could not parse output: {text}")

    ``` 
- **Issue:** #19107  
- **Dependencies:** None
2024-03-18 21:08:32 -07:00
Hamza Muhammad Farooqi
24a0a4472a
Add docstrings for Clickhouse class methods (#19195)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-03-19 04:03:12 +00:00
Rohit Gupta
785f8ab174
[langchain_community] milvus vectorstores upsert: add **kwargs to make it use for other argument also (#19193)
add **kwargs in add_documents for upsert, to make it use for other
argument also.
Lets use this, it was unused as of now.

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

Co-authored-by: Rohit Gupta <rohit.gupta2@walmart.com>
2024-03-18 21:01:12 -07:00
Cycle
77868b1974
experimental: add buffer_size hyperparameter to SemanticChunker as in source video (#19208)
add buffer_size hyperparameter which used in combine_sentences function
2024-03-19 03:54:20 +00:00
Shotaro Sano
ca9c8c58ea
text-splitters, infra: fix libs/langchain/dev.Dockerfile so that the text-splitter directory is copied before poetry installation (#19214)
## Description
This PR modifies the settings in `libs/langchain/dev.Dockerfile` to
ensure that the `text-splitters` directory is copied before the poetry
installation process begins.

Without this modification, the `docker build` command fails for
`dev.Dockerfile`, preventing the setup of some development environments,
including `.devcontainer`.

## Bug Details

### Repro
Run the following command:

```bash
docker build -f libs/langchain/dev.Dockerfile .
```

### Current Behavior
The docker build command fails, raising the following error:

```
...
 => [langchain-dev-dependencies 4/5] COPY libs/community/ ../community/                                                                                0.4s
 => ERROR [langchain-dev-dependencies 5/5] RUN poetry install --no-interaction --no-ansi --with dev,test,docs                                          1.1s
------                                                                                                                                                      
 > [langchain-dev-dependencies 5/5] RUN poetry install --no-interaction --no-ansi --with dev,test,docs:
#13 0.970 
#13 0.970 Directory ../text-splitters does not exist
------
executor failed running [/bin/sh -c poetry install --no-interaction --no-ansi --with dev,test,docs]: exit code: 1
```

### Expected Behavior
The `docker build` command successfully completes without the poetry
error.

### Analysis
The error occurs because the `text-splitters` directory is not copied
into the build environment, unlike the other packages under the `libs`
directory. I suspect that the `COPY` setting was overlooked since
`text-splitters` was separated in a recent PR.

## Fix
Add the following lines to the `libs/langchain/dev.Dockerfile`:

```dockerfile
# Copy the text-splitters library for installation
COPY libs/text-splitters/ ../text-splitters/
```
2024-03-18 20:45:35 -07:00
Guangdong Liu
c3310c5e7f
community: Fix Milvus got multiple values for keyword argument 'timeout' (#19232)
- **Description:** Fix Milvus got multiple values for keyword argument
'timeout'
- **Issue:**  fix #18580
- @baskaryan @eyurtsev PTAL
2024-03-18 20:44:25 -07:00
Erick Friis
95904fe443
langchain[patch]: update base imports to core (#19248)
still deprecated, but was misleading before
2024-03-19 03:17:07 +00:00
Asaf Joseph Gardin
21c45475c5
ai21[patch]: AI21 Labs bump SDK version (#19114)
Description: Added support AI21 SDK version 2.1.2
Twitter handle: https://github.com/AI21Labs

---------

Co-authored-by: Asaf Gardin <asafg@ai21.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-18 19:47:08 -07:00
Jib
516cc44b3f
langchain-mongodb: [test-fix] add explicit index_name setting on test vector creation (#19245)
- **Description:** Tests fail to do value lookup because it does not
specify the index name
  - **Issue:** the issue # Failing integration test
 

- [x] **Add tests and docs**: Tests now pass


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
2024-03-18 15:52:28 -07:00
William FH
780337488e
[Enhancement] Add support for directly providing a run_id (#18990)
The root run id (~trace id's) is useful for assigning feedback, but the
current recommended approach is to use callbacks to retrieve it, which
has some drawbacks:
1. Doesn't work for streaming until after the first event
2. Doesn't let you call other endpoints with the same trace ID in
parallel (since you have to wait until the call is completed/started to
use

This PR lets you provide = "run_id" in the runnable config.

Couple considerations:

1. For batch calls, we split the trace up into separate trees (to permit
better rendering). We keep the provided run ID for the first one and
generate a unique one for other elements of the batch.
2. For nested calls, the provided ID is ONLY used on the top root/trace.



### Example Usage


```
chain.invoke("foo", {"run_id": uuid.uuid4()})
```
2024-03-18 15:03:04 -07:00
Jacob Lee
bd329e9aad
core[patch]: Add LLM output to message response_metadata (#19158)
This will more easily expose token usage information.

CC @baskaryan

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-18 13:58:32 -07:00
Erick Friis
6fa1438334
mongodb[patch]: release 0.1.2 (#19243) 2024-03-18 13:35:45 -07:00
Leonid Ganeline
7de1d9acfd
community: llms imports fixes (#18943)
Classes are missed in  __all__  and in different places of __init__.py
- BaichuanLLM 
- ChatDatabricks
- ChatMlflow
- Llamafile
- Mlflow
- Together
Added classes to __all__. I also sorted __all__ list.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-18 20:24:40 +00:00
Kenzie Mihardja
21f75991d4
deprecate community docugami loader (#19230)
Thank you for contributing to LangChain!

- [x] **PR title**: "community: deprecate DocugamiLoader"

- [x] **PR message**: Deprecate the langchain_community and use the
docugami_langchain DocugamiLoader

---------

Co-authored-by: Kenzie Mihardja <kenzie28@cs.washington.edu>
2024-03-18 12:56:47 -07:00
Jib
ec026004cb
mongodb[patch]: Remove in-memory cache from cache abstractions (#18987)
## Description
* In memory cache easily gets out of sync with the server cache, so we
will remove it entirely to reduce the issues around invalidated caches.

## Dependencies
None

- [x]  If you're adding a new integration, please include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-18 19:44:34 +00:00
Jib
866d6408af
mongodb[patch]: Remove embedding retrieval from mongodb payload (#19035)
## Description
Returning the embedding is not necessary in the vector search
functionality unless specified as a debugging step. This change defaults
the behavior such that the server _only_ returns the embedding key if
explicitly requested, such as in the case of
`max_marginal_relevance_search`.


- [x] **Add tests and docs**: If you're adding a new integration, please
include
* Added `test_from_documents_no_embedding_return`


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-18 19:43:50 +00:00
Leonid Kuligin
366ba77459
core[minor]: moved fake llms and embeddings to core (#19226)
- [ ] **PR title**: "core: moved fake llms and embeddings to core"


- [ ] **PR message**:
 - **Description:** moved fake llms and embeddings to core"
2024-03-18 10:01:26 -07:00
Pengfei Jiang
514fe80778
community[patch]: add stop parameter support to volcengine maas (#19052)
- **Description:** add stop parameter to volcengine maas model
- **Dependencies:** no

---------

Co-authored-by: 江鹏飞 <jiangpengfei.jiangpf@bytedance.com>
2024-03-17 01:58:50 +00:00
htaoruan
bcc771e37c
docs: ChatTongyi example error (#19013) 2024-03-17 01:55:56 +00:00
primate88
5aa68936e0
community: Fix import path for StreamingStdOutCallbackHandler example (#19170)
- Description:
- Updated the import path for `StreamingStdOutCallbackHandler` in the
streaming response example within `huggingface_endpoint.py`. This change
corrects the import statement to reflect the actual location of
`StreamingStdOutCallbackHandler` in
`langchain_core.callbacks.streaming_stdout`.
- Issue:
  - None
- Dependencies:
  - No additional dependencies are required for this change.
- Twitter handle:
  - None

## Note:
I have tested this change locally and confirmed that the
`StreamingStdOutCallbackHandler` works as expected with the updated
import path. This PR does not require the addition of new tests since it
is a correction to documentation/examples rather than functional code.
2024-03-17 00:50:37 +00:00
Bagatur
611d5a1618
openai[patch]: fix async http client (#19164)
Fix #19116
2024-03-16 17:50:22 -07:00
Nikhil Kumar
635b3372bd
community[minor]: Add support for translation in HuggingFacePipeline (#19190)
- [x] **Support for translation**: "community: Add support for
translation in `HuggingFacePipeline`"


- [x] **Add support for translation in `HuggingFacePipeline`**:
- **Description:** Add support for translation in `HuggingFacePipeline`,
which earlier used to support only text summarization and generation.
    - **Issue:** N/A
    - **Dependencies:** N/A
    - **Twitter handle:** None
2024-03-17 00:48:13 +00:00
Nikhil Kumar
a1b26dd9b6
docs: Add docs for RouterRunnable (#19191)
- [x] **Docs for `RouterRunnable`**: core: Add docs for `RouterRunnable`

- [x] **Add docs for `RouterRunnable`**:
- **Description:** Add docs for `RouterRunnable`, which was previously
missing documentation
    - **Issue:** #18803 
    - **Dependencies:** N/A
    - **Twitter handle:** None
2024-03-17 00:48:00 +00:00
k.muto
8d2c34e655
community: Fix all page numbers were the same for _BaseGoogleVertexAISearchRetriever (#19175)
- Description:
- This pull request is to fix a bug where page numbers were not set
correctly. In the current code, all chunks share the same metadata
object doc_metadata, so the page number is set with the same value for
all documents. To fix this, I changed to using separate metadata objects
for each chunk.
- Issue:
  - None
- Dependencies:
  - No additional dependencies are required for this change.
- Twitter handle:
  - @eycjur

- Test
- Even if it's not a bug, there are cases where everything ends up with
the same number of pages, so it's very difficult for me to write
integration tests.
2024-03-16 22:28:56 +00:00
Cailin Wang
7cd87d2f6a
community: Add partition parameter to DashVector (#19023)
**Description**: DashVector Add partition parameter
**Twitter handle**: @CailinWang_

---------

Co-authored-by: root <root@Bluedot-AI>
2024-03-16 15:20:30 -07:00
Rodrigo Nogueira
e64cf1aba4
community: Add model argument for maritalk models and better error handling (#19187) 2024-03-16 15:18:56 -07:00
Sergey Kozlov
1a55e950aa
community[patch]: support fastembed v1 and v2 (#19125)
**Description:**
#18040 forces `fastembed>2.0`, and this causes dependency conflicts with
the new `unstructured` package (different `onnxruntime`). There may be
other dependency conflicts.. The only way to use
`langchain-community>=0.0.28` is rollback to `unstructured 0.10.X`. But
new `unstructured` contains many fixes.

This PR allows to use both `fastembed` `v1` and `v2`.

How to reproduce:

`pyproject.toml`:
```toml
[tool.poetry]
name = "depstest"
version = "0.0.0"
description = "test"
authors = ["<dev@example.org>"]

[tool.poetry.dependencies]
python = ">=3.10,<3.12"
langchain-community = "^0.0.28"
fastembed = "^0.2.0"
unstructured = {extras = ["pdf"], version = "^0.12"}
```

```bash
$ poetry lock
```

Co-authored-by: Sergey Kozlov <sergey.kozlov@ludditelabs.io>
2024-03-15 18:33:51 -07:00
six17
fd4f536c77
text-splitters[patch]: fix json split of RecursiveJsonSplitter (#19119)
- **Description:** This modification addresses the issue of mutable
default parameters in functions. In the original code, the `chunks`
parameter is defaulted to a list containing an empty dictionary, which
is mutable. Since default parameters in Python are evaluated only once
at function definition time, modifications to the parameter would
persist across future calls. By changing the default to `None` and
checking/initializing within the function, a new list is created for
each call, thus avoiding potential issues.

---------

Co-authored-by: sixiang <sixiang@lixiang.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-15 16:46:49 -07:00
aditya thomas
80eb510a7b
docs: update docstring of Together class (#19008)
**Description:** Update docstring of Together class to show example and
update API URL
**Issue:** Improves usability
**Dependencies:** None
**Lint and test**: `make format`, `make lint` and `make test` were run
2024-03-15 16:30:45 -07:00
高远
ef9813dae6
docs: add vikingdb docstrings(#19016)
Co-authored-by: gaoyuan <gaoyuan.20001218@bytedance.com>
2024-03-15 16:29:29 -07:00
wulixuan
0e0030f494
community[patch]: fix yuan2 chat model errors while invoke. (#19015)
1. fix yuan2 chat model errors while invoke.
2. update related tests.
3. fix some deprecationWarning.
2024-03-15 16:28:36 -07:00
Shuai Liu
c244e1a50b
community[patch]: Fixed bug in merging generation_info during chunk concatenation in Tongyi and ChatTongyi (#19014)
- **Description:** 

In #16218 , during the `GenerationChunk` and `ChatGenerationChunk`
concatenation, the `generation_info` merging changed from simple keys &
values replacement to using the util method
[`merge_dicts`](https://github.com/langchain-ai/langchain/blob/master/libs/core/langchain_core/utils/_merge.py):


![image](https://github.com/langchain-ai/langchain/assets/2098020/10f315bf-7fe0-43a7-a0ce-6a3834b99a15)

The `merge_dicts` method could not handle merging values of `int` or
some other types, and would raise a
[`TypeError`](https://github.com/langchain-ai/langchain/blob/master/libs/core/langchain_core/utils/_merge.py#L55).

This PR fixes this issue in the **Tongyi and ChatTongyi Model** by
adopting the `generation_info` of the last chunk
and discarding the `generation_info` of the intermediate chunks,
ensuring that `stream` and `astream` function correctly.

- **Issue:**  
    - Related issues or PRs about Tongyi & ChatTongyi: #16605, #17105 
    - Other models or cases: #18441, #17376
- **Dependencies:** No new dependencies
2024-03-15 16:27:53 -07:00
Christophe Bornet
f2a7dda4bd
community[patch]: Use langchain-astradb for AstraDB doc loader (#19071)
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-15 22:57:25 +00:00
Holt Skinner
cee03630d9
community[patch]: Add Blended Search Support to GoogleVertexAISearchRetriever (#19082)
https://cloud.google.com/generative-ai-app-builder/docs/create-data-store-es#multi-data-stores

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-15 22:39:31 +00:00
Eugene Yurtsev
0ddfe7fc9d
langchain[patch]: make hub work with older langchainhub versions (#19076)
Make it work with older clients
2024-03-15 15:37:52 -07:00
case-k
ebc4a64f9e
docs: fix databricks document url (#19096)
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-15 22:25:11 +00:00
Guangdong Liu
4468e5bdbe
docs: Add in code documentation to core Runnable with_fallbacks method (docs only) (#19104)
- Description: [a description of the change] Add in code documentation
to core Runnable with_fallbacks method (docs only)
- Issue: the issue #18804 
@eyurtsev PTAL
2024-03-15 15:21:10 -07:00
Guangdong Liu
cced3eb9bc
community[patch]: Fix sparkllm embeddings api bug. (#19122)
- **Description:** Fix sparkllm embeddings api bug.
@baskaryan PTAL
2024-03-15 15:08:49 -07:00
kaijietti
c20aeef79a
community[patch]: implement qdrant _aembed_query and use it in other async funcs (#19155)
`amax_marginal_relevance_search ` and `asimilarity_search_with_score `
should use an async version of `_embed_query `.
2024-03-15 21:20:12 +00:00
Barun Amalkumar Halder
34d6f0557d
community[patch] : publishes duration as milliseconds to Fiddler (#19166)
**Description:** Many LLM steps complete in sub-second duration, which
can lead to non-collection of duration field for Fiddler. This PR
updates duration from seconds to milliseconds.
**Issue:** [INTERNAL] FDL-17568
**Dependencies:** NA
**Twitter handle:** behalder

Co-authored-by: Barun Halder <barun@fiddler.ai>
2024-03-15 14:04:56 -07:00
Eugene Yurtsev
745d2476a2
langchain: upgrade mypy (#19163)
Update mypy in langchain
2024-03-15 16:37:09 -04:00
Maxime Perrin
aa785fa6ec
core[minor]: allow LLMs async streaming to fallback on sync streaming (#18960)
- **Description:** Handling fallbacks when calling async streaming for a
LLM that doesn't support it.
- **Issue:** #18920 
- **Twitter handle:**@maximeperrin_

---------

Co-authored-by: Maxime Perrin <mperrin@doing.fr>
2024-03-15 16:06:50 -04:00
Barun Amalkumar Halder
b551d49cf5
community[patch] : adds feedback and status for Fiddler callback handler events (#19157)
**Description:** This PR adds updates the fiddler events schema to also
pass user feedback, and llm status to fiddler
   **Tickets:** [INTERNAL] FDL-17559 
   **Dependencies:**  NA
   **Twitter handle:** behalder

Co-authored-by: Barun Halder <barun@fiddler.ai>
2024-03-15 12:03:49 -07:00
Juan Felipe Arias
f5b9aedc48
community[patch]: add args_schema to sql_database tools for langGraph integration (#18595)
- **Description:** This modification adds pydantic input definition for
sql_database tools. This helps for function calling capability in
LangGraph. Since actions nodes will usually check for the args_schema
attribute on tools, This update should make these tools compatible with
it (only implemented on the InfoSQLDatabaseTool)
  - **Issue:** N/A
  - **Dependencies:** N/A
  - **Twitter handle:** juanfe8881
2024-03-15 19:03:36 +00:00
fengjial
c922ea36cb
community[minor]: Add Baidu VectorDB as vector store (#17997)
Co-authored-by: fengjialin <fengjialin@MacBook-Pro.local>
2024-03-15 19:01:58 +00:00
Erick Friis
781aee0068
community, langchain, infra: revert store extended test deps outside of poetry (#19153)
Reverts langchain-ai/langchain#18995

Because it makes installing dependencies in python 3.11 extended testing
take 80 minutes
2024-03-15 17:10:47 +00:00
Erick Friis
9e569d85a4
community, langchain, infra: store extended test deps outside of poetry (#18995)
poetry can't reliably handle resolving the number of optional "extended
test" dependencies we have. If we instead just rely on pip to install
extended test deps in CI, this isn't an issue.
2024-03-15 05:55:30 +00:00
Bagatur
191ddbc77e
core[patch]: rc release 0.1.33-rc.1 (#19103) 2024-03-14 20:21:54 -07:00
Nuno Campos
508f75853c
core[patch]: Change structured prompt lc id to match js (#19099) 2024-03-14 20:02:52 -07:00
Erick Friis
7ce81eb6f4
voyageai[patch]: init package (#19098)
Co-authored-by: fodizoltan <zoltan@conway.expert>
Co-authored-by: Yujie Qian <thomasq0809@gmail.com>
Co-authored-by: fzowl <160063452+fzowl@users.noreply.github.com>
2024-03-15 00:56:10 +00:00
Asaf Joseph Gardin
4d7f6fa968
ai21[patch]: AI21 Labs Batch Support in Embeddings (#18633)
Description: Added support for batching when using AI21 Embeddings model
Twitter handle: https://github.com/AI21Labs

---------

Co-authored-by: Asaf Gardin <asafg@ai21.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-14 23:10:23 +00:00
Erick Friis
d5cf360329
ibm[patch]: release 0.1.3 (#19094) 2024-03-14 15:59:42 -07:00
Mateusz Szewczyk
b15d150d22
ibm[patch]: add async tests, add tokenize support (#18898)
- **Description:** add async tests, add tokenize support
- **Dependencies:**
[ibm-watsonx-ai](https://pypi.org/project/ibm-watsonx-ai/),
  - **Tag maintainer:** 

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally -> 
Please make sure integration_tests passing locally -> 

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-14 22:57:05 +00:00
billytrend-cohere
7253b816cc
community: Add support for cohere SDK v5 (keeps v4 backwards compatibility) (#19084)
- **Description:** Add support for cohere SDK v5 (keeps v4 backwards
compatibility)

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-14 15:53:24 -07:00
Eugene Yurtsev
06165efb5b
core[patch]: RunnablePassthrough transform to autoupgrade to AddableDict (#19051)
Follow up on https://github.com/langchain-ai/langchain/pull/18743 which
missed RunnablePassthrough

Issues:

https://github.com/langchain-ai/langchain/issues/18741
https://github.com/langchain-ai/langgraph/issues/136
https://github.com/langchain-ai/langserve/issues/504
2024-03-14 16:59:46 -04:00
Eugene Yurtsev
6cdca4355d
community[minor]: Revamp PGVector Filtering (#18992)
This PR makes the following updates in the pgvector database:

1. Use JSONB field for metadata instead of JSON
2. Update operator syntax to include required `$` prefix before the
operators (otherwise there will be name collisions with fields)
3. The change is non-breaking, old functionality is still the default,
but it will emit a deprecation warning
4. Previous functionality has bugs associated with comparisons due to
casting to text (so lexical ordering is used incorrectly for numeric
fields)
5. Adds an a GIN index on the JSONB field for more efficient querying
2024-03-14 16:56:00 -04:00
Guangdong Liu
d4b025c812
code[patch]: Add in code documentation to core Runnable assign method (docs only) (#18951)
**PR message**: ***Delete this entire checklist*** and replace with
- **Description:** [a description of the change](docs: Add in code
documentation to core Runnable assign method)
    - **Issue:** the issue  #18804
2024-03-14 15:41:19 -04:00
Bagatur
573f48e34d
core[patch]: Release 0.1.32 (#19088) 2024-03-14 12:01:58 -07:00
YHW
69a8ef2693
core: Runnable pass kwargs to _astream_log_implementation in astream_log (#19055)
- **Description:** When calling the `_stream_log_implementation` from
the `astream_log` method in the `Runnable` class, it is not handing over
the `kwargs` argument. Therefore, even if i want to customize APIHandler
and implement additional features with additional arguments, it is not
possible. Conversely, the `astream_events` method normally handing over
the `kwargs` argument.
- **Issue:** https://github.com/langchain-ai/langchain/issues/19054
- **Dependencies:**
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!

Co-authored-by: hyungwookyang <hyungwookyang@worksmobile.com>
2024-03-14 14:39:46 -04:00
Nuno Campos
751fb7de20
Add new beta StructuredPrompt (#19080)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-03-14 10:40:34 -07:00
Anton Parkhomenko
ae73b9d839
community[patch]: Fix NotionDBLoader 400 Error by conditionally adding filter parameter (#19075)
- **Description:** This change fixes a bug where attempts to load data
from Notion using the NotionDBLoader resulted in a 400 Bad Request
error. The issue was traced to the unconditional addition of an empty
'filter' object in the request payload, which Notion's API does not
accept. The modification ensures that the 'filter' object is only
included in the payload when it is explicitly provided and not empty,
thus preventing the 400 error from occurring.
- **Issue:** Fixes
[#18009](https://github.com/langchain-ai/langchain/issues/18009)
- **Dependencies:** None
- **Twitter handle:** @gunnzolder

Co-authored-by: Anton Parkhomenko <anton@merge.rocks>
2024-03-14 13:56:57 +00:00
Nuno Campos
2b7c3c548d
core[minor]: Add Runnable.batch_as_completed (#17603)
This PR adds `batch as completed` method to the standard Runnable
interface. It takes in a list of inputs and yields the corresponding
outputs as the inputs are completed.
2024-03-13 11:18:02 -07:00
Erick Friis
74b2c0aa01
templates, cli: more security deps (#19006) 2024-03-12 20:48:56 -07:00
Erick Friis
2ffb2144a6
experimental[patch]: release 0.0.54 (#19000) 2024-03-13 00:38:46 +00:00
Erick Friis
873d06c009
langchain[patch]: release 0.1.12 (#18999) 2024-03-13 00:22:21 +00:00
Leonid Ganeline
9c8523b529
community[patch]: flattening imports 3 (#18939)
@eyurtsev
2024-03-12 15:18:54 -07:00
Erick Friis
af50f21765
community[patch]: release 0.0.28 (#18993) 2024-03-12 21:55:29 +00:00
Erick Friis
4881bb669c
core[patch]: release 0.1.31 (#18989) 2024-03-12 19:45:21 +00:00
Erick Friis
a29e8d8594
elasticsearch[patch]: fix integration tests for release (#18980) 2024-03-12 10:22:07 -07:00
Erick Friis
0d1f6c417c
elasticsearch[patch]: release 0.1.1 (#18978) 2024-03-12 16:46:22 +00:00
Max Jakob
911ccf9aa6
docs: elasticsearch retriever (#18965)
Add documentation notebook for `ElasticsearchRetriever`.

## Dependencies
- [ ] Release new `langchain-elasticsearch` version 0.2.0 that includes
`ElasticsearchRetriever`
2024-03-12 09:42:36 -07:00
Dobiichi-Origami
471f2ed40a
community[patch]: re-arrange the addtional_kwargs of returned qianfan structure to avoid _merge_dict issue (#18889)
fix issue: https://github.com/langchain-ai/langchain/issues/18441
PTAL, thanks
@baskaryan, @efriis, @eyurtsev, @hwchase17.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-12 05:43:56 +00:00
Naman Jain
75122646b5
core[patch]: fixed circular dependency with json schema (#18657)
**Description:** Circular dependencies when parsing references leading
to `RecursionError: maximum recursion depth exceeded` issue. This PR
address the issue by handling previously seen refs as in any typical DFS
to avoid infinite depths.

**Issue:** https://github.com/langchain-ai/langchain/issues/12163

 **Twitter handle:** https://twitter.com/theBhulawat 


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-12 05:42:45 +00:00
Tymofii
0bec1f6877
commnity[patch]: refactor code for faiss vectorstore, update faiss vectorstore documentation (#18092)
**Description:** Refactor code of FAISS vectorcstore and update the
related documentation.
Details: 
 - replace `.format()` with f-strings for strings formatting;
- refactor definition of a filtering function to make code more readable
and more flexible;
- slightly improve efficiency of
`max_marginal_relevance_search_with_score_by_vector` method by removing
unnecessary looping over the same elements;
- slightly improve efficiency of `delete` method by using set data
structure for checking if the element was already deleted;

**Issue:** fix small inconsistency in the documentation (the old example
was incorrect and unappliable to faiss vectorstore)

**Dependencies:** basic langchain-community dependencies and `faiss`
(for CPU or for GPU)

**Twitter handle:** antonenkodev
2024-03-11 22:33:03 -07:00
Roshan Santhosh
acf1ecc081
langchain[patch]: update llm_router.py (#18865)
Issue : _call method of LLMRouterChain uses predict_and_parse, which is
slated for deprecation.

Description : Instead of using predict_and_parse, this replaces it with
individual predict and parse functions.
2024-03-11 22:30:07 -07:00
Bagatur
18de77cc8c
core[minor]: add streaming support to OAI tool parsers (#18940)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-11 21:53:56 -07:00
Bagatur
e0e688a277
core[minor]: generation info on msg (#18592)
related to #16403 #17188
2024-03-12 04:43:17 +00:00
Tomaz Bratanic
cda43c5a11
experimental[patch]: Fix LLM graph transformer default prompt (#18856)
Some LLMs do not allow multiple user messages in sequence.
2024-03-11 20:11:52 -07:00
Bagatur
19721246f5
core[patch]: support labeled json schema as tools (#18935) 2024-03-11 19:51:35 -07:00
Erick Friis
0d888a65cb
core[patch]: move some attr/methods to BaseLanguageModel (#18936)
Cleans up some shared code between `BaseLLM` and `BaseChatModel`. One
functional difference to make it more consistent (see comment)
2024-03-11 14:59:45 -07:00
aditya thomas
5c2f7e6b2b
partners[openai]: update the docstring of OpenAI, OpenAIEmbeddings and ChatOpenAI classes (#18908)
**Description:** Update the docstring of OpenAI, OpenAIEmbeddings and
ChatOpenAI classes
**Issue:** Update import module paths to the current LangChain API
**Dependencies:** None
**Lint and test**: `make format` and `make lint` were run

This incorporates the review comments from langchain-ai/langchain#18637
which I closed due to an issue I had in updating that pr branch

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-11 20:48:54 +00:00
Leonid Ganeline
11195cfa42
community[patch]: speed up import times in the community package (#18928)
This PR speeds up import times in the community package
2024-03-11 16:37:36 -04:00
aditya thomas
8544f748f2
community[patch]: update AnthropicLLM deprecation message (#18869)
**Description:** Update AnthropicLLM deprecation message import path for
ChatAnthropic
**Issue:** Incorrect import path in deprecation message
**Dependencies:** None
**Lint and test**: `make format`, `make lint` and `make test` were run
2024-03-11 12:59:10 -07:00
Virat Singh
cafffe8a21
community: Add PolygonAggregates tool (#18882)
**Description:**
In this PR, I am adding a `PolygonAggregates` tool, which can be used to
get historical stock price data (called aggregates by Polygon) for a
given ticker.

Polygon
[docs](https://polygon.io/docs/stocks/get_v2_aggs_ticker__stocksticker__range__multiplier___timespan___from___to)
for this endpoint.

**Twitter**: 
[@virattt](https://twitter.com/virattt)
2024-03-11 11:58:10 -07:00
Erick Friis
93ef8ead0b
mongodb[patch]: fix core dep (#18926) 2024-03-11 10:27:29 -07:00
Mohammad Mohtashim
43db4cd20e
core[major]: On Tool End Observation Casting Fix (#18798)
This PR updates the on_tool_end handlers to return the raw output from the tool instead of casting it to a string. 

This is technically a breaking change, though it's impact is expected to be somewhat minimal. It will fix behavior in `astream_events` as well.

Fixes the following issue #18760 raised by @eyurtsev

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-03-11 10:59:04 -04:00
Massimiliano Pronesti
8113d612bb
community[patch]: support modin document loader (#18866)
Langchain community document loaders support `pyspark`, `polars`, and
`pandas` dataframes but not `modin`'s. This PR addresses this point.
2024-03-10 18:40:04 -07:00
Pol Ruiz Farre
a7f63d8cb4
community[patch]: Fix BasePDFLoader suffix for s3 presigned urls (#18844)
BasePDFLoader doesn't parse the suffix of the file correctly when
parsing S3 presigned urls. This fix enables the proper detection and
parsing of S3 presigned URLs to prevent errors such as `OSError: [Errno
36] File name too long`.
No additional dependencies required.
2024-03-11 00:58:51 +00:00
Joshua Carroll
ddaf9de169
community: Fix bug with StreamlitChatMessageHistory (#18834)
- **Description:** Fix Streamlit bug which was introduced by
https://github.com/langchain-ai/langchain/pull/18250, update integration
test
- **Issue:** https://github.com/langchain-ai/langchain/issues/18684
- **Dependencies:** None
2024-03-09 13:42:22 -08:00
Tomaz Bratanic
a28be31a96
Switch to md5 for deduplication in neo4j integrations (#18846)
Deduplicate documents using MD5 of the page_content. Also allows for
custom deduplication with graph ingestion method by providing metadata
id attribute

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2024-03-09 13:28:55 -08:00
Tomaz Bratanic
246724faab
LLM graph transformer prompt engineering (#18843)
A bit of prompt engineering to improve results
2024-03-09 11:27:16 -08:00
Erick Friis
b48865bf94
langchain[patch]: attach hub metadata (#18830) 2024-03-08 18:40:49 -08:00
Ammar
34b31a8cc7
core: add in-code docs for RunnableAssign class (#18826)
**Description:** Improves the docstring for `RunnableAssign` by
providing a concise description and a self-contained code example.
  **Issue:**  #18803
2024-03-09 02:04:52 +00:00
Leonid Ganeline
476d6dc596
community[patch]: Use getattr for toolkits imports (#18825)
This will preserve the namespace, without actually loading the underlying packages on init.
2024-03-08 20:54:28 -05:00
Erick Friis
bbb609ac9d
core[patch]: fix arbitrary config keys (#18827) 2024-03-08 17:35:13 -08:00
Luis Antonio Vieira Junior
67c880af74
community[patch]: adding linearization config to AmazonTextractPDFLoader (#17489)
- **Description:** Adding an optional parameter `linearization_config`
to the `AmazonTextractPDFLoader` so the caller can define how the output
will be linearized, instead of forcing a predefined set of linearization
configs. It will still have a default configuration as this will be an
optional parameter.
- **Issue:** #17457
- **Dependencies:** The same ones that already exist for
`AmazonTextractPDFLoader`
- **Twitter handle:** [@lvieirajr19](https://twitter.com/lvieirajr19)

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-08 17:25:22 -08:00
Anis ZAKARI
37e89ba5b1
community[patch]: Bedrock add support for mistral models (#18756)
*Description**: My previous
[PR](https://github.com/langchain-ai/langchain/pull/18521) was
mistakenly closed, so I am reopening this one. Context: AWS released two
Mistral models on Bedrock last Friday (March 1, 2024). This PR includes
some code adjustments to ensure their compatibility with the Bedrock
class.

---------

Co-authored-by: Anis ZAKARI <anis.zakari@hymaia.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-09 01:20:38 +00:00
Alexander Dicke
66576948e0
experimental[minor]: adds mixtral wrapper (#17423)
**Description:** Adds a chat wrapper for Mixtral models using the
[prompt
template](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1#instruction-format).

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-08 17:14:23 -08:00
Keith Chan
914af69b44
community[patch]: Update azuresearch vectorstore from_texts() method to include fields argument (#17661)
- **Description:** Update azuresearch vectorstore from_texts() method to
include fields argument, necessary for creating an Azure AI Search index
with custom fields.
- **Issue:** Currently index fields are fixed to default fields if Azure
Search index is created using from_texts() method
- **Dependencies:** None
- **Twitter handle:** None

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-08 17:05:35 -08:00
al1p
46f0cea2b9
community[patch][: improved the suffix prompt to avoid loop (#17791)
Small improvement to the openapi prompt.
The agent was not finding the server base URL (looping through all
nodes). This small change narrows the search and enables finding the url
faster.

No dependency 

Twitter : @al1pra
2024-03-08 16:53:09 -08:00
Dmitry Kankalovich
f5117e907d
openai[patch]: Proper example for AzureOpenAI usage in error message (#17798)
# Proper example for AzureOpenAI usage in error message

The original error message is wrong in part of a usage example it gives.
Corrected to the right one.

Co-authored-by: Dzmitry Kankalovich <dzmitry_kankalovich@epam.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-08 16:52:55 -08:00
Théo LEBRUN
cf94091cd0
community[patch]: Skip nested directories when using S3DirectoryLoader (#17829)
- **Description:** `S3DirectoryLoader` is failing if prefix is a folder
(ex: `my_folder/`) because `S3FileLoader` will try to load that folder
and will fail. This PR skip nested directories so prefix can be set to
folder instead of `my_folder/files_prefix`.
- **Issue:**
  - #11917
  - #6535
  - #4326
- **Dependencies:** none
- **Twitter handle:** @Falydoor


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
2024-03-08 16:50:58 -08:00
Venkatesan
7a18b63dbf
community[patch]: Mongo index creation (#17748)
- [ ] Title: Mongodb: MongoDB connection performance improvement. 
- [ ] Message: 
- **Description:** I made collection index_creation as optional. Index
Creation is one time process.
- **Issue:** MongoDBChatMessageHistory class object is attempting to
create an index during connection, causing each request to take longer
than usual. This should be optional with a parameter.
    - **Dependencies:** N/A
    - **Branch to be checked:** origin/mongo_index_creation

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-08 16:43:17 -08:00
wt3639
5b5b37a999
community[patch]: Add embedding instruction to HuggingFaceBgeEmbeddings (#18017)
- **Description:** Add embedding instruction to
HuggingFaceBgeEmbeddings, so that it can be compatible with nomic and
other models that need embedding instruction.

---------

Co-authored-by: Tao Wu <tao.wu@rwth-aachen.de>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-08 16:39:29 -08:00
Erick Friis
a8de6d1533
anthropic[patch]: integration test update (#18823) 2024-03-08 13:47:31 -08:00
wewebber-merlin
d1f5bc4906
anthropic[patch]: add kwargs to format_output base (#18715)
_generate() and _agenerate() both accept **kwargs, then pass them on to
_format_output; but _format_output doesn't accept **kwargs. Attempting
to pass, e.g.,

     timeout=50

to _generate (or invoke()) results in a TypeError.

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-08 21:47:21 +00:00
Erick Friis
aa7bce6b13
anthropic[patch]: release 0.1.4 (#18822) 2024-03-08 21:34:47 +00:00
Erick Friis
a5bcddc738
anthropic[patch]: streaming param (#18819) 2024-03-08 13:32:57 -08:00
Erick Friis
8c0b215c02
anthropic[patch]: fix format output args (#18816) 2024-03-08 12:34:11 -08:00
Ishani Vyas
2b0cbd65ba
community[patch]: Add Passio Nutrition AI Food Search Tool to Community Package (#18278)
## Add Passio Nutrition AI Food Search Tool to Community Package

### Description
We propose adding a new tool to the `community` package, enabling
integration with Passio Nutrition AI for food search functionality. This
tool will provide a simple interface for retrieving nutrition facts
through the Passio Nutrition AI API, simplifying user access to
nutrition data based on food search queries.

### Implementation Details
- **Class Structure:** Implement `NutritionAI`, extending `BaseTool`. It
includes an `_run` method that accepts a query string and, optionally, a
`CallbackManagerForToolRun`.
- **API Integration:** Use `NutritionAIAPI` for the API wrapper,
encapsulating all interactions with the Passio Nutrition AI and
providing a clean API interface.
- **Error Handling:** Implement comprehensive error handling for API
request failures.

### Expected Outcome
- **User Benefits:** Enable easy querying of nutrition facts from Passio
Nutrition AI, enhancing the utility of the `langchain_community` package
for nutrition-related projects.
- **Functionality:** Provide a straightforward method for integrating
nutrition information retrieval into users' applications.

### Dependencies
- `langchain_core` for base tooling support
- `pydantic` for data validation and settings management
- Consider `requests` or another HTTP client library if not covered by
`NutritionAIAPI`.

### Tests and Documentation
- **Unit Tests:** Include tests that mock network interactions to ensure
tool reliability without external API dependency.
- **Documentation:** Create an example notebook in
`docs/docs/integrations/tools/passio_nutrition_ai.ipynb` showing usage,
setup, and example queries.

### Contribution Guidelines Compliance
- Adhere to the project's linting and formatting standards (`make
format`, `make lint`, `make test`).
- Ensure compliance with LangChain's contribution guidelines,
particularly around dependency management and package modifications.

### Additional Notes
- Aim for the tool to be a lightweight, focused addition, not
introducing significant new dependencies or complexity.
- Potential future enhancements could include caching for common queries
to improve performance.

### Twitter Handle
- Here is our Passio AI [twitter handle](https://twitter.com/@passio_ai)
where we announce our products.


If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-03-08 20:33:22 +00:00
Kushagra
b1f22bf76c
community[minor]: added a feature to filter documents in Mongoloader (#18253)
"community: added a feature to filter documents in Mongoloader"
- **Description:** added a feature to filter documents in Mongoloader
    - **Feature:** the feature #18251
    - **Dependencies:** No
    - **Twitter handle:** https://twitter.com/im_Kushagra
2024-03-08 12:06:35 -08:00
Eugene Yurtsev
1f50274df7
community[patch]: Add pgvector to docker compose and update settings used in integration test (#18815) 2024-03-08 14:39:28 -05:00
Erick Friis
ad29806255
nvidia-trt, nvidia-ai-endpoints: move to repo (#18814)
NVIDIA maintained in https://github.com/langchain-ai/langchain-nvidia
2024-03-08 19:30:50 +00:00
Christophe Bornet
e54a49b697
community[minor]: Add lazy_table_reflection param to SqlDatabase (#18742)
For some DBs with lots of tables, reflection of all the tables can take
very long. So this change will make the tables be reflected lazily when
get_table_info() is called and `lazy_table_reflection` is True.
2024-03-08 14:10:23 -05:00
Christophe Bornet
ead2a74806
community: Implement lazy_load() for JSONLoader (#18643)
Covered by `tests/unit_tests/document_loaders/test_json_loader.py`
2024-03-08 13:58:17 -05:00
Erick Friis
a88f62ec3c
langchain[patch]: getattr import from langchain.chains (#18160) 2024-03-08 10:36:14 -08:00
Eugene Yurtsev
cdfb5b4ca1
core[minor]: Chat Models to fallback astream to fallback on sync stream if available (#18748)
Allows all chat models that implement _stream, but not _astream to still have async streaming to work.

Amongst other things this should resolve issues with streaming community model implementations through langserve since langserve is exclusively async.
2024-03-08 13:27:29 -05:00
aditya thomas
e00c1ff2b0
infra: ChatOpenAI unit tests for invoke() and ainvoke() (#18792)
**Description:** Replacing the deprecated predict() and apredict()
methods in the unit tests
**Issue:** Not applicable
**Dependencies:** None
**Lint and test**: `make format`, `make lint` and `make test` have been
run
2024-03-08 09:48:38 -08:00
Bagatur
3e29c04213
core[minor]: add BaseMessage.response_metadata (#18699) 2024-03-08 09:35:56 -08:00
Bagatur
bc6249c889
langchain[patch]: runnable agent streaming param (#18761)
Usage:

```python
agent = RunnableAgent(runnable=runnable, .., stream_runnable=False)
```
or for convenience
```python
agent_executor = AgentExecutor(agent=agent, ..., stream_runnable=False)
```
2024-03-07 20:53:53 -08:00
Tomaz Bratanic
c8c592d3f1
experimental[minor]: Add LLM graph transformer (#18733)
Add a class that constructs knowledge graphs based on text using an LLM.
2024-03-07 20:52:53 -08:00
Phat Vo
3ecb903d49
community[patch] : Tidy up and update Clarifai SDK functions (#18314)
Description :
* Tidy up, add missing docstring and fix unused params
* Enable using session token
2024-03-07 19:47:44 -08:00
Max Jakob
61a2eba081
elasticsearch[patch]: add top-level import, remove obsolete dependency (#18644)
Make `ElasticsearchRetriever` available as top-level import.

The `langchain` package depends on `langchain-community` so we do not
need to depend on it explicitly.
2024-03-07 19:38:31 -08:00
Tomaz Bratanic
010a234f1e
docs: Fix diffbot graph transformer description (#18736)
The previous docstring was invalid
2024-03-07 19:25:41 -08:00
Jan Nissen
b8922480ed
core[patch]: improve PydanticOutputParser typing (#18740)
This PR adds generic typing to `PydanticOutputParser` so we get a typed
output from `.parse` instead of `Any`. It should provide a better DX by
way of Intellisense and for anyone strictly typing.

Pre-change:

![Screenshot 2024-03-07 at 10 22
31 AM](https://github.com/langchain-ai/langchain/assets/22690160/fd22dde0-9fdc-4283-b283-4c98f0bc46e5)

Post-change:

![Screenshot 2024-03-07 at 10 26
31 AM](https://github.com/langchain-ai/langchain/assets/22690160/7e23d2b7-8f8c-494f-80b3-187530a173ee)

I haven't dug too deep, but I think a similar change could probably be
added to `JsonOutputParser` so we don't have to pull up `.parse`.

Co-authored-by: Jan Nissen <jan23@gmail.com>
2024-03-07 19:25:24 -08:00
Massimiliano Pronesti
3b975c6ebe
experimental[minor]: add support for modin in pandas agent (#18749)
Added support for Intel's
[modin](https://github.com/modin-project/modin) in
`create_pandas_dataframe_agent`.
2024-03-07 19:23:07 -08:00
Tomaz Bratanic
4bfe888717
comunity[patch]: Fix neo4j sanitizing values (#18750)
Fixing sanitization for when deeply nested lists appear
2024-03-07 19:21:52 -08:00
Eugene Yurtsev
6caceb5473
core[patch]: Automatic upgrade to AddableDict in transform and atransform (#18743)
Automatic upgrade to transform and atransform

Closes: 

https://github.com/langchain-ai/langchain/issues/18741
https://github.com/langchain-ai/langgraph/issues/136
https://github.com/langchain-ai/langserve/issues/504
2024-03-07 21:23:12 -05:00
Yunmo Koo
fee6f983ef
community[minor]: Integration for Friendli LLM and ChatFriendli ChatModel. (#17913)
## Description
- Add [Friendli](https://friendli.ai/) integration for `Friendli` LLM
and `ChatFriendli` chat model.
- Unit tests and integration tests corresponding to this change are
added.
- Documentations corresponding to this change are added.

## Dependencies
- Optional dependency
[`friendli-client`](https://pypi.org/project/friendli-client/) package
is added only for those who use `Frienldi` or `ChatFriendli` model.

## Twitter handle
- https://twitter.com/friendliai
2024-03-08 02:20:47 +00:00
Smit Parmar
aed46cd6f2
community[patch]: Added support for filter out AWS Kendra search by score confidence (#12920)
**Description:** It will add support for filter out kendra search by
score confidence which will make result more accurate.
    For example
   ```
retriever = AmazonKendraRetriever(
        index_id=kendra_index_id, top_k=5, region_name=region,
        score_confidence="HIGH"
    )
```
Result will not include the records which has score confidence "LOW" or "MEDIUM". 
Relevant docs 
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/kendra/client/query.html
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/kendra/client/retrieve.html

 **Issue:** the issue # it resolve #11801 
**twitter:** [@SmitCode](https://twitter.com/SmitCode)
2024-03-07 17:28:09 -08:00
Ian
390ef6abe3
community[minor]: Add Initial Support for TiDB Vector Store (#15796)
This pull request introduces initial support for the TiDB vector store.
The current version is basic, laying the foundation for the vector store
integration. While this implementation provides the essential features,
we plan to expand and improve the TiDB vector store support with
additional enhancements in future updates.

Upcoming Enhancements:
* Support for Vector Index Creation: To enhance the efficiency and
performance of the vector store.
* Support for max marginal relevance search. 
* Customized Table Structure Support: Recognizing the need for
flexibility, we plan for more tailored and efficient data store
solutions.

Simple use case exmaple

```python
from typing import List, Tuple
from langchain.docstore.document import Document
from langchain_community.vectorstores import TiDBVectorStore
from langchain_openai import OpenAIEmbeddings

db = TiDBVectorStore.from_texts(
    embedding=embeddings,
    texts=['Andrew like eating oranges', 'Alexandra is from England', 'Ketanji Brown Jackson is a judge'],
    table_name="tidb_vector_langchain",
    connection_string=tidb_connection_url,
    distance_strategy="cosine",
)

query = "Can you tell me about Alexandra?"
docs_with_score: List[Tuple[Document, float]] = db.similarity_search_with_score(query)
for doc, score in docs_with_score:
    print("-" * 80)
    print("Score: ", score)
    print(doc.page_content)
    print("-" * 80)
```
2024-03-07 17:18:20 -08:00
Bagatur
3b1eb1f828
community[patch]: chat hf typing fix (#18693) 2024-03-07 17:06:38 -08:00
Jib
d60e93b6ae
langchain-mongodb: Standardize mongodb collection/index names in tests (#18755)
## **Description:**
MongoDB integration tests link to a provided Atlas Cluster. We have very
stringent permissions set against the cluster provided. In order to make
it easier to track and isolate the collections each test gets run
against, we've updated the collection names to map the test file name.
i.e. `langchain_{filename}` => `langchain_test_vectorstores`

Fixes integration test results

![image](https://github.com/langchain-ai/langchain/assets/2887713/41f911b9-55f7-4fe4-9134-5514b82009f9)

## **Dependencies:** 
Provided MONGODB_ATLAS_URI

- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

cc: @shaneharvey, @blink1073 , @NoahStapp , @caseyclements
2024-03-07 17:16:04 -05:00
Eugene Yurtsev
ca299a8e08
Docs: Add custom parsing documentation and extending langchain (#18331)
* Added extending langchain.mdx -- we'll need to add links as we add
more custom documentation
* Added partial documentation about parsers
2024-03-07 16:30:57 -05:00
Eugene Yurtsev
8c71f92cb2
core: upgrade mypy to recent mypy (#18753)
Testing this works per package on CI
2024-03-07 15:25:19 -05:00
Eugene Yurtsev
e188d4ecb0
Add dangerous parameter to requests tool (#18697)
The tools are already documented as dangerous. Not clear whether adding
an opt-in parameter is necessary or not
2024-03-07 15:10:56 -05:00