Commit Graph

3440 Commits (f1313339ac6f45bcf0afcf572d372c7e3660c0af)

Author SHA1 Message Date
Kenzie Mihardja 21f75991d4
deprecate community docugami loader (#19230)
Thank you for contributing to LangChain!

- [x] **PR title**: "community: deprecate DocugamiLoader"

- [x] **PR message**: Deprecate the langchain_community and use the
docugami_langchain DocugamiLoader

---------

Co-authored-by: Kenzie Mihardja <kenzie28@cs.washington.edu>
6 months ago
Jib ec026004cb
mongodb[patch]: Remove in-memory cache from cache abstractions (#18987)
## Description
* In memory cache easily gets out of sync with the server cache, so we
will remove it entirely to reduce the issues around invalidated caches.

## Dependencies
None

- [x]  If you're adding a new integration, please include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Co-authored-by: Erick Friis <erick@langchain.dev>
6 months ago
Jib 866d6408af
mongodb[patch]: Remove embedding retrieval from mongodb payload (#19035)
## Description
Returning the embedding is not necessary in the vector search
functionality unless specified as a debugging step. This change defaults
the behavior such that the server _only_ returns the embedding key if
explicitly requested, such as in the case of
`max_marginal_relevance_search`.


- [x] **Add tests and docs**: If you're adding a new integration, please
include
* Added `test_from_documents_no_embedding_return`


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
6 months ago
Leonid Kuligin 366ba77459
core[minor]: moved fake llms and embeddings to core (#19226)
- [ ] **PR title**: "core: moved fake llms and embeddings to core"


- [ ] **PR message**:
 - **Description:** moved fake llms and embeddings to core"
6 months ago
Pengfei Jiang 514fe80778
community[patch]: add stop parameter support to volcengine maas (#19052)
- **Description:** add stop parameter to volcengine maas model
- **Dependencies:** no

---------

Co-authored-by: 江鹏飞 <jiangpengfei.jiangpf@bytedance.com>
6 months ago
htaoruan bcc771e37c
docs: ChatTongyi example error (#19013) 6 months ago
primate88 5aa68936e0
community: Fix import path for StreamingStdOutCallbackHandler example (#19170)
- Description:
- Updated the import path for `StreamingStdOutCallbackHandler` in the
streaming response example within `huggingface_endpoint.py`. This change
corrects the import statement to reflect the actual location of
`StreamingStdOutCallbackHandler` in
`langchain_core.callbacks.streaming_stdout`.
- Issue:
  - None
- Dependencies:
  - No additional dependencies are required for this change.
- Twitter handle:
  - None

## Note:
I have tested this change locally and confirmed that the
`StreamingStdOutCallbackHandler` works as expected with the updated
import path. This PR does not require the addition of new tests since it
is a correction to documentation/examples rather than functional code.
6 months ago
Bagatur 611d5a1618
openai[patch]: fix async http client (#19164)
Fix #19116
6 months ago
Nikhil Kumar 635b3372bd
community[minor]: Add support for translation in HuggingFacePipeline (#19190)
- [x] **Support for translation**: "community: Add support for
translation in `HuggingFacePipeline`"


- [x] **Add support for translation in `HuggingFacePipeline`**:
- **Description:** Add support for translation in `HuggingFacePipeline`,
which earlier used to support only text summarization and generation.
    - **Issue:** N/A
    - **Dependencies:** N/A
    - **Twitter handle:** None
6 months ago
Nikhil Kumar a1b26dd9b6
docs: Add docs for RouterRunnable (#19191)
- [x] **Docs for `RouterRunnable`**: core: Add docs for `RouterRunnable`

- [x] **Add docs for `RouterRunnable`**:
- **Description:** Add docs for `RouterRunnable`, which was previously
missing documentation
    - **Issue:** #18803 
    - **Dependencies:** N/A
    - **Twitter handle:** None
6 months ago
k.muto 8d2c34e655
community: Fix all page numbers were the same for _BaseGoogleVertexAISearchRetriever (#19175)
- Description:
- This pull request is to fix a bug where page numbers were not set
correctly. In the current code, all chunks share the same metadata
object doc_metadata, so the page number is set with the same value for
all documents. To fix this, I changed to using separate metadata objects
for each chunk.
- Issue:
  - None
- Dependencies:
  - No additional dependencies are required for this change.
- Twitter handle:
  - @eycjur

- Test
- Even if it's not a bug, there are cases where everything ends up with
the same number of pages, so it's very difficult for me to write
integration tests.
6 months ago
Cailin Wang 7cd87d2f6a
community: Add `partition` parameter to DashVector (#19023)
**Description**: DashVector Add partition parameter
**Twitter handle**: @CailinWang_

---------

Co-authored-by: root <root@Bluedot-AI>
6 months ago
Rodrigo Nogueira e64cf1aba4
community: Add model argument for maritalk models and better error handling (#19187) 6 months ago
Sergey Kozlov 1a55e950aa
community[patch]: support fastembed v1 and v2 (#19125)
**Description:**
#18040 forces `fastembed>2.0`, and this causes dependency conflicts with
the new `unstructured` package (different `onnxruntime`). There may be
other dependency conflicts.. The only way to use
`langchain-community>=0.0.28` is rollback to `unstructured 0.10.X`. But
new `unstructured` contains many fixes.

This PR allows to use both `fastembed` `v1` and `v2`.

How to reproduce:

`pyproject.toml`:
```toml
[tool.poetry]
name = "depstest"
version = "0.0.0"
description = "test"
authors = ["<dev@example.org>"]

[tool.poetry.dependencies]
python = ">=3.10,<3.12"
langchain-community = "^0.0.28"
fastembed = "^0.2.0"
unstructured = {extras = ["pdf"], version = "^0.12"}
```

```bash
$ poetry lock
```

Co-authored-by: Sergey Kozlov <sergey.kozlov@ludditelabs.io>
6 months ago
six17 fd4f536c77
text-splitters[patch]: fix json split of RecursiveJsonSplitter (#19119)
- **Description:** This modification addresses the issue of mutable
default parameters in functions. In the original code, the `chunks`
parameter is defaulted to a list containing an empty dictionary, which
is mutable. Since default parameters in Python are evaluated only once
at function definition time, modifications to the parameter would
persist across future calls. By changing the default to `None` and
checking/initializing within the function, a new list is created for
each call, thus avoiding potential issues.

---------

Co-authored-by: sixiang <sixiang@lixiang.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
6 months ago
aditya thomas 80eb510a7b
docs: update docstring of Together class (#19008)
**Description:** Update docstring of Together class to show example and
update API URL
**Issue:** Improves usability
**Dependencies:** None
**Lint and test**: `make format`, `make lint` and `make test` were run
6 months ago
高远 ef9813dae6
docs: add vikingdb docstrings(#19016)
Co-authored-by: gaoyuan <gaoyuan.20001218@bytedance.com>
6 months ago
wulixuan 0e0030f494
community[patch]: fix yuan2 chat model errors while invoke. (#19015)
1. fix yuan2 chat model errors while invoke.
2. update related tests.
3. fix some deprecationWarning.
6 months ago
Shuai Liu c244e1a50b
community[patch]: Fixed bug in merging `generation_info` during chunk concatenation in Tongyi and ChatTongyi (#19014)
- **Description:** 

In #16218 , during the `GenerationChunk` and `ChatGenerationChunk`
concatenation, the `generation_info` merging changed from simple keys &
values replacement to using the util method
[`merge_dicts`](https://github.com/langchain-ai/langchain/blob/master/libs/core/langchain_core/utils/_merge.py):


![image](https://github.com/langchain-ai/langchain/assets/2098020/10f315bf-7fe0-43a7-a0ce-6a3834b99a15)

The `merge_dicts` method could not handle merging values of `int` or
some other types, and would raise a
[`TypeError`](https://github.com/langchain-ai/langchain/blob/master/libs/core/langchain_core/utils/_merge.py#L55).

This PR fixes this issue in the **Tongyi and ChatTongyi Model** by
adopting the `generation_info` of the last chunk
and discarding the `generation_info` of the intermediate chunks,
ensuring that `stream` and `astream` function correctly.

- **Issue:**  
    - Related issues or PRs about Tongyi & ChatTongyi: #16605, #17105 
    - Other models or cases: #18441, #17376
- **Dependencies:** No new dependencies
6 months ago
Christophe Bornet f2a7dda4bd
community[patch]: Use langchain-astradb for AstraDB doc loader (#19071)
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
6 months ago
Holt Skinner cee03630d9
community[patch]: Add Blended Search Support to `GoogleVertexAISearchRetriever` (#19082)
https://cloud.google.com/generative-ai-app-builder/docs/create-data-store-es#multi-data-stores

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
6 months ago
Eugene Yurtsev 0ddfe7fc9d
langchain[patch]: make hub work with older langchainhub versions (#19076)
Make it work with older clients
6 months ago
case-k ebc4a64f9e
docs: fix databricks document url (#19096)
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
6 months ago
Guangdong Liu 4468e5bdbe
docs: Add in code documentation to core Runnable with_fallbacks method (docs only) (#19104)
- Description: [a description of the change] Add in code documentation
to core Runnable with_fallbacks method (docs only)
- Issue: the issue #18804 
@eyurtsev PTAL
6 months ago
Guangdong Liu cced3eb9bc
community[patch]: Fix sparkllm embeddings api bug. (#19122)
- **Description:** Fix sparkllm embeddings api bug.
@baskaryan PTAL
6 months ago
kaijietti c20aeef79a
community[patch]: implement qdrant _aembed_query and use it in other async funcs (#19155)
`amax_marginal_relevance_search ` and `asimilarity_search_with_score `
should use an async version of `_embed_query `.
6 months ago
Barun Amalkumar Halder 34d6f0557d
community[patch] : publishes duration as milliseconds to Fiddler (#19166)
**Description:** Many LLM steps complete in sub-second duration, which
can lead to non-collection of duration field for Fiddler. This PR
updates duration from seconds to milliseconds.
**Issue:** [INTERNAL] FDL-17568
**Dependencies:** NA
**Twitter handle:** behalder

Co-authored-by: Barun Halder <barun@fiddler.ai>
6 months ago
Eugene Yurtsev 745d2476a2
langchain: upgrade mypy (#19163)
Update mypy in langchain
6 months ago
Maxime Perrin aa785fa6ec
core[minor]: allow LLMs async streaming to fallback on sync streaming (#18960)
- **Description:** Handling fallbacks when calling async streaming for a
LLM that doesn't support it.
- **Issue:** #18920 
- **Twitter handle:**@maximeperrin_

---------

Co-authored-by: Maxime Perrin <mperrin@doing.fr>
6 months ago
Barun Amalkumar Halder b551d49cf5
community[patch] : adds feedback and status for Fiddler callback handler events (#19157)
**Description:** This PR adds updates the fiddler events schema to also
pass user feedback, and llm status to fiddler
   **Tickets:** [INTERNAL] FDL-17559 
   **Dependencies:**  NA
   **Twitter handle:** behalder

Co-authored-by: Barun Halder <barun@fiddler.ai>
6 months ago
Juan Felipe Arias f5b9aedc48
community[patch]: add args_schema to sql_database tools for langGraph integration (#18595)
- **Description:** This modification adds pydantic input definition for
sql_database tools. This helps for function calling capability in
LangGraph. Since actions nodes will usually check for the args_schema
attribute on tools, This update should make these tools compatible with
it (only implemented on the InfoSQLDatabaseTool)
  - **Issue:** N/A
  - **Dependencies:** N/A
  - **Twitter handle:** juanfe8881
6 months ago
fengjial c922ea36cb
community[minor]: Add Baidu VectorDB as vector store (#17997)
Co-authored-by: fengjialin <fengjialin@MacBook-Pro.local>
6 months ago
Erick Friis 781aee0068
community, langchain, infra: revert store extended test deps outside of poetry (#19153)
Reverts langchain-ai/langchain#18995

Because it makes installing dependencies in python 3.11 extended testing
take 80 minutes
6 months ago
Erick Friis 9e569d85a4
community, langchain, infra: store extended test deps outside of poetry (#18995)
poetry can't reliably handle resolving the number of optional "extended
test" dependencies we have. If we instead just rely on pip to install
extended test deps in CI, this isn't an issue.
6 months ago
Bagatur 191ddbc77e
core[patch]: rc release 0.1.33-rc.1 (#19103) 6 months ago
Nuno Campos 508f75853c
core[patch]: Change structured prompt lc id to match js (#19099) 6 months ago
Erick Friis 7ce81eb6f4
voyageai[patch]: init package (#19098)
Co-authored-by: fodizoltan <zoltan@conway.expert>
Co-authored-by: Yujie Qian <thomasq0809@gmail.com>
Co-authored-by: fzowl <160063452+fzowl@users.noreply.github.com>
6 months ago
Asaf Joseph Gardin 4d7f6fa968
ai21[patch]: AI21 Labs Batch Support in Embeddings (#18633)
Description: Added support for batching when using AI21 Embeddings model
Twitter handle: https://github.com/AI21Labs

---------

Co-authored-by: Asaf Gardin <asafg@ai21.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
6 months ago
Erick Friis d5cf360329
ibm[patch]: release 0.1.3 (#19094) 6 months ago
Mateusz Szewczyk b15d150d22
ibm[patch]: add async tests, add tokenize support (#18898)
- **Description:** add async tests, add tokenize support
- **Dependencies:**
[ibm-watsonx-ai](https://pypi.org/project/ibm-watsonx-ai/),
  - **Tag maintainer:** 

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally -> 
Please make sure integration_tests passing locally -> 

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
6 months ago
billytrend-cohere 7253b816cc
community: Add support for cohere SDK v5 (keeps v4 backwards compatibility) (#19084)
- **Description:** Add support for cohere SDK v5 (keeps v4 backwards
compatibility)

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
6 months ago
Eugene Yurtsev 06165efb5b
core[patch]: RunnablePassthrough transform to autoupgrade to AddableDict (#19051)
Follow up on https://github.com/langchain-ai/langchain/pull/18743 which
missed RunnablePassthrough

Issues:

https://github.com/langchain-ai/langchain/issues/18741
https://github.com/langchain-ai/langgraph/issues/136
https://github.com/langchain-ai/langserve/issues/504
6 months ago
Eugene Yurtsev 6cdca4355d
community[minor]: Revamp PGVector Filtering (#18992)
This PR makes the following updates in the pgvector database:

1. Use JSONB field for metadata instead of JSON
2. Update operator syntax to include required `$` prefix before the
operators (otherwise there will be name collisions with fields)
3. The change is non-breaking, old functionality is still the default,
but it will emit a deprecation warning
4. Previous functionality has bugs associated with comparisons due to
casting to text (so lexical ordering is used incorrectly for numeric
fields)
5. Adds an a GIN index on the JSONB field for more efficient querying
6 months ago
Guangdong Liu d4b025c812
code[patch]: Add in code documentation to core Runnable assign method (docs only) (#18951)
**PR message**: ***Delete this entire checklist*** and replace with
- **Description:** [a description of the change](docs: Add in code
documentation to core Runnable assign method)
    - **Issue:** the issue  #18804
6 months ago
Bagatur 573f48e34d
core[patch]: Release 0.1.32 (#19088) 6 months ago
YHW 69a8ef2693
core: Runnable pass kwargs to _astream_log_implementation in astream_log (#19055)
- **Description:** When calling the `_stream_log_implementation` from
the `astream_log` method in the `Runnable` class, it is not handing over
the `kwargs` argument. Therefore, even if i want to customize APIHandler
and implement additional features with additional arguments, it is not
possible. Conversely, the `astream_events` method normally handing over
the `kwargs` argument.
- **Issue:** https://github.com/langchain-ai/langchain/issues/19054
- **Dependencies:**
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!

Co-authored-by: hyungwookyang <hyungwookyang@worksmobile.com>
6 months ago
Nuno Campos 751fb7de20
Add new beta StructuredPrompt (#19080)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
6 months ago
Anton Parkhomenko ae73b9d839
community[patch]: Fix NotionDBLoader 400 Error by conditionally adding filter parameter (#19075)
- **Description:** This change fixes a bug where attempts to load data
from Notion using the NotionDBLoader resulted in a 400 Bad Request
error. The issue was traced to the unconditional addition of an empty
'filter' object in the request payload, which Notion's API does not
accept. The modification ensures that the 'filter' object is only
included in the payload when it is explicitly provided and not empty,
thus preventing the 400 error from occurring.
- **Issue:** Fixes
[#18009](https://github.com/langchain-ai/langchain/issues/18009)
- **Dependencies:** None
- **Twitter handle:** @gunnzolder

Co-authored-by: Anton Parkhomenko <anton@merge.rocks>
6 months ago
Nuno Campos 2b7c3c548d
core[minor]: Add Runnable.batch_as_completed (#17603)
This PR adds `batch as completed` method to the standard Runnable
interface. It takes in a list of inputs and yields the corresponding
outputs as the inputs are completed.
6 months ago
Erick Friis 74b2c0aa01
templates, cli: more security deps (#19006) 6 months ago
Erick Friis 2ffb2144a6
experimental[patch]: release 0.0.54 (#19000) 6 months ago
Erick Friis 873d06c009
langchain[patch]: release 0.1.12 (#18999) 6 months ago
Leonid Ganeline 9c8523b529
community[patch]: flattening imports 3 (#18939)
@eyurtsev
6 months ago
Erick Friis af50f21765
community[patch]: release 0.0.28 (#18993) 6 months ago
Erick Friis 4881bb669c
core[patch]: release 0.1.31 (#18989) 6 months ago
Erick Friis a29e8d8594
elasticsearch[patch]: fix integration tests for release (#18980) 6 months ago
Erick Friis 0d1f6c417c
elasticsearch[patch]: release 0.1.1 (#18978) 6 months ago
Max Jakob 911ccf9aa6
docs: elasticsearch retriever (#18965)
Add documentation notebook for `ElasticsearchRetriever`.

## Dependencies
- [ ] Release new `langchain-elasticsearch` version 0.2.0 that includes
`ElasticsearchRetriever`
6 months ago
Dobiichi-Origami 471f2ed40a
community[patch]: re-arrange the addtional_kwargs of returned qianfan structure to avoid _merge_dict issue (#18889)
fix issue: https://github.com/langchain-ai/langchain/issues/18441
PTAL, thanks
@baskaryan, @efriis, @eyurtsev, @hwchase17.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
6 months ago
Naman Jain 75122646b5
core[patch]: fixed circular dependency with json schema (#18657)
**Description:** Circular dependencies when parsing references leading
to `RecursionError: maximum recursion depth exceeded` issue. This PR
address the issue by handling previously seen refs as in any typical DFS
to avoid infinite depths.

**Issue:** https://github.com/langchain-ai/langchain/issues/12163

 **Twitter handle:** https://twitter.com/theBhulawat 


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
6 months ago
Tymofii 0bec1f6877
commnity[patch]: refactor code for faiss vectorstore, update faiss vectorstore documentation (#18092)
**Description:** Refactor code of FAISS vectorcstore and update the
related documentation.
Details: 
 - replace `.format()` with f-strings for strings formatting;
- refactor definition of a filtering function to make code more readable
and more flexible;
- slightly improve efficiency of
`max_marginal_relevance_search_with_score_by_vector` method by removing
unnecessary looping over the same elements;
- slightly improve efficiency of `delete` method by using set data
structure for checking if the element was already deleted;

**Issue:** fix small inconsistency in the documentation (the old example
was incorrect and unappliable to faiss vectorstore)

**Dependencies:** basic langchain-community dependencies and `faiss`
(for CPU or for GPU)

**Twitter handle:** antonenkodev
6 months ago
Roshan Santhosh acf1ecc081
langchain[patch]: update llm_router.py (#18865)
Issue : _call method of LLMRouterChain uses predict_and_parse, which is
slated for deprecation.

Description : Instead of using predict_and_parse, this replaces it with
individual predict and parse functions.
6 months ago
Bagatur 18de77cc8c
core[minor]: add streaming support to OAI tool parsers (#18940)
Co-authored-by: Erick Friis <erick@langchain.dev>
6 months ago
Bagatur e0e688a277
core[minor]: generation info on msg (#18592)
related to #16403 #17188
6 months ago
Tomaz Bratanic cda43c5a11
experimental[patch]: Fix LLM graph transformer default prompt (#18856)
Some LLMs do not allow multiple user messages in sequence.
6 months ago
Bagatur 19721246f5
core[patch]: support labeled json schema as tools (#18935) 6 months ago
Erick Friis 0d888a65cb
core[patch]: move some attr/methods to BaseLanguageModel (#18936)
Cleans up some shared code between `BaseLLM` and `BaseChatModel`. One
functional difference to make it more consistent (see comment)
6 months ago
aditya thomas 5c2f7e6b2b
partners[openai]: update the docstring of OpenAI, OpenAIEmbeddings and ChatOpenAI classes (#18908)
**Description:** Update the docstring of OpenAI, OpenAIEmbeddings and
ChatOpenAI classes
**Issue:** Update import module paths to the current LangChain API
**Dependencies:** None
**Lint and test**: `make format` and `make lint` were run

This incorporates the review comments from langchain-ai/langchain#18637
which I closed due to an issue I had in updating that pr branch

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
6 months ago
Leonid Ganeline 11195cfa42
community[patch]: speed up import times in the community package (#18928)
This PR speeds up import times in the community package
6 months ago
aditya thomas 8544f748f2
community[patch]: update AnthropicLLM deprecation message (#18869)
**Description:** Update AnthropicLLM deprecation message import path for
ChatAnthropic
**Issue:** Incorrect import path in deprecation message
**Dependencies:** None
**Lint and test**: `make format`, `make lint` and `make test` were run
6 months ago
Virat Singh cafffe8a21
community: Add PolygonAggregates tool (#18882)
**Description:**
In this PR, I am adding a `PolygonAggregates` tool, which can be used to
get historical stock price data (called aggregates by Polygon) for a
given ticker.

Polygon
[docs](https://polygon.io/docs/stocks/get_v2_aggs_ticker__stocksticker__range__multiplier___timespan___from___to)
for this endpoint.

**Twitter**: 
[@virattt](https://twitter.com/virattt)
6 months ago
Erick Friis 93ef8ead0b
mongodb[patch]: fix core dep (#18926) 6 months ago
Mohammad Mohtashim 43db4cd20e
core[major]: On Tool End Observation Casting Fix (#18798)
This PR updates the on_tool_end handlers to return the raw output from the tool instead of casting it to a string. 

This is technically a breaking change, though it's impact is expected to be somewhat minimal. It will fix behavior in `astream_events` as well.

Fixes the following issue #18760 raised by @eyurtsev

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
6 months ago
Massimiliano Pronesti 8113d612bb
community[patch]: support modin document loader (#18866)
Langchain community document loaders support `pyspark`, `polars`, and
`pandas` dataframes but not `modin`'s. This PR addresses this point.
6 months ago
Pol Ruiz Farre a7f63d8cb4
community[patch]: Fix BasePDFLoader suffix for s3 presigned urls (#18844)
BasePDFLoader doesn't parse the suffix of the file correctly when
parsing S3 presigned urls. This fix enables the proper detection and
parsing of S3 presigned URLs to prevent errors such as `OSError: [Errno
36] File name too long`.
No additional dependencies required.
6 months ago
Joshua Carroll ddaf9de169
community: Fix bug with StreamlitChatMessageHistory (#18834)
- **Description:** Fix Streamlit bug which was introduced by
https://github.com/langchain-ai/langchain/pull/18250, update integration
test
- **Issue:** https://github.com/langchain-ai/langchain/issues/18684
- **Dependencies:** None
6 months ago
Tomaz Bratanic a28be31a96
Switch to md5 for deduplication in neo4j integrations (#18846)
Deduplicate documents using MD5 of the page_content. Also allows for
custom deduplication with graph ingestion method by providing metadata
id attribute

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
6 months ago
Tomaz Bratanic 246724faab
LLM graph transformer prompt engineering (#18843)
A bit of prompt engineering to improve results
6 months ago
Erick Friis b48865bf94
langchain[patch]: attach hub metadata (#18830) 6 months ago
Ammar 34b31a8cc7
core: add in-code docs for RunnableAssign class (#18826)
**Description:** Improves the docstring for `RunnableAssign` by
providing a concise description and a self-contained code example.
  **Issue:**  #18803
6 months ago
Leonid Ganeline 476d6dc596
community[patch]: Use getattr for `toolkits` imports (#18825)
This will preserve the namespace, without actually loading the underlying packages on init.
6 months ago
Erick Friis bbb609ac9d
core[patch]: fix arbitrary config keys (#18827) 6 months ago
Luis Antonio Vieira Junior 67c880af74
community[patch]: adding linearization config to AmazonTextractPDFLoader (#17489)
- **Description:** Adding an optional parameter `linearization_config`
to the `AmazonTextractPDFLoader` so the caller can define how the output
will be linearized, instead of forcing a predefined set of linearization
configs. It will still have a default configuration as this will be an
optional parameter.
- **Issue:** #17457
- **Dependencies:** The same ones that already exist for
`AmazonTextractPDFLoader`
- **Twitter handle:** [@lvieirajr19](https://twitter.com/lvieirajr19)

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
6 months ago
Anis ZAKARI 37e89ba5b1
community[patch]: Bedrock add support for mistral models (#18756)
*Description**: My previous
[PR](https://github.com/langchain-ai/langchain/pull/18521) was
mistakenly closed, so I am reopening this one. Context: AWS released two
Mistral models on Bedrock last Friday (March 1, 2024). This PR includes
some code adjustments to ensure their compatibility with the Bedrock
class.

---------

Co-authored-by: Anis ZAKARI <anis.zakari@hymaia.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
6 months ago
Alexander Dicke 66576948e0
experimental[minor]: adds mixtral wrapper (#17423)
**Description:** Adds a chat wrapper for Mixtral models using the
[prompt
template](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1#instruction-format).

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
6 months ago
Keith Chan 914af69b44
community[patch]: Update azuresearch vectorstore from_texts() method to include fields argument (#17661)
- **Description:** Update azuresearch vectorstore from_texts() method to
include fields argument, necessary for creating an Azure AI Search index
with custom fields.
- **Issue:** Currently index fields are fixed to default fields if Azure
Search index is created using from_texts() method
- **Dependencies:** None
- **Twitter handle:** None

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
6 months ago
al1p 46f0cea2b9
community[patch][: improved the suffix prompt to avoid loop (#17791)
Small improvement to the openapi prompt.
The agent was not finding the server base URL (looping through all
nodes). This small change narrows the search and enables finding the url
faster.

No dependency 

Twitter : @al1pra
6 months ago
Dmitry Kankalovich f5117e907d
openai[patch]: Proper example for AzureOpenAI usage in error message (#17798)
# Proper example for AzureOpenAI usage in error message

The original error message is wrong in part of a usage example it gives.
Corrected to the right one.

Co-authored-by: Dzmitry Kankalovich <dzmitry_kankalovich@epam.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
6 months ago
Théo LEBRUN cf94091cd0
community[patch]: Skip nested directories when using S3DirectoryLoader (#17829)
- **Description:** `S3DirectoryLoader` is failing if prefix is a folder
(ex: `my_folder/`) because `S3FileLoader` will try to load that folder
and will fail. This PR skip nested directories so prefix can be set to
folder instead of `my_folder/files_prefix`.
- **Issue:**
  - #11917
  - #6535
  - #4326
- **Dependencies:** none
- **Twitter handle:** @Falydoor


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
6 months ago
Venkatesan 7a18b63dbf
community[patch]: Mongo index creation (#17748)
- [ ] Title: Mongodb: MongoDB connection performance improvement. 
- [ ] Message: 
- **Description:** I made collection index_creation as optional. Index
Creation is one time process.
- **Issue:** MongoDBChatMessageHistory class object is attempting to
create an index during connection, causing each request to take longer
than usual. This should be optional with a parameter.
    - **Dependencies:** N/A
    - **Branch to be checked:** origin/mongo_index_creation

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
6 months ago
wt3639 5b5b37a999
community[patch]: Add embedding instruction to HuggingFaceBgeEmbeddings (#18017)
- **Description:** Add embedding instruction to
HuggingFaceBgeEmbeddings, so that it can be compatible with nomic and
other models that need embedding instruction.

---------

Co-authored-by: Tao Wu <tao.wu@rwth-aachen.de>
Co-authored-by: Bagatur <baskaryan@gmail.com>
6 months ago
Erick Friis a8de6d1533
anthropic[patch]: integration test update (#18823) 6 months ago
wewebber-merlin d1f5bc4906
anthropic[patch]: add kwargs to format_output base (#18715)
_generate() and _agenerate() both accept **kwargs, then pass them on to
_format_output; but _format_output doesn't accept **kwargs. Attempting
to pass, e.g.,

     timeout=50

to _generate (or invoke()) results in a TypeError.

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
6 months ago
Erick Friis aa7bce6b13
anthropic[patch]: release 0.1.4 (#18822) 6 months ago
Erick Friis a5bcddc738
anthropic[patch]: streaming param (#18819) 6 months ago
Erick Friis 8c0b215c02
anthropic[patch]: fix format output args (#18816) 6 months ago
Ishani Vyas 2b0cbd65ba
community[patch]: Add Passio Nutrition AI Food Search Tool to Community Package (#18278)
## Add Passio Nutrition AI Food Search Tool to Community Package

### Description
We propose adding a new tool to the `community` package, enabling
integration with Passio Nutrition AI for food search functionality. This
tool will provide a simple interface for retrieving nutrition facts
through the Passio Nutrition AI API, simplifying user access to
nutrition data based on food search queries.

### Implementation Details
- **Class Structure:** Implement `NutritionAI`, extending `BaseTool`. It
includes an `_run` method that accepts a query string and, optionally, a
`CallbackManagerForToolRun`.
- **API Integration:** Use `NutritionAIAPI` for the API wrapper,
encapsulating all interactions with the Passio Nutrition AI and
providing a clean API interface.
- **Error Handling:** Implement comprehensive error handling for API
request failures.

### Expected Outcome
- **User Benefits:** Enable easy querying of nutrition facts from Passio
Nutrition AI, enhancing the utility of the `langchain_community` package
for nutrition-related projects.
- **Functionality:** Provide a straightforward method for integrating
nutrition information retrieval into users' applications.

### Dependencies
- `langchain_core` for base tooling support
- `pydantic` for data validation and settings management
- Consider `requests` or another HTTP client library if not covered by
`NutritionAIAPI`.

### Tests and Documentation
- **Unit Tests:** Include tests that mock network interactions to ensure
tool reliability without external API dependency.
- **Documentation:** Create an example notebook in
`docs/docs/integrations/tools/passio_nutrition_ai.ipynb` showing usage,
setup, and example queries.

### Contribution Guidelines Compliance
- Adhere to the project's linting and formatting standards (`make
format`, `make lint`, `make test`).
- Ensure compliance with LangChain's contribution guidelines,
particularly around dependency management and package modifications.

### Additional Notes
- Aim for the tool to be a lightweight, focused addition, not
introducing significant new dependencies or complexity.
- Potential future enhancements could include caching for common queries
to improve performance.

### Twitter Handle
- Here is our Passio AI [twitter handle](https://twitter.com/@passio_ai)
where we announce our products.


If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
6 months ago
Kushagra b1f22bf76c
community[minor]: added a feature to filter documents in Mongoloader (#18253)
"community: added a feature to filter documents in Mongoloader"
- **Description:** added a feature to filter documents in Mongoloader
    - **Feature:** the feature #18251
    - **Dependencies:** No
    - **Twitter handle:** https://twitter.com/im_Kushagra
6 months ago
Eugene Yurtsev 1f50274df7
community[patch]: Add pgvector to docker compose and update settings used in integration test (#18815) 6 months ago
Erick Friis ad29806255
nvidia-trt, nvidia-ai-endpoints: move to repo (#18814)
NVIDIA maintained in https://github.com/langchain-ai/langchain-nvidia
6 months ago
Christophe Bornet e54a49b697
community[minor]: Add lazy_table_reflection param to SqlDatabase (#18742)
For some DBs with lots of tables, reflection of all the tables can take
very long. So this change will make the tables be reflected lazily when
get_table_info() is called and `lazy_table_reflection` is True.
6 months ago
Christophe Bornet ead2a74806
community: Implement lazy_load() for JSONLoader (#18643)
Covered by `tests/unit_tests/document_loaders/test_json_loader.py`
6 months ago
Erick Friis a88f62ec3c
langchain[patch]: getattr import from langchain.chains (#18160) 6 months ago
Eugene Yurtsev cdfb5b4ca1
core[minor]: Chat Models to fallback astream to fallback on sync stream if available (#18748)
Allows all chat models that implement _stream, but not _astream to still have async streaming to work.

Amongst other things this should resolve issues with streaming community model implementations through langserve since langserve is exclusively async.
6 months ago
aditya thomas e00c1ff2b0
infra: ChatOpenAI unit tests for invoke() and ainvoke() (#18792)
**Description:** Replacing the deprecated predict() and apredict()
methods in the unit tests
**Issue:** Not applicable
**Dependencies:** None
**Lint and test**: `make format`, `make lint` and `make test` have been
run
6 months ago
Bagatur 3e29c04213
core[minor]: add BaseMessage.response_metadata (#18699) 6 months ago
Bagatur bc6249c889
langchain[patch]: runnable agent streaming param (#18761)
Usage:

```python
agent = RunnableAgent(runnable=runnable, .., stream_runnable=False)
```
or for convenience
```python
agent_executor = AgentExecutor(agent=agent, ..., stream_runnable=False)
```
6 months ago
Tomaz Bratanic c8c592d3f1
experimental[minor]: Add LLM graph transformer (#18733)
Add a class that constructs knowledge graphs based on text using an LLM.
6 months ago
Phat Vo 3ecb903d49
community[patch] : Tidy up and update Clarifai SDK functions (#18314)
Description :
* Tidy up, add missing docstring and fix unused params
* Enable using session token
6 months ago
Max Jakob 61a2eba081
elasticsearch[patch]: add top-level import, remove obsolete dependency (#18644)
Make `ElasticsearchRetriever` available as top-level import.

The `langchain` package depends on `langchain-community` so we do not
need to depend on it explicitly.
6 months ago
Tomaz Bratanic 010a234f1e
docs: Fix diffbot graph transformer description (#18736)
The previous docstring was invalid
6 months ago
Jan Nissen b8922480ed
core[patch]: improve PydanticOutputParser typing (#18740)
This PR adds generic typing to `PydanticOutputParser` so we get a typed
output from `.parse` instead of `Any`. It should provide a better DX by
way of Intellisense and for anyone strictly typing.

Pre-change:

![Screenshot 2024-03-07 at 10 22
31 AM](https://github.com/langchain-ai/langchain/assets/22690160/fd22dde0-9fdc-4283-b283-4c98f0bc46e5)

Post-change:

![Screenshot 2024-03-07 at 10 26
31 AM](https://github.com/langchain-ai/langchain/assets/22690160/7e23d2b7-8f8c-494f-80b3-187530a173ee)

I haven't dug too deep, but I think a similar change could probably be
added to `JsonOutputParser` so we don't have to pull up `.parse`.

Co-authored-by: Jan Nissen <jan23@gmail.com>
6 months ago
Massimiliano Pronesti 3b975c6ebe
experimental[minor]: add support for modin in pandas agent (#18749)
Added support for Intel's
[modin](https://github.com/modin-project/modin) in
`create_pandas_dataframe_agent`.
6 months ago
Tomaz Bratanic 4bfe888717
comunity[patch]: Fix neo4j sanitizing values (#18750)
Fixing sanitization for when deeply nested lists appear
6 months ago
Eugene Yurtsev 6caceb5473
core[patch]: Automatic upgrade to AddableDict in transform and atransform (#18743)
Automatic upgrade to transform and atransform

Closes: 

https://github.com/langchain-ai/langchain/issues/18741
https://github.com/langchain-ai/langgraph/issues/136
https://github.com/langchain-ai/langserve/issues/504
6 months ago
Yunmo Koo fee6f983ef
community[minor]: Integration for `Friendli` LLM and `ChatFriendli` ChatModel. (#17913)
## Description
- Add [Friendli](https://friendli.ai/) integration for `Friendli` LLM
and `ChatFriendli` chat model.
- Unit tests and integration tests corresponding to this change are
added.
- Documentations corresponding to this change are added.

## Dependencies
- Optional dependency
[`friendli-client`](https://pypi.org/project/friendli-client/) package
is added only for those who use `Frienldi` or `ChatFriendli` model.

## Twitter handle
- https://twitter.com/friendliai
6 months ago
Smit Parmar aed46cd6f2
community[patch]: Added support for filter out AWS Kendra search by score confidence (#12920)
**Description:** It will add support for filter out kendra search by
score confidence which will make result more accurate.
    For example
   ```
retriever = AmazonKendraRetriever(
        index_id=kendra_index_id, top_k=5, region_name=region,
        score_confidence="HIGH"
    )
```
Result will not include the records which has score confidence "LOW" or "MEDIUM". 
Relevant docs 
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/kendra/client/query.html
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/kendra/client/retrieve.html

 **Issue:** the issue # it resolve #11801 
**twitter:** [@SmitCode](https://twitter.com/SmitCode)
6 months ago
Ian 390ef6abe3
community[minor]: Add Initial Support for TiDB Vector Store (#15796)
This pull request introduces initial support for the TiDB vector store.
The current version is basic, laying the foundation for the vector store
integration. While this implementation provides the essential features,
we plan to expand and improve the TiDB vector store support with
additional enhancements in future updates.

Upcoming Enhancements:
* Support for Vector Index Creation: To enhance the efficiency and
performance of the vector store.
* Support for max marginal relevance search. 
* Customized Table Structure Support: Recognizing the need for
flexibility, we plan for more tailored and efficient data store
solutions.

Simple use case exmaple

```python
from typing import List, Tuple
from langchain.docstore.document import Document
from langchain_community.vectorstores import TiDBVectorStore
from langchain_openai import OpenAIEmbeddings

db = TiDBVectorStore.from_texts(
    embedding=embeddings,
    texts=['Andrew like eating oranges', 'Alexandra is from England', 'Ketanji Brown Jackson is a judge'],
    table_name="tidb_vector_langchain",
    connection_string=tidb_connection_url,
    distance_strategy="cosine",
)

query = "Can you tell me about Alexandra?"
docs_with_score: List[Tuple[Document, float]] = db.similarity_search_with_score(query)
for doc, score in docs_with_score:
    print("-" * 80)
    print("Score: ", score)
    print(doc.page_content)
    print("-" * 80)
```
6 months ago
Bagatur 3b1eb1f828
community[patch]: chat hf typing fix (#18693) 6 months ago
Jib d60e93b6ae
langchain-mongodb: Standardize mongodb collection/index names in tests (#18755)
## **Description:**
MongoDB integration tests link to a provided Atlas Cluster. We have very
stringent permissions set against the cluster provided. In order to make
it easier to track and isolate the collections each test gets run
against, we've updated the collection names to map the test file name.
i.e. `langchain_{filename}` => `langchain_test_vectorstores`

Fixes integration test results

![image](https://github.com/langchain-ai/langchain/assets/2887713/41f911b9-55f7-4fe4-9134-5514b82009f9)

## **Dependencies:** 
Provided MONGODB_ATLAS_URI

- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

cc: @shaneharvey, @blink1073 , @NoahStapp , @caseyclements
6 months ago
Eugene Yurtsev ca299a8e08
Docs: Add custom parsing documentation and extending langchain (#18331)
* Added extending langchain.mdx -- we'll need to add links as we add
more custom documentation
* Added partial documentation about parsers
6 months ago
Eugene Yurtsev 8c71f92cb2
core: upgrade mypy to recent mypy (#18753)
Testing this works per package on CI
6 months ago
Eugene Yurtsev e188d4ecb0
Add dangerous parameter to requests tool (#18697)
The tools are already documented as dangerous. Not clear whether adding
an opt-in parameter is necessary or not
6 months ago
Erick Friis 1beb84b061
community[patch]: move pdf text tests to integration (#18746) 6 months ago
Christophe Bornet 4a7d73b39d
community: If load() has been overridden, use it in default lazy_load() (#18690) 6 months ago
Christophe Bornet 6cd7607816
community[patch]: Implement lazy_load() for MHTMLLoader (#18648)
Covered by `tests/unit_tests/document_loaders/test_mhtml.py`
6 months ago
axiangcoding 9745b5894d
community[patch]: Chroma use uuid4 instead of uuid1 to generate random ids (#18723)
- **Description:** Chroma use uuid4 instead of uuid1 as random ids. Use
uuid1 may leak mac address, changing to uuid4 will not cause other
effects.
  - **Issue:** None
  - **Dependencies:** None
  - **Twitter handle:** None
6 months ago
Guangdong Liu ced5e7bae7
community[patch]: Fix sparkllm authentication problem. (#18651)
- **Description:** fix sparkllm authentication problem.The current
timestamp is in RFC1123 format. The time deviation must be controlled
within 300s. I changed to re-obtain the url every time I ask a question.
https://www.xfyun.cn/doc/spark/general_url_authentication.html#_1-2-%E9%89%B4%E6%9D%83%E5%8F%82%E6%95%B0
6 months ago
Erick Friis 89d32ffbbd
community[patch]: release 0.0.27 (#18708) 6 months ago
Erick Friis c09b520ce4
core[patch]: release 0.1.30 (#18706) 6 months ago
Piyush Jain 2b234a4d96
Support for claude v3 models. (#18630)
Fixes #18513.

## Description
This PR attempts to fix the support for Anthropic Claude v3 models in
BedrockChat LLM. The changes here has updated the payload to use the
`messages` format instead of the formatted text prompt for all models;
`messages` API is backwards compatible with all models in Anthropic, so
this should not break the experience for any models.


## Notes
The PR in the current form does not support the v3 models for the
non-chat Bedrock LLM. This means, that with these changes, users won't
be able to able to use the v3 models with the Bedrock LLM. I can open a
separate PR to tackle this use-case, the intent here was to get this out
quickly, so users can start using and test the chat LLM. The Bedrock LLM
classes have also grown complex with a lot of conditions to support
various providers and models, and is ripe for a refactor to make future
changes more palatable. This refactor is likely to take longer, and
requires more thorough testing from the community. Credit to PRs
[18579](https://github.com/langchain-ai/langchain/pull/18579) and
[18548](https://github.com/langchain-ai/langchain/pull/18548) for some
of the code here.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
6 months ago
Sam Khano 1b4dcf22f3
community[minor]: Add DocumentDBVectorSearch VectorStore (#17757)
**Description:**
- Added Amazon DocumentDB Vector Search integration (HNSW index)
- Added integration tests
- Updated AWS documentation with DocumentDB Vector Search instructions
- Added notebook for DocumentDB integration with example usage

---------

Co-authored-by: EC2 Default User <ec2-user@ip-172-31-95-226.ec2.internal>
6 months ago
Vittorio Rigamonti 51f3902bc4
community[minor]: Adding support for Infinispan as VectorStore (#17861)
**Description:**
This integrates Infinispan as a vectorstore.
Infinispan is an open-source key-value data grid, it can work as single
node as well as distributed.

Vector search is supported since release 15.x 

For more: [Infinispan Home](https://infinispan.org)

Integration tests are provided as well as a demo notebook
6 months ago
Max Jakob cca0167917
elasticsearch[patch], community[patch]: update references, deprecate community classes (#18506)
Follow up on https://github.com/langchain-ai/langchain/pull/17467.

- Update all references to the Elasticsearch classes to use the partners
package.
- Deprecate community classes.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
6 months ago
Djordje 12b4a4d860
community[patch]: Opensearch delete method added - indexing supported (#18522)
- **Description:** Added delete method for OpenSearchVectorSearch,
therefore indexing supported
    - **Issue:** No
    - **Dependencies:** No
    - **Twitter handle:** stkbmf
6 months ago
Erick Friis 687d27567d
openai[patch]: unit test azure init (#18703) 6 months ago
Christophe Bornet db8db6faae
community: Implement lazy_load() for PlaywrightURLLoader (#18676)
Integration tests:
`tests/integration_tests/document_loaders/test_url_playwright.py`
6 months ago
Aaron Yi c092db862e
community[patch]: make metadata and text optional as expected in DocArray (#18678)
ValidationError: 2 validation errors for DocArrayDoc
text
Field required [type=missing, input_value={'embedding': [-0.0191128...9, 0.01005221541175212]}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.5/v/missing
metadata
Field required [type=missing, input_value={'embedding': [-0.0191128...9, 0.01005221541175212]}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.5/v/missing
```
In the `_get_doc_cls` method, the `DocArrayDoc` class is defined as
follows:

```python
class DocArrayDoc(BaseDoc):
    text: Optional[str]
    embedding: Optional[NdArray] = Field(**embeddings_params)
    metadata: Optional[dict]
```
6 months ago
Eugene Yurtsev 4c25b49229
community[major]: breaking change in some APIs to force users to opt-in for pickling (#18696)
This is a PR that adds a dangerous load parameter to force users to opt in to use pickle.

This is a PR that's meant to raise user awareness that the pickling module is involved.
6 months ago
Eugene Yurtsev 0e52961562
community[patch]: Patch tdidf retriever (CVE-2024-2057) (#18695)
This is a patch for `CVE-2024-2057`:
https://www.cve.org/CVERecord?id=CVE-2024-2057

This affects users that: 

* Use the  `TFIDFRetriever`
* Attempt to de-serialize it from an untrusted source that contains a
malicious payload
6 months ago
Erick Friis 2619420df1
mongodb[patch]: release 0.1.1 (#18692) 6 months ago
Christophe Bornet ea141511d8
core: Move document loader interfaces to core (#17723)
This is needed to be able to move document loaders to partner packages.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
6 months ago
Christophe Bornet 5985454269
Merge pull request #18539
* Implement lazy_load() for GitLoader
6 months ago
Christophe Bornet 9a6f7e213b
Merge pull request #18423
* Implement lazy_load() for BSHTMLLoader
6 months ago
Christophe Bornet b3a0c44838
Merge pull request #18673
* Implement lazy_load() for PDFMinerPDFasHTMLLoader and PyMuPDFLoader
6 months ago
Christophe Bornet 68fc0cf909
Merge pull request #18674
* Implement lazy_load() for TextLoader
6 months ago
Christophe Bornet 5b92f962f1
Merge pull request #18671
* Implement lazy_load() for MastodonTootsLoader
6 months ago
Christophe Bornet 15b1770326
Merge pull request #18421
* Implement lazy_load() for AssemblyAIAudioTranscriptLoader
6 months ago
Christophe Bornet bb284eebe4
Merge pull request #18436
* Implement lazy_load() for ConfluenceLoader
6 months ago
Christophe Bornet 691480f491
Merge pull request #18647
* Implement lazy_load() for UnstructuredBaseLoader
6 months ago