Commit Graph

944 Commits (c9e9470c5a3a12e2fe7fd14164dd5a26aec37fb0)

Author SHA1 Message Date
Erick Friis 845e407e08
community[patch]: release 0.0.12 (#15973) 6 months ago
Jonathan Algar a74f3a4979
Batch update of alt text and title attributes for images in md/mdx files across repo (#15357)
**Description:** Batch update of alt text and title attributes for
images in `md` & `mdx` files across the repo using
[alttexter](https://github.com/jonathanalgar/alttexter)/[alttexter-ghclient](https://github.com/jonathanalgar/alttexter-ghclient)
(built using LangChain/LangSmith).

**Limitation:** cannot update `ipynb` files because of [this
issue](https://github.com/langchain-ai/langchain/pull/15357#issuecomment-1885037250).
Can revisit when Docusaurus is bumped to v3.

I checked all the generated alt texts and titles and didn't find any
technical inaccuracies. That's not to say they're _perfect_, but a lot
better than what's there currently.


[Deployed](https://langchain-819yf1tbk-langchain.vercel.app/docs/modules/model_io/)
image example:


![chrome_yZQ7BF2GTj](https://github.com/langchain-ai/langchain/assets/93204286/43a9a4d4-70fd-41c4-8978-b6240ff63ffa)

You can see LangSmith traces for all the calls out to the LLM in the PRs
merged into this one:

* https://github.com/jonathanalgar/langchain/pull/6
* https://github.com/jonathanalgar/langchain/pull/4
* https://github.com/jonathanalgar/langchain/pull/3

I didn't add the following files to the PR as the images already have OK
alt texts:

*
27dca2d92f/docs/docs/integrations/providers/argilla.mdx (L3)
*
27dca2d92f/docs/docs/integrations/providers/apify.mdx (L11)

---------

Co-authored-by: github-actions <github-actions@github.com>
6 months ago
Varik Matevosyan efe6cfafe2
community: Added Lantern as VectorStore (#12951)
Support [Lantern](https://github.com/lanterndata/lantern) as a new
VectorStore type.

- Added Lantern as VectorStore.
It will support 3 distance functions `l2 squared`, `cosine` and
`hamming` and will use `HNSW` index.
- Added tests
- Added example notebook
6 months ago
Edwin Wenink 9fb09c1c30
community: fix the "page" mode in the AzureAIDocumentIntelligenceParser (bug) (#15958)
**Description**: the "page" mode in the
AzureAIDocumentIntelligenceParser is not accessible due to a wrong
membership test. The mode argument can only be a string (also see the
assertion in the `__init__`: `assert self.mode in ["single", "page",
"object", "markdown"]`, so the check `elif self.mode == ["page"]:`
always fails.
As a result, effectively the "object" mode is used when selecting the
"page" mode, which may lead to errors.

The docstring of the `AzureAIDocumentIntelligenceLoader` also ommitted
the `mode` parameter alltogether, so I added it.

**Issue**: I could not find a related issue (this class is only 3 weeks
old anyways)

**Dependencies**: this PR does not introduce or affect dependencies.

The current demo notebook and examples are not affected because they all
use the default markdown mode.
6 months ago
Mahdi Setayesh eb76f9c9fe
community: Fixing a performance issue with AzureSearch to perform batch embedding (#15594)
- **Description:** Azure Cognitive Search vector DB store performs slow
embedding as it does not utilize the batch embedding functionality. This
PR provide a fix to improve the performance of Azure Search class when
adding documents to the vector search,
  - **Issue:** #11313 ,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
6 months ago
ChengZi d5808f786c
community: Support milvus partition key. (#15740)
- **Description:** Milvus's partition key is an important feature. It
can support multi-tenancy. We hope to introduce this feature.
https://milvus.io/docs/partition_key.md
  - **Issue:** No
  - **Dependencies:** No
  - **Twitter handle:** No

---------

Signed-off-by: ChengZi <chen.zhang@zilliz.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
6 months ago
ohbeep 9b3962fc25
community: Add support of "http" URI for Milvus (#12710) (#15683)
- **Description:** Add support of HTTP URI for Milvus
  - **Issue:** #12710 
  - **Dependencies:** N/A,
6 months ago
Raunak e26e1f8b37
community: Added functions to make async calls to HuggingFaceHub's embedding endpoint in HuggingFaceHubEmbeddings class (#15737)
**Description:**
Added aembed_documents() and aembed_query() async functions in
HuggingFaceHubEmbeddings class in
langchain_community\embeddings\huggingface_hub.py file. It will support
to make async calls to HuggingFaceHub's
embedding endpoint and generate embeddings asynchronously.

Test Cases: Added test_huggingfacehub_embedding_async_documents() and
test_huggingfacehub_embedding_async_query()
functions in test_huggingface_hub.py file to test the two async
functions created in HuggingFaceHubEmbeddings class.

Documentation: Updated huggingfacehub.ipynb with steps to install
huggingface_hub package and use
HuggingFaceHubEmbeddings.

**Dependencies:** None,
**Twitter handle:** I do not have a Twitter account

---------

Co-authored-by: H161961 <Raunak.Raunak@Honeywell.com>
6 months ago
Christophe Bornet 81d1ba05dc
Add a BaseStore backed by AstraDB (#15812)
- **Description:** this change adds a `BaseStore` backed by AstraDB
  - **Twitter handle:** cbornet_
6 months ago
manishsahni2000 74d9fc2f9e
PR community:Removing knn beta content in mongodb atlas vectorstore (#15865)
<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
6 months ago
shahrin014 bdd90ae2ee
community: Ollama - Pass headers to post request (#15881)
## Feature
- Set additional headers in constructor
- Headers will be sent in post request

This feature is useful if deploying Ollama on a cloud service such as
hugging face, which requires authentication tokens to be passed in the
request header.

## Tests
- Test if header is passed
- Test if header is not passed
6 months ago
Xin Liu 5efec068c9
feat: Implement `stream` interface (#15875)
<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

Major changes:

- Rename `wasm_chat.py` to `llama_edge.py`
- Rename the `WasmChatService` class to `ChatService`
- Implement the `stream` interface for `ChatService`
- Add `test_chat_wasm_service_streaming` in the integration test
- Update `llama_edge.ipynb`

---------

Signed-off-by: Xin Liu <sam@secondstate.io>
6 months ago
Massimiliano Pronesti ec4dab0449
feat(community): make Amadeus toolkit LLM-agnostic (#15879)
- **Description:** `AmadeusToolkit` and `AmadeusClosestAirport`
contained a hardcoded call to `ChatOpenAI`. This PR makes it
LLM-independent, while guaranteeing backward compatibility.
  - **Issue:** #15847 
  - **Dependencies:** None
   
@baskaryan 

<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
6 months ago
Yacine 782dd44be9
<langchain_community.vectorstores>:<Fix pinecone.py __init__ docsrting instruction> (#15922)
- **Description:** The pinecone docstring instructs to pass the
embedding query text causing the warning below. It should be the
embeddings object.
warning message: UserWarning: Passing in `embedding` as a Callable is
deprecated. Please pass in an Embeddings object instead.
  - **Issue:** NA
  - **Dependencies:** None


@baskaryan
6 months ago
Erick Friis 623f87c888
community[patch]: pinecone bug (#15905) 6 months ago
axiangcoding d5aa277b94
community: add collection_properties parameter to Milvus (#15788)
- **Description:** add collection_properties parameter to Milvus. See
[pymilvus set_properties()
description](https://milvus.io/api-reference/pymilvus/v2.3.x/Collection/set_properties().md)
  - **Issue:** None
  - **Dependencies:** None
  - **Twitter handle:** None
6 months ago
mogith-pn 9e1ed17bfb
Community : Modified doc strings and example notebook for Clarifai (#15816)
Community : Modified doc strings and example notebook for Clarifai

Description:
1. Modified doc strings inside clarifai vectorstore class and
embeddings.
2. Modified notebook examples.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
6 months ago
Erick Friis 38523d7c57
together[minor]: add llm (#15853) 6 months ago
Erick Friis ee708739c3
community[patch]: pinecone v3 support (#15849)
Info in slack

---------

Co-authored-by: Roie Schwaber-Cohen <roie.cohen@gmail.com>
6 months ago
Erick Friis 85a4594ed7
community[patch]: more deprecations (#15782) 6 months ago
NuODaniel 70b6315b23
community[patch]: fix qianfan chat stream calling caused exception (#13800)
- **Description:** 
`QianfanChatEndpoint` extends `BaseChatModel` as a super class, which
has a default stream implement might concat the MessageChunk with
`__add__`. When call stream(), a ValueError for duplicated key will be
raise.
  - **Issues:** 
     * #13546  
     * #13548
     * merge two single test file related to qianfan.
  - **Dependencies:** no
  - **Tag maintainer:**

---------

Co-authored-by: root <liujun45@baidu.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
6 months ago
Bagatur ee5bd986de
community[patch]: update oai deprecation message (#15681)
addresses #15674
6 months ago
Erick Friis 4ed3d17c47
community[patch]: release 0.0.11 (#15760) 6 months ago
Ian 32ec56194b
community: fix myscale delete function bug (#15675)
Now the SQL used to delete vector doc from myscale is as follow:
```sql
DELETE FROM collection WHERE id = '1' AND id = '2' AND id = '3'
```

But the expected one should be 

```sql
DELETE FROM collection WHERE id IN ('1', '2', '3')
```
6 months ago
Christophe Bornet a466f79ac9
Fix AstraDB logical operator filtering (#15699)
<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
This change fixes the AstraDB logical operator filtering (`$and,`
`$or`).
The `metadata` prefix must not be added if the key is `$and` or `$or`.
6 months ago
Christophe Bornet 1f5f6381ec
Add doc for AstraDB document loader (#15703)
<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
See preview :
https://langchain-git-fork-cbornet-astra-loader-doc-langchain.vercel.app/docs/integrations/document_loaders/astradb
6 months ago
Erick Friis 94911ae503
community[patch]: Support different Pinecone initializations depending on the version (#15717)
Co-authored-by: DosticJelena <jelenadostic2@gmail.com>
6 months ago
Bagatur 4c47f39fcb
community[patch]: Release 0.0.10 (#15678) 6 months ago
Nuno Campos 7ce4cd0709
Do not issue beta or deprecation warnings on internal calls (#15641) 6 months ago
Earlee 98c6c9603e
community: fix: should flush after inserting data on milvus (#15568)
The inserted data cannot take effect immediately. We should flush after
inserting data on milvus.
6 months ago
chyroc a17a3638b5
Docs: fix excel document loader typo (#15470) 6 months ago
chyroc 9ae901c5e6
Feat: add CHM file loader (#15519)
fix https://github.com/langchain-ai/langchain/issues/15469
6 months ago
Nan LI 0b393315ce
community: Correct Input API Key Name in Notebook and Enhance Readability of Comments for ZhipuAI Chat Model (#15529)
- **Description:** This update rectifies an error in the notebook by
changing the input variable from `zhipu_api_key` to `api_key`. It also
includes revisions to comments to improve program readability.
- **Issue:** The input variable in the notebook example should be
`api_key` instead of `zhipu_api_key`.
- **Dependencies:** No additional dependencies are required for this
change.

To ensure quality and standards, we have performed extensive linting and
testing. Commands such as make format, make lint, and make test have
been run from the root of the modified package to ensure compliance with
LangChain's coding standards.
6 months ago
kursathalat 9ea28ee464
fix: Fix DEFAULT_API_KEY for ArgillaCallbackHandler (#15534)
- ArgillaCallbackHandler does not properly set the default values while
initializing. This PR corrects the line.
- Issue: #15531 
- Dependencies: Argilla

- Also corrected some dead links.
6 months ago
Chad Norvell d1bfb70bc4
community: Allow deleting by ID and collection in `pgvector` (#15627)
- **Description:** The `delete_collection` method deletes an entire
collection regardless of custom ID. The `delete` method deletes
everything with the provided custom IDs regardless of collection. It can
be useful to restrict deletion to both the collection and a set of
custom IDs. This change adds support for that by allowing you to
optionally specify that `delete` should be restricted to the collection
defined on the `PGVector` instance.
6 months ago
Chad Norvell f6226d464e
community: Include PDF ID in MathPix metadata (#15629)
- **Description:** Includes the PDF ID in the MathPix document metadata.
This is useful in case you need to re-request a processed PDF from the
MathPix API later.
6 months ago
Chad Norvell d2a686b165
community: Provide more actionable errors in the MathPix PDF loader (#15630)
- **Description:** The `error_info['id']` can be cross-referenced with
the MathPix API documentation to get very specific information about why
an error occurred.
6 months ago
Kai 5d05df4bce
community: Fixed bug of "system message check" in chat_models/tongyi. (#15631)
- **Description:** This PR is to fix a bug of "system message check" in
langchain_community/ chat_models/tongyi.py
- **Issue:** In term of current logic, if there's no system message in
the chat messages, an error of "System message can only be the first
message." will be wrongly raised.
  - **Dependencies:** No.
  - **Twitter handle:** I don't have a Twitter account.
6 months ago
Raunak 64f5968a81
community: Replaced hardcoded "metadata" with FIELDS_METADATA variable in semantic_hybrid_search_with_score_and_rerank (#15642)
- **Description:** This PR is to fix a bug in
semantic_hybrid_search_with_score_and_rerank() function in
langchain_community/vectorstores/azuresearch.py. The hardcoded
"metadata" name is replaced with FIELDS_METADATA variable with an if
block to check if the metadata column exists or not.
- **Issue:** Fixed #15581
- **Dependencies:** No
- **Twitter handle:** None

Co-authored-by: H161961 <Raunak.Raunak@Honeywell.com>
6 months ago
Erick Friis d136925c49
community[patch]: fix deprecation warnings on openai subclasses (#15621) 6 months ago
Bagatur c5226d7a18
docs: update cohere chat integration (#15562)
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
6 months ago
Bagatur dbb582d227
infra: community bump min core version (#15617) 6 months ago
Bagatur 1e4b8f0453
community[patch]: Release 0.0.9 (#15615) 6 months ago
Erick Friis ebc75c5ca7
openai[minor]: implement langchain-openai package (#15503)
Todo

- [x] copy over integration tests
- [x] update docs with new instructions in #15513 
- [x] add linear ticket to bump core -> community, community->langchain,
and core->openai deps
- [ ] (optional): add `pip install langchain-openai` command to each
notebook using it
- [x] Update docstrings to not need `openai` install
- [x] Add serialization
- [x] deprecate old models

Contributor steps:

- [x] Add secret names to manual integrations workflow in
.github/workflows/_integration_test.yml
- [x] Add secrets to release workflow (for pre-release testing) in
.github/workflows/_release.yml

Maintainer steps (Contributors should not do these):

- [x] set up pypi and test pypi projects
- [x] add credential secrets to Github Actions
- [ ] add package to conda-forge


Functional changes to existing classes:

- now relies on openai client v1 (1.6.1) via concrete dep in
langchain-openai package

Codebase organization

- some function calling stuff moved to
`langchain_core.utils.function_calling` in order to be used in both
community and langchain-openai
6 months ago
Bagatur a7d023aaf0
core[patch], community[patch]: mark runnable context, lc load as beta (#15603) 6 months ago
chyroc f12b5c1222
Feat: support Milvus more params (#15447)
fix https://github.com/langchain-ai/langchain/issues/15442
6 months ago
Bagatur b2f15738dd
core[patch], langchain[patch], community[patch]: Revert #15326 (#15546) 6 months ago
Bagatur 0b579dc623
infra: update community test min reqs (#15490) 6 months ago
Bagatur 266db0efc8
community[patch]: bump core version >=0.1.5,<0.2 (#15488) 6 months ago
Bagatur a2324ee533
community[patch]: Release 0.0.8 (#15481) 6 months ago
Bagatur 54b58c03db
infra: add minimum deps pre release check (#15485) 6 months ago
Bagatur baeac236b6
langchain[patch], experimental[patch]: update utilities imports (#15438) 6 months ago
Harutaka Kawamura 73da8f863c
Remove unused `Params` (#14385)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

Removes unused `Params` in `libs/langchain/langchain/llms/mlflow.py`.

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
6 months ago
chyroc b65e57971e
Patch: improve type hint (#15451) 6 months ago
Harutaka Kawamura 8ebf55ebbf
Fix `llms.Mlflow` example (#14386)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

The example code for `llms.Mlflow` is outdated.

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
6 months ago
Xin Liu 0a7d360ba4
feat: new integration `wasm_chat` (#14787)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

Adds `WasmChat` integration. `WasmChat` runs GGUF models locally or via
chat service in lightweight and secure WebAssembly containers. In this
PR, `WasmChatService` is introduced as the first step of the
integration. `WasmChatService` is driven by
[llama-api-server](https://github.com/second-state/llama-utils) and
[WasmEdge Runtime](https://wasmedge.org/).

---------

Signed-off-by: Xin Liu <sam@secondstate.io>
6 months ago
Anush 58cc7878e9
refactor: Qdrant async improvements (#14492)
Follow up on https://github.com/langchain-ai/langchain/pull/13048.
This PR intends to simplify the Qdrant async implementation by replacing
the internal GRPC methods with the `QdrantAsyncClient` methods.
This is a backward compatible change with no additional steps required
after merge.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
6 months ago
JuR-0 4dab37741a
Fix Bedrock broad error catching (#14398)
Fixes #14347 

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
- **Description:** Added the traceback of the previous error to keep the
initial error type,
  - **Issue:** #14347 ,
  - **Dependencies:** None,
  - **Tag maintainer:** 

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: Julien Raffy <julien.raffy@emeria.eu>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
6 months ago
Bob Lin e93be14c11
Improvement: Allow passing parameters to the underlying es_client. Closes: #14403 (#14435)
### Description

In https://github.com/langchain-ai/langchain/issues/14403, the user
mentioned that he hopes not to verify ssl and needs to pass more
parameters

I found that the `Elasticsearch` class [has very many
parameters](98f2af2134/elasticsearch/_sync/client/__init__.py (L131-L191)
):

<img width="1097" alt="Screenshot 2023-12-08 at 4 24 39 PM"
src="https://github.com/langchain-ai/langchain/assets/10000925/f2201554-b41a-4388-a8e8-c14a2d0466d4">

In order to adapt to more situations, I want to add the kwargs parameter
so that users can enter more `Elasticsearch` parameters. Like
[redis](https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/vectorstores/redis/base.py#L253),
[tair](https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/vectorstores/tair.py#L32),
[myscale](https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/vectorstores/myscale.py#L112)
and so on.
6 months ago
codehound42 8aa921d3a4
Support `score_threshold` in SupabaseVectorStore similarity search (#14439)
Description: Add support for setting the `score_threshold` for
similarity search in SupabaseVectoreStore.

This pull request addresses issue #14438

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
6 months ago
YISH eecfa81918
Add the collection_description parameter to Milvus (#14524)
Because Milvus' collection_name doesn't support UFT8 characters in other
languages, I want the `collection_descriotion`.


<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
6 months ago
Evgenii Molov b4ec340fb3
Fix failing serpapi response processing for Google Maps API (#14817)
**Description:** Fix for processing for serpapi response for Google Maps
API
**Issue:** Due to the fact corresponding
[api](https://serpapi.com/google-maps-api) returns 'local_results' as
list, and old version requested `res["local_results"].keys()` of the
list. As the result we got exception: ```AttributeError: 'list' object
has no attribute 'keys'```.

Way to reproduce wrong behaviour:
```
    params = {
        "engine": "google_maps",
        "type": "search",
        "google_domain": "google.de",
        "ll": "@51.1917,10.525,14z",
        "hl": "de",
        "gl": "de",
    }
    search = SerpAPIWrapper(params=params)
    results = search.run("cafe")
```

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Ran <rccalman@gmail.com>
6 months ago
YISH da0f750a0b
Milvus allows to store metadata as json field (#14636)
Because Milvus doesn't support nullable fields, but document metadata is
very rich, so it makes more sense to store it as json.


https://github.com/milvus-io/pymilvus/issues/1705#issuecomment-1731112372

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
6 months ago
Ashley Xu 0ce7858529
feat: add Google BigQueryVectorSearch in vectorstore (#14829)
BigQuery vector search lets you use GoogleSQL to do semantic search,
using vector indexes for fast but approximate results, or using brute
force for exact results.

This PR integrates LangChain vectorstore with BigQuery Vector Search.

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: Vlad Kolesnikov <vladkol@google.com>
6 months ago
JaguarDB 02f59c2035
Use args option in jaguar so it takes more options in similarity search (#15080)
- **Description:** replace score_threshold with args
  - **Issue:** needs a way to pass more options to similarity search
  - **Dependencies:** None
  - **Twitter handle:** @workbot

---------

Co-authored-by: JY <jyjy@jaguardb>
6 months ago
chyroc 37ad6ec248
Refactor: use SecretStr for tongyi chat-model (#15102) 6 months ago
Shaurya Rohatgi e1c2cd7a28
community: Semanticscholar tool to search 200M+ scientific articles (#15151)
- **Description:** Tool now supports querying over 200 million
scientific articles, vastly expanding its reach beyond the 2 million
articles accessible through Arxiv. This update significantly broadens
access to the entire scope of scientific literature.
- **Dependencies:** semantischolar
https://github.com/danielnsilva/semanticscholar
  - **Twitter handle:** @shauryr

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
6 months ago
dudub12 7e6b0056b8
SQLDatabase drop the column names in the result. (#15361)
Fix for the following bug:
https://github.com/langchain-ai/langchain/issues/15360

---------

Co-authored-by: dudu butbul <100126964+dudu-upstream@users.noreply.github.com>
6 months ago
chyroc 07d294b5ec
Fix: fix Bing Search empty result exception, fix #15384 (#15387)
fix https://github.com/langchain-ai/langchain/issues/15384
6 months ago
Bagatur 1678d6ca17
langchain[patch], experimental[patch], docs: update tools imports (#15433) 6 months ago
Leonid Ganeline b8c6ebf647
refactor `utils` (#15432)
The `langchain` [still holds several
artifacts](https://api.python.langchain.com/en/latest/langchain_api_reference.html#module-langchain.utils)
that belongs to `community`. If they moved then `langchain.utils`
namespace would be removed completely.
- moved `ernie_functions` artifacts to `community`
6 months ago
Bagatur fa5d49f2c1
docs, experimental[patch], langchain[patch], community[patch]: update storage imports (#15429)
ran 
```bash
g grep -l "langchain.vectorstores" | xargs -L 1 sed -i '' "s/langchain\.vectorstores/langchain_community.vectorstores/g"
g grep -l "langchain.document_loaders" | xargs -L 1 sed -i '' "s/langchain\.document_loaders/langchain_community.document_loaders/g"
g grep -l "langchain.chat_loaders" | xargs -L 1 sed -i '' "s/langchain\.chat_loaders/langchain_community.chat_loaders/g"
g grep -l "langchain.document_transformers" | xargs -L 1 sed -i '' "s/langchain\.document_transformers/langchain_community.document_transformers/g"
g grep -l "langchain\.graphs" | xargs -L 1 sed -i '' "s/langchain\.graphs/langchain_community.graphs/g"
g grep -l "langchain\.memory\.chat_message_histories" | xargs -L 1 sed -i '' "s/langchain\.memory\.chat_message_histories/langchain_community.chat_message_histories/g"
gco master libs/langchain/tests/unit_tests/*/test_imports.py
gco master libs/langchain/tests/unit_tests/**/test_public_api.py
```
6 months ago
Bagatur 480626dc99
docs, community[patch], experimental[patch], langchain[patch], cli[pa… (#15412)
…tch]: import models from community

ran
```bash
git grep -l 'from langchain\.chat_models' | xargs -L 1 sed -i '' "s/from\ langchain\.chat_models/from\ langchain_community.chat_models/g"
git grep -l 'from langchain\.llms' | xargs -L 1 sed -i '' "s/from\ langchain\.llms/from\ langchain_community.llms/g"
git grep -l 'from langchain\.embeddings' | xargs -L 1 sed -i '' "s/from\ langchain\.embeddings/from\ langchain_community.embeddings/g"
git checkout master libs/langchain/tests/unit_tests/llms
git checkout master libs/langchain/tests/unit_tests/chat_models
git checkout master libs/langchain/tests/unit_tests/embeddings/test_imports.py
make format
cd libs/langchain; make format
cd ../experimental; make format
cd ../core; make format
```
6 months ago
Mohammad Mohtashim b6c57d38fa
Langchain_community: Small Fix when loading facebook messages (#15358)
- **Description:** SingleFileFacebookMessengerChatLoader did not handle
the case for when messages had stickers and/or photos so fixed that.
  - **Issue:** #15356

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
6 months ago
Mateusz Szewczyk cbfaccc424
WatsonxLLM updates/enhancements (#14598)
- **Description:** updates/enhancements to IBM
[watsonx.ai](https://www.ibm.com/products/watsonx-ai) LLM provider
(prompt tuned models and prompt templates deployments support)
- **Dependencies:**
[ibm-watsonx-ai](https://pypi.org/project/ibm-watsonx-ai/),
  - **Tag maintainer:** : @hwchase17 , @eyurtsev , @baskaryan 
  - **Twitter handle:** details in comment below.

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally. 

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
6 months ago
Manjunath Janardhan 7a0feba9f7
GITLAB_URL should take default https://gitlab.com instead of error (#14638)
The fix #14221 has broken default gitlab url which is forcing the users
to specify GITLAB_URL for default one. With this fix if GITLAB_URL is
not set, the default gitlab url will be taken.

  - **Description:** Add the GITHUB URL instead of None
  - **Issue:** the issue #14221 has broken the default github URL 
  - **Dependencies:** None
  - **Tag maintainer:** @hwchase17 
  - **Twitter handle:** manjunath_shiva
6 months ago
David dcf047c48f
add api_base to _client_params (community version of #14393) (#14644)
- **Description:** This PR adds `api_base` to `_client_params` in the
`chat_model` of LiteLLM to ensure it's included in API calls.
Previously, `api_base` was set on the client but was not included in the
parameters passed to the completion function. This change ensures that
`api_base` is correctly passed to all API calls.
  - **Issue:** #14338
  - **Tag maintainer:** @hwchase17 @agola11
  - **Twitter handle:** @LMS_David_RS
6 months ago
xuxiang dd1d818a82
Fixing the Issue with DashScopeEmbeddings Handling More than 25 Rows of Data (#14662)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
 
This change addresses the issue where DashScopeEmbeddingAPI limits
requests to 25 lines of data, and DashScopeEmbeddings did not handle
cases with more than 25 lines, leading to errors. I have implemented a
fix to manage data exceeding this limit efficiently.

---------

Co-authored-by: xuxiang <xuxiang@aliyun.com>
6 months ago
Christophe Bornet e2a8962ba6
Add AstraDB document loader (#14747)
- **Description:** this adds the AstraDB document loader and an
integration test
  - **Twitter handle:** cbornet_
6 months ago
Igor Dvorkin 76923e5743
Restore self message sent before OSX 12 Monterey (#14818)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
6 months ago
savoiepe d006be60ec
Added more filtering options to pgvector vectorstore (#14852)
- **Description:** Using PGVector vector store, it was only possible to
filter for values equals, in or not in metadata. Extended this feature
to work with the following keywords : IN, NIN, BETWEEN, GT, LT, NE, EQ,
LIKE, CONTAINS, OR, AND

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
6 months ago
chyroc 32e96a471c
Refactor: use SecretStr for llm_rails embeddings (#15090) 6 months ago
chyroc b440f92d81
Refactor: use SecretStr for embaas embeddings (#15091) 6 months ago
chyroc ea6cf0f1b1
Refactor: use SecretStr for edenai embeddings (#15092) 6 months ago
chyroc 32e6e9de13
Refactor: use SecretStr for palm chat-model (#15100) 6 months ago
chyroc b6952d41e5
Refactor: use SecretStr for GPTRouter chat-model (#15101) 6 months ago
Nan LI f506b4cfd2
community: Integration of New Chat Model Based on ChatGLM3 via ZhipuAI API (#15105)
- **Description:** 
- This PR introduces a significant enhancement to the LangChain project
by integrating a new chat model powered by the third-generation base
large model, ChatGLM3, via the zhipuai API.
- This advanced model supports functionalities like function calls, code
interpretation, and intelligent Agent capabilities.
- The additions include the chat model itself, comprehensive
documentation in the form of Python notebook docs, and thorough testing
with both unit and integrated tests.
- **Dependencies:** This update relies on the ZhipuAI package as a key
dependency.
- **Twitter handle:** If this PR receives spotlight attention, we would
be honored to receive a mention for our integration of the advanced
ChatGLM3 model via the ZhipuAI API. Kindly tag us at @kaiwu.

To ensure quality and standards, we have performed extensive linting and
testing. Commands such as make format, make lint, and make test have
been run from the root of the modified package to ensure compliance with
LangChain's coding standards.

TO DO: Continue refining and enhancing both the unit tests and
integrated tests.

---------

Co-authored-by: jing <jingguo92@gmail.com>
Co-authored-by: hyy1987 <779003812@qq.com>
Co-authored-by: jianchuanqi <qijianchuan@hotmail.com>
Co-authored-by: lirq <whuclarence@gmail.com>
Co-authored-by: whucalrence <81530213+whucalrence@users.noreply.github.com>
Co-authored-by: Jing Guo <48378126+JaneCrystall@users.noreply.github.com>
6 months ago
Hin 2cf1e73d12
Feat add volcano embedding (#14693)
Description: Volcano Ark is an enterprise-grade large-model service
platform for developers, providing a full range of functions and
services such as model training, inference, evaluation, fine-tuning. You
can visit its homepage at https://www.volcengine.com/docs/82379/1099455
for details. This change could help developers use the platform for
embedding.
Issue: None
Dependencies: volcengine
Tag maintainer: @baskaryan
Twitter handle: @hinnnnnnnnnnnns

---------

Co-authored-by: lujingxuansc <lujingxuansc@bytedance.com>
6 months ago
David Křístek a010f29013
fix: call correct stream method in ollama (#15104)
Co-authored-by: David Kristek <david@David--MacBook-Pro.local>
6 months ago
Christian Janiake be578f32be
community:Lazy load wikipedia dump file (#15111)
**Description:** the MWDumpLoader implementation currently does not
support the lazy_load method, and the files are usually very large. We
are proposing refactoring the load function, extracting two private
functions with the functionality of loading the dump file and parsing a
single page, to reuse the code in the lazy_load implementation.
6 months ago
chyroc a4ae4bc361
feat: mask api_key for konko (#14010)
for https://github.com/langchain-ai/langchain/issues/12165
6 months ago
joel-teratis 62d32bd214
fix(minor): added missing **kwargs parameter to chroma query function (#14919)
**Description:**

This PR adds the `**kwargs` parameter to six calls in the `chroma.py`
package. All functions already were able to receive `kwargs` but they
were discarded before.

**Issue:**

When passing `kwargs` to functions in the `chroma.py` package they are
being ignored.

For example:

```
chroma_instance.similarity_search_with_score(
    query,
    k=100,
    include=["metadatas", "documents", "distances", "embeddings"],  # this parameter gets ignored
)
```
The `include` parameter does not get passed on to the next function and
does not have any effect.

**Dependencies:**

None
6 months ago
NuODaniel 7773943a51
community:qianfan endpoint support init params & remove useless params definietion (#15381)
- **Description:**
- support custom kwargs in object initialization. For instantance, QPS
differs from multiple object(chat/completion/embedding with diverse
models), for which global env is not a good choice for configuration.
  - **Issue:** no
  - **Dependencies:** no
  - **Twitter handle:** no

@baskaryan PTAL
6 months ago
Nuno Campos 99000c612e
Propagate context vars in all classes/methods (#15329)
- Any direct usage of ThreadPoolExecutor or asyncio.run_in_executor
needs manual handling of context vars

<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
6 months ago
Ankush Gola 7eec8f2487
Delete V1 tracer and refactor tracer tests to core (#15326) 6 months ago
chyroc 7ce338201c
Patch: improve check openai version (#15301) 6 months ago
Nuno Campos eb5e250188 Propagate context vars in all classes/methods
- Any direct usage of ThreadPoolExecutor or asyncio.run_in_executor needs manual handling of context vars
6 months ago
Shuai Liu 4b53440e70
Upgrades the Tongyi LLM and ChatTongyi Model (#14793)
- **Description:** fixes and upgrades for the Tongyi LLM and ChatTongyi
Model
      - Fixed typos; it should be `Tongyi`, not `OpenAI`.
- Fixed a bug in `stream_generate_with_retry`; it's a real stream
generator now.
- Fixed a bug in `validate_environment`; the `dashscope_api_key` should
be properly handled when set by environment variables or initialization
parameters.
- Changed the `dashscope` response to incremental output by setting the
parameter `incremental_output`, which eliminates the need for the
prefix-removal trick.
      - Removed some unused parameters, like `n`, `prefix_messages`.
      - Added `_stream` method.
- Added async methods support, such as `_astream`, `_agenerate`,
`_abatch`.
  - **Dependencies:** No new dependencies.
  - **Tag maintainer:** @hwchase17 

> PS: Some may be confused about the terms `dashscope`, `tongyi`, and
`Qwen`:
> - `dashscope`: A platform to deploy LLMs and provide APIs to invoke
the LLM.
> - `tongyi`: A brand name or overall term about Alibaba Cloud's LLM/AI.
> - `Qwen`: An LLM that is open-sourced and deployed in `dashscope`.
> 
> We use the `dashscope` SDK to interact with the `tongyi`-`Qwen` LLM.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
6 months ago
Bagatur 8bfac1a319
community[patch]: Release 0.0.7 (#15320) 6 months ago
Diego Rani Mazine ec72225265
refactor: enable connection pool usage in PGVector (#11514)
- **Description:** `PGVector` refactored to use connection pool.
  - **Issue:** #11433,
  - **Tag maintainer:** @hwchase17 @eyurtsev,

---------

Co-authored-by: Diego Rani Mazine <diego.mazine@mercadolivre.com>
Co-authored-by: Nuno Campos <nuno@langchain.dev>
6 months ago
joshy-deshaw bf5385592e
core, community: propagate context between threads (#15171)
While using `chain.batch`, the default implementation uses a
`ThreadPoolExecutor` and run the chains in separate threads. An issue
with this approach is that that [the token counting
callback](https://python.langchain.com/docs/modules/callbacks/token_counting)
fails to work as a consequence of the context not being propagated
between threads. This PR adds context propagation to the new threads and
adds some thread synchronization in the OpenAI callback. With this
change, the token counting callback works as intended.

Having the context propagation change would be highly beneficial for
those implementing custom callbacks for similar functionalities as well.

---------

Co-authored-by: Nuno Campos <nuno@langchain.dev>
6 months ago
shroominic 694bbb14cd
community: fix typo in async ollama chat (#15276)
Made a stupid typo in the last PR which got already merged😅
6 months ago
triThirty fea4888e72
community: Enhance Github error prompt (#15248)
- **Description:** The Github error prompt is confused because of JWT
enctrypt to somebody not familiar with Github connection method. This PR
is to add some useful error prompt to help users troubleshooting.
- **Issue:**
https://github.com/langchain-ai/langchain/issues/14550#issuecomment-1867445049
  - **Dependencies:** None,
  - **Twitter handle:** None
6 months ago
Bob Lin a464eb4394
community: Make doctran synchronous (#15264)
### Description

I found that the methods in [the doctran
library](https://github.com/psychic-api/doctran) have been restructured
into [synchronized
versions](14944a59f7),

And [the example
ipynb](https://github.com/psychic-api/doctran/blob/main/examples.ipynb)
also shows that the code is synchronized, but the README has not been
updated yet.

so we need to modify the code and update the documentation.

### Issue

https://github.com/langchain-ai/langchain/issues/14645
6 months ago
chyroc 6fb3cc6f27
Fix: Use `Union` instead of `|` to improve compatibility, fix #15244 (#15245) 6 months ago
chyroc 1abcf441ae
Refactor: use SecretStr for Predibase llms (#15119) 6 months ago
chyroc 0a9a73a9c9
Refactor: use SecretStr for PipelineAI llms (#15120) 6 months ago
chyroc d63ceb65b3
Refactor: use SecretStr for StochasticAI llms (#15118) 6 months ago
chyroc 674fde87d2
Refactor: use SecretStr for VolcEngineMaas llms (#15117) 6 months ago
chyroc 3cc1da2b38
Refactor: use SecretStr for Petals llms (#15121) 6 months ago
shroominic e6f0cee896
community: Async Ollama + ChatOllama (#15169)
**Description:**
Adding async methods to booth OllamaLLM and ChatOllama to enable async
streaming and async .on_llm_new_token callbacks.

**Issue:**
ChatOllama is not working in combination with an AsyncCallbackManager
because the .on_llm_new_token method is not awaited.
6 months ago
Phill Zarfos 35896faab7
community: correct spelling mistakes of "Suffle" and "reporoducibility" (#15172)
- **Description:** Correct spelling mistakes of "Suffle" and
"reporoducibility" in `DirectoryLoader` class
  - **Issue:** N/A
  - **Dependencies:** N/A
  - **Twitter handle:** N/A
6 months ago
chyroc 3a3f880e5a
Patch: improve ollama 404 api error message, fix #15147 (#15156)
Make this issue more clearly exposed to developers
6 months ago
Ivan 59d4b80a92
[community]: Elasticsearch chat history encoding (#15055)
- Added ensure_ascii property to ElasticsearchChatMessageHistory

<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: Ivan Chetverikov <ivan.chetverikov@raftds.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
7 months ago
Corey Brown 9e492620d4
Don't reassign chunk_type (#14923)
**Description**: The parameter chunk_type was being hard coded to
"extractive_answers", so that when "snippet" was being passed, it was
being ignored. This change simply doesn't do that.
7 months ago
Takuya Igei 6da2246215
Add support Vertex AI Gemini uses a public image URL (#14949)
## What

Since `langchain_google_genai.ChatGoogleGenerativeAI` supported A public
image URL, we add to support it in `langchain.chat_models.ChatVertexAI`
as well.

### Example

```py
from langchain.chat_models.vertexai import ChatVertexAI
from langchain_core.messages import HumanMessage

llm = ChatVertexAI(model_name="gemini-pro-vision")
image_message = {
    "type": "image_url",
    "image_url": {
        "url": "https://python.langchain.com/assets/images/cell-18-output-1-0c7fb8b94ff032d51bfe1880d8370104.png",
    },
}
text_message = {
    "type": "text",
    "text": "What is shown in this image?",
}
message = HumanMessage(content=[text_message, image_message])

output = llm([message])
print(output.content)
```

## Refs

-
https://python.langchain.com/docs/integrations/llms/google_vertex_ai_palm
-
https://python.langchain.com/docs/integrations/chat/google_generative_ai
7 months ago
Archan Ghosh affa3e755a
Update arxiv.py with get_summaries_as_docs inside of Arxivloader (#14953)
Added the call function get_summaries_as_docs inside of Arxivloader

- **Description:** Added a function that returns the documents from
get_summaries_as_docs, as the call signature is present in the parent
file but never used from Arxivloader, this can be used from Arxivloader
itself just like .load() as both the signatures are same.
- **Issue:** Reduces time to load papers as no pdf is processed only
metadata is pulled from Arxiv allowing users for faster load times on
bulk loads. Users can then choose one or more paper and use ID directly
with .load() to load pdf thereby loading all the contents of the paper.
7 months ago
ccurme f2782f4c86
community: add args_schema to GmailSendMessage (#14973)
- **Description:** `tools.gmail.send_message` implements a
`SendMessageSchema` that is not used anywhere. `GmailSendMessage` also
does not have an `args_schema` attribute (this led to issues when
invoking the tool with an OpenAI functions agent, at least for me). Here
we add the missing attribute and a minimal test for the tool.
  - **Issue:** N/A
  - **Dependencies:** N/A
  - **Twitter handle:** N/A

---------

Co-authored-by: Chester Curme <chestercurme@microsoft.com>
7 months ago
Philip Kiely - Baseten 6342da333a
community: refactor Baseten integration with new API endpoints & docs (#15017)
- **Description:** In response to user feedback, this PR refactors the
Baseten integration with updated model endpoints, as well as updates
relevant documentation. This PR has been tested by end users in
production and works as expected.
  - **Issue:** N/A
- **Dependencies:** This PR actually removes the dependency on the
`baseten` package!
  - **Twitter handle:** https://twitter.com/basetenco
7 months ago
Blane Honeycutt 3fc1b3553b
Community: Adds ability to pass a Config to the boto3 client used by Bedrock (#15029)
# Description  
This PR adds the ability to pass a `botocore.config.Config` instance to
the boto3 client instantiated by the Bedrock LLM.

Currently, the Bedrock LLM doesn't support a way to pass a Config, which
means that some settings (e.g., timeouts and retry configuration)
require instantiating a new boto3 client with a Config and then
replacing the LLM's client:

```python
llm = Bedrock(
        region_name='us-west-2',
        model_id="anthropic.claude-v2",
        model_kwargs={'max_tokens_to_sample': 4096, 'temperature': 0},
)

llm.client = boto_client('bedrock-runtime', region_name='us-west-2', config=Config({'read_timeout': 300}))
```

# Issue
N/A

# Dependencies
N/A
7 months ago
Grzegorz Sajko dc71fcfabf
corrected outdated link (#15053)
<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
7 months ago
Ran c3f8733aef
fix: correct spelling mistakes of "seperate, intialise, pre-defined" (#14647)
fix spellings

**seperate -> separate**: found more occurrences, see
https://github.com/langchain-ai/langchain/pull/14602
**initialise -> intialize**: the latter is more common in the repo
**pre-defined > predefined**: adding a comma after a prefix is a
delicate matter, but this is a generally accepted word

also, another word that appears in the repo is "fs" (stands for
filesystem), e.g., in `libs/core/langchain_core/prompts/loading.py`
` """Unified method for loading a prompt from LangChainHub or local
fs."""`
Isn't "filesystem" better?
7 months ago
Harrison Chase 2e159931ac
add defaults for tavily (#15075) 7 months ago
chyroc 4440ec5ab3
Refactor: use SecretStr for minimax embeddings (#15067) 7 months ago
chyroc aa19ca9723
Refactor: use SecretStr for jina embeddings (#15068)
<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
7 months ago
Michael Goin 501cc8311d
community[patch]: Fix generation_config not setting properly for DeepSparse (#15036)
- **Description:** Tiny but important bugfix to use a more stable
interface for specifying generation_config parameters for DeepSparse LLM
7 months ago
QIAN Zifei 2460f977c5
community[minor]: Azure DocumentIntelligenceLoader/Parser support update with latest SDK (#14389)
- **Description:**
Add DocumentIntelligenceLoader & DocumentIntelligenceParser
implementation using the latest Azure Document Intelligence SDK with
markdown support.
The core logic resides in DocumentIntelligenceParser and
DocumentIntelligenceLoader is a mere wrapper of the parser.
The parser will takes api_endpoint and api_key and creates
DocumentIntelligenceClient for the user. 4 parsing modes are supported:
1. Markdown (default)
2. Single
3. Page 
4. Object

UT and notebook are also updated accordingly.

- **Dependencies:** Azure Document Intelligence SDK:
azure-ai-documentintelligence
[azure-sdk-for-python/sdk/documentintelligence/azure-ai-documentintelligence
at 7c42462ac662522a6fd21b17d2a20f4cd40d0356 · Azure/azure-sdk-for-python
(github.com)](https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FAzure%2Fazure-sdk-for-python%2Ftree%2F7c42462ac662522a6fd21b17d2a20f4cd40d0356%2Fsdk%2Fdocumentintelligence%2Fazure-ai-documentintelligence&data=05%7C01%7CZifei.Qian%40microsoft.com%7C298225aa3e31468a863108dbf07374ff%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638368150928704292%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=oE0Sl4HERnMKdbkV9KgBV46Z2xytcQAShdTWf7ZNl%2Bs%3D&reserved=0).

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
7 months ago
Ran 129a929d69
infra: Fix test filesystem paths incompatible with windows (#14388)
- **Description:** This PR fixes test failures on Windows caused by path
handling differences and unescaped special characters in regex. The
failing tests are:
```
FAILED tests/unit_tests/storage/test_filesystem.py::test_yield_keys - AssertionError: assert ['key1', 'subdir\\key2'] == ['key1', 'subdir/key2']
FAILED tests/unit_tests/test_imports.py::test_importable_all - ModuleNotFoundError: No module named 'langchain_community.langchain_community\\adapters'
FAILED tests/unit_tests/tools/file_management/test_utils.py::test_get_validated_relative_path_errs_on_absolute - re.error: incomplete escape \U at position 53
FAILED tests/unit_tests/tools/file_management/test_utils.py::test_get_validated_relative_path_errs_on_parent_dir - re.error: incomplete escape \U at position 69
FAILED tests/unit_tests/tools/file_management/test_utils.py::test_get_validated_relative_path_errs_for_symlink_outside_root - re.error: incomplete escape \U at position 64
```

- **Issue:** fixes
https://github.com/langchain-ai/langchain/issues/11775 (partially)
- **Dependencies:** none
7 months ago
Bagatur 40f42b8947
community[patch]: Release 0.0.6 (#15023) 7 months ago
Jacob Lee 1b01ee0e3c
community[minor]: add hf chat wrapper (#14736)
Builds on #14040 with community refactor merged and notebook updated.

Note that with this refactor, models will be imported from
`langchain_community.chat_models.huggingface` rather than the main
`langchain` repo.

---------

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Signed-off-by: ugm2 <unaigaraymaestre@gmail.com>
Signed-off-by: Yuchen Liang <yuchenl3@andrew.cmu.edu>
Co-authored-by: Andrew Reed <andrew.reed.r@gmail.com>
Co-authored-by: Andrew Reed <areed1242@gmail.com>
Co-authored-by: A-Roucher <aymeric.roucher@gmail.com>
Co-authored-by: Aymeric Roucher <69208727+A-Roucher@users.noreply.github.com>
7 months ago
Leonid Kuligin b99274c9d8
community[patch]: changed default for VertexAIEmbeddings (#14614)
Replace this entire comment with:
- **Description:** @kurtisvg has raised a point that it's a good idea to
have a fixed version for embeddings (since otherwise a user might run a
query with one version vs a vectorstore where another version was used).
In order to avoid breaking changes, I'd suggest to give users a warning,
and make a `model_name` a required argument in 1.5 months.
7 months ago
Karim Lalani 228ddabc3b
community: fix for surrealdb client 0.3.2 update + store and retrieve metadata (#14997)
Surrealdb client changes from 0.3.1 to 0.3.2 broke the surrealdb vectore
integration.
This PR updates the code to work with the updated client. The change is
backwards compatible with previous versions of surrealdb client.
Also expanded the vector store implementation to store and retrieve
metadata that's included with the document object.
7 months ago
JaguarDB ca0a75e1fc
community[patch]: JaguarHttpClient conditional import (#14985)
- **Description:** Fixed jaguar.py to import JaguarHttpClient with try
and catch
- **Issue:** the issue # Unable to use the JaguarHttpClient at run time
  - **Dependencies:** It requires "pip install -U jaguardb-http-client" 
  - **Twitter handle:** workbot

---------

Co-authored-by: JY <jyjy@jaguardb>
Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
Michael Landis 1c934fff0e
community[patch]: support momento vector index filter expressions (#14978)
**Description**

For the Momento Vector Index (MVI) vector store implementation, pass
through `filter_expression` kwarg to the MVI client, if specified. This
change will enable the MVI self query implementation in a future PR.

Also fixes some integration tests.
7 months ago
Yacine 300c1cbf92
community[patch]: Fix typo in class Docstring (#14982)
- **Description:** Fix typo in class Docstring to replace
AZURE_OPENAI_API_ENDPOINT by AZURE_OPENAI_ENDPOINT
  - **Issue:** the issue #14901 
  - **Dependencies:** NA
  - **Twitter handle:**

Co-authored-by: Yacine Bouakkaz <Yacine.Bouakkaz@evokegroup.com>
7 months ago
MING KANG ed5e0cfe57
community: add OCI Endpoint (#14250)
- **Description:** 
- [OCI Data
Science](https://docs.oracle.com/en-us/iaas/data-science/using/home.htm)
is a fully managed and serverless platform for data science teams to
build, train, and manage machine learning models in the Oracle Cloud
Infrastructure. This PR add integration for using LangChain with an LLM
hosted on a [OCI Data Science Model
Deployment](https://docs.oracle.com/en-us/iaas/data-science/using/model-dep-about.htm).
To authenticate,
[oracle-ads](https://accelerated-data-science.readthedocs.io/en/latest/user_guide/cli/authentication.html)
has been used to automatically load credentials for invoking endpoint.
- **Issue:** None
- **Dependencies:** `oracle-ads`
- **Tag maintainer:** @baskaryan
- **Twitter handle:** None

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
7 months ago
Erick Friis 75ba22793f
community: Vectara summarization (#14970)
Description: Adding Summarization to Vectara, to reflect it provides not
only vector-store type functionality but also can return a summary.
Also added:
MMR capability (in the Vectara platform side)

Updated templates

Updated documentation and IPYNB examples

Tag maintainer: @baskaryan
Twitter handle: @ofermend

---------

Co-authored-by: Ofer Mendelevitch <ofermend@gmail.com>
7 months ago
Liang Zhang 6479aab74f
community[patch]: Add param "task" to Databricks LLM to work around serialization of transform_output_fn (#14933)
**What is the reproduce code?**

```python
from langchain.chains import LLMChain, load_chain
from langchain.llms import Databricks
from langchain.prompts import PromptTemplate

def transform_output(response):
    # Extract the answer from the responses.
    return str(response["candidates"][0]["text"])

def transform_input(**request):
    full_prompt = f"""{request["prompt"]}
    Be Concise.
    """
    request["prompt"] = full_prompt
    return request

chat_model = Databricks(
    endpoint_name="llama2-13B-chat-Brambles",
    transform_input_fn=transform_input,
    transform_output_fn=transform_output,
    verbose=True,
)
print(f"Test chat model: {chat_model('What is Apache Spark')}") # This works

llm_chain = LLMChain(llm=chat_model, prompt=PromptTemplate.from_template("{chat_input}"))
llm_chain("colorful socks") # this works
llm_chain.save("databricks_llm_chain.yaml") # transform_input_fn and transform_output_fn are not serialized into the model yaml file
loaded_chain = load_chain("databricks_llm_chain.yaml") # The Databricks LLM is recreated with transform_input_fn=None, transform_output_fn=None.
loaded_chain("colorful socks") # Thus this errors. The transform_output_fn is needed to produce the correct output
```


Error:
```
 File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-6c34afab-3473-421d-877f-1ef18930ef4d/lib/python3.10/site-packages/pydantic/v1/main.py", line 341, in __init__
    raise validation_error
pydantic.v1.error_wrappers.ValidationError: 1 validation error for Generation
text
  str type expected (type=type_error.str)
 request payload: {'query': 'What is a databricks notebook?'}'}
```

**What does the error mean?**

When the LLM generates an answer, represented by a Generation data
object. The Generation data object takes a str field called text, e.g.
Generation(text=”blah”). However, the Databricks LLM tried to put a
non-str to text, e.g. Generation(text={“candidates”:[{“text”: “blah”}]})
Thus, pydantic errors.

**Why the output format becomes incorrect after saving and loading the
Databricks LLM?**

Databrick LLM does not support serializing transform_input_fn and
transform_output_fn, so they are not serialized into the model yaml
file. When the Databricks LLM is loaded, it is recreated with
transform_input_fn=None, transform_output_fn=None. Without
transform_output_fn, the output text is not unwrapped, thus errors.

Missing transform_output_fn causes this error.
Missing transform_input_fn causes the additional prompt “Be Concise.” to
be lost after saving and loading.
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
Bagatur b03845e069
community[patch]: Release 0.0.5 (#14960) 7 months ago
Anush 60c70effe9
community[minor]: Qdrant sparse vector retriever (#14814)
## Description

This PR intends to add support for Qdrant's new [sparse vector
retrieval](https://qdrant.tech/articles/sparse-vectors/) by introducing
a new retriever class, `QdrantSparseVectorRetriever`.

Necessary usage docs and integration tests have been added for the
retriever.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
mogith-pn c53fab63a3
community[patch]: Fixed duplicate input id issue in clarifai vectorstore (#14914)
- **Description:** 
This PR fixes the issue faces with duplicate input id in Clarifai
vectorstore class when ingesting documents into the vectorstore more
than the batch size.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
Sypherd 5642132c0c
community[patch]: Add safe lookup to OpenAI response adapter (#14765)
## Description
Similar to https://github.com/langchain-ai/langchain/issues/5861, I've
experienced `KeyError`s resulting from unsafe lookups in the
`convert_dict_to_message` function in [this
file](https://github.com/langchain-ai/langchain/blob/master/libs/community/langchain_community/adapters/openai.py).
While that issue focused on `KeyError 'content'`, I've opened another
issue (#14764) about how the problem still exists in the same function
but with `KeyError 'role'`. The fix for #5861 only added a safe lookup
to the specific line that was giving them trouble.. This PR fixes the
unsafe lookup in the rest of the function but the problem still exists
across the repo.

## Issues
* #14764
* #5861 

## Dependencies
* None

## Checklist
[x] make format
[x] make lint
[ ] make test - Results in `make: *** No rule to make target 'test'.
Stop.`

## Maintainers
* @hinthornw

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
AlpinDale b0588774f1
community[minor]: Add Aphrodite Engine support (#14759)
This PR adds support for PygmalionAI's [Aphrodite
Engine](https://github.com/PygmalionAI/aphrodite-engine), based on
vLLM's attention mechanism. At the moment, this PR does not include
support for the API servers, but they will be added in a later PR.

The only dependency as of now is `aphrodite-engine==0.4.2`. We pin the
version to prevent breakage due to changes in the aphrodite-engine
library.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
Dmitry Tyumentsev d21f44b484
community[minor]: Add YandexGPT embeddings (#14767)
- **Description:** Introducing an ability to work with the
[YandexGPT](https://cloud.yandex.com/en/services/yandexgpt) embeddings
models.
---------

Co-authored-by: Dmitry Tyumentsev <dmitry.tyumentsev@raftds.com>
7 months ago
Nicolas Suzor 529144649e
community[patch]: add png support for vertexai._parse_chat_history_gemini() (#14788)
- **Description:** Modify community chat model vertexai to handle png
and other image types encoded in base64
  - **Dependencies:** added `import re` but no new dependencies.

This addresses a problem where the vertexai method
_parse_chat_history_gemini() was only recognizing image uris in jpeg
format. I made a simple change to cover other extension types.
7 months ago
Liu Jun b0c48dc983
community[patch]: make ak and sk optional in qianfan endpoint (#14835)
- **Description:** The Qianfan SDK offers multiple authentication
methods, but in the `QianfanEndpoint` of Langchain, it currently only
supports authentication through AK and SK. In order to accommodate users
who wish to use alternative authentication methods, this pull request
makes AK and SK optional. This change should not impact existing users,
while allowing users to configure other authentication methods as per
the Qianfan SDK documentation.
  - **Issue:** /
  - **Dependencies:** No
  - **Tag maintainer:** No
  - **Twitter handle:**
7 months ago
Archan Ghosh 65678b3816
community[patch]: Update arxiv.py with Entry ID as a return value (#14915)
Added Entry ID as a return value inside get_summaries_as_docs

- **Description:** Added the Entry ID as a return, so it's easier to
track the IDs of the papers that are being returned.


With the addition return of the entry ID in functions like
ArxivRetriever, it will be easier to reference the ID of the paper
itself.
7 months ago
Bagatur 345acb26ac
community[patch]: Matching engine, return doc id (#14930) 7 months ago
Michael Feil 7b96de3d5d
community[patch]: update Gradient embeddings (#14846)
- **Description:** Going forward, we have a own API `pip install
gradientai`. Therefore gradually removing the self-build packages in
llamaindex, haystack and langchain.
  - **Issue:** None.
  - **Dependencies:** `pip install gradientai`
  - **Tag maintainer:** @michaelfeil
7 months ago
Igor Dvorkin 6cc3c2452c
community[patch]: Enhance iMessage chat loader with timestamp parsing and message ownership (#14804)
---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
Mohammad Mohtashim e3abe12243
community[patch]: helpful error message for GitHubAPIWrapper (#14803)
Very simple change in relation to the issue
https://github.com/langchain-ai/langchain/issues/14550

@baskaryan, @eyurtsev, @hwchase17.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
Dmitry Tyumentsev 50381abc42
community[patch]: Add retry logic to Yandex GPT API Calls (#14907)
**Description:** Added logic for re-calling the YandexGPT API in case of
an error

---------

Co-authored-by: Dmitry Tyumentsev <dmitry.tyumentsev@raftds.com>
7 months ago
Sirjanpreet Singh Banga 425e5e1791
community[minor]: rename ChatGPTRouter to GPTRouter (#14913)
**Description:**: Rename integration to GPTRouter 
**Tag maintainer:** @Gupta-Anubhav12 @samanyougarg @sirjan-ws-ext  
**Twitter handle:** [@SamanyouGarg](https://twitter.com/SamanyouGarg)
7 months ago
JaguarDB 992b04e475
community[minor]: added jaguar vector store (#14838)
Description: A new vector store Jaguar is being added. Class, test
scripts, and documentation is added.
Issue: None -- This is the first PR contributing to LangChain
Dependencies: This depends on "pip install -U jaguardb-http-client"
client http package
Tag maintainer: @baskaryan, @eyurtsev, @hwchase1
Twitter handle: @workbot

---------

Co-authored-by: JY <jyjy@jaguardb>
Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
Sirjanpreet Singh Banga 44cb899a93
community[minor]: Integrating GPTRouter (#14900)
**Description:** Adding a langchain integration for
[GPTRouter](https://gpt-router.writesonic.com/) 🚀 ,
 **Tag maintainer:** @Gupta-Anubhav12 @samanyougarg @sirjan-ws-ext  
 **Twitter handle:** [@SamanyouGarg](https://twitter.com/SamanyouGarg)
 
Integration Tests Passing:
<img width="1137" alt="Screenshot 2023-12-19 at 5 45 31 PM"
src="https://github.com/Writesonic/langchain/assets/151817113/4a59df9a-ee30-47aa-9df9-b8c4eeb9dc76">
7 months ago
Leonid Ganeline b2fd41331e
docs: docstrings `langchain_community` update (#14889)
Addded missed docstrings. Fixed inconsistency in docstrings.

**Note** CC @efriis 
There were PR errors on
`langchain_experimental/prompt_injection_identifier/hugging_face_identifier.py`
But, I didn't touch this file in this PR! Can it be some cache problems?
I fixed this error.
7 months ago
abhjaw 6fbd068b3f
Update kendra.py to avoid Kendra query ValidationException (#14866)
Fixing issue - https://github.com/langchain-ai/langchain/issues/14494 to
avoid Kendra query ValidationException

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
- **Description:** Update kendra.py to avoid Kendra query
ValidationException,
- **Issue:** the issue
#https://github.com/langchain-ai/langchain/issues/14494,
  - **Dependencies:** None,
  - **Tag maintainer:** ,
  - **Twitter handle:** 

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
7 months ago
Erick Friis 5f839beab9
community: replace deprecated davinci models (#14860)
This is technically a breaking change because it'll switch out default
models from `text-davinci-003` to `gpt-3.5-turbo-instruct`, but OpenAI
is shutting off those endpoints on 1/4 anyways.

Feels less disruptive to switch out the default instead.
7 months ago
Bagatur 61ad0e8be9
community[patch]: Release 0.0.4 (#14864) 7 months ago
Bob Lin 5de1dc72b9
community[patch]: Update Tongyi default model_name (#14844)
<img width="1305" alt="Screenshot 2023-12-18 at 9 54 01 PM"
src="https://github.com/langchain-ai/langchain/assets/10000925/c943fd81-cd48-46eb-8dff-4680424d9ba9">

The current model is no longer available.
7 months ago
Vlad Kolesnikov 11fda490ca
community[minor]: New model parameters and dynamic batching for VertexAIEmbeddings (#13999)
- **Description:** VertexAIEmbeddings performance improvements
  - **Twitter handle:** @vladkol

## Improvements

- Dynamic batch size, starting from 250, lowering down to 5. Batch size
varies across regions.
Some regions support larger batches, and it significantly improves
performance.
When running large batches of texts in `us-central1`, performance gain
can be up to 3.5x.
The dynamic batching also makes sure every batch is below 20K token
limit.
- New model parameter `embeddings_type` that translates to `task_type`
parameter of the API. Newer model versions support [different embeddings
task
types](https://cloud.google.com/vertex-ai/docs/generative-ai/embeddings/get-text-embeddings#api_changes_to_models_released_on_or_after_august_2023).
7 months ago
William FH 2d91d2b978
community: Add logprobs in gen output (#14826)
Now that it's supported again for OAI chat models .

Shame this wouldn't include it in the `.invoke()` output though (it's
not included in the message itself). Would need to do a follow-up for
that to be the case
7 months ago
Dmitry Tyumentsev 78ae276df7
community[patch]: fix agenerate return value (#14815)
Fixed:
  -  `_agenerate` return value in the YandexGPT Chat Model
  - duplicate line in the documentation

Co-authored-by: Dmitry Tyumentsev <dmitry.tyumentsev@raftds.com>
7 months ago
sujeet f1d3f29bc4
community[patch]: support for Sybase SQL anywhere added. (#14821)
- **Description:** support for Sybase SQL anywhere added in
sql_database.py file at path
langchain\libs\community\langchain_community\utilities
- **Issue:** It will resolve default schema setting for Sybase SQL
anywhere
  - **Dependencies:** No,
  - **Tag maintainer:** @baskaryan, @eyurtsev, @hwchase17,
  - **Twitter handle:** NA

---------

Co-authored-by: learn360sujeet <121271779+learn360sujeet@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
Erick Friis 8a07c56313
docs: developer docs (#14776)
Builds out a developer documentation section in the docs

- Links it from contributing.md
- Adds an initial guide on how to contribute an integration

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
Noah Stapp 34e6f3ff72
community[patch]: Implement similarity_score_threshold for MongoDB Vector Store (#14740)
Adds the option for `similarity_score_threshold` when using
`MongoDBAtlasVectorSearch` as a vector store retriever.

Example use:

```
vector_search = MongoDBAtlasVectorSearch.from_documents(...)

qa_retriever = vector_search.as_retriever(
    search_type="similarity_score_threshold",
    search_kwargs={
        "score_threshold": 0.5,
    }
)

qa = RetrievalQA.from_chain_type(
	llm=OpenAI(), 
	chain_type="stuff", 
	retriever=qa_retriever,
)

docs = qa({"query": "..."})
```

I've tested this feature locally, using a MongoDB Atlas Cluster with a
vector search index.
7 months ago
Dmitry Tyumentsev dcead816df
community[patch]: Update YandexGPT API (#14773)
Update LLMand Chat model to use new api version

---------

Co-authored-by: Dmitry Tyumentsev <dmitry.tyumentsev@raftds.com>
7 months ago
Lance Martin 42421860bc
Add image support for Ollama (#14713)
Support [LLaVA](https://ollama.ai/library/llava):
* Upgrade Ollama
* `ollama pull llava`

Ensure compatibility with [image prompt
template](https://github.com/langchain-ai/langchain/pull/14263)

---------

Co-authored-by: jacoblee93 <jacoblee93@gmail.com>
7 months ago
Harrison Chase 16399fd61d
langchain[patch]: remove unused imports (#14680)
Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
Karim Lalani a0064330b1
community[minor]: Add SurrealDB vectorstore (#13331)
**Description:** Vectorstore implementation around
[SurrealDB](https://www.surrealdb.com)

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
William FH 4855964332
Fix OAI Tool Message (#14746)
See format here:
https://platform.openai.com/docs/guides/function-calling/parallel-function-calling


It expects a "name" argument, which we aren't providing by default.


![image](https://github.com/langchain-ai/langchain/assets/13333726/7cd82978-337c-40a1-b099-3bb25cd57eb4)


Alternative is to add the 'name' field directly to the message if people
prefer.
7 months ago
William FH 93c7eb4e6b
[Tracing] String Stacktrace (#14131)
Add full stacktrace
7 months ago
Leonid Kuligin 7f42811e14
google-genai[patch], community[patch]: Added support for new Google GenerativeAI models (#14530)
Replace this entire comment with:
  - **Description:** added support for new Google GenerativeAI models
  - **Twitter handle:** lkuligin

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
7 months ago
Erick Friis 9fb26a2a71
community[patch]: fix pgvector sqlalchemy (#14726)
Fixes #14699
7 months ago
Funkeke ea99612caa
community[patch]: fix dashvector endpoint params error (#14484)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

Co-authored-by: fangkeke <3339698829@qq.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
7 months ago
Bob Lin dce3c74905
community[patch]: Correct type annotation for azure_ad_token_provider Closed: #14402 (#14432)
Description
Fix https://github.com/langchain-ai/langchain/issues/14402, Similar
changes: https://github.com/langchain-ai/langchain/pull/14166

Twitter handle
[lin_bob57617](https://twitter.com/lin_bob57617)
7 months ago
Fran Cirka 8a4162d15e
community[patch]: Fixed issue with importing Row from sqlalchemy (#14488)
- **Description:** Fixed import of Row in cache.py, 
- **Issue:** the issue # #13464
https://creditone.us.to/langchain-ai/langchain/issues/13464,
  - **Dependencies:** None,
  - **Twitter handle:** @frankybridman

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
7 months ago
Bagatur 4574749147
communty[patch]: Release 0.0.3 (#14673) 7 months ago
William FH 75b8891399
Update Vertex AI to include Gemini (#14670)
h/t to @lkuligin 
-  **Description:** added new models on VertexAI
  - **Twitter handle:** @lkuligin

---------

Co-authored-by: Leonid Kuligin <lkuligin@yandex.ru>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
7 months ago
Tomaz Bratanic ea2616ae23
Fix RRF and lucene escape characters for neo4j vector store (#14646)
* Remove Lucene special characters (fixes
https://github.com/langchain-ai/langchain/issues/14232)
* Fixes RRF normalization for hybrid search
7 months ago
Chengzu Ou df95abb7e7
docs: Add Databricks Vector Search example notebook (#14158)
This PR adds an example notebook for the Databricks Vector Search vector
store. It also adds an introduction to the Databricks Vector Search
product on the Databricks's provider page.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
葛尧 e780433f6b
Fix token_usage None issue in ChatOpenAI with local Chatglm2-6B (#14493)
When using local Chatglm2-6B by changing OPENAI_BASE_URL to localhost,
the token_usage in ChatOpenAI becomes None. This leads to an
AttributeError when trying to access token_usage.items().

This commit adds a check to ensure token_usage is not None before
accessing its items. This change prevents the AttributeError and allows
ChatOpenAI to work seamlessly with a local Chatglm2-6B model, aligning
with the way it operates with the OpenAI API.

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
7 months ago
Massimiliano Pronesti 6080c98108
fix(embeddings): huggingface hub embeddings and TEI (#14489)
**Description:** This PR fixes `HuggingFaceHubEmbeddings` by making the
API token optional (as in the client beneath). Most models don't require
one. I also updated the notebook for TEI (text-embeddings-inference)
accordingly as requested here #14288. In addition, I fixed a mistake in
the POST call parameters.

**Tag maintainers:** @baskaryan
7 months ago
Bob Lin a019183a01
create mypy cache dir if it doesn't exist (#14579)
### Description

When running `make lint` multiple times, i can see the error `mkdir:
.mypy_cache: File exists`. Use `mkdir -p` to solve this problem.
<img width="1512" alt="Screenshot 2023-12-12 at 11 22 01 AM"
src="https://github.com/langchain-ai/langchain/assets/10000925/1429383d-3283-4e22-8882-5693bc50b502">
7 months ago
dandanwei e5bd88383f
fix a bug in RedisNum filter againt value 0 (#14587)
- **Description:** There is a bug in RedisNum filter that filter towards
value 0 will be parsed as "*". This is a fix to it.
  - **Issue:** NA
  - **Dependencies:** NA
  - **Tag maintainer:** NA
  - **Twitter handle:** NA
7 months ago
Bagatur ca7da8f7ef
docs: fix links in readme (#14624) 7 months ago
Bagatur 2a10cabf66
docs: core and community readme (#14623) 7 months ago
Bagatur d388863a3b
community[patch]: Release 0.0.2 (#14610) 7 months ago
Erick Friis 0a9d933bb2
infra: import checking bugfix (#14569) 7 months ago
Erick Friis 482e2b94fa
infra: import CI speed (#14566)
Was taking 10 mins. Now a few seconds.
7 months ago
Bagatur 6a828e60ee
community[patch]: Release 0.0.1 (#14565) 7 months ago
Erick Friis 5418d8bfd6
infra: import CI fix (#14562)
TIL `**` globstar doesn't work in make

Makefile changes fix that.

`__getattr__` changes allow import of all files, but raise error when
accessing anything from the module.

file deletions were corresponding libs change from #14559
7 months ago
Bagatur a844b495c4
community[patch]: Fix agenttoolkits imports (#14559) 7 months ago
Bagatur ed58eeb9c5
community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463)
Moved the following modules to new package langchain-community in a backwards compatible fashion:

```
mv langchain/langchain/adapters community/langchain_community
mv langchain/langchain/callbacks community/langchain_community/callbacks
mv langchain/langchain/chat_loaders community/langchain_community
mv langchain/langchain/chat_models community/langchain_community
mv langchain/langchain/document_loaders community/langchain_community
mv langchain/langchain/docstore community/langchain_community
mv langchain/langchain/document_transformers community/langchain_community
mv langchain/langchain/embeddings community/langchain_community
mv langchain/langchain/graphs community/langchain_community
mv langchain/langchain/llms community/langchain_community
mv langchain/langchain/memory/chat_message_histories community/langchain_community
mv langchain/langchain/retrievers community/langchain_community
mv langchain/langchain/storage community/langchain_community
mv langchain/langchain/tools community/langchain_community
mv langchain/langchain/utilities community/langchain_community
mv langchain/langchain/vectorstores community/langchain_community
mv langchain/langchain/agents/agent_toolkits community/langchain_community
mv langchain/langchain/cache.py community/langchain_community
mv langchain/langchain/adapters community/langchain_community
mv langchain/langchain/callbacks community/langchain_community/callbacks
mv langchain/langchain/chat_loaders community/langchain_community
mv langchain/langchain/chat_models community/langchain_community
mv langchain/langchain/document_loaders community/langchain_community
mv langchain/langchain/docstore community/langchain_community
mv langchain/langchain/document_transformers community/langchain_community
mv langchain/langchain/embeddings community/langchain_community
mv langchain/langchain/graphs community/langchain_community
mv langchain/langchain/llms community/langchain_community
mv langchain/langchain/memory/chat_message_histories community/langchain_community
mv langchain/langchain/retrievers community/langchain_community
mv langchain/langchain/storage community/langchain_community
mv langchain/langchain/tools community/langchain_community
mv langchain/langchain/utilities community/langchain_community
mv langchain/langchain/vectorstores community/langchain_community
mv langchain/langchain/agents/agent_toolkits community/langchain_community
mv langchain/langchain/cache.py community/langchain_community
```

Moved the following to core
```
mv langchain/langchain/utils/json_schema.py core/langchain_core/utils
mv langchain/langchain/utils/html.py core/langchain_core/utils
mv langchain/langchain/utils/strings.py core/langchain_core/utils
cat langchain/langchain/utils/env.py >> core/langchain_core/utils/env.py
rm langchain/langchain/utils/env.py
```

See .scripts/community_split/script_integrations.sh for all changes
7 months ago