Commit Graph

516 Commits (e080281623597fb1bf5c7c8470b1fb48edb37a52)

Author SHA1 Message Date
thiswillbeyourgithub 1d082359ee
community: add support for callable filters in FAISS (#16190)
- **Description:**
Filtering in a FAISS vectorstores is very inflexible and doesn't allow
that many use case. I think supporting callable like this enables a lot:
regular expressions, condition on multiple keys etc. **Note** I had to
manually alter a test. I don't understand if it was falty to begin with
or if there is something funky going on.
- **Issue:** None
- **Dependencies:** None
- **Twitter handle:** None

Signed-off-by: thiswillbeyourgithub <26625900+thiswillbeyourgithub@users.noreply.github.com>
8 months ago
Killinsun - Ryota Takeuchi 52f4ad8216
community: Add new fields in metadata for qdrant vector store (#16608)
## Description

The PR is to return the ID and collection name from qdrant client to
metadata field in `Document` class.

## Issue

The motivation is almost same to
[11592](https://github.com/langchain-ai/langchain/issues/11592)

Returning ID is useful to update existing records in a vector store, but
we cannot know them if we use some retrievers.

In order to avoid any conflicts, breaking changes, the new fields in
metadata have a prefix `_`

## Dependencies

N/A

## Twitter handle

@kill_in_sun

<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
8 months ago
hulitaitai 32cad38ec6
<langchain_community\llms\chatglm.py>: <Correcting "history"> (#16729)
Use the real "history" provided by the original program instead of
putting "None" in the history.

- **Description:** I change one line in the code to make it return the
"history" of the chat model.
- **Issue:** At the moment it returns only the answers of the chat
model. However the chat model himself provides a history more complet
with the questions of the user.
  - **Dependencies:** no dependencies required for this change,
8 months ago
Bassem Yacoube 85e93e05ed
community[minor]: Update OctoAI LLM, Embedding and documentation (#16710)
This PR includes updates for OctoAI integrations:
- The LLM class was updated to fix a bug that occurs with multiple
sequential calls
- The Embedding class was updated to support the new GTE-Large endpoint
released on OctoAI lately
- The documentation jupyter notebook was updated to reflect using the
new LLM sdk
Thank you!
8 months ago
Volodymyr Machula 32c5be8b73
community[minor]: Connery Tool and Toolkit (#14506)
## Summary

This PR implements the "Connery Action Tool" and "Connery Toolkit".
Using them, you can integrate Connery actions into your LangChain agents
and chains.

Connery is an open-source plugin infrastructure for AI.

With Connery, you can easily create a custom plugin with a set of
actions and seamlessly integrate them into your LangChain agents and
chains. Connery will handle the rest: runtime, authorization, secret
management, access management, audit logs, and other vital features.
Additionally, Connery and our community offer a wide range of
ready-to-use open-source plugins for your convenience.

Learn more about Connery:

- GitHub: https://github.com/connery-io/connery-platform
- Documentation: https://docs.connery.io
- Twitter: https://twitter.com/connery_io

## TODOs

- [x] API wrapper
   - [x] Integration tests
- [x] Connery Action Tool
   - [x] Docs
   - [x] Example
   - [x] Integration tests
- [x] Connery Toolkit
  - [x] Docs
  - [x] Example
- [x] Formatting (`make format`)
- [x] Linting (`make lint`)
- [x] Testing (`make test`)
8 months ago
Harrison Chase 8457c31c04
community[patch]: activeloop ai tql deprecation (#14634)
Co-authored-by: AdkSarsen <adilkhan@activeloop.ai>
8 months ago
Neli Hateva c95facc293
langchain[minor], community[minor]: Implement Ontotext GraphDB QA Chain (#16019)
- **Description:** Implement Ontotext GraphDB QA Chain
  - **Issue:** N/A
  - **Dependencies:** N/A
  - **Twitter handle:** @OntotextGraphDB
8 months ago
Elliot 39eb00d304
community[patch]: Adapt more parameters related to MemorySearchPayload for the search method of ZepChatMessageHistory (#15441)
- **Description:** To adapt more parameters related to
MemorySearchPayload for the search method of ZepChatMessageHistory,
  - **Issue:** None,
  - **Dependencies:** None,
  - **Twitter handle:** None
8 months ago
Jael Gu a1aa3a657c
community[patch]: Milvus supports add & delete texts by ids (#16256)
# Description

To support [langchain
indexing](https://python.langchain.com/docs/modules/data_connection/indexing)
as requested by users, vectorstore Milvus needs to support:
- document addition by id (`add_documents` method with `ids` argument)
- delete by id (`delete` method with `ids` argument)

Example usage:

```python
from langchain.indexes import SQLRecordManager, index
from langchain.schema import Document
from langchain_community.vectorstores import Milvus
from langchain_openai import OpenAIEmbeddings

collection_name = "test_index"
embedding = OpenAIEmbeddings()
vectorstore = Milvus(embedding_function=embedding, collection_name=collection_name)

namespace = f"milvus/{collection_name}"
record_manager = SQLRecordManager(
    namespace, db_url="sqlite:///record_manager_cache.sql"
)
record_manager.create_schema()

doc1 = Document(page_content="kitty", metadata={"source": "kitty.txt"})
doc2 = Document(page_content="doggy", metadata={"source": "doggy.txt"})

index(
    [doc1, doc1, doc2],
    record_manager,
    vectorstore,
    cleanup="incremental",  # None, "incremental", or "full"
    source_id_key="source",
)
```

# Fix issues

Fix https://github.com/milvus-io/milvus/issues/30112

---------

Signed-off-by: Jael Gu <mengjia.gu@zilliz.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
8 months ago
Michard Hugo e9d3527b79
community[patch]: Add missing async similarity_distance_threshold handling in RedisVectorStoreRetriever (#16359)
Add missing async similarity_distance_threshold handling in
RedisVectorStoreRetriever

- **Description:** added method `_aget_relevant_documents` to
`RedisVectorStoreRetriever` that overrides parent method to add support
of `similarity_distance_threshold` in async mode (as for sync mode)
  - **Issue:** #16099
  - **Dependencies:** N/A
  - **Twitter handle:** N/A
8 months ago
Anthony Bernabeu 2db79ab111
community[patch]: Implement TTL for DynamoDBChatMessageHistory (#15478)
- **Description:** Implement TTL for DynamoDBChatMessageHistory, 
  - **Issue:** see #15477,
  - **Dependencies:** N/A,

---------

Co-authored-by: Piyush Jain <piyushjain@duck.com>
8 months ago
taimo d3d9244fee
langchain-community: fix unicode escaping issue with SlackToolkit (#16616)
- **Description:** fix unicode escaping issue with SlackToolkit
  - **Issue:**  #16610
8 months ago
Benito Geordie f3fdc5c5da
community: Added integrations for ThirdAI's NeuralDB with Retriever and VectorStore frameworks (#15280)
**Description:** Adds ThirdAI NeuralDB retriever and vectorstore
integration. NeuralDB is a CPU-friendly and fine-tunable text retrieval
engine.
8 months ago
Pashva Mehta 22d90800c8
community: Fixed schema discrepancy in from_texts function for weaviate vectorstore (#16693)
* Description: Fixed schema discrepancy in **from_texts** function for
weaviate vectorstore which created a redundant property "key" inside a
class.
* Issue: Fixed: https://github.com/langchain-ai/langchain/issues/16692
* Twitter handle: @pashvamehta1
8 months ago
Daniel Erenrich 0600998f38
community: Wikidata tool support (#16691)
- **Description:** Adds Wikidata support to langchain. Can read out
documents from Wikidata.
  - **Issue:** N/A
- **Dependencies:** Adds implicit dependencies for
`wikibase-rest-api-client` (for turning items into docs) and
`mediawikiapi` (for hitting the search endpoint)
  - **Twitter handle:** @derenrich

You can see an example of this tool used in a chain
[here](https://nbviewer.org/urls/d.erenrich.net/upload/Wikidata_Langchain.ipynb)
or
[here](https://nbviewer.org/urls/d.erenrich.net/upload/Wikidata_Lars_Kai_Hansen.ipynb)

<!-- Thank you for contributing to LangChain!


Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
8 months ago
Christophe Bornet 2e3af04080
Use Postponed Evaluation of Annotations in Astra and Cassandra doc loaders (#16694)
Minor/cosmetic change
8 months ago
Christophe Bornet 36e432672a
community[minor]: Add async methods to AstraDBLoader (#16652) 8 months ago
Rashedul Hasan Rijul 481493dbce
community[patch]: apply embedding functions during query if defined (#16646)
**Description:** This update ensures that the user-defined embedding
function specified during vector store creation is applied during
queries. Previously, even if a custom embedding function was defined at
the time of store creation, Bagel DB would default to using the standard
embedding function during query execution. This pull request addresses
this issue by consistently using the user-defined embedding function for
queries if one has been specified earlier.
8 months ago
Serena Ruan f01fb47597
community[patch]: MLflowCallbackHandler -- Move textstat and spacy as optional dependency (#16657)
Signed-off-by: Serena Ruan <serena.rxy@gmail.com>
8 months ago
Zhuoyun(John) Xu 508bde7f40
community[patch]: Ollama - Pass headers to post request in async method (#16660)
# Description
A previous PR (https://github.com/langchain-ai/langchain/pull/15881)
added option to pass headers to ollama endpoint, but headers are not
pass to the async method.
8 months ago
João Carlos Ferra de Almeida 3e87b67a3c
community[patch]: Add Cookie Support to Fetch Method (#16673)
- **Description:** This change allows the `_fetch` method in the
`WebBaseLoader` class to utilize cookies from an existing
`requests.Session`. It ensures that when the `fetch` method is used, any
cookies in the provided session are included in the request. This
enhancement maintains compatibility with existing functionality while
extending the utility of the `fetch` method for scenarios where cookie
persistence is necessary.
- **Issue:** Not applicable (new feature),
- **Dependencies:** Requires `aiohttp` and `requests` libraries (no new
dependencies introduced),
- **Twitter handle:** N/A

Co-authored-by: Joao Almeida <joao.almeida@mercedes-benz.io>
8 months ago
Harrison Chase 27665e3546
[community] fix anthropic streaming (#16682) 8 months ago
Christophe Bornet 4915c3cd86
[Fix] Fix Cassandra Document loader default page content mapper (#16273)
We can't use `json.dumps` by default as many types returned by the
cassandra driver are not serializable. It's safer to use `str` and let
users define their own custom `page_content_mapper` if needed.
8 months ago
Pasha 4e189cd89a
community[patch]: youtube loader transcript format (#16625)
- **Description**: YoutubeLoader right now returns one document that
contains the entire transcript. I think it would be useful to add an
option to return multiple documents, where each document would contain
one line of transcript with the start time and duration in the metadata.
For example,
[AssemblyAIAudioTranscriptLoader](https://github.com/langchain-ai/langchain/blob/master/libs/community/langchain_community/document_loaders/assemblyai.py)
is implemented in a similar way, it allows you to choose between the
format to use for the document loader.
8 months ago
yin1991 a936472512
docs: Update documentation to use 'model_id' rather than 'model_name' to match actual API (#16615)
- **Description:** Replace 'model_name' with 'model_id' for accuracy 
- **Issue:**
[link-to-issue](https://github.com/langchain-ai/langchain/issues/16577)
  - **Dependencies:** 
  - **Twitter handle:**
8 months ago
Micah Parker 6543e585a5
community[patch]: Added support for Ollama's num_predict option in ChatOllama (#16633)
Just a simple default addition to the options payload for a ollama
generate call to support a max_new_tokens parameter.

Should fix issue: https://github.com/langchain-ai/langchain/issues/14715
8 months ago
baichuan-assistant 70ff54eace
community[minor]: Add Baichuan Text Embedding Model and Baichuan Inc introduction (#16568)
- **Description:** Adding Baichuan Text Embedding Model and Baichuan Inc
introduction.

Baichuan Text Embedding ranks #1 in C-MTEB leaderboard:
https://huggingface.co/spaces/mteb/leaderboard

Co-authored-by: BaiChuanHelper <wintergyc@WinterGYCs-MacBook-Pro.local>
8 months ago
Ghani e30c6662df
Langchain-community : EdenAI chat integration. (#16377)
- **Description:** This PR adds [EdenAI](https://edenai.co/) for the
chat model (already available in LLM & Embeddings). It supports all
[ChatModel] functionality: generate, async generate, stream, astream and
batch. A detailed notebook was added.

  - **Dependencies**: No dependencies are added as we call a rest API.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
8 months ago
Jatin Chawda a79345f199
community[patch]: Fixed tool names snake_case (#16397)
#16396
Fixed
1. golden_query
2. google_lens
3. memorize
4. merriam_webster
5. open_weather_map
6. pub_med
7. stack_exchange
8. generate_image
9. wikipedia
8 months ago
Bagatur 61e876aad8
openai[patch]: Explicitly support embedding dimensions (#16596) 8 months ago
Bagatur 5df8ab574e
infra: move indexing documentation test (#16595) 8 months ago
Bagatur 61b200947f
community[patch]: Release 0.0.16 (#16591) 8 months ago
Bagatur ef42d9d559
core[patch], community[patch], openai[patch]: consolidate openai tool… (#16485)
… converters

One way to convert anything to an OAI function:
convert_to_openai_function
One way to convert anything to an OAI tool: convert_to_openai_tool
Corresponding bind functions on OAI models: bind_functions, bind_tools
8 months ago
Brian Burgin 148347e858
community[minor]: Add LiteLLM Router Integration (#15588)
community:

  - **Description:**
- Add new ChatLiteLLMRouter class that allows a client to use a LiteLLM
Router as a LangChain chat model.
- Note: The existing ChatLiteLLM integration did not cover the LiteLLM
Router class.
    - Add tests and Jupyter notebook.
  - **Issue:** None
  - **Dependencies:** Relies on existing ChatLiteLLM integration
  - **Twitter handle:** @bburgin_0

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
8 months ago
Dmitry Tyumentsev e86e66bad7
community[patch]: YandexGPT models - add sleep_interval (#16566)
Added sleep between requests to prevent errors associated with
simultaneous requests.
8 months ago
Erick Friis adc008407e
exa: init pkg (#16553) 8 months ago
Rave Harpaz c4e9c9ca29
community[minor]: Add OCI Generative AI integration (#16548)
<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
- **Description:** Adding Oracle Cloud Infrastructure Generative AI
integration. Oracle Cloud Infrastructure (OCI) Generative AI is a fully
managed service that provides a set of state-of-the-art, customizable
large language models (LLMs) that cover a wide range of use cases, and
which is available through a single API. Using the OCI Generative AI
service you can access ready-to-use pretrained models, or create and
host your own fine-tuned custom models based on your own data on
dedicated AI clusters.
https://docs.oracle.com/en-us/iaas/Content/generative-ai/home.htm
  - **Issue:** None,
  - **Dependencies:** OCI Python SDK,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.
Passed

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

we provide unit tests. However, we cannot provide integration tests due
to Oracle policies that prohibit public sharing of api keys.
 
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: Arthur Cheng <arthur.cheng@oracle.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
8 months ago
Harel Gal a91181fe6d
community[minor]: add support for Guardrails for Amazon Bedrock (#15099)
Added support for optionally supplying 'Guardrails for Amazon Bedrock'
on both types of model invocations (batch/regular and streaming) and for
all models supported by the Amazon Bedrock service.

@baskaryan  @hwchase17

```python 
llm = Bedrock(model_id="<model_id>", client=bedrock,
                  model_kwargs={},
                  guardrails={"id": " <guardrail_id>",
                              "version": "<guardrail_version>",
                               "trace": True}, callbacks=[BedrockAsyncCallbackHandler()])

class BedrockAsyncCallbackHandler(AsyncCallbackHandler):
    """Async callback handler that can be used to handle callbacks from langchain."""

    async def on_llm_error(
            self,
            error: BaseException,
            **kwargs: Any,
    ) -> Any:
        reason = kwargs.get("reason")
        if reason == "GUARDRAIL_INTERVENED":
           # kwargs contains additional trace information sent by 'Guardrails for Bedrock' service.
            print(f"""Guardrails: {kwargs}""")


# streaming 
llm = Bedrock(model_id="<model_id>", client=bedrock,
                  model_kwargs={},
                  streaming=True,
                  guardrails={"id": "<guardrail_id>",
                              "version": "<guardrail_version>"})
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
8 months ago
Martin Kolb 04651f0248
community[minor]: VectorStore integration for SAP HANA Cloud Vector Engine (#16514)
- **Description:**
This PR adds a VectorStore integration for SAP HANA Cloud Vector Engine,
which is an upcoming feature in the SAP HANA Cloud database
(https://blogs.sap.com/2023/11/02/sap-hana-clouds-vector-engine-announcement/).

  - **Issue:** N/A
- **Dependencies:** [SAP HANA Python
Client](https://pypi.org/project/hdbcli/)
  - **Twitter handle:** @sapopensource

Implementation of the integration:
`libs/community/langchain_community/vectorstores/hanavector.py`

Unit tests:
`libs/community/tests/unit_tests/vectorstores/test_hanavector.py`

Integration tests:
`libs/community/tests/integration_tests/vectorstores/test_hanavector.py`

Example notebook:
`docs/docs/integrations/vectorstores/hanavector.ipynb`

Access credentials for execution of the integration tests can be
provided to the maintainers.

---------

Co-authored-by: sascha <sascha.stoll@sap.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
8 months ago
Unai Garay Maestre fdbfa6b2c8
Adds progress bar to VertexAIEmbeddings (#14542)
- **Description:** Adds progress bar to VertexAIEmbeddings 
- **Issue:** related issue
https://github.com/langchain-ai/langchain/issues/13637

Signed-off-by: ugm2 <unaigaraymaestre@gmail.com>

---------

Signed-off-by: ugm2 <unaigaraymaestre@gmail.com>
8 months ago
Jeremi Joslin 9e95699277
community[patch]: Fix error message when litellm is not installed (#16316)
The error message was mentioning the wrong package. I updated it to the
correct one.
8 months ago
bachr b3ed98dec0
community[patch]: avoid KeyError when language not in LANGUAGE_SEGMENTERS (#15212)
**Description:**

Handle unsupported languages in same way as when none is provided 
 
**Issue:**

The following line will throw a KeyError if the language is not
supported.
```python
self.Segmenter = LANGUAGE_SEGMENTERS[language]
```
E.g. when using `Language.CPP` we would get `KeyError: <Language.CPP:
'cpp'>`

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
8 months ago
chyroc 61da2ff24c
community[patch]: use SecretStr for yandex model secrets (#15463) 8 months ago
Alessio Serra d628a80a5d
community[patch]: added 'conversational' as a valid task for hugginface endopoint models (#15761)
- **Description:** added the conversational task to hugginFace endpoint
in order to use models designed for chatbot programming.
  - **Dependencies:** None

---------

Co-authored-by: Alessio Serra (ext.) <alessio.serra@partner.bmw.de>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
8 months ago
Karim Lalani 4c7755778d
community[patch]: SurrealDB fix for asyncio (#16092)
Code fix for asyncio
8 months ago
Raunak 476bf8b763
community[patch]: Load list of files using UnstructuredFileLoader (#16216)
- **Description:** Updated `_get_elements()` function of
`UnstructuredFileLoader `class to check if the argument self.file_path
is a file or list of files. If it is a list of files then it iterates
over the list of file paths, calls the partition function for each one,
and appends the results to the elements list. If self.file_path is not a
list, it calls the partition function as before.
  
  - **Issue:** Fixed #15607,
  - **Dependencies:** NA
  - **Twitter handle:** NA

Co-authored-by: H161961 <Raunak.Raunak@Honeywell.com>
8 months ago
Xudong Sun 019b6ebe8d
community[minor]: Add iFlyTek Spark LLM chat model support (#13389)
- **Description:** This PR enables LangChain to access the iFlyTek's
Spark LLM via the chat_models wrapper.
  - **Dependencies:** websocket-client ^1.6.1
  - **Tag maintainer:** @baskaryan 

### SparkLLM chat model usage

Get SparkLLM's app_id, api_key and api_secret from [iFlyTek SparkLLM API
Console](https://console.xfyun.cn/services/bm3) (for more info, see
[iFlyTek SparkLLM Intro](https://xinghuo.xfyun.cn/sparkapi) ), then set
environment variables `IFLYTEK_SPARK_APP_ID`, `IFLYTEK_SPARK_API_KEY`
and `IFLYTEK_SPARK_API_SECRET` or pass parameters when using it like the
demo below:

```python3
from langchain.chat_models.sparkllm import ChatSparkLLM

client = ChatSparkLLM(
    spark_app_id="<app_id>",
    spark_api_key="<api_key>",
    spark_api_secret="<api_secret>"
)
```
8 months ago
Serena Ruan 5c6e123757
community[patch]: Fix MlflowCallback with none artifacts_dir (#16487) 8 months ago
bu2kx ff3163297b
community[minor]: Add KDBAI vector store (#12797)
Addition of KDBAI vector store (https://kdb.ai).

Dependencies: `kdbai_client` v0.1.2 Python package.

Sample notebook: `docs/docs/integrations/vectorstores/kdbai.ipynb`

Tag maintainer: @bu2kx
Twitter handle: @kxsystems
8 months ago
Shivani Modi 4e160540ff
community[minor]: Adding Konko Completion endpoint (#15570)
This PR introduces update to Konko Integration with LangChain.

1. **New Endpoint Addition**: Integration of a new endpoint to utilize
completion models hosted on Konko.

2. **Chat Model Updates for Backward Compatibility**: We have updated
the chat models to ensure backward compatibility with previous OpenAI
versions.

4. **Updated Documentation**: Comprehensive documentation has been
updated to reflect these new changes, providing clear guidance on
utilizing the new features and ensuring seamless integration.

Thank you to the LangChain team for their exceptional work and for
considering this PR. Please let me know if any additional information is
needed.

---------

Co-authored-by: Shivani Modi <shivanimodi@Shivanis-MacBook-Pro.local>
Co-authored-by: Shivani Modi <shivanimodi@Shivanis-MBP.lan>
8 months ago
Noah Stapp e135e5257c
community[patch]: Include scores in MongoDB Atlas QA chain results (#14666)
Adds the ability to return similarity scores when using
`RetrievalQA.from_chain_type` with `MongoDBAtlasVectorSearch`. Requires
that `return_source_documents=True` is set.

Example use:

```
vector_search = MongoDBAtlasVectorSearch.from_documents(...)

qa = RetrievalQA.from_chain_type(
	llm=OpenAI(), 
	chain_type="stuff", 
	retriever=vector_search.as_retriever(search_kwargs={"additional": ["similarity_score"]}),
	return_source_documents=True
)

...

docs = qa({"query": "..."})

docs["source_documents"][0].metadata["score"] # score will be here
```

I've tested this feature locally, using a MongoDB Atlas Cluster with a
vector search index.
8 months ago
Serena Ruan 90f5a1c40e
community[minor]: Improve mlflow callback (#15691)
- **Description:** Allow passing run_id to MLflowCallbackHandler to
resume a run instead of creating a new run. Support recording retriever
relevant metrics. Refactor the code to fix some bugs.
---------

Signed-off-by: Serena Ruan <serena.rxy@gmail.com>
8 months ago
Facundo Santiago 92e6a641fd
feat: adding paygo api support for Azure ML / Azure AI Studio (#14560)
- **Description:** Introducing support for LLMs and Chat models running
in Azure AI studio and Azure ML using the new deployment mode
pay-as-you-go (model as a service).
- **Issue:** NA
- **Dependencies:** None.
- **Tag maintainer:** @prakharg-msft @gdyre 
- **Twitter handle:** @santiagofacundo

Examples added:
*
[docs/docs/integrations/llms/azure_ml.ipynb](https://github.com/santiagxf/langchain/blob/santiagxf/azureml-endpoints-paygo-community/docs/docs/integrations/chat/azureml_endpoint.ipynb)
*
[docs/docs/integrations/chat/azureml_chat_endpoint.ipynb](https://github.com/santiagxf/langchain/blob/santiagxf/azureml-endpoints-paygo-community/docs/docs/integrations/chat/azureml_chat_endpoint.ipynb)

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
8 months ago
Davide Menini 9ce177580a
community: normalize bedrock embeddings (#15103)
In this PR I added a post-processing function to normalize the
embeddings. This happens only if the new `normalize` flag is `True`.

---------

Co-authored-by: taamedag <Davide.Menini@swisscom.com>
8 months ago
baichuan-assistant 20fcd49348
community: Fix Baichuan Chat. (#15207)
- **Description:** Baichuan Chat (with both Baichuan-Turbo and
Baichuan-Turbo-192K models) has updated their APIs. There are breaking
changes. For example, BAICHUAN_SECRET_KEY is removed in the latest API
but is still required in Langchain. Baichuan's Langchain integration
needs to be updated to the latest version.
  - **Issue:** #15206
  - **Dependencies:** None,
  - **Twitter handle:** None

@hwchase17.

Co-authored-by: BaiChuanHelper <wintergyc@WinterGYCs-MacBook-Pro.local>
8 months ago
gcheron cfc225ecb3
community: SQLStrStore/SQLDocStore provide an easy SQL alternative to `InMemoryStore` to persist data remotely in a SQL storage (#15909)
**Description:**

- Implement `SQLStrStore` and `SQLDocStore` classes that inherits from
`BaseStore` to allow to persist data remotely on a SQL server.
- SQL is widely used and sometimes we do not want to install a caching
solution like Redis.
- Multiple issues/comments complain that there is no easy remote and
persistent solution that are not in memory (users want to replace
InMemoryStore), e.g.,
https://github.com/langchain-ai/langchain/issues/14267,
https://github.com/langchain-ai/langchain/issues/15633,
https://github.com/langchain-ai/langchain/issues/14643,
https://stackoverflow.com/questions/77385587/persist-parentdocumentretriever-of-langchain
- This is particularly painful when wanting to use
`ParentDocumentRetriever `
- This implementation is particularly useful when:
     * it's expensive to construct an InMemoryDocstore/dict
     * you want to retrieve documents from remote sources
     * you just want to reuse existing objects
- This implementation integrates well with PGVector, indeed, when using
PGVector, you already have a SQL instance running. `SQLDocStore` is a
convenient way of using this instance to store documents associated to
vectors. An integration example with ParentDocumentRetriever and
PGVector is provided in docs/docs/integrations/stores/sql.ipynb or
[here](https://github.com/gcheron/langchain/blob/sql-store/docs/docs/integrations/stores/sql.ipynb).
- It persists `str` and `Document` objects but can be easily extended.

 **Issue:**

Provide an easy SQL alternative to `InMemoryStore`.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
8 months ago
Massimiliano Pronesti e529939c54
feat(llms): support more tasks in HuggingFaceHub LLM and remove deprecated dep (#14406)
- **Description:** this PR upgrades the `HuggingFaceHub` LLM:
   * support more tasks (`translation` and `conversational`)
   * replaced the deprecated `InferenceApi` with `InferenceClient`
* adjusted the overall logic to use the "recommended" model for each
task when no model is provided, and vice-versa.
- **Tag mainter(s)**: @baskaryan @hwchase17
8 months ago
Bagatur 54149292f8
community[patch]: Release 0.0.15 (#16474) 8 months ago
Tomaz Bratanic d0a8082188
Fix neo4j sanitize (#16439)
Fix the sanitization bug and add an integration test
8 months ago
Florian MOREL 4b7969efc5
community[minor]: New documents loader for visio files (with extension .vsdx) (#16171)
**Description** : New documents loader for visio files (with extension
.vsdx)

A [visio file](https://fr.wikipedia.org/wiki/Microsoft_Visio) (with
extension .vsdx) is associated with Microsoft Visio, a diagram creation
software. It stores information about the structure, layout, and
graphical elements of a diagram. This format facilitates the creation
and sharing of visualizations in areas such as business, engineering,
and computer science.

A Visio file can contain multiple pages. Some of them may serve as the
background for others, and this can occur across multiple layers. This
loader extracts the textual content from each page and its associated
pages, enabling the extraction of all visible text from each page,
similar to what an OCR algorithm would do.

**Dependencies** : xmltodict package
8 months ago
Boris Feld 404abf139a
community: Add CometLLM tracing context var (#15765)
I also added LANGCHAIN_COMET_TRACING to enable the CometLLM tracing
integration similar to other tracing integrations. This is easier for
end-users to enable it rather than importing the callback and pass it
manually.

(This is the same content as
https://github.com/langchain-ai/langchain/pull/14650 but rebased and
squashed as something seems to confuse Github Action).
8 months ago
DL b9e7f6f38a
community[minor]: Bedrock async methods (#12477)
Description: Added support for asynchronous streaming in the Bedrock
class and corresponding tests.

Primarily:
  async def aprepare_output_stream
    async def _aprepare_input_and_invoke_stream
    async def _astream
    async def _acall

I've ensured that the code adheres to the project's linting and
formatting standards by running make format, make lint, and make test.

Issue: #12054, #11589

Dependencies: None

Tag maintainer: @baskaryan 

Twitter handle: @dominic_lovric

---------

Co-authored-by: Piyush Jain <piyushjain@duck.com>
8 months ago
Frank995 5694728816
community[patch]: Implement vector length definition at init time in PGVector for indexing (#16133)
Replace this entire comment with:
- **Description:** allow user to define tVector length in PGVector when
creating the embedding store, this allows for later indexing
  - **Issue:** #16132
  - **Dependencies:** None
8 months ago
parkererickson-tg b26a22f307
community[minor]: add TigerGraph support (#16280)
**Description:** Add support for querying TigerGraph databases through
the InquiryAI service.
**Issue**: N/A
**Dependencies:** N/A
**Twitter handle:** @TigerGraphDB
8 months ago
Alireza Kashani d1b4ead87c
community[patch]: Update grobid.py (#16298)
there is a case where "coords" does not exist in the "sentence"
therefore, the "split(";")" will lead to error.

we can fix that by adding "if sentence.get("coords") is not None:" 

the resulting empty "sbboxes" from this scenario will raise error at
"sbboxes[0]["page"]" because sbboxes are empty.

the PDF from https://pubmed.ncbi.nlm.nih.gov/23970373/ can replicate
those errors.
8 months ago
s-g-1 fbe592a5ce
community[patch]: fix typo in pgvecto_rs debug msg (#16318)
fixes typo in pip install message for the pgvecto_rs community vector
store
no issues found mentioning this
no dependents changed
8 months ago
Ian b9f5104e6c
communty[minor]: Store Message History to TiDB Database (#16304)
This pull request integrates the TiDB database into LangChain for
storing message history, marking one of several steps towards a
comprehensive integration of TiDB with LangChain.


A simple usage
```python
from datetime import datetime
from langchain_community.chat_message_histories import TiDBChatMessageHistory

history = TiDBChatMessageHistory(
    connection_string="mysql+pymysql://<host>:<PASSWORD>@<host>:4000/<db>?ssl_ca=/etc/ssl/cert.pem&ssl_verify_cert=true&ssl_verify_identity=true",
    session_id="code_gen",
    earliest_time=datetime.utcnow(),  # Optional to set earliest_time to load messages after this time point.
)

history.add_user_message("hi! How's feature going?")
history.add_ai_message("It's almot done")
```
8 months ago
Erick Friis cfe95ab085
multiple: update langsmith dep (#16407) 8 months ago
Eli Lucherini 6b2a57161a
community[patch]: allow additional kwargs in MlflowEmbeddings for compatibility with Cohere API (#15242)
- **Description:** add support for kwargs in`MlflowEmbeddings`
`embed_document()` and `embed_query()` so that all the arguments
required by Cohere API (and others?) can be passed down to the server.
  - **Issue:** #15234 
- **Dependencies:** MLflow with MLflow Deployments (`pip install
mlflow[genai]`)

**Tests**
Now this code [adapted from the
docs](https://python.langchain.com/docs/integrations/providers/mlflow#embeddings-example)
for the Cohere API works locally.

```python
"""
Setup
-----
export COHERE_API_KEY=...
mlflow deployments start-server --config-path examples/deployments/cohere/config.yaml

Run
---
python /path/to/this/file.py
"""
embeddings = MlflowCohereEmbeddings(target_uri="http://127.0.0.1:5000", endpoint="embeddings")
print(embeddings.embed_query("hello")[:3])
print(embeddings.embed_documents(["hello", "world"])[0][:3])
```

Output
```
[0.060455322, 0.028793335, -0.025848389]
[0.031707764, 0.021057129, -0.009361267]
```
8 months ago
Guillem Orellana Trullols aad2aa7188
community[patch]: BedrockChat -> Support Titan express as chat model (#15408)
Titan Express model was not supported as a chat model because LangChain
messages were not "translated" to a text prompt.

Co-authored-by: Guillem Orellana Trullols <guillem.orellana_trullols@siemens.com>
8 months ago
Katarina Supe 01c2f27ffa
community[patch]: Update Memgraph support (#16360)
- **Description:** I removed two queries to the database and left just
one whose results were formatted afterward into other type of schema
(avoided two calls to DB)
  - **Issue:** /
  - **Dependencies:** /
  - **Twitter handle:** @supe_katarina
8 months ago
Max Jakob 8569b8f680
community[patch]: ElasticsearchStore enable max inner product (#16393)
Enable max inner product for approximate retrieval strategy. For exact
strategy we lack the necessary `maxInnerProduct` function in the
Painless scripting language, this is why we do not add it there.

Similarity docs:
https://www.elastic.co/guide/en/elasticsearch/reference/current/dense-vector.html#dense-vector-params

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Joe McElroy <joseph.mcelroy@elastic.co>
8 months ago
Iskren Ivov Chernev fc196cab12
community[minor]: DeepInfra support for chat models (#16380)
Add deepinfra chat models support.

This is https://github.com/langchain-ai/langchain/pull/14234 re-opened
from my branch (so maintainers can edit).
8 months ago
Bagatur 85e8423312
community[patch]: Update bing results tool name (#16395)
Make BingSearchResults tool name OpenAI functions compatible (can't have
spaces).

Fixes #16368
8 months ago
Max Jakob de209af533
community[patch]: ElasticsearchStore: add relevance function selector (#16378)
Implement similarity function selector for ElasticsearchStore. The
scores coming back from Elasticsearch are already similarities (not
distances) and they are already normalized (see
[docs](https://www.elastic.co/guide/en/elasticsearch/reference/current/dense-vector.html#dense-vector-params)).
Hence we leave the scores untouched and just forward them.

This fixes #11539.

However, in hybrid mode (when keyword search and vector search are
involved) Elasticsearch currently returns no scores. This PR adds an
error message around this fact. We need to think a bit more to come up
with a solution for this case.

This PR also corrects a small error in the Elasticsearch integration
test.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
8 months ago
Tom Jorquera 1445ac95e8
community[patch]: Enable streaming for GPT4all (#16392)
`streaming` param was never passed to model
8 months ago
Bagatur 8779013847
community[patch]: Release 0.0.14 (#16384) 8 months ago
Bagatur 1dc6c1ce06
core[patch], community[patch], langchain[patch], docs: Update SQL chains/agents/docs (#16168)
Revamp SQL use cases docs. In the process update SQL chains and agents.
8 months ago
Luke 5396604ef4
community: Handling missing key in Google Trends API response. (#15864)
- **Description:** Handing response where _interest_over_time_ is
missing.
  - **Issue:** #15859
  - **Dependencies:** None
8 months ago
Virat Singh c2a614eddc
community: Add PolygonLastQuote Tool and Toolkit (#15990)
**Description:** 
In this PR, I am adding a `PolygonLastQuote` Tool, which can be used to
get the latest price quote for a given ticker / stock.

Additionally, I've added a Polygon Toolkit, which we can use to
encapsulate future tools that we build for Polygon.

**Twitter handle:** [@virattt](https://twitter.com/virattt)

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
8 months ago
Ofer Mendelevitch ffae98d371
template: Update Vectara templates (#15363)
fixed multi-query template for Vectara
added self-query template for Vectara

Also added prompt_name parameter to summarization

CC @efriis 
 **Twitter handle:** @ofermend
8 months ago
Carey 021b0484a8
community[patch]: add skipped test for inner product normalization (#14989)
---------

Co-authored-by: Erick Friis <erick@langchain.dev>
8 months ago
Christophe Bornet 3ccbe11363
community[minor]: Add Cassandra document loader (#16215)
- **Description:** document loader for Apache Cassandra
  - **Twitter handle:** cbornet_
8 months ago
mikeFore4 9d32af72ce
community[patch]: huggingface hub character removal bug fix (#16233)
- **Description:** Some text-generation models on huggingface repeat the
prompt in their generated response, but not all do! The tests use "gpt2"
which DOES repeat the prompt and as such, the HuggingFaceHub class is
hardcoded to remove the first few characters of the response (to match
the len(prompt)). However, if you are using a model (such as the very
popular "meta-llama/Llama-2-7b-chat-hf") that DOES NOT repeat the prompt
in it's generated text, then the beginning of the generated text will be
cut off. This code change fixes that bug by first checking whether the
prompt is repeated in the generated response and removing it
conditionally.
  - **Issue:** #16232 
  - **Dependencies:** N/A
  - **Twitter handle:** N/A
8 months ago
Andreas Motl 3613d8a2ad
community[patch]: Use SQLAlchemy's `bulk_save_objects` method to improve insert performance (#16244)
- **Description:** Improve [pgvector vector store
adapter](https://github.com/langchain-ai/langchain/blob/v0.1.1/libs/community/langchain_community/vectorstores/pgvector.py)
to save embeddings in batches, to improve its performance.
  - **Issue:** NA
  - **Dependencies:** NA
  - **References:** https://github.com/crate-workbench/langchain/pull/1


Hi again from the CrateDB team,

following up on GH-16243, this is another minor patch to the pgvector
vector store adapter. Inserting embeddings in batches, using
[SQLAlchemy's
`bulk_save_objects`](https://docs.sqlalchemy.org/en/20/orm/session_api.html#sqlalchemy.orm.Session.bulk_save_objects)
method, can deliver substantial performance gains.

With kind regards,
Andreas.

NB: As I am seeing just now that this method is a legacy feature of SA
2.0, it will need to be reworked on a future iteration. However, it is
not deprecated yet, and I haven't been able to come up with a different
implementation, yet.
8 months ago
Christophe Bornet 3502a407d9
infra: Use dotenv in langchain-community's integration tests (#16137)
* Removed some env vars not used in langchain package IT
* Added Astra DB env vars in langchain package, used for cache tests
* Added conftest.py to load env vars in langchain_community IT
* Added .env.example in  langchain_community IT
8 months ago
Tomaz Bratanic 1e80113ac9
community[patch]: Add neo4j timeout and value sanitization option (#16138)
The timeout function comes in handy when you want to kill longrunning
queries.
The value sanitization removes all lists that are larger than 128
elements. The idea here is to remove embedding properties from results.
8 months ago
Krishna Shedbalkar f238217cea
community[patch]: Basic Logging and Human input to ShellTool (#15932)
- **Description:** As Shell tool is very versatile, while integrating it
into applications as openai functions, developers have no clue about
what command is being executed using the ShellTool. All one can see is:

![image](https://github.com/langchain-ai/langchain/assets/60742358/540e274a-debc-4564-9027-046b91424df3)

Summarising my feature request:
1. There's no visibility about what command was executed.
2. There's no mechanism to prevent a command to be executed using
ShellTool, like a y/n human input which can be accepted from user to
proceed with executing the command.,
  - **Issue:** the issue #15931 it fixes if applicable,
  - **Dependencies:** There isn't any dependancy,
  - **Twitter handle:** @krishnashed
8 months ago
Christophe Bornet fb940d11df
community[patch]: Use newer MetadataVectorCassandraTable in Cassandra vector store (#15987)
as VectorTable is deprecated

Tested manually with `test_cassandra.py` vector store integration test.
8 months ago
Mohammad Mohtashim 1fa056c324
community[patch]: Don't set search path for unknown SQL dialects (#16047)
- **Description:** Made a small fix for the `SQLDatabase` highlighted in
an issue. The issue pertains to switching schema for different SQL
engines. 
  - **Issue:** #16023
@baskaryan
8 months ago
Leonid Ganeline c5f6b828ad
langchain[patch], community[minor]: move `output_parsers.ernie_functions` (#16057)
`output_parsers.ernie_functions` moved into `community`
8 months ago
Fei Wang d0e101e4e0
community[patch]: fix ollama astream (#16070)
Update ollama.py
8 months ago
BeatrixCohere b0c3e3db2b
community[patch]: Handle when documents are not provided in the Cohere response (#16144)
- **Description:** This handles the cohere response when documents
aren't included in the response
  - **Issue:** N/A
  - **Dependencies:** N/A
  - **Twitter handle:** N/A
8 months ago
Felix Krones d91126fc64
community[patch]: missing unpack operator for or_clause in pgvector document filter (#16148)
- Fix for #16146 
- Adding unpack operation to "or" and "and" filter for pgvector
retriever. #
8 months ago
William FH e5cf1e2414
Community[patch]use secret str in Tavily and HuggingFaceInferenceEmbeddings (#16109)
So the api keys don't show up in repr's 

Still need to do tests
8 months ago
William FH f3601b0aaf
Community[Patch] Remove docs form bm25 repr (#16110)
Resolves: https://github.com/langchain-ai/langsmith-sdk/issues/356
8 months ago
Erick Friis 52114bdfac
community[patch]: release 0.0.13 (#16087) 8 months ago
James Briggs ca288d8f2c
community[patch]: add vector param to index query for pinecone vec store (#16054) 8 months ago
Antonio Morales 476fb328ee
community[patch]: implement adelete from VectorStore in Qdrant (#16005)
**Description:**
Implement `adelete` function from `VectorStore` in `Qdrant` to support
other asynchronous flows such as async indexing (`aindex`) which
requires `adelete` to be implemented. Since `Qdrant` can be passed an
async qdrant client, this can be supported easily.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
8 months ago
高远 061e63eef2
community[minor]: add vikingdb vecstore (#15155)
---------

Co-authored-by: gaoyuan <gaoyuan.20001218@bytedance.com>
8 months ago
andrijdavid d196646811
community[patch]: Refactor OpenAIWhisperParserLocal (#15150)
This PR addresses an issue in OpenAIWhisperParserLocal where requesting
CUDA without availability leads to an AttributeError #15143

Changes:

- Refactored Logic for CUDA Availability: The initialization now
includes a check for CUDA availability. If CUDA is not available, the
code falls back to using the CPU. This ensures seamless operation
without manual intervention.
- Parameterizing Batch Size and Chunk Size: The batch_size and
chunk_size are now configurable parameters, offering greater flexibility
and optimization options based on the specific requirements of the use
case.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
8 months ago
Zhichao HAN 5cf06db3b3
community[minor]: add JsonRequestsWrapper tool (#15374)
**Description:** This new feature enhances the flexibility of pipeline
integration, particularly when working with RESTful APIs.
``JsonRequestsWrapper`` allows for the decoding of JSON output, instead
of the only option for text output.

---------

Co-authored-by: Zhichao HAN <hanzhichao2000@hotmail.com>
8 months ago
chyroc d334efc848
community[patch]: fix top_p type hint (#15452)
fix: https://github.com/langchain-ai/langchain/issues/15341

@efriis
8 months ago
Mateusz Szewczyk 251afda549
community[patch]: fix stop (stop_sequences) param on WatsonxLLM (#15541)
- **Description:** Fix to IBM
[watsonx.ai](https://www.ibm.com/products/watsonx-ai) LLM provider (stop
(`stop_sequences`) param on watsonxLLM)
- **Dependencies:**
[ibm-watsonx-ai](https://pypi.org/project/ibm-watsonx-ai/),
8 months ago
Funkeke 7220124368
community[patch]: fix tongyi completion and params error (#15544)
fix tongyi completion json parse error and prompt's params error

---------

Co-authored-by: fangkeke <3339698829@qq.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
8 months ago
盐粒 Yanli ddf4e7c633
community[minor]: Update pgvecto_rs to use its high level sdk (#15574)
- **Description:** Update pgvecto_rs to use its high level sdk, 
  - **Issue:** fix #15173
8 months ago
YHW ce21392a21
community: add a flag that determines whether to load the milvus collection (#15693)
fix https://github.com/langchain-ai/langchain/issues/15694

---------

Co-authored-by: hyungwookyang <hyungwookyang@worksmobile.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
8 months ago
Mohammad Mohtashim 9e779ca846
community[patch]: Fixing the SlackGetChannel Tool Input Error (#15725)
Fixed the issue mentioned in #15698 for SlackGetChannel Tool.

@baskaryan.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
8 months ago
axiangcoding daa9ccae52
community[patch]: deprecate ErnieBotChat and ErnieEmbeddings classes (#15862)
- **Description:** add deprecated warning for ErnieBotChat and
ErnieEmbeddings.
- These two classes **lack maintenance** and do not use the sdk provided
by qianfan, which means hard to implement some key feature like
streaming.
- The alternative `langchain_community.chat_models.QianfanChatEndpoint`
and `langchain_community.embeddings.QianfanEmbeddingsEndpoint` can
completely replace these two classes, only need to change configuration
items.
  - **Issue:** None,
  - **Dependencies:** None,
  - **Twitter handle:** None

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
8 months ago
JaguarDB b11fd3bedc
community[patch]: jaguar vector store fix integer-element error when joining metadata values (#15939)
- **Description:** some document loaders add integer-type metadata
values which cause error
  - **Issue:** 15937
  - **Dependencies:** none

---------

Co-authored-by: JY <jyjy@jaguardb>
8 months ago
Neo Zhao 21e0df937f
community[patch]: fix a bug that mistakenly handle zip iterator in FAISS.from_embeddings (#16020)
**Description**: `zip` is iterator that will only produce result once,
so the previous code will cause the `embeddings` to be an empty list.

**Issue**: I could not find a related issue.

**Dependencies**: this PR does not introduce or affect dependencies.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
8 months ago
Leonid Ganeline fb676d8a9b
community[minor], langchain[minor]: refactor `output_parsers` Rail (#15852)
Moved Rail parser to `community` package.
8 months ago
Massimiliano Pronesti e80aab2275
docs(community): update Amadeus toolkit to langchain v0.1 (#15976)
- **Description:** docs update following the changes introduced in
#15879

<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
8 months ago
Ashley Xu ce7723c1e5
community[minor]: add additional support for `BigQueryVectorSearch` (#15904)
BigQuery vector search lets you use GoogleSQL to do semantic search,
using vector indexes for fast but approximate results, or using brute
force for exact results.

This PR:
1. Add `metadata[_job_ib]` in Document returned by any similarity search
2. Add `explore_job_stats` to enable users to explore job statistics and
better the debuggability
3. Set the minimum row limit for running create vector index.
8 months ago
Mohammed Naqi 8799b028a6
community[minor]: Adding asynchronous function implementation for Doctran (#15941)
## Description 
In this update, I addressed the missing implementation for
atransform_document, which is the asynchronous counterpart of
transform_document in Doctran.

### Usage Example:
```py
# Instantiate DoctranPropertyExtractor with specified properties
property_extractor = DoctranPropertyExtractor(properties=properties)

# Asynchronously extract properties from a list of documents
extracted_document = await property_extractor.atransform_documents(
    documents, properties=properties
)

# Display metadata of the first extracted document
print(json.dumps(extracted_document[0].metadata, indent=2))

```

## Issue
- Pull request #14525 has caused a break in the aforementioned code.
Instead of removing an asynchronous implementation of a function,
consider implementing a synchronous version alongside it.
8 months ago
Raunak c0773ab329
community[patch]: Fixed 'coroutine' object is not subscriptable error (#15986)
- **Description:** Added parenthesis in return statement of
aembed_query() funtion to fix 'coroutine' object is not subscriptable
error.
  - **Dependencies:** NA

Co-authored-by: H161961 <Raunak.Raunak@Honeywell.com>
8 months ago
Karim Lalani 14244bd7e5
community[minor]: Added document loader for SurrealDB (#15995)
Added a simple document loader to work with SurrealDB.
8 months ago
Karim Lalani 768e5e33bc
community[minor]: Fix to match SurrealDB 0.3.2 SDK (#15996)
New version of SurrealDB python sdk was causing the integration to
break.
This fix addresses that change.
8 months ago
shahrin014 86321a949f
community: Ollama - Parameter structure to follow official documentation (#16035)
## Feature
- Follow parameter structure as per official documentation 
- top level parameters (e.g. model, system, template) will be passed as
top level parameters
  - other parameters will be sent in options unless options is provided

![image](https://github.com/langchain-ai/langchain/assets/17451563/d14715d9-9701-4ee3-b44b-89fffea62389)

## Tests
- Test if top level parameters handled properly
- Test if parameters that are not top level parameters are handled as
options
- Test if options is provided, it will be passed as is
8 months ago
Nir Kopler 0fa06732b7
community: add new gpt-3.5-turbo-1106 finetuned for cost calculation (#16039)
**Description:** Added the new gpt-3.5-turbo-1106 for **finetuned** cost
calculation,
**Issue:** no issue found open

By the information in OpenAI the pricing is the same as the older model
(0613)
8 months ago
Virat Singh eb6e385dc5
community: Add PolygonAPIWrapper and get_last_quote endpoint (#15971)
- **Description:** Added a `PolygonAPIWrapper` and an initial
`get_last_quote` endpoint, which allows us to get the last price quote
for a given `ticker`. Once merged, I can add a Polygon tool in `tools/`
for agents to use.
- **Twitter handle:** [@virattt](https://twitter.com/virattt)

The Polygon.io Stocks API provides REST endpoints that let you query the
latest market data from all US stock exchanges.
8 months ago
Erick Friis 74bac7bda1
community[patch]: core min 0.1.9 (#15974) 8 months ago
Erick Friis 845e407e08
community[patch]: release 0.0.12 (#15973) 8 months ago
Jonathan Algar a74f3a4979
Batch update of alt text and title attributes for images in md/mdx files across repo (#15357)
**Description:** Batch update of alt text and title attributes for
images in `md` & `mdx` files across the repo using
[alttexter](https://github.com/jonathanalgar/alttexter)/[alttexter-ghclient](https://github.com/jonathanalgar/alttexter-ghclient)
(built using LangChain/LangSmith).

**Limitation:** cannot update `ipynb` files because of [this
issue](https://github.com/langchain-ai/langchain/pull/15357#issuecomment-1885037250).
Can revisit when Docusaurus is bumped to v3.

I checked all the generated alt texts and titles and didn't find any
technical inaccuracies. That's not to say they're _perfect_, but a lot
better than what's there currently.


[Deployed](https://langchain-819yf1tbk-langchain.vercel.app/docs/modules/model_io/)
image example:


![chrome_yZQ7BF2GTj](https://github.com/langchain-ai/langchain/assets/93204286/43a9a4d4-70fd-41c4-8978-b6240ff63ffa)

You can see LangSmith traces for all the calls out to the LLM in the PRs
merged into this one:

* https://github.com/jonathanalgar/langchain/pull/6
* https://github.com/jonathanalgar/langchain/pull/4
* https://github.com/jonathanalgar/langchain/pull/3

I didn't add the following files to the PR as the images already have OK
alt texts:

*
27dca2d92f/docs/docs/integrations/providers/argilla.mdx (L3)
*
27dca2d92f/docs/docs/integrations/providers/apify.mdx (L11)

---------

Co-authored-by: github-actions <github-actions@github.com>
8 months ago
Varik Matevosyan efe6cfafe2
community: Added Lantern as VectorStore (#12951)
Support [Lantern](https://github.com/lanterndata/lantern) as a new
VectorStore type.

- Added Lantern as VectorStore.
It will support 3 distance functions `l2 squared`, `cosine` and
`hamming` and will use `HNSW` index.
- Added tests
- Added example notebook
8 months ago
Edwin Wenink 9fb09c1c30
community: fix the "page" mode in the AzureAIDocumentIntelligenceParser (bug) (#15958)
**Description**: the "page" mode in the
AzureAIDocumentIntelligenceParser is not accessible due to a wrong
membership test. The mode argument can only be a string (also see the
assertion in the `__init__`: `assert self.mode in ["single", "page",
"object", "markdown"]`, so the check `elif self.mode == ["page"]:`
always fails.
As a result, effectively the "object" mode is used when selecting the
"page" mode, which may lead to errors.

The docstring of the `AzureAIDocumentIntelligenceLoader` also ommitted
the `mode` parameter alltogether, so I added it.

**Issue**: I could not find a related issue (this class is only 3 weeks
old anyways)

**Dependencies**: this PR does not introduce or affect dependencies.

The current demo notebook and examples are not affected because they all
use the default markdown mode.
8 months ago
Mahdi Setayesh eb76f9c9fe
community: Fixing a performance issue with AzureSearch to perform batch embedding (#15594)
- **Description:** Azure Cognitive Search vector DB store performs slow
embedding as it does not utilize the batch embedding functionality. This
PR provide a fix to improve the performance of Azure Search class when
adding documents to the vector search,
  - **Issue:** #11313 ,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
8 months ago
ChengZi d5808f786c
community: Support milvus partition key. (#15740)
- **Description:** Milvus's partition key is an important feature. It
can support multi-tenancy. We hope to introduce this feature.
https://milvus.io/docs/partition_key.md
  - **Issue:** No
  - **Dependencies:** No
  - **Twitter handle:** No

---------

Signed-off-by: ChengZi <chen.zhang@zilliz.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
8 months ago
ohbeep 9b3962fc25
community: Add support of "http" URI for Milvus (#12710) (#15683)
- **Description:** Add support of HTTP URI for Milvus
  - **Issue:** #12710 
  - **Dependencies:** N/A,
8 months ago
Raunak e26e1f8b37
community: Added functions to make async calls to HuggingFaceHub's embedding endpoint in HuggingFaceHubEmbeddings class (#15737)
**Description:**
Added aembed_documents() and aembed_query() async functions in
HuggingFaceHubEmbeddings class in
langchain_community\embeddings\huggingface_hub.py file. It will support
to make async calls to HuggingFaceHub's
embedding endpoint and generate embeddings asynchronously.

Test Cases: Added test_huggingfacehub_embedding_async_documents() and
test_huggingfacehub_embedding_async_query()
functions in test_huggingface_hub.py file to test the two async
functions created in HuggingFaceHubEmbeddings class.

Documentation: Updated huggingfacehub.ipynb with steps to install
huggingface_hub package and use
HuggingFaceHubEmbeddings.

**Dependencies:** None,
**Twitter handle:** I do not have a Twitter account

---------

Co-authored-by: H161961 <Raunak.Raunak@Honeywell.com>
8 months ago
Christophe Bornet 81d1ba05dc
Add a BaseStore backed by AstraDB (#15812)
- **Description:** this change adds a `BaseStore` backed by AstraDB
  - **Twitter handle:** cbornet_
8 months ago
manishsahni2000 74d9fc2f9e
PR community:Removing knn beta content in mongodb atlas vectorstore (#15865)
<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
8 months ago
shahrin014 bdd90ae2ee
community: Ollama - Pass headers to post request (#15881)
## Feature
- Set additional headers in constructor
- Headers will be sent in post request

This feature is useful if deploying Ollama on a cloud service such as
hugging face, which requires authentication tokens to be passed in the
request header.

## Tests
- Test if header is passed
- Test if header is not passed
8 months ago
Xin Liu 5efec068c9
feat: Implement `stream` interface (#15875)
<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

Major changes:

- Rename `wasm_chat.py` to `llama_edge.py`
- Rename the `WasmChatService` class to `ChatService`
- Implement the `stream` interface for `ChatService`
- Add `test_chat_wasm_service_streaming` in the integration test
- Update `llama_edge.ipynb`

---------

Signed-off-by: Xin Liu <sam@secondstate.io>
8 months ago
Massimiliano Pronesti ec4dab0449
feat(community): make Amadeus toolkit LLM-agnostic (#15879)
- **Description:** `AmadeusToolkit` and `AmadeusClosestAirport`
contained a hardcoded call to `ChatOpenAI`. This PR makes it
LLM-independent, while guaranteeing backward compatibility.
  - **Issue:** #15847 
  - **Dependencies:** None
   
@baskaryan 

<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
8 months ago
Yacine 782dd44be9
<langchain_community.vectorstores>:<Fix pinecone.py __init__ docsrting instruction> (#15922)
- **Description:** The pinecone docstring instructs to pass the
embedding query text causing the warning below. It should be the
embeddings object.
warning message: UserWarning: Passing in `embedding` as a Callable is
deprecated. Please pass in an Embeddings object instead.
  - **Issue:** NA
  - **Dependencies:** None


@baskaryan
8 months ago
Erick Friis 623f87c888
community[patch]: pinecone bug (#15905) 8 months ago
axiangcoding d5aa277b94
community: add collection_properties parameter to Milvus (#15788)
- **Description:** add collection_properties parameter to Milvus. See
[pymilvus set_properties()
description](https://milvus.io/api-reference/pymilvus/v2.3.x/Collection/set_properties().md)
  - **Issue:** None
  - **Dependencies:** None
  - **Twitter handle:** None
9 months ago
mogith-pn 9e1ed17bfb
Community : Modified doc strings and example notebook for Clarifai (#15816)
Community : Modified doc strings and example notebook for Clarifai

Description:
1. Modified doc strings inside clarifai vectorstore class and
embeddings.
2. Modified notebook examples.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
9 months ago
Erick Friis 38523d7c57
together[minor]: add llm (#15853) 9 months ago
Erick Friis ee708739c3
community[patch]: pinecone v3 support (#15849)
Info in slack

---------

Co-authored-by: Roie Schwaber-Cohen <roie.cohen@gmail.com>
9 months ago
Erick Friis 85a4594ed7
community[patch]: more deprecations (#15782) 9 months ago
NuODaniel 70b6315b23
community[patch]: fix qianfan chat stream calling caused exception (#13800)
- **Description:** 
`QianfanChatEndpoint` extends `BaseChatModel` as a super class, which
has a default stream implement might concat the MessageChunk with
`__add__`. When call stream(), a ValueError for duplicated key will be
raise.
  - **Issues:** 
     * #13546  
     * #13548
     * merge two single test file related to qianfan.
  - **Dependencies:** no
  - **Tag maintainer:**

---------

Co-authored-by: root <liujun45@baidu.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
9 months ago
Bagatur ee5bd986de
community[patch]: update oai deprecation message (#15681)
addresses #15674
9 months ago
Erick Friis 4ed3d17c47
community[patch]: release 0.0.11 (#15760) 9 months ago
Ian 32ec56194b
community: fix myscale delete function bug (#15675)
Now the SQL used to delete vector doc from myscale is as follow:
```sql
DELETE FROM collection WHERE id = '1' AND id = '2' AND id = '3'
```

But the expected one should be 

```sql
DELETE FROM collection WHERE id IN ('1', '2', '3')
```
9 months ago
Christophe Bornet a466f79ac9
Fix AstraDB logical operator filtering (#15699)
<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
This change fixes the AstraDB logical operator filtering (`$and,`
`$or`).
The `metadata` prefix must not be added if the key is `$and` or `$or`.
9 months ago
Christophe Bornet 1f5f6381ec
Add doc for AstraDB document loader (#15703)
<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
See preview :
https://langchain-git-fork-cbornet-astra-loader-doc-langchain.vercel.app/docs/integrations/document_loaders/astradb
9 months ago
Erick Friis 94911ae503
community[patch]: Support different Pinecone initializations depending on the version (#15717)
Co-authored-by: DosticJelena <jelenadostic2@gmail.com>
9 months ago
Bagatur 4c47f39fcb
community[patch]: Release 0.0.10 (#15678) 9 months ago
Nuno Campos 7ce4cd0709
Do not issue beta or deprecation warnings on internal calls (#15641) 9 months ago
Earlee 98c6c9603e
community: fix: should flush after inserting data on milvus (#15568)
The inserted data cannot take effect immediately. We should flush after
inserting data on milvus.
9 months ago
chyroc a17a3638b5
Docs: fix excel document loader typo (#15470) 9 months ago
chyroc 9ae901c5e6
Feat: add CHM file loader (#15519)
fix https://github.com/langchain-ai/langchain/issues/15469
9 months ago
Nan LI 0b393315ce
community: Correct Input API Key Name in Notebook and Enhance Readability of Comments for ZhipuAI Chat Model (#15529)
- **Description:** This update rectifies an error in the notebook by
changing the input variable from `zhipu_api_key` to `api_key`. It also
includes revisions to comments to improve program readability.
- **Issue:** The input variable in the notebook example should be
`api_key` instead of `zhipu_api_key`.
- **Dependencies:** No additional dependencies are required for this
change.

To ensure quality and standards, we have performed extensive linting and
testing. Commands such as make format, make lint, and make test have
been run from the root of the modified package to ensure compliance with
LangChain's coding standards.
9 months ago
kursathalat 9ea28ee464
fix: Fix DEFAULT_API_KEY for ArgillaCallbackHandler (#15534)
- ArgillaCallbackHandler does not properly set the default values while
initializing. This PR corrects the line.
- Issue: #15531 
- Dependencies: Argilla

- Also corrected some dead links.
9 months ago
Chad Norvell d1bfb70bc4
community: Allow deleting by ID and collection in `pgvector` (#15627)
- **Description:** The `delete_collection` method deletes an entire
collection regardless of custom ID. The `delete` method deletes
everything with the provided custom IDs regardless of collection. It can
be useful to restrict deletion to both the collection and a set of
custom IDs. This change adds support for that by allowing you to
optionally specify that `delete` should be restricted to the collection
defined on the `PGVector` instance.
9 months ago
Chad Norvell f6226d464e
community: Include PDF ID in MathPix metadata (#15629)
- **Description:** Includes the PDF ID in the MathPix document metadata.
This is useful in case you need to re-request a processed PDF from the
MathPix API later.
9 months ago
Chad Norvell d2a686b165
community: Provide more actionable errors in the MathPix PDF loader (#15630)
- **Description:** The `error_info['id']` can be cross-referenced with
the MathPix API documentation to get very specific information about why
an error occurred.
9 months ago
Kai 5d05df4bce
community: Fixed bug of "system message check" in chat_models/tongyi. (#15631)
- **Description:** This PR is to fix a bug of "system message check" in
langchain_community/ chat_models/tongyi.py
- **Issue:** In term of current logic, if there's no system message in
the chat messages, an error of "System message can only be the first
message." will be wrongly raised.
  - **Dependencies:** No.
  - **Twitter handle:** I don't have a Twitter account.
9 months ago
Raunak 64f5968a81
community: Replaced hardcoded "metadata" with FIELDS_METADATA variable in semantic_hybrid_search_with_score_and_rerank (#15642)
- **Description:** This PR is to fix a bug in
semantic_hybrid_search_with_score_and_rerank() function in
langchain_community/vectorstores/azuresearch.py. The hardcoded
"metadata" name is replaced with FIELDS_METADATA variable with an if
block to check if the metadata column exists or not.
- **Issue:** Fixed #15581
- **Dependencies:** No
- **Twitter handle:** None

Co-authored-by: H161961 <Raunak.Raunak@Honeywell.com>
9 months ago
Erick Friis d136925c49
community[patch]: fix deprecation warnings on openai subclasses (#15621) 9 months ago
Bagatur c5226d7a18
docs: update cohere chat integration (#15562)
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
9 months ago
Bagatur dbb582d227
infra: community bump min core version (#15617) 9 months ago
Bagatur 1e4b8f0453
community[patch]: Release 0.0.9 (#15615) 9 months ago
Erick Friis ebc75c5ca7
openai[minor]: implement langchain-openai package (#15503)
Todo

- [x] copy over integration tests
- [x] update docs with new instructions in #15513 
- [x] add linear ticket to bump core -> community, community->langchain,
and core->openai deps
- [ ] (optional): add `pip install langchain-openai` command to each
notebook using it
- [x] Update docstrings to not need `openai` install
- [x] Add serialization
- [x] deprecate old models

Contributor steps:

- [x] Add secret names to manual integrations workflow in
.github/workflows/_integration_test.yml
- [x] Add secrets to release workflow (for pre-release testing) in
.github/workflows/_release.yml

Maintainer steps (Contributors should not do these):

- [x] set up pypi and test pypi projects
- [x] add credential secrets to Github Actions
- [ ] add package to conda-forge


Functional changes to existing classes:

- now relies on openai client v1 (1.6.1) via concrete dep in
langchain-openai package

Codebase organization

- some function calling stuff moved to
`langchain_core.utils.function_calling` in order to be used in both
community and langchain-openai
9 months ago
Bagatur a7d023aaf0
core[patch], community[patch]: mark runnable context, lc load as beta (#15603) 9 months ago
chyroc f12b5c1222
Feat: support Milvus more params (#15447)
fix https://github.com/langchain-ai/langchain/issues/15442
9 months ago
Bagatur b2f15738dd
core[patch], langchain[patch], community[patch]: Revert #15326 (#15546) 9 months ago
Bagatur 0b579dc623
infra: update community test min reqs (#15490) 9 months ago
Bagatur 266db0efc8
community[patch]: bump core version >=0.1.5,<0.2 (#15488) 9 months ago
Bagatur a2324ee533
community[patch]: Release 0.0.8 (#15481) 9 months ago
Bagatur 54b58c03db
infra: add minimum deps pre release check (#15485) 9 months ago
Bagatur baeac236b6
langchain[patch], experimental[patch]: update utilities imports (#15438) 9 months ago
Harutaka Kawamura 73da8f863c
Remove unused `Params` (#14385)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

Removes unused `Params` in `libs/langchain/langchain/llms/mlflow.py`.

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
9 months ago
chyroc b65e57971e
Patch: improve type hint (#15451) 9 months ago
Harutaka Kawamura 8ebf55ebbf
Fix `llms.Mlflow` example (#14386)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

The example code for `llms.Mlflow` is outdated.

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
9 months ago
Xin Liu 0a7d360ba4
feat: new integration `wasm_chat` (#14787)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

Adds `WasmChat` integration. `WasmChat` runs GGUF models locally or via
chat service in lightweight and secure WebAssembly containers. In this
PR, `WasmChatService` is introduced as the first step of the
integration. `WasmChatService` is driven by
[llama-api-server](https://github.com/second-state/llama-utils) and
[WasmEdge Runtime](https://wasmedge.org/).

---------

Signed-off-by: Xin Liu <sam@secondstate.io>
9 months ago
Anush 58cc7878e9
refactor: Qdrant async improvements (#14492)
Follow up on https://github.com/langchain-ai/langchain/pull/13048.
This PR intends to simplify the Qdrant async implementation by replacing
the internal GRPC methods with the `QdrantAsyncClient` methods.
This is a backward compatible change with no additional steps required
after merge.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
9 months ago
JuR-0 4dab37741a
Fix Bedrock broad error catching (#14398)
Fixes #14347 

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
- **Description:** Added the traceback of the previous error to keep the
initial error type,
  - **Issue:** #14347 ,
  - **Dependencies:** None,
  - **Tag maintainer:** 

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: Julien Raffy <julien.raffy@emeria.eu>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
9 months ago
Bob Lin e93be14c11
Improvement: Allow passing parameters to the underlying es_client. Closes: #14403 (#14435)
### Description

In https://github.com/langchain-ai/langchain/issues/14403, the user
mentioned that he hopes not to verify ssl and needs to pass more
parameters

I found that the `Elasticsearch` class [has very many
parameters](98f2af2134/elasticsearch/_sync/client/__init__.py (L131-L191)
):

<img width="1097" alt="Screenshot 2023-12-08 at 4 24 39 PM"
src="https://github.com/langchain-ai/langchain/assets/10000925/f2201554-b41a-4388-a8e8-c14a2d0466d4">

In order to adapt to more situations, I want to add the kwargs parameter
so that users can enter more `Elasticsearch` parameters. Like
[redis](https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/vectorstores/redis/base.py#L253),
[tair](https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/vectorstores/tair.py#L32),
[myscale](https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/vectorstores/myscale.py#L112)
and so on.
9 months ago
codehound42 8aa921d3a4
Support `score_threshold` in SupabaseVectorStore similarity search (#14439)
Description: Add support for setting the `score_threshold` for
similarity search in SupabaseVectoreStore.

This pull request addresses issue #14438

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
9 months ago
YISH eecfa81918
Add the collection_description parameter to Milvus (#14524)
Because Milvus' collection_name doesn't support UFT8 characters in other
languages, I want the `collection_descriotion`.


<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
9 months ago
Evgenii Molov b4ec340fb3
Fix failing serpapi response processing for Google Maps API (#14817)
**Description:** Fix for processing for serpapi response for Google Maps
API
**Issue:** Due to the fact corresponding
[api](https://serpapi.com/google-maps-api) returns 'local_results' as
list, and old version requested `res["local_results"].keys()` of the
list. As the result we got exception: ```AttributeError: 'list' object
has no attribute 'keys'```.

Way to reproduce wrong behaviour:
```
    params = {
        "engine": "google_maps",
        "type": "search",
        "google_domain": "google.de",
        "ll": "@51.1917,10.525,14z",
        "hl": "de",
        "gl": "de",
    }
    search = SerpAPIWrapper(params=params)
    results = search.run("cafe")
```

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Ran <rccalman@gmail.com>
9 months ago
YISH da0f750a0b
Milvus allows to store metadata as json field (#14636)
Because Milvus doesn't support nullable fields, but document metadata is
very rich, so it makes more sense to store it as json.


https://github.com/milvus-io/pymilvus/issues/1705#issuecomment-1731112372

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
9 months ago
Ashley Xu 0ce7858529
feat: add Google BigQueryVectorSearch in vectorstore (#14829)
BigQuery vector search lets you use GoogleSQL to do semantic search,
using vector indexes for fast but approximate results, or using brute
force for exact results.

This PR integrates LangChain vectorstore with BigQuery Vector Search.

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: Vlad Kolesnikov <vladkol@google.com>
9 months ago
JaguarDB 02f59c2035
Use args option in jaguar so it takes more options in similarity search (#15080)
- **Description:** replace score_threshold with args
  - **Issue:** needs a way to pass more options to similarity search
  - **Dependencies:** None
  - **Twitter handle:** @workbot

---------

Co-authored-by: JY <jyjy@jaguardb>
9 months ago
chyroc 37ad6ec248
Refactor: use SecretStr for tongyi chat-model (#15102) 9 months ago
Shaurya Rohatgi e1c2cd7a28
community: Semanticscholar tool to search 200M+ scientific articles (#15151)
- **Description:** Tool now supports querying over 200 million
scientific articles, vastly expanding its reach beyond the 2 million
articles accessible through Arxiv. This update significantly broadens
access to the entire scope of scientific literature.
- **Dependencies:** semantischolar
https://github.com/danielnsilva/semanticscholar
  - **Twitter handle:** @shauryr

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
9 months ago
dudub12 7e6b0056b8
SQLDatabase drop the column names in the result. (#15361)
Fix for the following bug:
https://github.com/langchain-ai/langchain/issues/15360

---------

Co-authored-by: dudu butbul <100126964+dudu-upstream@users.noreply.github.com>
9 months ago
chyroc 07d294b5ec
Fix: fix Bing Search empty result exception, fix #15384 (#15387)
fix https://github.com/langchain-ai/langchain/issues/15384
9 months ago
Bagatur 1678d6ca17
langchain[patch], experimental[patch], docs: update tools imports (#15433) 9 months ago
Leonid Ganeline b8c6ebf647
refactor `utils` (#15432)
The `langchain` [still holds several
artifacts](https://api.python.langchain.com/en/latest/langchain_api_reference.html#module-langchain.utils)
that belongs to `community`. If they moved then `langchain.utils`
namespace would be removed completely.
- moved `ernie_functions` artifacts to `community`
9 months ago
Bagatur fa5d49f2c1
docs, experimental[patch], langchain[patch], community[patch]: update storage imports (#15429)
ran 
```bash
g grep -l "langchain.vectorstores" | xargs -L 1 sed -i '' "s/langchain\.vectorstores/langchain_community.vectorstores/g"
g grep -l "langchain.document_loaders" | xargs -L 1 sed -i '' "s/langchain\.document_loaders/langchain_community.document_loaders/g"
g grep -l "langchain.chat_loaders" | xargs -L 1 sed -i '' "s/langchain\.chat_loaders/langchain_community.chat_loaders/g"
g grep -l "langchain.document_transformers" | xargs -L 1 sed -i '' "s/langchain\.document_transformers/langchain_community.document_transformers/g"
g grep -l "langchain\.graphs" | xargs -L 1 sed -i '' "s/langchain\.graphs/langchain_community.graphs/g"
g grep -l "langchain\.memory\.chat_message_histories" | xargs -L 1 sed -i '' "s/langchain\.memory\.chat_message_histories/langchain_community.chat_message_histories/g"
gco master libs/langchain/tests/unit_tests/*/test_imports.py
gco master libs/langchain/tests/unit_tests/**/test_public_api.py
```
9 months ago
Bagatur 480626dc99
docs, community[patch], experimental[patch], langchain[patch], cli[pa… (#15412)
…tch]: import models from community

ran
```bash
git grep -l 'from langchain\.chat_models' | xargs -L 1 sed -i '' "s/from\ langchain\.chat_models/from\ langchain_community.chat_models/g"
git grep -l 'from langchain\.llms' | xargs -L 1 sed -i '' "s/from\ langchain\.llms/from\ langchain_community.llms/g"
git grep -l 'from langchain\.embeddings' | xargs -L 1 sed -i '' "s/from\ langchain\.embeddings/from\ langchain_community.embeddings/g"
git checkout master libs/langchain/tests/unit_tests/llms
git checkout master libs/langchain/tests/unit_tests/chat_models
git checkout master libs/langchain/tests/unit_tests/embeddings/test_imports.py
make format
cd libs/langchain; make format
cd ../experimental; make format
cd ../core; make format
```
9 months ago
Mohammad Mohtashim b6c57d38fa
Langchain_community: Small Fix when loading facebook messages (#15358)
- **Description:** SingleFileFacebookMessengerChatLoader did not handle
the case for when messages had stickers and/or photos so fixed that.
  - **Issue:** #15356

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
9 months ago
Mateusz Szewczyk cbfaccc424
WatsonxLLM updates/enhancements (#14598)
- **Description:** updates/enhancements to IBM
[watsonx.ai](https://www.ibm.com/products/watsonx-ai) LLM provider
(prompt tuned models and prompt templates deployments support)
- **Dependencies:**
[ibm-watsonx-ai](https://pypi.org/project/ibm-watsonx-ai/),
  - **Tag maintainer:** : @hwchase17 , @eyurtsev , @baskaryan 
  - **Twitter handle:** details in comment below.

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally. 

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
9 months ago
Manjunath Janardhan 7a0feba9f7
GITLAB_URL should take default https://gitlab.com instead of error (#14638)
The fix #14221 has broken default gitlab url which is forcing the users
to specify GITLAB_URL for default one. With this fix if GITLAB_URL is
not set, the default gitlab url will be taken.

  - **Description:** Add the GITHUB URL instead of None
  - **Issue:** the issue #14221 has broken the default github URL 
  - **Dependencies:** None
  - **Tag maintainer:** @hwchase17 
  - **Twitter handle:** manjunath_shiva
9 months ago
David dcf047c48f
add api_base to _client_params (community version of #14393) (#14644)
- **Description:** This PR adds `api_base` to `_client_params` in the
`chat_model` of LiteLLM to ensure it's included in API calls.
Previously, `api_base` was set on the client but was not included in the
parameters passed to the completion function. This change ensures that
`api_base` is correctly passed to all API calls.
  - **Issue:** #14338
  - **Tag maintainer:** @hwchase17 @agola11
  - **Twitter handle:** @LMS_David_RS
9 months ago
xuxiang dd1d818a82
Fixing the Issue with DashScopeEmbeddings Handling More than 25 Rows of Data (#14662)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
 
This change addresses the issue where DashScopeEmbeddingAPI limits
requests to 25 lines of data, and DashScopeEmbeddings did not handle
cases with more than 25 lines, leading to errors. I have implemented a
fix to manage data exceeding this limit efficiently.

---------

Co-authored-by: xuxiang <xuxiang@aliyun.com>
9 months ago
Christophe Bornet e2a8962ba6
Add AstraDB document loader (#14747)
- **Description:** this adds the AstraDB document loader and an
integration test
  - **Twitter handle:** cbornet_
9 months ago
Igor Dvorkin 76923e5743
Restore self message sent before OSX 12 Monterey (#14818)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
9 months ago
savoiepe d006be60ec
Added more filtering options to pgvector vectorstore (#14852)
- **Description:** Using PGVector vector store, it was only possible to
filter for values equals, in or not in metadata. Extended this feature
to work with the following keywords : IN, NIN, BETWEEN, GT, LT, NE, EQ,
LIKE, CONTAINS, OR, AND

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
9 months ago
chyroc 32e96a471c
Refactor: use SecretStr for llm_rails embeddings (#15090) 9 months ago
chyroc b440f92d81
Refactor: use SecretStr for embaas embeddings (#15091) 9 months ago
chyroc ea6cf0f1b1
Refactor: use SecretStr for edenai embeddings (#15092) 9 months ago
chyroc 32e6e9de13
Refactor: use SecretStr for palm chat-model (#15100) 9 months ago
chyroc b6952d41e5
Refactor: use SecretStr for GPTRouter chat-model (#15101) 9 months ago
Nan LI f506b4cfd2
community: Integration of New Chat Model Based on ChatGLM3 via ZhipuAI API (#15105)
- **Description:** 
- This PR introduces a significant enhancement to the LangChain project
by integrating a new chat model powered by the third-generation base
large model, ChatGLM3, via the zhipuai API.
- This advanced model supports functionalities like function calls, code
interpretation, and intelligent Agent capabilities.
- The additions include the chat model itself, comprehensive
documentation in the form of Python notebook docs, and thorough testing
with both unit and integrated tests.
- **Dependencies:** This update relies on the ZhipuAI package as a key
dependency.
- **Twitter handle:** If this PR receives spotlight attention, we would
be honored to receive a mention for our integration of the advanced
ChatGLM3 model via the ZhipuAI API. Kindly tag us at @kaiwu.

To ensure quality and standards, we have performed extensive linting and
testing. Commands such as make format, make lint, and make test have
been run from the root of the modified package to ensure compliance with
LangChain's coding standards.

TO DO: Continue refining and enhancing both the unit tests and
integrated tests.

---------

Co-authored-by: jing <jingguo92@gmail.com>
Co-authored-by: hyy1987 <779003812@qq.com>
Co-authored-by: jianchuanqi <qijianchuan@hotmail.com>
Co-authored-by: lirq <whuclarence@gmail.com>
Co-authored-by: whucalrence <81530213+whucalrence@users.noreply.github.com>
Co-authored-by: Jing Guo <48378126+JaneCrystall@users.noreply.github.com>
9 months ago
Hin 2cf1e73d12
Feat add volcano embedding (#14693)
Description: Volcano Ark is an enterprise-grade large-model service
platform for developers, providing a full range of functions and
services such as model training, inference, evaluation, fine-tuning. You
can visit its homepage at https://www.volcengine.com/docs/82379/1099455
for details. This change could help developers use the platform for
embedding.
Issue: None
Dependencies: volcengine
Tag maintainer: @baskaryan
Twitter handle: @hinnnnnnnnnnnns

---------

Co-authored-by: lujingxuansc <lujingxuansc@bytedance.com>
9 months ago
David Křístek a010f29013
fix: call correct stream method in ollama (#15104)
Co-authored-by: David Kristek <david@David--MacBook-Pro.local>
9 months ago
Christian Janiake be578f32be
community:Lazy load wikipedia dump file (#15111)
**Description:** the MWDumpLoader implementation currently does not
support the lazy_load method, and the files are usually very large. We
are proposing refactoring the load function, extracting two private
functions with the functionality of loading the dump file and parsing a
single page, to reuse the code in the lazy_load implementation.
9 months ago
chyroc a4ae4bc361
feat: mask api_key for konko (#14010)
for https://github.com/langchain-ai/langchain/issues/12165
9 months ago
joel-teratis 62d32bd214
fix(minor): added missing **kwargs parameter to chroma query function (#14919)
**Description:**

This PR adds the `**kwargs` parameter to six calls in the `chroma.py`
package. All functions already were able to receive `kwargs` but they
were discarded before.

**Issue:**

When passing `kwargs` to functions in the `chroma.py` package they are
being ignored.

For example:

```
chroma_instance.similarity_search_with_score(
    query,
    k=100,
    include=["metadatas", "documents", "distances", "embeddings"],  # this parameter gets ignored
)
```
The `include` parameter does not get passed on to the next function and
does not have any effect.

**Dependencies:**

None
9 months ago
NuODaniel 7773943a51
community:qianfan endpoint support init params & remove useless params definietion (#15381)
- **Description:**
- support custom kwargs in object initialization. For instantance, QPS
differs from multiple object(chat/completion/embedding with diverse
models), for which global env is not a good choice for configuration.
  - **Issue:** no
  - **Dependencies:** no
  - **Twitter handle:** no

@baskaryan PTAL
9 months ago
Nuno Campos 99000c612e
Propagate context vars in all classes/methods (#15329)
- Any direct usage of ThreadPoolExecutor or asyncio.run_in_executor
needs manual handling of context vars

<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
9 months ago
Ankush Gola 7eec8f2487
Delete V1 tracer and refactor tracer tests to core (#15326) 9 months ago
chyroc 7ce338201c
Patch: improve check openai version (#15301) 9 months ago
Nuno Campos eb5e250188 Propagate context vars in all classes/methods
- Any direct usage of ThreadPoolExecutor or asyncio.run_in_executor needs manual handling of context vars
9 months ago
Shuai Liu 4b53440e70
Upgrades the Tongyi LLM and ChatTongyi Model (#14793)
- **Description:** fixes and upgrades for the Tongyi LLM and ChatTongyi
Model
      - Fixed typos; it should be `Tongyi`, not `OpenAI`.
- Fixed a bug in `stream_generate_with_retry`; it's a real stream
generator now.
- Fixed a bug in `validate_environment`; the `dashscope_api_key` should
be properly handled when set by environment variables or initialization
parameters.
- Changed the `dashscope` response to incremental output by setting the
parameter `incremental_output`, which eliminates the need for the
prefix-removal trick.
      - Removed some unused parameters, like `n`, `prefix_messages`.
      - Added `_stream` method.
- Added async methods support, such as `_astream`, `_agenerate`,
`_abatch`.
  - **Dependencies:** No new dependencies.
  - **Tag maintainer:** @hwchase17 

> PS: Some may be confused about the terms `dashscope`, `tongyi`, and
`Qwen`:
> - `dashscope`: A platform to deploy LLMs and provide APIs to invoke
the LLM.
> - `tongyi`: A brand name or overall term about Alibaba Cloud's LLM/AI.
> - `Qwen`: An LLM that is open-sourced and deployed in `dashscope`.
> 
> We use the `dashscope` SDK to interact with the `tongyi`-`Qwen` LLM.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
9 months ago
Bagatur 8bfac1a319
community[patch]: Release 0.0.7 (#15320) 9 months ago
Diego Rani Mazine ec72225265
refactor: enable connection pool usage in PGVector (#11514)
- **Description:** `PGVector` refactored to use connection pool.
  - **Issue:** #11433,
  - **Tag maintainer:** @hwchase17 @eyurtsev,

---------

Co-authored-by: Diego Rani Mazine <diego.mazine@mercadolivre.com>
Co-authored-by: Nuno Campos <nuno@langchain.dev>
9 months ago
joshy-deshaw bf5385592e
core, community: propagate context between threads (#15171)
While using `chain.batch`, the default implementation uses a
`ThreadPoolExecutor` and run the chains in separate threads. An issue
with this approach is that that [the token counting
callback](https://python.langchain.com/docs/modules/callbacks/token_counting)
fails to work as a consequence of the context not being propagated
between threads. This PR adds context propagation to the new threads and
adds some thread synchronization in the OpenAI callback. With this
change, the token counting callback works as intended.

Having the context propagation change would be highly beneficial for
those implementing custom callbacks for similar functionalities as well.

---------

Co-authored-by: Nuno Campos <nuno@langchain.dev>
9 months ago
shroominic 694bbb14cd
community: fix typo in async ollama chat (#15276)
Made a stupid typo in the last PR which got already merged😅
9 months ago
triThirty fea4888e72
community: Enhance Github error prompt (#15248)
- **Description:** The Github error prompt is confused because of JWT
enctrypt to somebody not familiar with Github connection method. This PR
is to add some useful error prompt to help users troubleshooting.
- **Issue:**
https://github.com/langchain-ai/langchain/issues/14550#issuecomment-1867445049
  - **Dependencies:** None,
  - **Twitter handle:** None
9 months ago
Bob Lin a464eb4394
community: Make doctran synchronous (#15264)
### Description

I found that the methods in [the doctran
library](https://github.com/psychic-api/doctran) have been restructured
into [synchronized
versions](14944a59f7),

And [the example
ipynb](https://github.com/psychic-api/doctran/blob/main/examples.ipynb)
also shows that the code is synchronized, but the README has not been
updated yet.

so we need to modify the code and update the documentation.

### Issue

https://github.com/langchain-ai/langchain/issues/14645
9 months ago
chyroc 6fb3cc6f27
Fix: Use `Union` instead of `|` to improve compatibility, fix #15244 (#15245) 9 months ago
chyroc 1abcf441ae
Refactor: use SecretStr for Predibase llms (#15119) 9 months ago
chyroc 0a9a73a9c9
Refactor: use SecretStr for PipelineAI llms (#15120) 9 months ago
chyroc d63ceb65b3
Refactor: use SecretStr for StochasticAI llms (#15118) 9 months ago
chyroc 674fde87d2
Refactor: use SecretStr for VolcEngineMaas llms (#15117) 9 months ago
chyroc 3cc1da2b38
Refactor: use SecretStr for Petals llms (#15121) 9 months ago
shroominic e6f0cee896
community: Async Ollama + ChatOllama (#15169)
**Description:**
Adding async methods to booth OllamaLLM and ChatOllama to enable async
streaming and async .on_llm_new_token callbacks.

**Issue:**
ChatOllama is not working in combination with an AsyncCallbackManager
because the .on_llm_new_token method is not awaited.
9 months ago
Phill Zarfos 35896faab7
community: correct spelling mistakes of "Suffle" and "reporoducibility" (#15172)
- **Description:** Correct spelling mistakes of "Suffle" and
"reporoducibility" in `DirectoryLoader` class
  - **Issue:** N/A
  - **Dependencies:** N/A
  - **Twitter handle:** N/A
9 months ago
chyroc 3a3f880e5a
Patch: improve ollama 404 api error message, fix #15147 (#15156)
Make this issue more clearly exposed to developers
9 months ago
Ivan 59d4b80a92
[community]: Elasticsearch chat history encoding (#15055)
- Added ensure_ascii property to ElasticsearchChatMessageHistory

<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: Ivan Chetverikov <ivan.chetverikov@raftds.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
9 months ago
Corey Brown 9e492620d4
Don't reassign chunk_type (#14923)
**Description**: The parameter chunk_type was being hard coded to
"extractive_answers", so that when "snippet" was being passed, it was
being ignored. This change simply doesn't do that.
9 months ago
Takuya Igei 6da2246215
Add support Vertex AI Gemini uses a public image URL (#14949)
## What

Since `langchain_google_genai.ChatGoogleGenerativeAI` supported A public
image URL, we add to support it in `langchain.chat_models.ChatVertexAI`
as well.

### Example

```py
from langchain.chat_models.vertexai import ChatVertexAI
from langchain_core.messages import HumanMessage

llm = ChatVertexAI(model_name="gemini-pro-vision")
image_message = {
    "type": "image_url",
    "image_url": {
        "url": "https://python.langchain.com/assets/images/cell-18-output-1-0c7fb8b94ff032d51bfe1880d8370104.png",
    },
}
text_message = {
    "type": "text",
    "text": "What is shown in this image?",
}
message = HumanMessage(content=[text_message, image_message])

output = llm([message])
print(output.content)
```

## Refs

-
https://python.langchain.com/docs/integrations/llms/google_vertex_ai_palm
-
https://python.langchain.com/docs/integrations/chat/google_generative_ai
9 months ago
Archan Ghosh affa3e755a
Update arxiv.py with get_summaries_as_docs inside of Arxivloader (#14953)
Added the call function get_summaries_as_docs inside of Arxivloader

- **Description:** Added a function that returns the documents from
get_summaries_as_docs, as the call signature is present in the parent
file but never used from Arxivloader, this can be used from Arxivloader
itself just like .load() as both the signatures are same.
- **Issue:** Reduces time to load papers as no pdf is processed only
metadata is pulled from Arxiv allowing users for faster load times on
bulk loads. Users can then choose one or more paper and use ID directly
with .load() to load pdf thereby loading all the contents of the paper.
9 months ago
ccurme f2782f4c86
community: add args_schema to GmailSendMessage (#14973)
- **Description:** `tools.gmail.send_message` implements a
`SendMessageSchema` that is not used anywhere. `GmailSendMessage` also
does not have an `args_schema` attribute (this led to issues when
invoking the tool with an OpenAI functions agent, at least for me). Here
we add the missing attribute and a minimal test for the tool.
  - **Issue:** N/A
  - **Dependencies:** N/A
  - **Twitter handle:** N/A

---------

Co-authored-by: Chester Curme <chestercurme@microsoft.com>
9 months ago
Philip Kiely - Baseten 6342da333a
community: refactor Baseten integration with new API endpoints & docs (#15017)
- **Description:** In response to user feedback, this PR refactors the
Baseten integration with updated model endpoints, as well as updates
relevant documentation. This PR has been tested by end users in
production and works as expected.
  - **Issue:** N/A
- **Dependencies:** This PR actually removes the dependency on the
`baseten` package!
  - **Twitter handle:** https://twitter.com/basetenco
9 months ago
Blane Honeycutt 3fc1b3553b
Community: Adds ability to pass a Config to the boto3 client used by Bedrock (#15029)
# Description  
This PR adds the ability to pass a `botocore.config.Config` instance to
the boto3 client instantiated by the Bedrock LLM.

Currently, the Bedrock LLM doesn't support a way to pass a Config, which
means that some settings (e.g., timeouts and retry configuration)
require instantiating a new boto3 client with a Config and then
replacing the LLM's client:

```python
llm = Bedrock(
        region_name='us-west-2',
        model_id="anthropic.claude-v2",
        model_kwargs={'max_tokens_to_sample': 4096, 'temperature': 0},
)

llm.client = boto_client('bedrock-runtime', region_name='us-west-2', config=Config({'read_timeout': 300}))
```

# Issue
N/A

# Dependencies
N/A
9 months ago
Grzegorz Sajko dc71fcfabf
corrected outdated link (#15053)
<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
9 months ago
Ran c3f8733aef
fix: correct spelling mistakes of "seperate, intialise, pre-defined" (#14647)
fix spellings

**seperate -> separate**: found more occurrences, see
https://github.com/langchain-ai/langchain/pull/14602
**initialise -> intialize**: the latter is more common in the repo
**pre-defined > predefined**: adding a comma after a prefix is a
delicate matter, but this is a generally accepted word

also, another word that appears in the repo is "fs" (stands for
filesystem), e.g., in `libs/core/langchain_core/prompts/loading.py`
` """Unified method for loading a prompt from LangChainHub or local
fs."""`
Isn't "filesystem" better?
9 months ago
Harrison Chase 2e159931ac
add defaults for tavily (#15075) 9 months ago
chyroc 4440ec5ab3
Refactor: use SecretStr for minimax embeddings (#15067) 9 months ago
chyroc aa19ca9723
Refactor: use SecretStr for jina embeddings (#15068)
<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
9 months ago
Michael Goin 501cc8311d
community[patch]: Fix generation_config not setting properly for DeepSparse (#15036)
- **Description:** Tiny but important bugfix to use a more stable
interface for specifying generation_config parameters for DeepSparse LLM
9 months ago
QIAN Zifei 2460f977c5
community[minor]: Azure DocumentIntelligenceLoader/Parser support update with latest SDK (#14389)
- **Description:**
Add DocumentIntelligenceLoader & DocumentIntelligenceParser
implementation using the latest Azure Document Intelligence SDK with
markdown support.
The core logic resides in DocumentIntelligenceParser and
DocumentIntelligenceLoader is a mere wrapper of the parser.
The parser will takes api_endpoint and api_key and creates
DocumentIntelligenceClient for the user. 4 parsing modes are supported:
1. Markdown (default)
2. Single
3. Page 
4. Object

UT and notebook are also updated accordingly.

- **Dependencies:** Azure Document Intelligence SDK:
azure-ai-documentintelligence
[azure-sdk-for-python/sdk/documentintelligence/azure-ai-documentintelligence
at 7c42462ac662522a6fd21b17d2a20f4cd40d0356 · Azure/azure-sdk-for-python
(github.com)](https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FAzure%2Fazure-sdk-for-python%2Ftree%2F7c42462ac662522a6fd21b17d2a20f4cd40d0356%2Fsdk%2Fdocumentintelligence%2Fazure-ai-documentintelligence&data=05%7C01%7CZifei.Qian%40microsoft.com%7C298225aa3e31468a863108dbf07374ff%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638368150928704292%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=oE0Sl4HERnMKdbkV9KgBV46Z2xytcQAShdTWf7ZNl%2Bs%3D&reserved=0).

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
9 months ago
Ran 129a929d69
infra: Fix test filesystem paths incompatible with windows (#14388)
- **Description:** This PR fixes test failures on Windows caused by path
handling differences and unescaped special characters in regex. The
failing tests are:
```
FAILED tests/unit_tests/storage/test_filesystem.py::test_yield_keys - AssertionError: assert ['key1', 'subdir\\key2'] == ['key1', 'subdir/key2']
FAILED tests/unit_tests/test_imports.py::test_importable_all - ModuleNotFoundError: No module named 'langchain_community.langchain_community\\adapters'
FAILED tests/unit_tests/tools/file_management/test_utils.py::test_get_validated_relative_path_errs_on_absolute - re.error: incomplete escape \U at position 53
FAILED tests/unit_tests/tools/file_management/test_utils.py::test_get_validated_relative_path_errs_on_parent_dir - re.error: incomplete escape \U at position 69
FAILED tests/unit_tests/tools/file_management/test_utils.py::test_get_validated_relative_path_errs_for_symlink_outside_root - re.error: incomplete escape \U at position 64
```

- **Issue:** fixes
https://github.com/langchain-ai/langchain/issues/11775 (partially)
- **Dependencies:** none
9 months ago