Commit Graph

7144 Commits (bcc71d1a5754c49300aedc9d681e21bcea343015)
 

Author SHA1 Message Date
chyroc 61da2ff24c
community[patch]: use SecretStr for yandex model secrets (#15463) 8 months ago
Alessio Serra d628a80a5d
community[patch]: added 'conversational' as a valid task for hugginface endopoint models (#15761)
- **Description:** added the conversational task to hugginFace endpoint
in order to use models designed for chatbot programming.
  - **Dependencies:** None

---------

Co-authored-by: Alessio Serra (ext.) <alessio.serra@partner.bmw.de>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
8 months ago
Karim Lalani 4c7755778d
community[patch]: SurrealDB fix for asyncio (#16092)
Code fix for asyncio
8 months ago
BeatrixCohere 2b2285dac0
docs: Update cohere rerank and comparison docs (#16198)
- **Description:** Update the cohere rerank docs to use cohere
embeddings
  - **Issue:** n/a
  - **Dependencies:** n/a
  - **Twitter handle:** n/a
8 months ago
Raunak 476bf8b763
community[patch]: Load list of files using UnstructuredFileLoader (#16216)
- **Description:** Updated `_get_elements()` function of
`UnstructuredFileLoader `class to check if the argument self.file_path
is a file or list of files. If it is a list of files then it iterates
over the list of file paths, calls the partition function for each one,
and appends the results to the elements list. If self.file_path is not a
list, it calls the partition function as before.
  
  - **Issue:** Fixed #15607,
  - **Dependencies:** NA
  - **Twitter handle:** NA

Co-authored-by: H161961 <Raunak.Raunak@Honeywell.com>
8 months ago
Xudong Sun 019b6ebe8d
community[minor]: Add iFlyTek Spark LLM chat model support (#13389)
- **Description:** This PR enables LangChain to access the iFlyTek's
Spark LLM via the chat_models wrapper.
  - **Dependencies:** websocket-client ^1.6.1
  - **Tag maintainer:** @baskaryan 

### SparkLLM chat model usage

Get SparkLLM's app_id, api_key and api_secret from [iFlyTek SparkLLM API
Console](https://console.xfyun.cn/services/bm3) (for more info, see
[iFlyTek SparkLLM Intro](https://xinghuo.xfyun.cn/sparkapi) ), then set
environment variables `IFLYTEK_SPARK_APP_ID`, `IFLYTEK_SPARK_API_KEY`
and `IFLYTEK_SPARK_API_SECRET` or pass parameters when using it like the
demo below:

```python3
from langchain.chat_models.sparkllm import ChatSparkLLM

client = ChatSparkLLM(
    spark_app_id="<app_id>",
    spark_api_key="<api_key>",
    spark_api_secret="<api_secret>"
)
```
8 months ago
Ali Zendegani 80fcc50c65
langchain[patch]: Minor Fix: Enable Passing custom_headers for Authentication in GraphQL Agent/Tool (#16413)
- **Description:** 

This PR aims to enhance the `langchain` library by enabling the support
for passing `custom_headers` in the `GraphQLAPIWrapper` usage within
`langchain/agents/load_tools.py`.

While the `GraphQLAPIWrapper` from the `langchain_community` module is
inherently capable of handling `custom_headers`, its current invocation
in `load_tools.py` does not facilitate this functionality.
This limitation restricts the use of the `graphql` tool with databases
or APIs that require token-based authentication.

The absence of support for `custom_headers` in this context also leads
to a lack of error messages when attempting to interact with secured
GraphQL endpoints, making debugging and troubleshooting more
challenging.

This update modifies the `load_tools` function to correctly handle
`custom_headers`, thereby allowing secure and authenticated access to
GraphQL services requiring tokens.

Example usage after the proposed change:
```python
tools = load_tools(
    ["graphql"],
    graphql_endpoint="https://your-graphql-endpoint.com/graphql",
    custom_headers={"Authorization": f"Token {api_token}"},
)
```
  - **Issue:** None,
  - **Dependencies:** None,
  - **Twitter handle:** None
8 months ago
Serena Ruan 5c6e123757
community[patch]: Fix MlflowCallback with none artifacts_dir (#16487) 8 months ago
Krista Pratico 0e2e7d8b83
langchain[patch]: allow passing client with OpenAIAssistantRunnable (#16486)
- **Description:** This addresses the issue tagged below where if you
try to pass your own client when creating an OpenAI assistant, a
pydantic error is raised:

Example code:

```python
import openai
from langchain.agents.openai_assistant import OpenAIAssistantRunnable

client = openai.OpenAI()
interpreter_assistant = OpenAIAssistantRunnable.create_assistant(
    name="langchain assistant",
    instructions="You are a personal math tutor. Write and run code to answer math questions.",
    tools=[{"type": "code_interpreter"}],
    model="gpt-4-1106-preview",
    client=client
)

```

Error:
`pydantic.v1.errors.ConfigError: field "client" not yet prepared, so the
type is still a ForwardRef. You might need to call
OpenAIAssistantRunnable.update_forward_refs()`

It additionally updates type hints and docstrings to indicate that an
AzureOpenAI client is permissible as well.

  - **Issue:** https://github.com/langchain-ai/langchain/issues/15948
  - **Dependencies:** N/A
8 months ago
Eugene Yurtsev d898d2f07b
docs: Fix version in which astream_events was released (#16481)
Fix typo in version
8 months ago
bu2kx ff3163297b
community[minor]: Add KDBAI vector store (#12797)
Addition of KDBAI vector store (https://kdb.ai).

Dependencies: `kdbai_client` v0.1.2 Python package.

Sample notebook: `docs/docs/integrations/vectorstores/kdbai.ipynb`

Tag maintainer: @bu2kx
Twitter handle: @kxsystems
8 months ago
JongRok BAEK 4ec3fe4680
docs: Updated integration docs structure for chat/anthropic (#16268)
Description: 
- Added output and environment variables
- Updated the documentation for chat/anthropic, changing references from
`langchain.schema` to `langchain_core.prompts`.

Issue: https://github.com/langchain-ai/langchain/issues/15664
Dependencies: None
Twitter handle: None

Since this is my first open-source PR, please feel free to point out any
mistakes, and I'll be eager to make corrections.
8 months ago
Shivani Modi 4e160540ff
community[minor]: Adding Konko Completion endpoint (#15570)
This PR introduces update to Konko Integration with LangChain.

1. **New Endpoint Addition**: Integration of a new endpoint to utilize
completion models hosted on Konko.

2. **Chat Model Updates for Backward Compatibility**: We have updated
the chat models to ensure backward compatibility with previous OpenAI
versions.

4. **Updated Documentation**: Comprehensive documentation has been
updated to reflect these new changes, providing clear guidance on
utilizing the new features and ensuring seamless integration.

Thank you to the LangChain team for their exceptional work and for
considering this PR. Please let me know if any additional information is
needed.

---------

Co-authored-by: Shivani Modi <shivanimodi@Shivanis-MacBook-Pro.local>
Co-authored-by: Shivani Modi <shivanimodi@Shivanis-MBP.lan>
8 months ago
Gianfranco Demarco c69f599594
langchain[patch]: Extract _aperform_agent_action from _aiter_next_step from AgentExecutor (#15707)
- **Description:** extreact the _aperform_agent_action in the
AgentExecutor class to allow for easier overriding. Extracted logic from
_iter_next_step into a new method _perform_agent_action for consistency
and easier overriding.
- **Issue:** #15706

Closes #15706
8 months ago
i-w-a 95ee69a301
langchain[patch]: In HTMLHeaderTextSplitter set default encoding to utf-8 (#16372)
- **Description:** The HTMLHeaderTextSplitter Class now explicitly
specifies utf-8 encoding in the part of the split_text_from_file method
that calls the HTMLParser.
- **Issue:** Prevent garbled characters due to differences in encoding
of html files (except for English in particular, I noticed that problem
with Japanese).
  - **Dependencies:** No dependencies,
  - **Twitter handle:**  @i_w__a
8 months ago
Noah Stapp e135e5257c
community[patch]: Include scores in MongoDB Atlas QA chain results (#14666)
Adds the ability to return similarity scores when using
`RetrievalQA.from_chain_type` with `MongoDBAtlasVectorSearch`. Requires
that `return_source_documents=True` is set.

Example use:

```
vector_search = MongoDBAtlasVectorSearch.from_documents(...)

qa = RetrievalQA.from_chain_type(
	llm=OpenAI(), 
	chain_type="stuff", 
	retriever=vector_search.as_retriever(search_kwargs={"additional": ["similarity_score"]}),
	return_source_documents=True
)

...

docs = qa({"query": "..."})

docs["source_documents"][0].metadata["score"] # score will be here
```

I've tested this feature locally, using a MongoDB Atlas Cluster with a
vector search index.
8 months ago
Serena Ruan 90f5a1c40e
community[minor]: Improve mlflow callback (#15691)
- **Description:** Allow passing run_id to MLflowCallbackHandler to
resume a run instead of creating a new run. Support recording retriever
relevant metrics. Refactor the code to fix some bugs.
---------

Signed-off-by: Serena Ruan <serena.rxy@gmail.com>
8 months ago
Facundo Santiago 92e6a641fd
feat: adding paygo api support for Azure ML / Azure AI Studio (#14560)
- **Description:** Introducing support for LLMs and Chat models running
in Azure AI studio and Azure ML using the new deployment mode
pay-as-you-go (model as a service).
- **Issue:** NA
- **Dependencies:** None.
- **Tag maintainer:** @prakharg-msft @gdyre 
- **Twitter handle:** @santiagofacundo

Examples added:
*
[docs/docs/integrations/llms/azure_ml.ipynb](https://github.com/santiagxf/langchain/blob/santiagxf/azureml-endpoints-paygo-community/docs/docs/integrations/chat/azureml_endpoint.ipynb)
*
[docs/docs/integrations/chat/azureml_chat_endpoint.ipynb](https://github.com/santiagxf/langchain/blob/santiagxf/azureml-endpoints-paygo-community/docs/docs/integrations/chat/azureml_chat_endpoint.ipynb)

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
8 months ago
Davide Menini 9ce177580a
community: normalize bedrock embeddings (#15103)
In this PR I added a post-processing function to normalize the
embeddings. This happens only if the new `normalize` flag is `True`.

---------

Co-authored-by: taamedag <Davide.Menini@swisscom.com>
8 months ago
baichuan-assistant 20fcd49348
community: Fix Baichuan Chat. (#15207)
- **Description:** Baichuan Chat (with both Baichuan-Turbo and
Baichuan-Turbo-192K models) has updated their APIs. There are breaking
changes. For example, BAICHUAN_SECRET_KEY is removed in the latest API
but is still required in Langchain. Baichuan's Langchain integration
needs to be updated to the latest version.
  - **Issue:** #15206
  - **Dependencies:** None,
  - **Twitter handle:** None

@hwchase17.

Co-authored-by: BaiChuanHelper <wintergyc@WinterGYCs-MacBook-Pro.local>
8 months ago
gcheron cfc225ecb3
community: SQLStrStore/SQLDocStore provide an easy SQL alternative to `InMemoryStore` to persist data remotely in a SQL storage (#15909)
**Description:**

- Implement `SQLStrStore` and `SQLDocStore` classes that inherits from
`BaseStore` to allow to persist data remotely on a SQL server.
- SQL is widely used and sometimes we do not want to install a caching
solution like Redis.
- Multiple issues/comments complain that there is no easy remote and
persistent solution that are not in memory (users want to replace
InMemoryStore), e.g.,
https://github.com/langchain-ai/langchain/issues/14267,
https://github.com/langchain-ai/langchain/issues/15633,
https://github.com/langchain-ai/langchain/issues/14643,
https://stackoverflow.com/questions/77385587/persist-parentdocumentretriever-of-langchain
- This is particularly painful when wanting to use
`ParentDocumentRetriever `
- This implementation is particularly useful when:
     * it's expensive to construct an InMemoryDocstore/dict
     * you want to retrieve documents from remote sources
     * you just want to reuse existing objects
- This implementation integrates well with PGVector, indeed, when using
PGVector, you already have a SQL instance running. `SQLDocStore` is a
convenient way of using this instance to store documents associated to
vectors. An integration example with ParentDocumentRetriever and
PGVector is provided in docs/docs/integrations/stores/sql.ipynb or
[here](https://github.com/gcheron/langchain/blob/sql-store/docs/docs/integrations/stores/sql.ipynb).
- It persists `str` and `Document` objects but can be easily extended.

 **Issue:**

Provide an easy SQL alternative to `InMemoryStore`.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
8 months ago
dudgeon 26b2ad6d5b
Fixed typo on quickstart.ipynb (#16482)
- **Description:** Quick typo fix: `inpect` >> `inspect`
  - **Issue:** N/A
  - **Dependencies:** any dependencies required for this change,
  - **Twitter handle:** @geoffdudgeon
8 months ago
Massimiliano Pronesti e529939c54
feat(llms): support more tasks in HuggingFaceHub LLM and remove deprecated dep (#14406)
- **Description:** this PR upgrades the `HuggingFaceHub` LLM:
   * support more tasks (`translation` and `conversational`)
   * replaced the deprecated `InferenceApi` with `InferenceClient`
* adjusted the overall logic to use the "recommended" model for each
task when no model is provided, and vice-versa.
- **Tag mainter(s)**: @baskaryan @hwchase17
8 months ago
Erick Friis afb25eeec4
cli[patch]: add integration tests to default makefile (#16479) 8 months ago
Erick Friis 51c8ef6af4
templates: fix azure params in retrieval agent (#16257)
- FIX templates/retrieval-agent/retireval-agent/chain.py to use the new
Syntax for Azure env params
- cr

---------

Co-authored-by: braun-viathan <p.braun@viathan.de>
Co-authored-by: Braun-viathan <121631422+braun-viathan@users.noreply.github.com>
8 months ago
Lance Martin c3530f1c11
templates: Minor nit on HyDE (#16478) 8 months ago
Bagatur ba326b98d0
langchain[patch]: Release 0.1.3 (#16475) 8 months ago
Bagatur 54149292f8
community[patch]: Release 0.0.15 (#16474) 8 months ago
Bagatur ef6a335570
core[patch]: Release 0.1.15 (#16473) 8 months ago
Erick Friis 1f4ac62dee
cli[patch], google-vertexai[patch]: readme template (#16470) 8 months ago
Eugene Yurtsev 39d1cbfecf
Docs: Document astream_events API (#16300)
Document astream events API
8 months ago
Tomaz Bratanic d0a8082188
Fix neo4j sanitize (#16439)
Fix the sanitization bug and add an integration test
8 months ago
William FH 5de59f9236
Core[Patch] Parse tool input after on_start (#16430)
For tracing, if a validation error occurs, currently it is attributed to
the previous step of the chain. It would be nice to have the on_start
and on_error callbacks called for tools when there is a validation error
that occurs to more easily attribute the root-cause
8 months ago
Nuno Campos 226fe645f1
core[patch] Do not try to access attribute of None (#16321) 8 months ago
Florian MOREL 4b7969efc5
community[minor]: New documents loader for visio files (with extension .vsdx) (#16171)
**Description** : New documents loader for visio files (with extension
.vsdx)

A [visio file](https://fr.wikipedia.org/wiki/Microsoft_Visio) (with
extension .vsdx) is associated with Microsoft Visio, a diagram creation
software. It stores information about the structure, layout, and
graphical elements of a diagram. This format facilitates the creation
and sharing of visualizations in areas such as business, engineering,
and computer science.

A Visio file can contain multiple pages. Some of them may serve as the
background for others, and this can occur across multiple layers. This
loader extracts the textual content from each page and its associated
pages, enabling the extraction of all visible text from each page,
similar to what an OCR algorithm would do.

**Dependencies** : xmltodict package
8 months ago
KhoPhi fb41b68ea1
docs: Update with LCEL examples to Ollama & ChatOllama Integration notebook (#16194)
- **Description:** Updated the Chat/Ollama docs notebook with LCEL chain
examples

- **Issue:**  #15664 I'm a new contributor 😊

- **Dependencies:** No dependencies

- **Twitter handle:** 

Comments:

- How do I truncate the output of the stream in the notebook if and or
when it goes on and on and on for even the basic of prompts?

Edit:

Looking forward to feedback @baskaryan

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
8 months ago
Michael Gorham 3b0226b2c6
docs: Update redis_chat_message_history.ipynb (#16344)
## Problem
Spent several hours trying to figure out how to pass
`RedisChatMessageHistory` as a `GetSessionHistoryCallable` with a
different REDIS hostname. This example kept connecting to
`redis://localhost:6379`, but I wanted to connect to a server not hosted
locally.

## Cause
Assumption the user knows how to implement `BaseChatMessageHistory` and
`GetSessionHistoryCallable`

## Solution
Update documentation to show how to explicitly set the REDIS hostname
using a lambda function much like the MongoDB and SQLite examples.
8 months ago
Ian c98994c3c9
docs: Improve notebook to show how to use tidb to store history messages (#16420)
After merging [PR
#16304](https://github.com/langchain-ai/langchain/pull/16304), I
realized that our notebook example for integrating TiDB with LangChain
was too basic. To make it more useful and user-friendly, I plan to
create a detailed example. This will show how to use TiDB for saving
history messages in LangChain, offering a clearer, more practical guide
for our users
8 months ago
Eugene Yurtsev c88750d54b
Docs: Agent streaming notebooks (#15858)
Update information about streaming in the agents section. Show how to
use astream_events to get token by token streaming.
8 months ago
Eugene Yurtsev e5672bc944
docs: Re-write custom agent to show to write a tools agent (#15907)
Shows how to write a tools agent rather than a functions agent.
8 months ago
Boris Feld 404abf139a
community: Add CometLLM tracing context var (#15765)
I also added LANGCHAIN_COMET_TRACING to enable the CometLLM tracing
integration similar to other tracing integrations. This is easier for
end-users to enable it rather than importing the callback and pass it
manually.

(This is the same content as
https://github.com/langchain-ai/langchain/pull/14650 but rebased and
squashed as something seems to confuse Github Action).
8 months ago
Nicolò Boschi a500527030
infra: google-vertexai relax types-requests deps range (#16264)
- **Description:** At the moment it's not possible to include in the
same project langchain-google-vertexai and boto3 (e.g. use bedrock and
vertex in the same application) because of the dependency resolutions
conflict. boto3 is still using urllib3 1.x, meanwhile
langchain-google-vertexai -> types-requests depends on urllib3 2.x. [the
last version of types-requests that allows urllib3 1.x is
2.31.0.6](https://pypi.org/project/types-requests/#description).
In this PR I allow the vertexai package to get that version also. 
  
- **Twitter handle:** nicoloboschi
8 months ago
DL b9e7f6f38a
community[minor]: Bedrock async methods (#12477)
Description: Added support for asynchronous streaming in the Bedrock
class and corresponding tests.

Primarily:
  async def aprepare_output_stream
    async def _aprepare_input_and_invoke_stream
    async def _astream
    async def _acall

I've ensured that the code adheres to the project's linting and
formatting standards by running make format, make lint, and make test.

Issue: #12054, #11589

Dependencies: None

Tag maintainer: @baskaryan 

Twitter handle: @dominic_lovric

---------

Co-authored-by: Piyush Jain <piyushjain@duck.com>
8 months ago
Jennifer Melot d6275e47f2
docs: Updated integration docs structure for tools/arxiv (#16091) (#16250)
- **Description:** Updated docs for tools/arxiv to use `AgentExecutor`
and `invoke`
  - **Issue:** #15664
  - **Dependencies:** None
  - **Twitter handle:** None
8 months ago
Frank995 5694728816
community[patch]: Implement vector length definition at init time in PGVector for indexing (#16133)
Replace this entire comment with:
- **Description:** allow user to define tVector length in PGVector when
creating the embedding store, this allows for later indexing
  - **Issue:** #16132
  - **Dependencies:** None
8 months ago
ChengZi a950fa0487
docs: add milvus multitenancy doc (#16177)
- **Description:** add milvus multitenancy doc, it is an example for
this [pr](https://github.com/langchain-ai/langchain/pull/15740) .
  - **Issue:** No,
  - **Dependencies:** No,
  - **Twitter handle:** No

Signed-off-by: ChengZi <chen.zhang@zilliz.com>
8 months ago
Chase VanSteenburg 1011b681dc
core[patch]: Fix f-string formatting in error message for configurable_fields (#16411)
- **Description:** Simple fix to f-string formatting. Allows more
informative ValueError output.
  - **Issue:** None needed.
  - **Dependencies:** None.
  - **Twitter handle:** @FlightP1an
8 months ago
parkererickson-tg b26a22f307
community[minor]: add TigerGraph support (#16280)
**Description:** Add support for querying TigerGraph databases through
the InquiryAI service.
**Issue**: N/A
**Dependencies:** N/A
**Twitter handle:** @TigerGraphDB
8 months ago
Christophe Bornet 8da34118bc
docs: Add documentation for Cassandra Document Loader (#16282) 8 months ago
Alireza Kashani d1b4ead87c
community[patch]: Update grobid.py (#16298)
there is a case where "coords" does not exist in the "sentence"
therefore, the "split(";")" will lead to error.

we can fix that by adding "if sentence.get("coords") is not None:" 

the resulting empty "sbboxes" from this scenario will raise error at
"sbboxes[0]["page"]" because sbboxes are empty.

the PDF from https://pubmed.ncbi.nlm.nih.gov/23970373/ can replicate
those errors.
8 months ago