Commit Graph

1598 Commits (0660c06cf15260570f5e7a51e29e0d9d2e52ec7d)

Author SHA1 Message Date
Bagatur 25b1d65305
bump 315 (#11850) 9 months ago
Nuno Campos 4321d192ea
Use a less specific return type for | on Runnables (#11762)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
9 months ago
Harrison Chase a506302772
bearly tool (#11812) 9 months ago
Harrison Chase 4a2f0c51a1
use get_llm_cache and set_llm_cache (#11741)
Co-authored-by: Bagatur <baskaryan@gmail.com>
9 months ago
Harrison Chase f3ad22e64a
pipe default key (#11788) 9 months ago
Eugene Yurtsev 0d37b4c27d
Add python,pandas,xorbits,spark agents to experimental (#11774)
See for contex
https://github.com/langchain-ai/langchain/discussions/11680
9 months ago
Michael Feil 233a904f2e
GradientLLM Docs update and model_id renaming. (#10963)
Related to #10800 

- Errors in the Docstring of GradientLLM / Gradient.ai LLM
- Renamed the `model_id` to `model` and adapting this in all tests.
Reason to so is to be in Sync with `GradientEmbeddings` and other LLM's.
- inmproving tests so they check the headers in the sent request.
- making the aiosession a private attribute in the docs, as in the
future `pip install gradientai` will be replacing aiosession.
- adding a example how to fine-tune on the Prompt Template as suggested
in #10800
9 months ago
Bagatur 1559ba4bfc
fix upstash test import (#11781) 9 months ago
Leonid Kuligin 9f0a718198
added candidate_count for Vertex models (#11729)
- **Description:** added support for `candidate_count` parameter on
Vertex
9 months ago
David 9d200e6cbe
Create ChatEverlyAI (#11357)
- Description: Adds the ChatEverlyAI class with llama-2 7b on [EverlyAI
Hosted
Endpoints](https://everlyai.xyz/)
- It inherits from ChatOpenAI and requires openai (probably unnecessary
but it made for a quick and easy implementation)

---------

Co-authored-by: everly-studio <127131037+everly-studio@users.noreply.github.com>
9 months ago
Hristo G 7fb25b4154
Add graceful fallback for ES vectorstore when content field is missing (#11726)
- **Description:**
- If the Elasticsearch field used for Langchain > Document.page_content
is missing because the specific document is
        somehow malformed fail gracefully.

  - **Tag maintainer:** 
    - @joemcelroy
9 months ago
Bagatur f06fcde0d7
rm duplicate zilliz import (#11777) 9 months ago
Bagatur a3330c4258
bump 314 (#11773) 9 months ago
Erick Friis 1861cc7100
General anthropic functions, steps towards experimental integration tests (#11727)
To match change in js here
https://github.com/langchain-ai/langchainjs/pull/2892

Some integration tests need a bit more work in experimental:
![Screenshot 2023-10-12 at 12 02 49
PM](https://github.com/langchain-ai/langchain/assets/9557659/262d7d22-c405-40e9-afef-669e8d585307)

Pretty sure the sqldatabase ones are an actual regression or change in
interface because it's returning a placeholder.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
9 months ago
Nuno Campos 17c69678ab
Revert "New add Baichuan Model" (#11761)
Reverts langchain-ai/langchain#11714

This has linting and formatting issues, plus it's added to chat models
folder but doesn't subclass Chat Model base class
9 months ago
cloudscool 56653c53aa
New add Baichuan Model (#11714)
Motivation and Context
At present, the Baichuan Large Language Model is relatively popular and
efficient in performance. Due to widespread market recognition, this
model has been added to enhance the scalability of Langchain's ability
to access the big language model, so as to facilitate application access
and usage for interested users.

System Info
langchain: 0.0.295
python:3.8.3
IDE:vs code

Description
Add the following files:

1. Add baichuan_baichuaninc_endpoint.py in the
libs/langchain/langchain/chat_models
2. Modify the __init__.py file,which is located in the
libs/langchain/langchain/chat_models/__init__.py:
a. Add "from langchain.chat_models.baichuan_baichuaninc_endpoint import
BaichuanChatEndpoint"
    b. Add "BaichuanChatEndpoint" In the file's __ All__  method

Your contribution
I am willing to help implement this feature and submit a PR, but I would
appreciate guidance from the maintainers or community to ensure the
changes are made correctly and in line with the project's standards and
practices.
9 months ago
Yang, Bo 9e1e0f54d2
Add `TrainableLLM` (#11721)
- **Description:** Add `TrainableLLM` for those LLM support fine-tuning
  - **Tag maintainer:** @hwchase17

This PR add training methods to `GradientLLM`

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
9 months ago
Burak Yılmaz 63e516c2b0
Upstash redis integration (#10871)
- **Description:** Introduced Upstash provider with following wrappers:
UpstashRedisCache, UpstashRedisEntityStore,
UpstashRedisChatMessageHistory, UpstashRedisStore
  - **Issue:** -,
  - **Dependencies:** upstash-redis python package is needed,
  - **Tag maintainer:** @baskaryan 
  - **Twitter handle:** @BurakY744

---------

Co-authored-by: Burak Yılmaz <burakyilmaz@Buraks-MacBook-Pro.local>
Co-authored-by: Bagatur <baskaryan@gmail.com>
9 months ago
Bagatur a9db2b0b92
fix tongyi import (#11745) 9 months ago
Aaron Pham 6c61315067
fix(openllm): update with newer remote client implementation (#11740)
cc @baskaryan

---------

Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>
9 months ago
Richy Wang 11cdfe44af
Implement Alibaba Tongyi chat model apis. (#10922)
Hi there
This PR is aim to implement chat model for Alibaba Tongyi LLM model. It
contains work below:
1.Implement ChatTongyi chat model in langchain.chat_models.tongyi. Note
this is different with tongyi llm model to another PR
https://github.com/langchain-ai/langchain/pull/10878.
For detail it implements _generate() and _stream() function in
ChatTongyi.
2. Add some examples in chat/tongyi.ipynb. 
3. Add integration test in chat_models/test_tongyi.py 

Note async completion for the Text API is not yet supported.
Dependencies: dashscope. It will be installed manually cause it is not
need by everyone.
9 months ago
Adam Demjen 008348ce71
Add ElasticsearchChatMessageHistory (#10932)
**Description**

This PR adds the `ElasticsearchChatMessageHistory` implementation that
stores chat message history in the configured
[Elasticsearch](https://www.elastic.co/elasticsearch/) deployment.

```python
from langchain.memory.chat_message_histories import ElasticsearchChatMessageHistory

history = ElasticsearchChatMessageHistory(
    es_url="https://my-elasticsearch-deployment-url:9200", index="chat-history-index", session_id="123"
)

history.add_ai_message("This is me, the AI")
history.add_user_message("This is me, the human")
```

**Dependencies**
- [elasticsearch client](https://elasticsearch-py.readthedocs.io/)
required

Co-authored-by: Bagatur <baskaryan@gmail.com>
9 months ago
Jonathan Soma 48cf978391
Allow placeholders in OpenAPI endpoints #2938 (#2940)
Use regex matches when checking endpoints instead of exact matches.
`{varname}` becomes `.*`

Fixes #2938

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
9 months ago
Predrag Gruevski 9e32120cbb
Deprecate direct access to globals like `debug` and `verbose`. (#11311)
Instead of accessing `langchain.debug`, `langchain.verbose`, or
`langchain.llm_cache`, please use the new getter/setter functions in
`langchain.globals`:
- `langchain.globals.set_debug()` and `langchain.globals.get_debug()`
- `langchain.globals.set_verbose()` and
`langchain.globals.get_verbose()`
- `langchain.globals.set_llm_cache()` and
`langchain.globals.get_llm_cache()`

Using the old globals directly will now raise a warning.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
9 months ago
Richard Adams 35965df20d
Rspace doc loader (#11511)
**Description:**

Add a document loader for the RSpace Electronic Lab Notebook
(www.researchspace.com), so that scientific documents and research notes
can be easily pulled into Langchain pipelines.

**Issue**

This is an new contribution, rather than an issue fix.

 **Dependencies:** 
  
There are no new required dependencies.
In order to use the loader, clients will need to install rspace_client
SDK using `pip install rspace_client`

---------

Co-authored-by: richarda23 <richard.c.adams@infinityworks.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
9 months ago
Ryan Zotti 9d1867c77f
Update docs to specify Indexing-API-compatible vectorstores (#11581)
**Description:** Update Indexing API docs to specify vectorstores that
are compatible with the Indexing API. I add a unit test to remind
developers to update the documentation whenever they add or change a
vectorstore in a way that affects compatibility. For the unit test I
repurposed existing code from
[here](https://github.com/langchain-ai/langchain/blob/v0.0.311/libs/langchain/langchain/indexes/_api.py#L245-L257).

This is my first PR to an open source project. This is a trivially
simple PR whose main purpose is to make me more comfortable submitting
Langchain PRs. If this PR goes through I plan to submit PRs with more
substantive changes in the near future.

**Issue:** Resolves
[10482](https://github.com/langchain-ai/langchain/discussions/10482).

**Dependencies:** No new dependencies.

**Twitter handle:** None.
9 months ago
Richard Wang 6402c33299
Let Notion document loader support utf-8 and make it default. (#10613)
Use utf-8 encoding by default
9 months ago
Bagatur bd74eba152
add azure openai sched tests (#11723) 9 months ago
Bagatur 9c0584be74
bump 313 (#11718) 9 months ago
sudranga 361f8e1bc6
Add MMR functionality to elasticsearch retriever (#11633)
Allows MMR functionality only for the case where we have access to the
embedding function. Also allows for users to request for fields from
elasticsearch store. These are added to the document metadata.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
9 months ago
Dmitry Tyumentsev ead9d5b55c
Add yandex stt parser (#11435)
Description: Introducing an ability to load a transcription document of
audio file using [Yandex
SpeechKit](https://cloud.yandex.com/en-ru/services/speechkit)
Issue: None
Dependencies: yandex-speechkit
Tag maintainer: @rlancemartin, @eyurtsev
9 months ago
Janos Tolgyesi 15687a28d5
Use correct tokenizer for Bedrock/Anthropic LLMs (#11561)
**Description**

This PR implements the usage of the correct tokenizer in Bedrock LLMs,
if using anthropic models.

**Issue:** #11560

**Dependencies:** optional dependency on `anthropic` python library.

**Twitter handle:** jtolgyesi


---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
9 months ago
kYLe 467b082c34
Modify Anyscale integration to work with Anyscale Endpoint (#11569)
**Description:** Modify Anyscale integration to work with [Anyscale
Endpoint](https://docs.endpoints.anyscale.com/)
and it supports invoke, async invoke, stream and async invoke features

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
9 months ago
plpycoin 51193309ea
Update readthedocs.py (#11110)
Only parse .html files
.svg .png favicon.ico will crash processing phase

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
9 months ago
Nuno Campos ca9de26f2b
Add callback function to RunnablePassthrough (#11564)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
9 months ago
Nuno Campos 7f4734c0dd
Add deploy command to repos generated by cli template (#11711)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
9 months ago
Nuno Campos 1c0857b53e
Fix default impl of aparse_result (#11702)
Should delegate to parse_result, not to aparse, as parse_result is a
method that some output parsers override

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
9 months ago
nuric 44da27c07b
Add SemaDB VST wrapper (#11484)
- **Description**: Adding vectorstore wrapper for
[SemaDB](https://rapidapi.com/semafind-semadb/api/semadb).
- **Issue**: None
- **Dependencies**: None
- **Twitter handle**: semafind

Checks performed:
- [x] `make format`
- [x] `make lint`
- [x] `make test`
- [x] `make spell_check`
- [x] `make docs_build`

Documentation added:

- SemaDB vectorstore wrapper tutorial
9 months ago
hsuyuming 0b743f005b
Feature/enhance huggingfacepipeline to handle different return type (#11394)
**Description:** Avoid huggingfacepipeline to truncate the response if
user setup return_full_text as False within huggingface pipeline.

**Dependencies:** : None
**Tag maintainer:**   Maybe @sam-h-bean ?

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
9 months ago
Leonid Kuligin 2aba9ab47e
Retriever based on GCP DocAI Warehouse (#11400)
- **Description:** implements a retriever on top of DocAI Warehouse (to
interact with existing enterprise documents)
  https://cloud.google.com/document-ai-warehouse?hl=en
  - **Issue:** new functionality
 
@baskaryan

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
9 months ago
Erick Friis a477ddda45
Langsmith in readme update (#11497) 9 months ago
Leonid Kuligin 9e81ab47be
Added a better error description if processor name is wrong. (#11488)
Replace this entire comment with:
  - **Description:** added a better error description for this error
  - **Issue:** #11407 
  
  @baskaryan
9 months ago
Robert Yi e75766b759
fix: incorrect arguments in clickhouse docstring (#11693)
fix docstring for clickhouse
9 months ago
Eugene Yurtsev 17b5090c18
Add `type` to Agent actions (#11682)
Add `type` to agent actions.
9 months ago
April c14a8df2ee
wrap confluence attachment processing with a try-except block (#11503)
Prevents document loading from erroring out when an attachment is not
found at the url.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
9 months ago
eajechiloae 4ba2c8ba75
Fix ClearML callback (#11472)
Handle different field names in dicts/dataframes, fixing the ClearML
callback.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
9 months ago
Lawrence Wu 93bb19f69a
Fix chains/loading.py error messages (#11688)
- **Description:** make the error messages consistent in
chains/loading.py
  - **Dependencies:** None
9 months ago
Harrison Chase 18ebce2032
fix tool async (#11689) 9 months ago
sudranga 9beb03e771
11474 (#11519)
No relevant documents may be found for a given question. In some use
cases, we could directly respond with a fixed message instead of doing
an LLM call with an empty context. This PR exposes this as an option:
response_if_no_docs_found.

---------

Co-authored-by: Sudharsan Rangarajan <sudranga@nile-global.com>
9 months ago
Joaquin Menendez ef99b06362
feature: add metadata information into the embedding file before uplo… (#11553)
Replace this entire comment with:
- **Description:** In this modified version of the function, if the
metadatas parameter is not None, the function includes the corresponding
metadata in the JSON object for each text. This allows the metadata to
be stored alongside the text's embedding in the vector store.
  - 
  - **Issue:** #10924
  - **Dependencies:** None
  - **Tag maintainer:** @hwchase17
@agola11
  - **Twitter handle:** @MelliJoaco

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
9 months ago
Marcin Wątroba 51a3a86022
#11655 Add SQLAlchemyMd5Cache implementation (#11660)
- **Description:** Add SQLAlchemyMd5Cache implementation, 
  - **Issue:** the issue # #11655,
  - **Dependencies:** no deps,
  - **Tag maintainer:** @markowanga

---------

Co-authored-by: Marcin Wątroba <marcin.watroba@pwr.edu.pl>
Co-authored-by: Bagatur <baskaryan@gmail.com>
9 months ago
Suresh Kumar Ponnusamy 70f7558db2
langchain-experimental: Add allow_list support in experimental/data_anonymizer (#11597)
- **Description:** Add allow_list support in langchain experimental
data-anonymizer package
  - **Issue:** no
  - **Dependencies:** no
  - **Tag maintainer:** @hwchase17
  - **Twitter handle:**
9 months ago
wemysschen 2363c02cf3
Bos loader (#11525)
**Description:**
Add  BaiduCloud BOS document loader.

---------

Co-authored-by: chenweixu01 <chenweixu01@baidu.com>
Co-authored-by: root <root@icoding-cwx.bcc-szzj.baidu.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
9 months ago
Kwanghoon Choi fbb82608cd
Fixed a bug in reporting Python code validation (#11522)
- **Description:** fixed a bug in pal-chain when it reports Python
    code validation errors. When node.func does not have any ids, the
    original code tried to print node.func.id in raising ValueError.
- **Issue:** n/a,
- **Dependencies:** no dependencies,
- **Tag maintainer:** @hazzel-cn, @eyurtsev
- **Twitter handle:** @lazyswamp

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
9 months ago
Harrison Chase 9f39c23a13
add input type for convo retrieval chain (#11679) 9 months ago
zhaozhiming d5e762d328
fix: Change the docs of JSONAgentOutputParser (#11594)
I am merely making some minor adjustments to the function documentation.
I hope to provide a small assistance to LangChain.
- **Description:** Change the docs of JSONAgentOutputParser. It will be
`JSON` better,
  - **Issue:** no,
  - **Dependencies:** no,
  - **Tag maintainer:** @hwchase17,
  - **Twitter handle:** Not worth mentioning.
9 months ago
Vinay Kakade dd0cd98861
Add support for ChatOpenAI models in Infino callback handler (#11608)
**Description:** This PR adds support for ChatOpenAI models in the
Infino callback handler. In particular, this PR implements
`on_chat_model_start` callback, so that ChatOpenAI models are supported.
With this change, Infino callback handler can be used to track latency,
errors, and prompt tokens for ChatOpenAI models too (in addition to the
support for OpenAI and other non-chat models it has today). The existing
example notebook is updated to show how to use this integration as well.
cc/ @naman-modi @savannahar68

**Issue:** https://github.com/langchain-ai/langchain/issues/11607 

**Dependencies:** None

**Tag maintainer:** @hwchase17 

**Twitter handle:** [@vkakade](https://twitter.com/vkakade)
9 months ago
Israel Ekpo d0603c86b6
Add Support for Azure Cosmos DB MongoDB vCore Vector Store #11627 (#11632)
This PR adds support for the Azure Cosmos DB MongoDB vCore Vector Store

https://learn.microsoft.com/en-us/azure/cosmos-db/mongodb/vcore/

https://learn.microsoft.com/en-us/azure/cosmos-db/mongodb/vcore/vector-search

Summary:
- **Description:** added vector store integration for Azure Cosmos DB
MongoDB vCore Vector Store,
  - **Issue:** the issue # it fixes #11627,
  - **Dependencies:** pymongo dependency,
  - **Tag maintainer:** @hwchase17,
  - **Twitter handle:** @izzyacademy

---------

Co-authored-by: Israel Ekpo <israel.ekpo@gmail.com>
Co-authored-by: Israel Ekpo <44282278+izzyacademy@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
9 months ago
Erick Friis 28ee6a7c12
Track ChatFireworks time to first_token (#11672) 9 months ago
Eugene Yurtsev 539941281d
Fix output types for BaseChatModel (#11670)
* Should use non chunked messages for Invoke/Batch
* After this PR, stream output type is not represented, do we want to
use the union?

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
9 months ago
Eugene Yurtsev 99adcdb1c9
Add dedicated `type` attribute to be used solely for serialization purposes (#11585)
Adds standard `type` field for all messages that will be
serialized/validated by pydantic.

* The presence of `type` makes it easier for developers consuming
schemas to write client code to serialize/deserialize.
* In LangServe `type` will be used for both validation and will appear
in the generated openapi specs
9 months ago
eryk-dsai 06d5971be9
Fix issue #10985 - Skip model.to(device) if it is instantiated with bitsandbytes config (#11009)
Preventing error caused by attempting to move the model that was already
loaded on the GPU using the Accelerate module to the same or another
device. It is not possible to load model with Accelerate/PEFT to CPU for
now

Addresses:
[#10985](https://github.com/langchain-ai/langchain/issues/10985)
9 months ago
Nuno Campos 64969bc8ae
Add patch_config(configurable=) arg, make with_config(configurable=) merge it with existing (#11662)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
9 months ago
Harrison Chase ce0019b646
make utils conditional (#11646) 9 months ago
Harrison Chase 8f06085b24
make tools conditional (#11647) 9 months ago
Bassem Yacoube 5451b724fc
Adds support for llama2 and fixes MPT-7b url (#11465)
- **Description:** This is an update to OctoAI LLM provider that adds
support for llama2 endpoints hosted on OctoAI and updates MPT-7b url
with the current one.
@baskaryan
Thanks!

---------

Co-authored-by: ML Wiz <bassemgeorgi@gmail.com>
9 months ago
Todd Kerpelman 0bff399af1
Make metadata from the url_selenium loader match that of the web_base loader (#11617)
**Description:** I noticed the metadata returned by the url_selenium
loader was missing several values included by the web_base loader. (The
former returned `{source: ...}`, the latter returned `{source: ...,
title: ..., description: ..., language: ...}`.) This change fixes it so
both loaders return all 4 key value pairs.

Files have been properly formatted and all tests are passing. Note,
however, that I am not much of a python expert, so that whole "Adding
the imports inside the code so that tests pass" thing seems weird to me.
Please LMK if I did anything wrong.
9 months ago
Tarun Thotakura c9d4d53545
Fixed the assignment of custom_llm_provider argument (#11628)
- **Description:** Assigning the custom_llm_provider to the default
params function so that it will be passed to the litellm
- **Issue:** Even though the custom_llm_provider argument is being
defined it's not being assigned anywhere in the code and hence its not
being passed to litellm, therefore any litellm call which uses the
custom_llm_provider as required parameter is being failed. This
parameter is mainly used by litellm when we are doing inference via
Custom API server.
https://docs.litellm.ai/docs/providers/custom_openai_proxy
  - **Dependencies:** No dependencies are required

@krrishdholakia , @baskaryan

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
9 months ago
Leonid Ganeline db67ccb0bb
docstrings cleanup (#11640)
Added missed docstrings. Some reformatting.
9 months ago
Yang, Bo 3a82bd7bdb
Use raise from statement so that users can find detailed error message (#11461)
- **Description:** Use `raise from` statement so that users can find
detailed error message
  - **Tag maintainer:** @baskaryan, @eyurtsev, @hwchase17
9 months ago
Nuno Campos 9a0ed75a95
Add configurable fields with options (#11601)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
9 months ago
Bagatur 7232e082de
bump 312 (#11621) 9 months ago
Eugene Yurtsev 58220cda72
Remove LLM Bash and related bash utilities (#11619)
Deprecate LLMBash and related bash utilities
9 months ago
Shubham Kushwaha 49de862076
Arcee.ai LLM & Retriever integration (#11579)
- **Description:** This PR introduces a new LLM and Retriever API to
https://arcee.ai for the python client
  - **Issue:** implements the integrations as requested in #11578 ,
  - **Dependencies:** no dependencies are required,
  - **Tag maintainer:** @hwchase17
  - **Twitter handle:** shwooobham 


** `make format`, `make lint` and `make test` runs locally.**
```shell
=========== 1245 passed, 277 skipped, 20 warnings in 16.26s ===========
./scripts/check_pydantic.sh .
./scripts/check_imports.sh
poetry run ruff .
[ "." = "" ] || poetry run black . --check
All done!  🍰 
1818 files would be left unchanged.
[ "." = "" ] || poetry run mypy .
Success: no issues found in 1815 source files
[ "." = "" ] || poetry run black .
All done!  🍰 
1818 files left unchanged.
[ "." = "" ] || poetry run ruff --select I --fix .
poetry run codespell --toml pyproject.toml
poetry run codespell --toml pyproject.toml -w
```


**Contributions**
1. Arcee (langchain/llms), ArceeRetriever (langchain/retrievers),
ArceeWrapper (langchain/utilities)
2. docs for Arcee (llms/arcee.py) and
ArceeRetriever(retrievers/arcee.py)
3.

cc: @jacobsolawetz @ben-epstein

---------

Co-authored-by: Shubham <shubham@sORo.local>
9 months ago
Eugene Yurtsev b56ca0c2a4
Deprecate LLMSymbolicMath from langchain core (#11615)
Deprecate LLMSymbolicMath from langchain core package.
9 months ago
Eugene Yurtsev c9bce5bbfb
Add version to langchain_experimental (#11613)
Add version to langchain experimental
9 months ago
Predrag Gruevski 22abeb9f6c
Disable loading jinja2 `PromptTemplate` from file. (#10252)
jinja2 templates are not sandboxed and are at risk for arbitrary code
execution. To mitigate this risk:
- We no longer support loading jinja2-formatted prompt template files.
- `PromptTemplate` with jinja2 may still be constructed manually, but
the class carries a security warning reminding the user to not pass
untrusted input into it.

Resolves #4394.
9 months ago
Nuno Campos c7c03d4709
Fix mutation bugs in callback manager configure (#11603)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
9 months ago
cccs-eric e2a9072b80
Fix CohereRerank configuration (#11583)
**Description:** CohereRerank is missing `cohere_api_key` as a field and
since extras are forbidden, it is not possible to pass-in the key. The
only way is to use an env variable named `COHERE_API_KEY`.

For example, if trying to create a compressor like this:
```python
cohere_api_key = "......Cohere api key......"
compressor = CohereRerank(cohere_api_key=cohere_api_key)
```
you will get the following error:
```
  File "/langchain/.venv/lib/python3.10/site-packages/pydantic/v1/main.py", line 341, in __init__
    raise validation_error
pydantic.v1.error_wrappers.ValidationError: 1 validation error for CohereRerank
cohere_api_key
  extra fields not permitted (type=value_error.extra)
```
9 months ago
Anar 55fef4b64b
implemented add files method in LLMRails (#11518)
This PR provides add files method with LLMRails. Implemented here are:

docs/extras/integrations/vectorstores/llm-rails.ipynb

---------

Co-authored-by: Anar Aliyev <aaliyev@mgmt.cloudnet.services>
9 months ago
Stephen Hankinson 316dddc7cd
fix wording of query_sql_database_tool_description (#11530)
- **Description:** Fixes minor typo for the
query_sql_database_tool_description in the db toolkit
  - **Issue:** N/A
  - **Dependencies:** N/A
  - **Tag maintainer:** @nfcampos 
  - **Twitter handle:** N/A
9 months ago
Ash Vardanian 1acfe86353
Accelerating Math Utils with SimSIMD (#11566)
LangChain relies on NumPy to compute cosine distances, which becomes a
bottleneck with the growing dimensionality and number of embeddings. To
avoid this bottleneck, in our libraries at
[Unum](https://github.com/unum-cloud), we have created a specialized
package - [SimSIMD](https://github.com/ashvardanian/simsimd), that knows
how to use newer hardware capabilities. Compared to SciPy and NumPy, it
reaches 3x-200x performance for various data types. Since publication,
several LangChain users have asked me if I can integrate it into
LangChain to accelerate their workflows, so here I am 🤗

## Benchmarking

To conduct benchmarks locally, run this in your Jupyter:

```py
import numpy as np
import scipy as sp
import simsimd as simd
import timeit as tt

def cosine_similarity_np(X: np.ndarray, Y: np.ndarray) -> np.ndarray:
    X_norm = np.linalg.norm(X, axis=1)
    Y_norm = np.linalg.norm(Y, axis=1)
    with np.errstate(divide="ignore", invalid="ignore"):
        similarity = np.dot(X, Y.T) / np.outer(X_norm, Y_norm)
    similarity[np.isnan(similarity) | np.isinf(similarity)] = 0.0
    return similarity

def cosine_similarity_sp(X: np.ndarray, Y: np.ndarray) -> np.ndarray:
    return 1 - sp.spatial.distance.cdist(X, Y, metric='cosine')

def cosine_similarity_simd(X: np.ndarray, Y: np.ndarray) -> np.ndarray:
    return 1 - simd.cdist(X, Y, metric='cosine')

X = np.random.randn(1, 1536).astype(np.float32)
Y = np.random.randn(1, 1536).astype(np.float32)
repeat = 1000

print("NumPy: {:,.0f} ops/s, SciPy: {:,.0f} ops/s, SimSIMD: {:,.0f} ops/s".format(
    repeat / tt.timeit(lambda: cosine_similarity_np(X, Y), number=repeat),
    repeat / tt.timeit(lambda: cosine_similarity_sp(X, Y), number=repeat),
    repeat / tt.timeit(lambda: cosine_similarity_simd(X, Y), number=repeat),
))
```

## Results

I ran this on an M2 Pro Macbook for various data types and different
number of rows in `X` and reformatted the results as a table for
readability:

| Data Type | NumPy | SciPy | SimSIMD |
| :--- | ---: | ---: | ---: |
| `f32, 1` | 59,114 ops/s | 80,330 ops/s | 475,351 ops/s |
| `f16, 1` | 32,880 ops/s | 82,420 ops/s | 650,177 ops/s |
| `i8, 1` | 47,916 ops/s | 115,084 ops/s | 866,958 ops/s |
| `f32, 10` | 40,135 ops/s | 24,305 ops/s | 185,373 ops/s |
| `f16, 10` | 7,041 ops/s | 17,596 ops/s | 192,058 ops/s |
| `f16, 10` | 21,989 ops/s | 25,064 ops/s | 619,131 ops/s |
| `f32, 100` | 3,536 ops/s | 3,094 ops/s | 24,206 ops/s |
| `f16, 100` | 900 ops/s | 2,014 ops/s | 23,364 ops/s |
| `i8, 100` | 5,510 ops/s | 3,214 ops/s | 143,922 ops/s |

It's important to note that SimSIMD will underperform if both matrices
are huge.
That, however, seems to be an uncommon usage pattern for LangChain
users.
You can find a much more detailed performance report for different
hardware models here:

- [Apple M2
Pro](https://ashvardanian.com/posts/simsimd-faster-scipy/#appendix-1-performance-on-apple-m2-pro).
- [4th Gen Intel Xeon
Platinum](https://ashvardanian.com/posts/simsimd-faster-scipy/#appendix-2-performance-on-4th-gen-intel-xeon-platinum-8480).
- [AWS Graviton
3](https://ashvardanian.com/posts/simsimd-faster-scipy/#appendix-3-performance-on-aws-graviton-3).
  
## Additional Notes

1. Previous version used `X = np.array(X)`, to repackage lists of lists.
It's an anti-pattern, as it will use double-precision floating-point
numbers, which are slow on both CPUs and GPUs. I have replaced it with
`X = np.array(X, dtype=np.float32)`, but a more selective approach
should be discussed.
2. In numerical computations, it's recommended to explicitly define
tolerance levels, which were previously avoided in
`np.allclose(expected, actual)` calls. For now, I've set absolute
tolerance to distance computation errors as 0.01: `np.allclose(expected,
actual, atol=1e-2)`.

---

  - **Dependencies:** adds `simsimd` dependency
  - **Tag maintainer:** @hwchase17
  - **Twitter handle:** @ashvardanian

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
9 months ago
benchello 5de64e6d60
Add option to specify metadata columns in CSV loader (#11576)
#### Description
This PR adds the option to specify additional metadata columns in the
CSVLoader beyond just `Source`.

The current CSV loader includes all columns in `page_content` and if we
want to have columns specified for `page_content` and `metadata` we have
to do something like the below.:
```
csv = pd.read_csv(
        "path_to_csv"
    ).to_dict("records")

documents = [
        Document(
            page_content=doc["content"],
            metadata={
                "last_modified_by": doc["last_modified_by"],
                "point_of_contact": doc["point_of_contact"],
            }
        ) for doc in csv
    ]
```
#### Usage
Example Usage:
```
csv_test  =  CSVLoader(
      file_path="path_to_csv", 
      metadata_columns=["last_modified_by", "point_of_contact"]
 )
```
Example CSV:
```
content, last_modified_by, point_of_contact
"hello world", "Person A", "Person B"
```

Example Result:
```
Document {
 page_content: "hello world"
 metadata: {
 row: '0',
 source: 'path_to_csv',
 last_modified_by: 'Person A',
 point_of_contact: 'Person B',
 }
```

---------

Co-authored-by: Ben Chello <bchello@dropbox.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
9 months ago
Stephen Hankinson 447a523662
fix comments in output format (#11536)
- **Description:** Fixes the comments in the ConvoOutputParser. Because
the \\\\ is escaping a single \\, they render something like:
`"action_input": string \ The input to the action` in the prompt.
Changing this to \\\\\\\\ lets it escape two slashes so that it renders
a proper comment: `"action_input": string \\ The input to the action`
  - **Issue:** N/A
  - **Dependencies:** 
  - **Tag maintainer:** @hwchase17
  - **Twitter handle:**
9 months ago
Michael Landis 8e45f720a8
feat: add momento vector index as a vector store provider (#11567)
**Description**:

- Added Momento Vector Index (MVI) as a vector store provider. This
includes an implementation with docstrings, integration tests, a
notebook, and documentation on the docs pages.
- Updated the Momento dependency in pyproject.toml and the lock file to
enable access to MVI.
- Refactored the Momento cache and chat history session store to prefer
using "MOMENTO_API_KEY" over "MOMENTO_AUTH_TOKEN" for consistency with
MVI. This change is backwards compatible with the previous "auth_token"
variable usage. Updated the code and tests accordingly.

**Dependencies**:

- Updated Momento dependency in pyproject.toml.

**Testing**:

- Run the integration tests with a Momento API key. Get one at the
[Momento Console](https://console.gomomento.com) for free. MVI is
available in AWS us-west-2 with a superuser key.
- `MOMENTO_API_KEY=<your key> poetry run pytest
tests/integration_tests/vectorstores/test_momento_vector_index.py`

**Tag maintainer:**

@eyurtsev

**Twitter handle**:

Please mention @momentohq for this addition to langchain. With the
integration of Momento Vector Index, Momento caching, and session store,
Momento provides serverless support for the core langchain data needs.

Also mention @mlonml for the integration.
9 months ago
Eugene Yurtsev ca2eed36b7
LangChain cli fix a few bugs (#11573)
Code was assuming that `git` and `poetry` exist. In addition, it was not
ignoring pycache files that get generated during run time
9 months ago
Hugues Chocart 258ae1ba5f
[LLMonitor Callback Handler]: Add error handling (#11563)
Wraps every callback handler method in error handlers to avoid breaking
users' programs when an error occurs inside the handler.

Thanks @valdo99 for the suggestion 🙂
9 months ago
Eugene Yurtsev 2aabfafe1e
Module documentation for langchain runnables (#11550)
Add in code documentation for langchain runnables module.
9 months ago
Eugene Yurtsev d8fa94e6fa
RunnablePassthrough: In code documentation (#11552)
Add in code documentation for a runnable passthrough
9 months ago
Eugene Yurtsev b42f218cfc
RunnableLambda: Add in code docs (#11521)
Add in code docs for Runnable Lambda
9 months ago
maks-operlejn-ds f64522fbaf
Reset deanonymizer mapping (#11559)
@hwchase17 @baskaryan
9 months ago
maks-operlejn-ds b14b65d62a
Support all presidio entities (#11558)
https://microsoft.github.io/presidio/supported_entities/

@baskaryan @hwchase17
9 months ago
maks-operlejn-ds 4d62def9ff
Better deanonymizer matching strategy (#11557)
@baskaryan, @hwchase17
9 months ago
Ash Vardanian a992b9670d
Fix: Missing DuckDuckGo package version (#11535)
[The `duckduckgo-search` v3.9.2 was removed from
PyPi](https://pypi.org/project/duckduckgo-search/#history). That breaks
the build.

  - **Description:** refreshes the Poetry dependency to v3.9.3
  - **Tag maintainer:** @baskaryan
  - **Twitter handle:** @ashvardanian
9 months ago
Bagatur 8932ed3f07
bump 311 (#11555) 9 months ago
Bagatur e7a0def1bc
QoL improvements to query constructor (#11504)
updating query constructor and self query retriever to
- make it easier to pass in examples
- validate attributes used in query
- remove invalid parts of query
- make it easier to get + edit prompt
- make query constructor a runnable
- make self query retriever use as runnable
9 months ago
Taikono-Himazin eec53fa294
Added autodetect_encoding option to csvLoader (#11327) 9 months ago
Holt Skinner 09c66fe04f
feat: Update Google Document AI Parser (#11413)
- **Description:** Code Refactoring, Documentation Improvements for
Google Document AI PDF Parser
  - Adds Online (synchronous) processing option.
  - Adds default field mask to limit payload size.
  - Skips Human review by default.
- **Issue:** Fixes #10589

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
9 months ago
Nuno Campos 628cc4cce8
Rename RunnableMap to RunnableParallel (#11487)
- keep alias for RunnableMap
- update docs to use RunnableParallel and RunnablePassthrough.assign

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
9 months ago
Eugene Yurtsev 6a10e8ef31
Add documentation to Runnable (#11516) 9 months ago
William FH eb572f41a6
Add LangSmith Run Chat Loader (#11458) 9 months ago
David Duong 484947c492
Fetch up-to-date attributes for env-pulled kwargs during serialisation of OpenAI classes (#11499) 9 months ago
Bagatur 5470e730d2
raise openapi import error (#11495) 9 months ago
Erick Friis 29f5f70415
Rename some last hwchase17/langchain links (#11494) 9 months ago
Fabrice Pont 872836c541
feat: add markdown list parser (#11411)
**Description:** add `MarkdownListOutputParser` as a new
`ListOutputParser`
 **Issue:** #11410
9 months ago
Erick Friis 8f50b616c5
Remove optional from vectara source (#11493)
fyi @ofermend

---------

Co-authored-by: Ofer Mendelevitch <ofer@vectara.com>
Co-authored-by: Ofer Mendelevitch <ofermend@gmail.com>
9 months ago
Bagatur 53887242a1
bump 310 (#11486) 9 months ago
Jesús Vélez Santiago a1c7532298
Add async sql record manager and async indexing API (#10726)
- **Description:** Add support for a SQLRecordManager in async
environments. It includes the creation of `RecorManagerAsync` abstract
class.
- **Issue:** None
- **Dependencies:** Optional `aiosqlite`.
- **Tag maintainer:** @nfcampos 
- **Twitter handle:** @jvelezmagic

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
9 months ago
Qihui Xie 57ade13b2b
fix llm_inputs duplication problem in intermediate_steps in SQLDatabaseChain (#10279)
Use `.copy()` to fix the bug that the first `llm_inputs` element is
overwritten by the second `llm_inputs` element in `intermediate_steps`.

***Problem description:***
In [line 127](

c732d8fffd/libs/experimental/langchain_experimental/sql/base.py (L127C17-L127C17)),
the `llm_inputs` of the sql generation step is appended as the first
element of `intermediate_steps`:
```
            intermediate_steps.append(llm_inputs)  # input: sql generation
```

However, `llm_inputs` is a mutable dict, it is updated in [line
179](https://github.com/langchain-ai/langchain/blob/master/libs/experimental/langchain_experimental/sql/base.py#L179)
for the final answer step:
```
                llm_inputs["input"] = input_text
```
Then, the updated `llm_inputs` is appended as another element of
`intermediate_steps` in [line
180](c732d8fffd/libs/experimental/langchain_experimental/sql/base.py (L180)):
```
                intermediate_steps.append(llm_inputs)  # input: final answer
```

As a result, the final `intermediate_steps` returned in [line
189](c732d8fffd/libs/experimental/langchain_experimental/sql/base.py (L189C43-L189C43))
actually contains two same `llm_inputs` elements, i.e., the `llm_inputs`
for the sql generation step overwritten by the one for final answer step
by mistake. Users are not able to get the actual `llm_inputs` for the
sql generation step from `intermediate_steps`

Simply calling `.copy()` when appending `llm_inputs` to
`intermediate_steps` can solve this problem.
9 months ago
Florian d78f418c0d
Extract abstracts from Pubmed articles, even if they have no extra label (#10245)
### Description
This pull request involves modifications to the extraction method for
abstracts/summaries within the PubMed utility. A condition has been
added to verify the presence of unlabeled abstracts. Now an abstract
will be extracted even if it does not have a subtitle. In addition, the
extraction of the abstract was extended to books.

### Issue
The PubMed utility occasionally returns an empty result when extracting
abstracts from articles, despite the presence of an abstract for the
paper on PubMed. This issue arises due to the varying structure of
articles; some articles follow a "subtitle/label: text" format, while
others do not include subtitles in their abstracts. An example of the
latter case can be found at:
[https://pubmed.ncbi.nlm.nih.gov/37666905/](url)

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
9 months ago
Viktor Zhemchuzhnikov fd9da60aea
Add async support to SelfQueryRetriever (#10175)
### Description

SelfQueryRetriever is missing async support, so I am adding it.
I also removed deprecated predict_and_parse method usage here, and added
some tests.

### Issue
N/A

### Tag maintainer
Not yet

### Twitter handle
N/A
9 months ago
Theron Tau 35297ca0d3
Add feature for extracting images from pdf and recognizing text from images. (#10653)
**Description**

It is for #10423 that it will be a useful feature if we can extract
images from pdf and recognize text on them. I have implemented it with
`PyPDFLoader`, `PyPDFium2Loader`, `PyPDFDirectoryLoader`,
`PyMuPDFLoader`, `PDFMinerLoader`, and `PDFPlumberLoader`.
[RapidOCR](https://github.com/RapidAI/RapidOCR.git) is used to recognize
text on extracted images. It is time-consuming for ocr so a boolen
parameter `extract_images` is set to control whether to extract and
recognize. I have tested the time usage for each parser on my own laptop
thinkbook 14+ with AMD R7-6800H by unit test and the result is:

| extract_images | PyPDFParser | PDFMinerParser | PyMuPDFParser |
PyPDFium2Parser | PDFPlumberParser |
| ------------- | ------------- | ------------- | ------------- |
------------- | ------------- |
| False | 0.27s | 0.39s | 0.06s | 0.08s | 1.01s |
| True  | 17.01s  | 20.67s | 20.32s | 19,75s | 20.55s |

**Issue**

#10423 

**Dependencies**

rapidocr_onnxruntime in
[RapidOCR](https://github.com/RapidAI/RapidOCR/tree/main)

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
9 months ago
Bagatur 8e3fbc97ca
Add vowpal_wabbit RL chain (#11462) 9 months ago
Haris Wang f1269830a0
Fix bug in MarkdownHeaderTextSplitter for codeblock (#10262)
- Description: The previous version of the MarkdownHeaderTextSplitter
did not take into account the possibility of '#' appearing within code
blocks, which caused segmentation anomalies in these situations. This PR
has fixed this issue.
  - Issue: 
  - Dependencies: No
  - Tag maintainer: 
  - Twitter handle: 

cc @baskaryan @eyurtsev  @rlancemartin

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
9 months ago
Eddie Cohen 656d2303f7
add in, nin for pinecone (#10303)
Description: Adds the in and nin comparators for pinecone seen
[here](https://docs.pinecone.io/docs/metadata-filtering#metadata-query-language)

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
9 months ago
Bagatur a3a2ce623e Revise vowpal_wabbit notebook 9 months ago
Bagatur 8fafa1af91 merge 9 months ago
olgavrou 3b07c0cf3d
RL Chain with VowpalWabbit (#10242)
- Description: This PR adds a new chain `rl_chain.PickBest` for learned
prompt variable injection, detailed description and usage can be found
in the example notebook added. It essentially adds a
[VowpalWabbit](https://github.com/VowpalWabbit/vowpal_wabbit) layer
before the llm call in order to learn or personalize prompt variable
selections.

Most of the code is to make the API simple and provide lots of defaults
and data wrangling that is needed to use Vowpal Wabbit, so that the user
of the chain doesn't have to worry about it.

- Dependencies:
[vowpal-wabbit-next](https://pypi.org/project/vowpal-wabbit-next/),
     - sentence-transformers (already a dep)
     - numpy (already a dep)
  - tagging @ataymano who contributed to this chain
  - Tag maintainer: @baskaryan
  - Twitter handle: @olgavrou


Added example notebook and unit tests
9 months ago
Manikanta5112 56048b909f
added ContentFormatter escape special characters for message content (#10319)
---------

Co-authored-by: Manikanta5112 <42089393+mani5112@users.noreply.github.com>
9 months ago
Leonid Ganeline d17416ec79
docstrings `callbacks` (#11456)
Added missed docstrings to the `callbacks/`

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
9 months ago
Ofer Mendelevitch 3c7653bf0f
"source" argument in constructor of Vectara (#11454)
Replace this entire comment with:
- **Description:** minor update to constructor to allow for
specification of "source"
  - **Tag maintainer:** @baskaryan
  - **Twitter handle:** @ofermend
9 months ago
Eugene Yurtsev d9018ae5f1
Improve CLI ux (#11452)
Improve UX for cli
9 months ago
Jaikanth J 9f85f7c543
fix(cache): use dumps for RedisCache (#10408)
# Description
Attempts to fix RedisCache for ChatGenerations using `loads` and `dumps`
used in SQLAlchemy cache by @hwchase17 . this is better than pickle
dump, because this won't execute any arbitrary code during
de-serialisation.

# Issues
#7722 & #8666 

# Dependencies
None, but removes the warning introduced in #8041 by @baskaryan

Handle: @jaikanthjay46
9 months ago
rodrigo-clickup 5944c1851b
Add ClickUp Toolkit (#10662)
- **Description:** Adds a toolkit to interact with the
[ClickUp](https://clickup.com/) [Public API](https://clickup.com/api/)
- **Dependencies:** None
- **Tag maintainer:** @rodrigo-georgian, @rodrigo-clickup,
@aiswaryasankarwork
- **Twitter handle:** 
- Aiswarya (https://twitter.com/Aiswarya_Sankar,
https://www.linkedin.com/in/sankaraiswarya/)
   - Rodrigo (https://www.linkedin.com/in/rodrigo-ceballos-lentini/)


---------

Co-authored-by: Aiswarya Sankar <aiswaryasankar@Aiswaryas-MacBook-Pro.local>
Co-authored-by: aiswaryasankarwork <143119412+aiswaryasankarwork@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
9 months ago
John Reynolds 68901e1e40
Update output_parser.py (#10430)
- Description: Updated output parser for mrkl to remove any
hallucination actions after the final answer; this was encountered when
using Anthropic claude v2 for planning; reopening PR with updated unit
tests
- Issue: #10278 
- Dependencies: N/A
- Twitter handle: @johnreynolds
9 months ago
Joshua Sundance Bailey 790010703b
ArcGISLoader: Limit number of results in query (#10615)
Description: this PR changes the `ArcGISLoader` to set
`return_all_records` to `False` when `result_record_count` is provided
as a keyword argument. Previously, `return_all_records` was `True` by
default and this made the API ignore `result_record_count`.

Issue: `ArcGISLoader` would ignore `result_record_count` unless user
also passed `return_all_records=False`.
9 months ago
mrbean 9903a70379
Add youdotcom retriever (#11304)
---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
9 months ago
ashish-dahal 1655ff2ded
Fix PyMuPDFLoader kwargs (#11434)
- **Description:** Fix the `PyMuPDFLoader` to accept `loader_kwargs`
from the document loader's `loader_kwargs` option. This provides more
flexibility in formatting the output from documents.

- **Issue:** The `loader_kwargs` is not passed into the `load` method
from the document loader, which limits configuration options.

- **Dependencies:**  None

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
9 months ago
Leonid Kuligin e4a46747dc
integration test for DocAI parser (#11424)
- **Description:** added an integration test
  - **Issue:** #11407 

@baskaryan
9 months ago
Aashish Saini 2abbdc6ecb
Update bageldb.py (#11421)
I have restructured the code to ensure uniform handling of ImportError.
In place of previously used ValueError, I've adopted the standard
practice of raising ImportError with explanatory messages. This
modification enhances code readability and clarifies that any problems
stem from module importation.
9 months ago
maks-operlejn-ds 2aae1102b0
Instance anonymization (#10501)
### Description

Add instance anonymization - if `John Doe` will appear twice in the
text, it will be treated as the same entity.
The difference between `PresidioAnonymizer` and
`PresidioReversibleAnonymizer` is that only the second one has a
built-in memory, so it will remember anonymization mapping for multiple
texts:

```
>>> anonymizer = PresidioAnonymizer()
>>> anonymizer.anonymize("My name is John Doe. Hi John Doe!")
'My name is Noah Rhodes. Hi Noah Rhodes!'
>>> anonymizer.anonymize("My name is John Doe. Hi John Doe!")
'My name is Brett Russell. Hi Brett Russell!'
```
```
>>> anonymizer = PresidioReversibleAnonymizer()
>>> anonymizer.anonymize("My name is John Doe. Hi John Doe!")
'My name is Noah Rhodes. Hi Noah Rhodes!'
>>> anonymizer.anonymize("My name is John Doe. Hi John Doe!")
'My name is Noah Rhodes. Hi Noah Rhodes!'
```

### Twitter handle
@deepsense_ai / @MaksOpp

### Tag maintainer
@baskaryan @hwchase17 @hinthornw

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
9 months ago
Kyle Pancamo 203258b4d6
Update pdf.py comment for PyPDFLoader (#10495)
PyPDF does not chunk at the character level to my understanding.

Description: PyPDF does not chunk at the character level, but instead
breaks up content by page. Fixup comment

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
9 months ago
Juan Daza 4236ae3851
Added Streaming Capability to SageMaker LLMs (#10535)
This PR adds the ability to declare a Streaming response in the
SageMaker LLM by leveraging the `invoke_endpoint_with_response_stream`
capability in `boto3`. It is heavily based on the AWS Blog Post
announcement linked
[here](https://aws.amazon.com/blogs/machine-learning/elevating-the-generative-ai-experience-introducing-streaming-support-in-amazon-sagemaker-hosting/).

It does not add any additional dependencies since it uses the existing
`boto3` version.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
9 months ago
Laurentiu Piciu d9670a5945
openai_functions_multi_agent: solved the case when the "arguments" is valid JSON but it does not contain `actions` key (#10543)
Description: There are cases when the output from the LLM comes fine
(i.e. function_call["arguments"] is a valid JSON object), but it does
not contain the key "actions". So I split the validation in 2 steps:
loading arguments as JSON and then checking for "actions" in it.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
9 months ago
Eugene Yurtsev fcccde406d
Add SymbolicMathChain to experiment in preparation for deprecation (#11129)
Move symbolic math chain to experimental
9 months ago
Holt Skinner 9f73fec057
fix: Update Google Cloud Enterprise Search to Vertex AI Search (#10513)
- Description: Google Cloud Enterprise Search was renamed to Vertex AI
Search
-
https://cloud.google.com/blog/products/ai-machine-learning/vertex-ai-search-and-conversation-is-now-generally-available
- This PR updates the documentation and Retriever class to use the new
terminology.
- Changed retriever class from `GoogleCloudEnterpriseSearchRetriever` to
`GoogleVertexAISearchRetriever`
- Updated documentation to specify that `extractive_segments` requires
the new [Enterprise
edition](https://cloud.google.com/generative-ai-app-builder/docs/about-advanced-features#enterprise-features)
to be enabled.
  - Fixed spelling errors in documentation.
- Change parameter for Retriever from `search_engine_id` to
`data_store_id`
- When this retriever was originally implemented, there was no
distinction between a data store and search engine, but now these have
been split.
- Fixed an issue blocking some users where the api_endpoint can't be set
9 months ago
Patrick Randell 1d678f805f
Additional Weaviate Filter Comparators (#10522)
### Description
When using Weaviate Self-Retrievers, certain common filter comparators
generated by user queries were unimplemented, resulting in errors. This
PR implements some of them. All linting and format commands have been
run and tests passed.
### Issue
#10474
### Dependencies
timestamp module

---------

Co-authored-by: Patrick Randell <prandell@deloitte.com.au>
9 months ago
Nuno Campos 79011f835f
Remove str() from RunnableConfigurableAlternatives (#11446) 9 months ago
Harrison Chase 31d5bd84d7
make vectorstores optional (#11393) 9 months ago
Eugene Yurtsev 8aa545901a
Update agent type docs (#11137)
In code docs for agent types
9 months ago
Eugene Yurtsev 3e31d6e35f
Start deprecation of LLMBashChain (#11300)
In preparation for migration LLMBashChain and related tools add a
derprecation warning to the code.
9 months ago
Bagatur 8b6b8bf68c
bump 309 (#11443) 9 months ago
billytrend-cohere 2ff91a46c0
Add cohere /chat integration (#11389)
Add cohere /chat integration and an iPython notebook to demonstrate the
addition.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
9 months ago
adrienohana ca346011b7
added interactive login for azure cognitive search vector store (#11360)
**Description:** Previously if the access to Azure Cognitive Search was
not done via an API key, the default credential was called which doesn't
allow to use an interactive login. I simply added the option to use
"INTERACTIVE" as a key name, and this will launch a login window upon
initialization of the AzureSearch object.
9 months ago
Eugene Yurtsev 5a1f614175
Add docker compose to CLI (#11406)
Add docker compose to cli
9 months ago
Predrag Gruevski e2d6c41177
Upgrade langchain dependencies. (#11420)
I was hoping this would pick up numpy 1.26, which is required to support
the new Python 3.12 release, but it didn't. It seems that some
transitive dependency requirement on numpy is preventing that, and the
highest we can currently go is 1.24.x.

But to find this out required a 15min `poetry lock`, so I figured we
might as well upgrade the dependencies we can and hopefully make the
next dependency upgrade a bit smaller.
9 months ago
Jacob Lee 71fd6428c5
Remove overridden async not implemented method on embeddings filters and add default async implementation for document compressors (#11415)
@nfcampos @eyurtsev @baskaryan

---------

Co-authored-by: Nuno Campos <nuno@boringbits.io>
9 months ago
Nuno Campos 2f490be09b
Fix .dict() for agent/chain (#11436)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
9 months ago
Nuno Campos 1e59c44d36
Nc/5oct/runnable release (#11428)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
9 months ago
Bagatur 58b7a3ba16
Rm bedrock anthropic error (#11403) 9 months ago
Predrag Gruevski c9986bc3a9
Tweak type hints to match dependency's behavior. (#11355)
Needs #11353 to merge first, and a new `langchain` to be published with
those changes.
9 months ago
William FH 940b9ae30a
Normalize Option in Scoring Chain (#11412) 9 months ago
Eugene Yurtsev 70be04a816
CLI: Readme update (#11404)
Consolidating to a single README for now, will be easier to maintain we
can differentiate between poetry and pip later. Does not seem critical.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
9 months ago
Nuno Campos fde19c8667
Add CLI command to create a new project (#7837)
First version of CLI command to create a new langchain project template

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
9 months ago
mhwang-stripe 9cea796671
Make langchain compatible with SQLAlchemy<1.4.0 (#11390)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

## Description
Currently SQLAlchemy >=1.4.0 is a hard requirement. We are unable to run
`from langchain.vectorstores import FAISS` with SQLAlchemy <1.4.0 due to
top-level imports, even if we aren't even using parts of the library
that use SQLAlchemy. See Testing section for repro. Let's make it so
that langchain is still compatible with SQLAlchemy <1.4.0, especially if
we aren't using parts of langchain that require it.

The main conflict is that SQLAlchemy removed `declarative_base` from
`sqlalchemy.ext.declarative` in 1.4.0 and moved it to `sqlalchemy.orm`.
We can fix this by try-catching the import. This is the same fix as
applied in https://github.com/langchain-ai/langchain/pull/883.

(I see that there seems to be some refactoring going on about isolating
dependencies, e.g.
c87e9fb2ce,
so if this issue will be eventually fixed by isolating imports in
langchain.vectorstores that also works).

## Issue
I can't find a matching issue.

## Dependencies
No additional dependencies

## Maintainer
@hwchase17 since you reviewed
https://github.com/langchain-ai/langchain/pull/883

## Testing
I didn't add a test, but I manually tested this.

1. Current failure:
```
langchain==0.0.305
sqlalchemy==1.3.24
```

``` python
python -i
>>> from langchain.vectorstores import FAISS
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/pay/src/zoolander/vendor3/lib/python3.8/site-packages/langchain/vectorstores/__init__.py", line 58, in <module>
    from langchain.vectorstores.pgembedding import PGEmbedding
  File "/pay/src/zoolander/vendor3/lib/python3.8/site-packages/langchain/vectorstores/pgembedding.py", line 10, in <module>
    from sqlalchemy.orm import Session, declarative_base, relationship
ImportError: cannot import name 'declarative_base' from 'sqlalchemy.orm' (/pay/src/zoolander/vendor3/lib/python3.8/site-packages/sqlalchemy/orm/__init__.py)
```

2. This fix:
```
langchain==<this PR>
sqlalchemy==1.3.24
```

``` python
python -i
>>> from langchain.vectorstores import FAISS
<succeeds>
```
9 months ago
Nuno Campos 4d66756d93
Improve output of Runnable.astream_log() (#11391)
- Make logs a dictionary keyed by run name (and counter for repeats)
- Ensure no output shows up in lc_serializable format
- Fix up repr for RunLog and RunLogPatch

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
9 months ago
Lester Solbakken a30f98f534
Add Vespa vector store (#11329)
Addition of Vespa vector store integration including notebook showing
its use.

Maintainer: @lesters 
Twitter handle: LesterSolbakken
9 months ago
Nuno Campos 58a88f3911
Add optional input_types to prompt template (#11385)
- default MessagesPlaceholder one to list of messages

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
9 months ago
Tomaz Bratanic 71290315cf
Add optional Cypher validation tool (#11078)
LLMs have trouble with consistently getting the relationship direction
accurately. That's why I organized a competition how to best and most
simple to fix it based on the existing schema as a post-processing step.
https://github.com/tomasonjo/cypher-direction-competition

I am adding the winner's code in this PR:
https://github.com/sakusaku-rich/cypher-direction-competition
9 months ago
Bagatur dd514c2781
bump 308 (#11383) 9 months ago
Leonid Kuligin 4f4e0f38fc
a better error description when GCP project is not set (#11377)
- **Description:** a little bit better error description
  - **Issue:** #10879
9 months ago
Nuno Campos 0d80226c64
Add _type to json functions output parser (#11381)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
9 months ago
Bagatur 106608bc89
add default async (#11141) 9 months ago
Nuno Campos b0893c7c6a
Use an enum for configurable_alternatives to make the generated json schema nicer (#11350) 9 months ago
Bagatur b499de2926
Anthropic system message fix (#11301)
Removes human prompt prefix before system message for anthropic models

Bedrock anthropic api enforces that Human and Assistant messages must be
interleaved (cannot have same type twice in a row). We currently treat
System Messages as human messages when converting messages -> string
prompt. Our validation when using Bedrock/BedrockChat raises an error
when this happens. For ChatAnthropic we don't validate this so no error
is raised, but perhaps the behavior is still suboptimal
9 months ago
Massimiliano Angelino 2f83350eac
Feat bedrock cohere support (#11230)
**Description:**
Added support for Cohere command model via Bedrock.
With this change it is now possible to use the `cohere.command-text-v14`
model via Bedrock API.

About Streaming: Cohere model outputs 2 additional chunks at the end of
the text being generated via streaming: a chunk containing the text
`<EOS_TOKEN>`, and a chunk indicating the end of the stream. In this
implementation I chose to ignore both chunks. An alternative solution
could be to replace `<EOS_TOKEN>` with `\n`

Tests: manually tested that the new model work with both
`llm.generate()` and `llm.stream()`.
Tested with `temperature`, `p` and `stop` parameters.

**Issue:** #11181 

**Dependencies:** No new dependencies

**Tag maintainer:** @baskaryan 

**Twitter handle:** mangelino

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
9 months ago
Daniel Butler 939bceccb0
GitHubIssuesLoader Custom API URL Support (#11378)
- **Description:** Adds support for custom API URL in the
GitHubIssuesLoader. This allows it to be used with Github enterprise
instances.
9 months ago
Bagatur 16a80779b9
bump 307 (#11380) 9 months ago
mziru 9e3c1d4463
add HTMLHeaderTextSplitter (#11039)
Description: Similar in concept to the `MarkdownHeaderTextSplitter`, the
`HTMLHeaderTextSplitter` is a "structure-aware" chunker that splits text
at the element level and adds metadata for each header "relevant" to any
given chunk. It can return chunks element by element or combine elements
with the same metadata, with the objectives of (a) keeping related text
grouped (more or less) semantically and (b) preserving context-rich
information encoded in document structures. It can be used with other
text splitters as part of a chunking pipeline.

Dependency: lxml python package

Maintainer: @hwchase17

Twitter handle: @MartinZirulnik

---------

Co-authored-by: PresidioVantage <github@presidiovantage.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
9 months ago
Predrag Gruevski 289de601c8
Use parameterized queries to select SQL schemas. (#11356) 9 months ago
Nuno Campos b0097f8908
In ProgressBarCallback update the progress counter also when runs fin… (#11332) 9 months ago
William FH 06f39be1c2
Wfh/eval max concurrency (#11368) 9 months ago
Aashish Saini 4adb2b399d
Fixed exception type in py files (#11322)
I've refactored the code to ensure that ImportError is consistently
handled. Instead of using ValueError as before, I've now followed the
standard practice of raising ImportError along with clear and
informative error messages. This change enhances the code's clarity and
explicitly signifies that any problems are associated with module
imports.
9 months ago
니콜라스 c6d7124675
Add 'device' to GPT4All (#11216)
Add device to GPT4All

- **Description:** GPT4All now supports GPU. This commit adds the option
to enable it.
- **Issue:** It closes
https://github.com/langchain-ai/langchain/issues/10486

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
9 months ago
Harrison Chase 6e848b879a
add default for async (#11367) 9 months ago
Fynn Flügge 0a4baca291
chore: add kotlin code splitter (#11364)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

- **Description:** Adds Kotlin language to `TextSplitter`

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
9 months ago
Ofer Mendelevitch b93a08079e
Updates to Vectara Implementation (#11366)
Replace this entire comment with:
  - **Description:** updates to documentation and API headers
  - **Tag maintainer:** @baskarya
  - **Twitter handle:** @ofermend
9 months ago
Erick Friis 745e3e29da
add getattr case for llms.type_to_cls_dict (#11362)
For external libraries that depend on `type_to_cls_dict`, adds a
workaround to continue using the old format.

Recommend people use `get_type_to_cls_dict()` instead and only resolve
the imports when they're used.
9 months ago
Vicente Reyes f3e13e7e5a
Use term keyword according to the official python doc glossary (#11338)
- **Description:** use term keyword according to the official python doc
glossary, see https://docs.python.org/3/glossary.html
  - **Issue:** not applicable
  - **Dependencies:** not applicable
  - **Tag maintainer:** @hwchase17
  - **Twitter handle:** vreyespue
9 months ago
Predrag Gruevski 5d6b83d9cf
Make a copy of external data instead of mutating another object's attributes. (#11349)
Fix for a bug surfaced as part of #11339. `mypy` caught this since the
types didn't match up.
9 months ago
Predrag Gruevski 42d979efdd
Improve type hints and interface for SQL execution functionality. (#11353)
The previous API of the `_execute()` function had a few rough edges that
this PR addresses:
- The `fetch` argument was type-hinted as being able to take any string,
but any string other than `"all"` or `"one"` would `raise ValueError`.
The new type hints explicitly declare that only those values are
supported.
- The return type was type-hinted as `Sequence` but using `fetch =
"one"` would actually return a single result item. This was incorrectly
suppressed using `# type: ignore`. We now always return a list.
- Using `fetch = "one"` would return a single item if data was found, or
an empty *list* if no data was found. This was confusing, and we now
always return a list to simplify.
- The return type was `Sequence[Any]` which was a bit difficult to use
since it wasn't clear what one could do with the returned rows. I'm
making the new type `Dict[str, Any]` that corresponds to the column
names and their values in the query.

I've updated the use of this method elsewhere in the file to match the
new behavior.
9 months ago
Mohammad Mohtashim 3bddd708f7
Add memory to sql chain (#8597)
continuation of PR #8550

@hwchase17 please see and merge. And also close the PR #8550.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
9 months ago
Harrison Chase feabf2e0d5
make llm imports optional (#11237) 9 months ago
Harrison Chase 88bad37ec2
fix get_tool_return (#11346) 9 months ago
Harrison Chase bdf865d8e8
better error message on parsing errors (#11342) 9 months ago
Eugene Yurtsev 2343302fc6
Remove langserve from langchain repo (#11288)
LangServe has been moved to a separate repo
9 months ago
William FH 6950b44bfc
Consolidate run collector. Add link helper (#11269)
Instead of:

```
client = Client()
with collect_runs() as cb:
    chain.invoke()
    run = cb.traced_runs[0]
    client.get_run_url(run)
```

it's
```
with tracing_v2_enabled() as cb:
    chain.invoke()
    cb.get_run_url()
```
9 months ago
Nuno Campos 0aedbcf7b2
Pass kwargs in runnable retry (#11324) 9 months ago
Jacob Lee 933655b4ac
Adds Tavily Search API retriever (#11314)
@baskaryan @efriis
9 months ago
David Duong 3ec970cc11
Mark Vertex AI classes as serialisable (#10484)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. These live is docs/extras
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17, @rlancemartin.
 -->

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
9 months ago
David Duong db36a0ee99
Make Google PaLM classes serialisable (#11121)
Similarly to Vertex classes, PaLM classes weren't marked as
serialisable. Should be working fine with LangSmith.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
9 months ago
CG80499 943e4f30d8
Add scoring chain (#11123)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
9 months ago
Predrag Gruevski cd2479dfae
Upgrade `langchain` dependency versions to resolve dependabot alerts. (#11307) 9 months ago
Nuno Campos 4df3191092
Add .configurable_fields() and .configurable_alternatives() to expose fields of a Runnable to be configured at runtime (#11282) 9 months ago
Eugene Yurtsev 5e2d5047af
add LLMBashChain to experimental (#11305)
Add LLMBashChain to experimental
9 months ago
Bagatur 38d5b63a10
Bedrock scheduled tests (#11194) 9 months ago
Eugene Yurtsev f9b565fa8c
Bump min version of numexpr (#11302)
Bump min version
9 months ago
William FH 64febf7751
Make numexpr optional (#11049)
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
9 months ago
Eugene Yurtsev 20b7bd497c
Add pending deprecation warning (#11133)
This PR uses 2 dedicated LangChain warnings types for deprecations
(mirroring python's built in deprecation and pending deprecation
warnings).

These deprecation types are unslienced during initialization in
langchain achieving the same default behavior that we have with our
current warnings approach. However, because these warnings have a
dedicated type, users will be able to silence them selectively (I think
this is strictly better than our current handling of warnings).

The PR adds a deprecation warning to llm symbolic math.

---------

Co-authored-by: Predrag Gruevski <2348618+obi1kenobi@users.noreply.github.com>
9 months ago
Nuno Campos 0638f7b83a
Create new RunnableSerializable base class in preparation for configurable runnables (#11279)
- Also move RunnableBranch to its own file

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
9 months ago
Bagatur 8eec43ed91
bump 306 (#11289) 9 months ago
Nuno Campos c6a720f256 Lint 9 months ago
Nuno Campos 1d46ddd16d Lint 9 months ago
Nuno Campos 17708fc156 Lint 9 months ago
Nuno Campos a3b82d1831 Move RunnableWithFallbacks to its own file 9 months ago
Nuno Campos 01dbfc2bc7 Lint 9 months ago
Nuno Campos a6afd45c63 Lint 9 months ago
Nuno Campos f7dd10b820 Lint 9 months ago
Nuno Campos 040bb2983d Lint 9 months ago
Nuno Campos 52e5a8b43e Create new RunnableSerializable class in preparation for configurable runnables
- Also move RunnableBranch to its own file
9 months ago
Yeonji-Lim 61ab1b1266
Fix typo in docstring (#11256)
Description : Remove meaningless 's' in docstring
9 months ago
Kazuki Maeda a363ab5292
rename repo namespace to langchain-ai (#11259)
### Description
renamed several repository links from `hwchase17` to `langchain-ai`.

### Why
I discovered that the README file in the devcontainer contains an old
repository name, so I took the opportunity to rename the old repository
name in all files within the repository, excluding those that do not
require changes.

### Dependencies
none

### Tag maintainer
@baskaryan

### Twitter handle
[kzk_maeda](https://twitter.com/kzk_maeda)
9 months ago
Dayuan Jiang 17cdeb72ef
minor fix: remove redundant code from OpenAIFunctionsAgent (#11245)
minor fix: remove redundant code from OpenAIFunctionsAgent (#11245)
9 months ago
Michael Goin 33eb5f8300
Update DeepSparse LLM (#11236)
**Description:** Adds streaming and many more sampling parameters to the
DeepSparse interface

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
9 months ago
Eugene Yurtsev f91ce4eddf
Bump deps in langserve (#11234)
Bump deps in langserve lockfile
9 months ago
Haozhe 4c97a10bd0
fix code injection vuln (#11233)
- **Description:** Fix a code injection vuln by adding one more keyword
into the filtering list
  - **Issue:** N/A
  - **Dependencies:** N/A
  - **Tag maintainer:** 
  - **Twitter handle:**

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
9 months ago
Eugene Yurtsev aebdb1ad01
Ignore aadd (#11235) 9 months ago
Eugene Yurtsev 8b4cb4eb60
Add type to message chunks (#11232) 9 months ago
Nuno Campos fb66b392c6
Implement RunnablePassthrough.assign(...) (#11222)
Passes through dict input and assigns additional keys

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
9 months ago
Nuno Campos 1ddf9f74b2
Add a streaming json parser (#11193)
<img width="1728" alt="Screenshot 2023-09-28 at 20 15 01"
src="https://github.com/langchain-ai/langchain/assets/56902/ed0644c3-6db7-41b9-9543-e34fce46d3e5">


<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
9 months ago
Nuno Campos ee56c616ff Remove flawed test
- It is not possible to access properties on classes, only on instances, therefore this test is not something we can implement
9 months ago
Nuno Campos f3f3f71811 Lint 9 months ago
Nuno Campos f6b0b065d3
Update json.py
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
9 months ago
Nuno Campos cbe18057b0
Update json.py
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
9 months ago
Nuno Campos aa8b4120a8 Keep exceptions when not in streaming mode 9 months ago
Nuno Campos 1f30e25681 Lint 9 months ago
Nuno Campos c9d0f2b984 Combine with existing json output parsers 9 months ago
Eugene Yurtsev b4354b7694
Make tests stricter, remove old code, fix up pydantic import when using v2 (#11231)
Make tests stricter, remove old code, fix up pydantic import when using v2 (#11231)
9 months ago
Eugene Yurtsev 572968fee3
Using langchain input types (#11204)
Using langchain input type
9 months ago
Bagatur 77c7c9ab97
bump 305 (#11224) 9 months ago
Nuno Campos 4b8442896b Make test deterministic 9 months ago
Attila Tőkés ba9371854f
OpenAI gpt-3.5-turbo-instruct cost information (#11218)
Added pricing info for `gpt-3.5-turbo-instruct` for OpenAI and Azure
OpenAI.

Co-authored-by: Attila Tőkés <atokes@rws.com>
9 months ago
Eugene Yurtsev de69ea26e8
Suppress warnings in interactive env that stem from tab completion (#11190)
Suppress warnings in interactive environments that can arise from users 
relying on tab completion (without even using deprecated modules).

jupyter seems to filter warnings by default (at least for me), but
ipython surfaces them all
9 months ago
Jon Saginaw 715ffda28b
mongodb doc loader init (#10645)
- **Description:** A Document Loader for MongoDB
  - **Issue:** n/a
  - **Dependencies:** Motor, the async driver for MongoDB
  - **Tag maintainer:** n/a
  - **Twitter handle:** pigpenblue

Note that an initial mongodb document loader was created 4 months ago,
but the [PR ](https://github.com/langchain-ai/langchain/pull/4285)was
never pulled in. @leo-gan had commented on that PR, but given it is
extremely far behind the master branch and a ton has changed in
Langchain since then (including repo name and structure), I rewrote the
branch and issued a new PR with the expectation that the old one can be
closed.

Please reference that old PR for comments/context, but it can be closed
in favor of this one. Thanks!

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
9 months ago
Nuno Campos 3d8aa88e26 Add async tests and comments 9 months ago
Nuno Campos 4ad0f3de2b
Add RunnableGenerator (#11214)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
9 months ago
Guy Korland 748a757306
Clean warnings: replace type with isinstance and fix syntax (#11219)
Clean warnings: replace type with `isinstance` and fix on notebook
syntax syntax
9 months ago
Nuno Campos 091d8845d5 Backwards compat 9 months ago
Nuno Campos 4e28a7a513 Implement diff 9 months ago
Nuno Campos 5cbe2b7b6a Implement diff 9 months ago
Nuno Campos 6c0a6b70e0 WIP Add tests§ 9 months ago
Nuno Campos 63f2ef8d1c Implement str one 9 months ago
Nuno Campos f672b39cc9 Add a streaming json parser 9 months ago
Nuno Campos 2387647d30 Lint 9 months ago
Nuno Campos 0318cdd33c Add tests 9 months ago
Nuno Campos b67db8deaa Add RunnableGenerator 9 months ago
Nuno Campos e35ea565d1 Lint 9 months ago
Nuno Campos 7f589ebbc2 Lint 9 months ago
Nuno Campos 8be598f504 Fix invocation 9 months ago
Nuno Campos 6eb6c45c98 Enable creating Tools from any Runnable 9 months ago