Commit Graph

7719 Commits (c53aa5cd37da7a2b6c6630097b813064247c460c)
 

Author SHA1 Message Date
nvpranak 91bcc9c5c9
community[minor]: Nemo embeddings(#16206)
This PR is adding support for NVIDIA NeMo embeddings issue #16095.

---------

Co-authored-by: Praveen Nakshatrala <pnakshatrala@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
Mattt394 7c6009b76f
experimental[patch]: Fixed typos in SmartLLMChain ideation and critique prompts (#11507)
Noticed and fixed a few typos in the SmartLLMChain default ideation and
critique prompts

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
5 months ago
Erick Friis 86d3e42853
core[minor]: add name to basemessage (#17539)
Adds an optional name param to our base message to support passing names
into LLMs.

OpenAI supports having a name on anything except tool message now
(system, ai, user/human).
5 months ago
Mateusz Szewczyk 916332ef5b
ibm: added partners package `langchain_ibm`, added llm (#16512)
- **Description:** Added `langchain_ibm` as an langchain partners
package of IBM [watsonx.ai](https://www.ibm.com/products/watsonx-ai) LLM
provider (`WatsonxLLM`)
- **Dependencies:**
[ibm-watsonx-ai](https://pypi.org/project/ibm-watsonx-ai/),
  - **Tag maintainer:** : 
---------

Co-authored-by: Erick Friis <erick@langchain.dev>
5 months ago
Shawn f6d3a3546f
community[patch]: document_loaders: modified athena key logic to handle s3 uris without a prefix (#17526)
https://github.com/langchain-ai/langchain/issues/17525

### Example Code

```python
from langchain_community.document_loaders.athena import AthenaLoader

database_name = "database"
s3_output_path = "s3://bucket-no-prefix"
query="""SELECT 
  CAST(extract(hour FROM current_timestamp) AS INTEGER) AS current_hour,
  CAST(extract(minute FROM current_timestamp) AS INTEGER) AS current_minute,
  CAST(extract(second FROM current_timestamp) AS INTEGER) AS current_second;
"""
profile_name = "AdministratorAccess"

loader = AthenaLoader(
    query=query,
    database=database_name,
    s3_output_uri=s3_output_path,
    profile_name=profile_name,
)

documents = loader.load()
print(documents)
```



### Error Message and Stack Trace (if applicable)

NoSuchKey: An error occurred (NoSuchKey) when calling the GetObject
operation: The specified key does not exist

### Description

Athena Loader errors when result s3 bucket uri has no prefix. The Loader
instance call results in a "NoSuchKey: An error occurred (NoSuchKey)
when calling the GetObject operation: The specified key does not exist."
error.

If s3_output_path contains a prefix like:

```python
s3_output_path = "s3://bucket-with-prefix/prefix"
```

Execution works without an error.

## Suggested solution

Modify:

```python
key = "/".join(tokens[1:]) + "/" + query_execution_id + ".csv"
```

to

```python
key = "/".join(tokens[1:]) + ("/" if tokens[1:] else "") + query_execution_id + ".csv"
```


9e8a3fc4ff/libs/community/langchain_community/document_loaders/athena.py (L128)


### System Info


System Information
------------------
> OS:  Darwin
> OS Version: Darwin Kernel Version 22.6.0: Fri Sep 15 13:41:30 PDT
2023; root:xnu-8796.141.3.700.8~1/RELEASE_ARM64_T8103
> Python Version:  3.9.9 (main, Jan  9 2023, 11:42:03) 
[Clang 14.0.0 (clang-1400.0.29.102)]

Package Information
-------------------
> langchain_core: 0.1.23
> langchain: 0.1.7
> langchain_community: 0.0.20
> langsmith: 0.0.87
> langchain_openai: 0.0.6
> langchainhub: 0.1.14

Packages not installed (Not Necessarily a Problem)
--------------------------------------------------
The following packages were not found:

> langgraph
> langserve

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
wulixuan c776cfc599
community[minor]: integrate with model Yuan2.0 (#15411)
1. integrate with
[`Yuan2.0`](https://github.com/IEIT-Yuan/Yuan-2.0/blob/main/README-EN.md)
2. update `langchain.llms`
3. add a new doc for [Yuan2.0
integration](docs/docs/integrations/llms/yuan2.ipynb)

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
Philippe PRADOS d07db457fc
community[patch]: Fix SQLAlchemyMd5Cache race condition (#16279)
If the SQLAlchemyMd5Cache is shared among multiple processes, it is
possible to encounter a race condition during the cache update.

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
5 months ago
Alex Peplowski 70c296ae96
community[patch]: Expose Anthropic Retry Logic (#17069)
**Description:**

Expose Anthropic's retry logic, so that `max_retries` can be configured
via langchain. Anthropic's retry logic is implemented in their Python
SDK here:
https://github.com/anthropics/anthropic-sdk-python?tab=readme-ov-file#retries

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
DanisJiang de9a6cdf16
experimental[patch]: Enhance protection against arbitrary code execution in PALChain (#17091)
- **Description:** Block some ways to trigger arbitrary code execution
bug in PALChain.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
5 months ago
Lyndsey 8562a1e7d4
community[patch]: support query filters for NotionDBLoader (#17217)
- **Description:** Support filtering databases in the use case where
devs do not want to query ALL entries within a DB,
- **Issue:** N/A,
- **Dependencies:** N/A,
- **Twitter handle:** I don't have Twitter but feel free to tag my
Github!

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
5 months ago
volodymyr-memsql e36bc379f2
community[patch]: Add vector index support to SingleStoreDB VectorStore (#17308)
This pull request introduces support for various Approximate Nearest
Neighbor (ANN) vector index algorithms in the VectorStore class,
starting from version 8.5 of SingleStore DB. Leveraging this enhancement
enables users to harness the power of vector indexing, significantly
boosting search speed, particularly when handling large sets of vectors.

---------

Co-authored-by: Volodymyr Tkachuk <vtkachuk-ua@singlestore.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
Kate Silverstein 0bc4a9b3fc
community[minor]: Adds Llamafile as an LLM (#17431)
* **Description:** Adds a simple LLM implementation for interacting with
[llamafile](https://github.com/Mozilla-Ocho/llamafile)-based models.
* **Dependencies:** N/A
* **Issue:** N/A

**Detail**
[llamafile](https://github.com/Mozilla-Ocho/llamafile) lets you run LLMs
locally from a single file on most computers without installing any
dependencies.

To use the llamafile LLM implementation, the user needs to:

1. Download a llamafile e.g.
https://huggingface.co/jartine/TinyLlama-1.1B-Chat-v1.0-GGUF/resolve/main/TinyLlama-1.1B-Chat-v1.0.Q5_K_M.llamafile?download=true
2. Make the file executable.
3. Run the llamafile in 'server mode'. (All llamafiles come packaged
with a lightweight server; by default, the server listens at
`http://localhost:8080`.)


```bash
wget https://url/of/model.llamafile
chmod +x model.llamafile
./model.llamafile --server --nobrowser
```

Now, the user can invoke the LLM via the LangChain client:

```python
from langchain_community.llms.llamafile import Llamafile

llm = Llamafile()

llm.invoke("Tell me a joke.")
```
5 months ago
Rakib Hosen 5ce1827d31
community[patch]: fix import in language parser (#17538)
- **Description:** Resolving import error in language_parser.py during
"from langchain.langchain.text_splitter import Language - **Issue:** the
issue #17536
- **Dependencies:** NO
- **Twitter handle:** @iRakibHosen

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
Raunak 685d62b032
community[patch]: Added functions in NetworkxEntityGraph class (#17535)
- **Description:** 
1. Added _clear_edges()_ and _get_number_of_nodes()_ functions in
NetworkxEntityGraph class.
2. Added the above two function in graph_networkx_qa.ipynb
documentation.
5 months ago
Erick Friis bfaa8c3048
anthropic[patch]: de-beta anthropic messages, release 0.0.2 (#17540) 5 months ago
Erick Friis a99c667c22
partners: version constraints (#17492)
Core should be ^0.1 by default

Careful about 0.x.y and 0.0.z packages
5 months ago
Erick Friis d7418acbe1
nomic[patch]: release 0.0.2, dimensionality (#17534)
- nomic[patch]: release 0.0.2
- x
5 months ago
Bagatur 9e8a3fc4ff
infra: rm @ from pr template (#17507) 5 months ago
shibuiwilliam c502736841
infra: add test for ensemble retriever to ensure multiple retrievers (#8401)
Add tests to ensemble retriever to ensure it works with combination of
multiple retrievers

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
Qihui Xie 5738143d4b
add mongodb_store (#13801)
# Add MongoDB storage
  - **Description:** 
  Add MongoDB Storage as an option for large doc store. 

Example usage: 
```Python
# Instantiate the MongodbStore with a MongoDB connection
from langchain.storage import MongodbStore

mongo_conn_str = "mongodb://localhost:27017/"
mongodb_store = MongodbStore(mongo_conn_str, db_name="test-db",
                                collection_name="test-collection")

# Set values for keys
doc1 = Document(page_content='test1')
doc2 = Document(page_content='test2')
mongodb_store.mset([("key1", doc1), ("key2", doc2)])

# Get values for keys
values = mongodb_store.mget(["key1", "key2"])
# [doc1, doc2]

# Iterate over keys
for key in mongodb_store.yield_keys():
    print(key)

# Delete keys
mongodb_store.mdelete(["key1", "key2"])
 ```

  - **Dependencies:**
  Use `mongomock` for integration test.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
5 months ago
Mo Latif 50b48a8e6a
langchain[patch]: Invoke chain prep_inputs and prep_outputs inside try block to catch validation errors (#16644)
- **Description:** Callback manager can't catch chain input or output
validation errors because `prepare_input` and `prepare_output` are not
part of the try/raise logic, this PR fixes that logic.
 
  - **Issue:** #15954
5 months ago
Christophe Bornet a8f530bc4d
Add async methods to CacheBackedEmbeddings (#16873)
Adds async methods to CacheBackedEmbeddings
5 months ago
Bagatur dd68a8716e
infra: update rtd yaml (#17502) 5 months ago
Bagatur 1aeb52caac
infra: merge in master during api docs build (#17494) 5 months ago
Bagatur 54373fb384
infra: add api docs build GHA (#17493) 5 months ago
Bagatur 50de7a31f0
langchain[patch]: structured output chain nits (#17291) 5 months ago
Nat Noordanus 8a3b74fe1f
community[patch]: Fix pydantic ForwardRef error in BedrockBase (#17416)
- **Description:** Fixes a type annotation issue in the definition of
BedrockBase. This issue was that the annotation for the `config`
attribute includes a ForwardRef to `botocore.client.Config` which is
only imported when `TYPE_CHECKING`. This can cause pydantic to raise an
error like `pydantic.errors.ConfigError: field "config" not yet prepared
so type is still a ForwardRef, ...`.
  - **Issue:** N/A
  - **Dependencies:** N/A
  - **Twitter handle:** `@__nat_n__`
5 months ago
Bagatur 2c076bebc9
docs: fix self query redirect (#17490) 5 months ago
Ashley Xu f746a73e26
Add the BQ job usage tracking from LangChain (#17123)
- **Description:**
Add the BQ job usage tracking from LangChain

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
5 months ago
Bagatur 5dca107621
docs: update providers (#17488) 5 months ago
JongRok BAEK 8d6cc90fc5
langchain.core : Use shallow copy for schema manipulation in JsonOutputParser.get_format_instructions (#17162)
- **Description :**  

Fix: Use shallow copy for schema manipulation in get_format_instructions

Prevents side effects on the original schema object by using a
dictionary comprehension for a safer and more controlled manipulation of
schema key-value pairs, enhancing code reliability.

  - **Issue:**  #17161 
  - **Dependencies:** None
  -  **Twitter handle:** None
5 months ago
Rave Harpaz 90f55e6bd1
Documentation/add update documentation for oci (#17473)
Thank you for contributing to LangChain!

Checklist:

- **PR title**: docs: add & update docs for Oracle Cloud Infrastructure
(OCI) integrations

- **Description**: adding and updating documentation for two
integrations - OCI Generative AI & OCI Data Science
(1) adding integration page for OCI Generative AI embeddings (@baskaryan
request,
         docs/docs/integrations/text_embedding/oci_generative_ai.ipynb)
(2) updating integration page for OCI Generative AI llms
(docs/docs/integrations/llms/oci_generative_ai.ipynb)
(3) adding platform documentation for OCI (@baskaryan request,
docs/docs/integrations/platforms/oci.mdx). this combines the
          integrations of OCI Generative AI & OCI Data Science
(4) if possible, requesting to be added to 'Featured Community
Providers' so supplying a modified
docs/docs/integrations/platforms/index.mdx to reflect the addition
- **Issue:** none

 - **Dependencies:** no new dependencies 

 - **Twitter handle:**

---------

Co-authored-by: MING KANG <ming.kang@oracle.com>
5 months ago
Bagatur b5d3416563
experimental[patch]: Release 0.0.51 (#17484) 5 months ago
Bagatur de7c4b277c
langchain[patch]: Release 0.1.7 (#17482) 5 months ago
Bagatur 39342d98d6
community[patch]: Release 0.0.20 (#17480) 5 months ago
Bagatur 89b765ec27
core[patch]: Release 0.1.23 (#17479) 5 months ago
Max Jakob ab3d944667
community[patch]: ElasticsearchStore: preserve user headers (#16830)
Users can provide an Elasticsearch connection with custom headers. This
PR makes sure these headers are preserved when adding the langchain user
agent header.
5 months ago
Erick Friis 112e10e933
infra: azure release integration testing secrets (#17476) 5 months ago
Erick Friis 9eb1b56e73
pinecone[patch]: release 0.0.2 (#17477) 5 months ago
Erick Friis 37678471c4
openai[patch]: relax tiktoken constraint, release 0.0.6 (#17472) 5 months ago
Wendy H. Chun 2df7387c91
langchain[patch]: Fix to avoid infinite loop during collapse chain in map reduce (#16253)
- **Description:** Depending on `token_max` used in
`load_summarize_chain`, it could cause an infinite loop when documents
cannot collapse under `token_max`. This change would not affect the
existing feature, but it also gives an option to users to avoid the
situation.
  - **Issue:** https://github.com/langchain-ai/langchain/issues/16251
  - **Dependencies:** None
  - **Twitter handle:** None

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
wulixuan 5d06797905
community[minor]: integrate chat models with Yuan2.0 (#16575)
1. integrate chat models with
[`Yuan2.0`](https://github.com/IEIT-Yuan/Yuan-2.0/blob/main/README-EN.md)
2. add a new doc for [Yuan2.0
integration](docs/docs/integrations/llms/yuan2.ipynb)
 
Yuan2.0 is a new generation Fundamental Large Language Model developed
by IEIT System. We have published all three models, Yuan 2.0-102B, Yuan
2.0-51B, and Yuan 2.0-2B.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
Taha Khabouss 15baffc484
langchain[patch]: Ensure that the Elasticsearch Query Translator functions accurately w… (#17044)
Description:
Addresses a problem where the Date type within an Elasticsearch
SelfQueryRetriever would encounter difficulties in generating a valid
query.

Issue: #17042

---------

Co-authored-by: Max Jakob <max.jakob@elastic.co>
Co-authored-by: Bagatur <baskaryan@gmail.com>
5 months ago
Erick Friis e5c76f9dbd
pinecone[patch]: poetry update (#17471) 5 months ago
Erick Friis 10bdf2422c
pinecone[patch]: release 0.0.2rc0, remove simsimd dep (#17469) 5 months ago
Erick Friis 065cde69b1
google-genai[patch]: release 0.0.9, safety settings docs (#17432) 5 months ago
Sergey Kozlov db6f266d97
core: improve None value processing in merge_dicts() (#17462)
- **Description:** fix `None` and `0` merging in `merge_dicts()`, add
tests.
```python
from langchain_core.utils._merge import merge_dicts
assert merge_dicts({"a": None}, {"a": 0}) == {"a": 0}
```

---------

Co-authored-by: Sergey Kozlov <sergey.kozlov@ludditelabs.io>
5 months ago
Ian Gregory e5472b5eb8
Framework for supporting more languages in LanguageParser (#13318)
## Description

I am submitting this for a school project as part of a team of 5. Other
team members are @LeilaChr, @maazh10, @Megabear137, @jelalalamy. This PR
also has contributions from community members @Harrolee and @Mario928.

Initial context is in the issue we opened (#11229).

This pull request adds:

- Generic framework for expanding the languages that `LanguageParser`
can handle, using the
[tree-sitter](https://github.com/tree-sitter/py-tree-sitter#py-tree-sitter)
parsing library and existing language-specific parsers written for it
- Support for the following additional languages in `LanguageParser`:
  - C
  - C++
  - C#
  - Go
- Java (contributed by @Mario928
https://github.com/ThatsJustCheesy/langchain/pull/2)
  - Kotlin
  - Lua
  - Perl
  - Ruby
  - Rust
  - Scala
- TypeScript (contributed by @Harrolee
https://github.com/ThatsJustCheesy/langchain/pull/1)

Here is the [design
document](https://docs.google.com/document/d/17dB14cKCWAaiTeSeBtxHpoVPGKrsPye8W0o_WClz2kk)
if curious, but no need to read it.

## Issues

- Closes #11229
- Closes #10996
- Closes #8405

## Dependencies

`tree_sitter` and `tree_sitter_languages` on PyPI. We have tried to add
these as optional dependencies.

## Documentation

We have updated the list of supported languages, and also added a
section to `source_code.ipynb` detailing how to add support for
additional languages using our framework.

## Maintainer

- @hwchase17 (previously reviewed
https://github.com/langchain-ai/langchain/pull/6486)

Thanks!!

## Git commits

We will gladly squash any/all of our commits (esp merge commits) if
necessary. Let us know if this is desirable, or if you will be
squash-merging anyway.

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: Maaz Hashmi <mhashmi373@gmail.com>
Co-authored-by: LeilaChr <87657694+LeilaChr@users.noreply.github.com>
Co-authored-by: Jeremy La <jeremylai511@gmail.com>
Co-authored-by: Megabear137 <zubair.alnoor27@gmail.com>
Co-authored-by: Lee Harrold <lhharrold@sep.com>
Co-authored-by: Mario928 <88029051+Mario928@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
5 months ago
merlin-quix 729c6d6827
docs: add use case for managing chat messages via Apache Kafka (#16771)
Adding a new notebook that demonstrates how to use LangChain's standard
chat features while passing the chat messages back and forth via Apache
Kafka.

This goal is to simulate an architecture where the chat front end and
the LLM are running as separate services that need to communicate with
one another over an internal nework.

It's an alternative to typical pattern of requesting a reponse from the
model via a REST API (there's more info on why you would want to do this
at the end of the notebook).

NOTE: Assuming "uses cases" is the right place for this but feel free to
propose another location.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
5 months ago
Bagatur 3925071dd6
langchain[patch], templates[patch]: fix multi query retriever, web re… (#17434)
…search retriever

Fixes #17352
5 months ago