Commit Graph

7514 Commits

Author SHA1 Message Date
wulixuan
c776cfc599
community[minor]: integrate with model Yuan2.0 (#15411)
1. integrate with
[`Yuan2.0`](https://github.com/IEIT-Yuan/Yuan-2.0/blob/main/README-EN.md)
2. update `langchain.llms`
3. add a new doc for [Yuan2.0
integration](docs/docs/integrations/llms/yuan2.ipynb)

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-02-14 11:46:20 -08:00
Philippe PRADOS
d07db457fc
community[patch]: Fix SQLAlchemyMd5Cache race condition (#16279)
If the SQLAlchemyMd5Cache is shared among multiple processes, it is
possible to encounter a race condition during the cache update.

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-02-14 11:45:28 -08:00
Alex Peplowski
70c296ae96
community[patch]: Expose Anthropic Retry Logic (#17069)
**Description:**

Expose Anthropic's retry logic, so that `max_retries` can be configured
via langchain. Anthropic's retry logic is implemented in their Python
SDK here:
https://github.com/anthropics/anthropic-sdk-python?tab=readme-ov-file#retries

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-02-14 11:44:28 -08:00
DanisJiang
de9a6cdf16
experimental[patch]: Enhance protection against arbitrary code execution in PALChain (#17091)
- **Description:** Block some ways to trigger arbitrary code execution
bug in PALChain.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-02-14 11:44:07 -08:00
Lyndsey
8562a1e7d4
community[patch]: support query filters for NotionDBLoader (#17217)
- **Description:** Support filtering databases in the use case where
devs do not want to query ALL entries within a DB,
- **Issue:** N/A,
- **Dependencies:** N/A,
- **Twitter handle:** I don't have Twitter but feel free to tag my
Github!

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-02-14 11:43:41 -08:00
volodymyr-memsql
e36bc379f2
community[patch]: Add vector index support to SingleStoreDB VectorStore (#17308)
This pull request introduces support for various Approximate Nearest
Neighbor (ANN) vector index algorithms in the VectorStore class,
starting from version 8.5 of SingleStore DB. Leveraging this enhancement
enables users to harness the power of vector indexing, significantly
boosting search speed, particularly when handling large sets of vectors.

---------

Co-authored-by: Volodymyr Tkachuk <vtkachuk-ua@singlestore.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-02-14 11:43:12 -08:00
Kate Silverstein
0bc4a9b3fc
community[minor]: Adds Llamafile as an LLM (#17431)
* **Description:** Adds a simple LLM implementation for interacting with
[llamafile](https://github.com/Mozilla-Ocho/llamafile)-based models.
* **Dependencies:** N/A
* **Issue:** N/A

**Detail**
[llamafile](https://github.com/Mozilla-Ocho/llamafile) lets you run LLMs
locally from a single file on most computers without installing any
dependencies.

To use the llamafile LLM implementation, the user needs to:

1. Download a llamafile e.g.
https://huggingface.co/jartine/TinyLlama-1.1B-Chat-v1.0-GGUF/resolve/main/TinyLlama-1.1B-Chat-v1.0.Q5_K_M.llamafile?download=true
2. Make the file executable.
3. Run the llamafile in 'server mode'. (All llamafiles come packaged
with a lightweight server; by default, the server listens at
`http://localhost:8080`.)


```bash
wget https://url/of/model.llamafile
chmod +x model.llamafile
./model.llamafile --server --nobrowser
```

Now, the user can invoke the LLM via the LangChain client:

```python
from langchain_community.llms.llamafile import Llamafile

llm = Llamafile()

llm.invoke("Tell me a joke.")
```
2024-02-14 11:15:24 -08:00
Rakib Hosen
5ce1827d31
community[patch]: fix import in language parser (#17538)
- **Description:** Resolving import error in language_parser.py during
"from langchain.langchain.text_splitter import Language - **Issue:** the
issue #17536
- **Dependencies:** NO
- **Twitter handle:** @iRakibHosen

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-02-14 11:11:23 -08:00
Raunak
685d62b032
community[patch]: Added functions in NetworkxEntityGraph class (#17535)
- **Description:** 
1. Added _clear_edges()_ and _get_number_of_nodes()_ functions in
NetworkxEntityGraph class.
2. Added the above two function in graph_networkx_qa.ipynb
documentation.
2024-02-14 11:02:24 -08:00
Erick Friis
bfaa8c3048
anthropic[patch]: de-beta anthropic messages, release 0.0.2 (#17540) 2024-02-14 10:31:45 -08:00
Erick Friis
a99c667c22
partners: version constraints (#17492)
Core should be ^0.1 by default

Careful about 0.x.y and 0.0.z packages
2024-02-14 08:57:46 -08:00
Erick Friis
d7418acbe1
nomic[patch]: release 0.0.2, dimensionality (#17534)
- nomic[patch]: release 0.0.2
- x
2024-02-14 08:38:07 -08:00
Bagatur
9e8a3fc4ff
infra: rm @ from pr template (#17507) 2024-02-13 21:29:22 -08:00
shibuiwilliam
c502736841
infra: add test for ensemble retriever to ensure multiple retrievers (#8401)
Add tests to ensemble retriever to ensure it works with combination of
multiple retrievers

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-02-13 21:22:03 -08:00
Qihui Xie
5738143d4b
add mongodb_store (#13801)
# Add MongoDB storage
  - **Description:** 
  Add MongoDB Storage as an option for large doc store. 

Example usage: 
```Python
# Instantiate the MongodbStore with a MongoDB connection
from langchain.storage import MongodbStore

mongo_conn_str = "mongodb://localhost:27017/"
mongodb_store = MongodbStore(mongo_conn_str, db_name="test-db",
                                collection_name="test-collection")

# Set values for keys
doc1 = Document(page_content='test1')
doc2 = Document(page_content='test2')
mongodb_store.mset([("key1", doc1), ("key2", doc2)])

# Get values for keys
values = mongodb_store.mget(["key1", "key2"])
# [doc1, doc2]

# Iterate over keys
for key in mongodb_store.yield_keys():
    print(key)

# Delete keys
mongodb_store.mdelete(["key1", "key2"])
 ```

  - **Dependencies:**
  Use `mongomock` for integration test.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-02-13 22:33:22 -05:00
Mo Latif
50b48a8e6a
langchain[patch]: Invoke chain prep_inputs and prep_outputs inside try block to catch validation errors (#16644)
- **Description:** Callback manager can't catch chain input or output
validation errors because `prepare_input` and `prepare_output` are not
part of the try/raise logic, this PR fixes that logic.
 
  - **Issue:** #15954
2024-02-13 22:23:11 -05:00
Christophe Bornet
a8f530bc4d
Add async methods to CacheBackedEmbeddings (#16873)
Adds async methods to CacheBackedEmbeddings
2024-02-13 22:16:27 -05:00
Bagatur
dd68a8716e
infra: update rtd yaml (#17502) 2024-02-13 18:16:44 -08:00
Bagatur
1aeb52caac
infra: merge in master during api docs build (#17494) 2024-02-13 18:08:07 -08:00
Bagatur
54373fb384
infra: add api docs build GHA (#17493) 2024-02-13 16:46:58 -08:00
Bagatur
50de7a31f0
langchain[patch]: structured output chain nits (#17291) 2024-02-13 16:45:29 -08:00
Nat Noordanus
8a3b74fe1f
community[patch]: Fix pydantic ForwardRef error in BedrockBase (#17416)
- **Description:** Fixes a type annotation issue in the definition of
BedrockBase. This issue was that the annotation for the `config`
attribute includes a ForwardRef to `botocore.client.Config` which is
only imported when `TYPE_CHECKING`. This can cause pydantic to raise an
error like `pydantic.errors.ConfigError: field "config" not yet prepared
so type is still a ForwardRef, ...`.
  - **Issue:** N/A
  - **Dependencies:** N/A
  - **Twitter handle:** `@__nat_n__`
2024-02-13 16:15:55 -08:00
Bagatur
2c076bebc9
docs: fix self query redirect (#17490) 2024-02-13 15:44:56 -08:00
Ashley Xu
f746a73e26
Add the BQ job usage tracking from LangChain (#17123)
- **Description:**
Add the BQ job usage tracking from LangChain

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-02-13 14:47:57 -08:00
Bagatur
5dca107621
docs: update providers (#17488) 2024-02-13 14:00:15 -08:00
JongRok BAEK
8d6cc90fc5
langchain.core : Use shallow copy for schema manipulation in JsonOutputParser.get_format_instructions (#17162)
- **Description :**  

Fix: Use shallow copy for schema manipulation in get_format_instructions

Prevents side effects on the original schema object by using a
dictionary comprehension for a safer and more controlled manipulation of
schema key-value pairs, enhancing code reliability.

  - **Issue:**  #17161 
  - **Dependencies:** None
  -  **Twitter handle:** None
2024-02-13 13:30:53 -08:00
Rave Harpaz
90f55e6bd1
Documentation/add update documentation for oci (#17473)
Thank you for contributing to LangChain!

Checklist:

- **PR title**: docs: add & update docs for Oracle Cloud Infrastructure
(OCI) integrations

- **Description**: adding and updating documentation for two
integrations - OCI Generative AI & OCI Data Science
(1) adding integration page for OCI Generative AI embeddings (@baskaryan
request,
         docs/docs/integrations/text_embedding/oci_generative_ai.ipynb)
(2) updating integration page for OCI Generative AI llms
(docs/docs/integrations/llms/oci_generative_ai.ipynb)
(3) adding platform documentation for OCI (@baskaryan request,
docs/docs/integrations/platforms/oci.mdx). this combines the
          integrations of OCI Generative AI & OCI Data Science
(4) if possible, requesting to be added to 'Featured Community
Providers' so supplying a modified
docs/docs/integrations/platforms/index.mdx to reflect the addition
- **Issue:** none

 - **Dependencies:** no new dependencies 

 - **Twitter handle:**

---------

Co-authored-by: MING KANG <ming.kang@oracle.com>
2024-02-13 13:26:23 -08:00
Bagatur
b5d3416563
experimental[patch]: Release 0.0.51 (#17484) 2024-02-13 13:14:38 -08:00
Bagatur
de7c4b277c
langchain[patch]: Release 0.1.7 (#17482) 2024-02-13 13:13:04 -08:00
Bagatur
39342d98d6
community[patch]: Release 0.0.20 (#17480) 2024-02-13 13:01:51 -08:00
Bagatur
89b765ec27
core[patch]: Release 0.1.23 (#17479) 2024-02-13 12:55:45 -08:00
Max Jakob
ab3d944667
community[patch]: ElasticsearchStore: preserve user headers (#16830)
Users can provide an Elasticsearch connection with custom headers. This
PR makes sure these headers are preserved when adding the langchain user
agent header.
2024-02-13 12:37:35 -08:00
Erick Friis
112e10e933
infra: azure release integration testing secrets (#17476) 2024-02-13 12:17:06 -08:00
Erick Friis
9eb1b56e73
pinecone[patch]: release 0.0.2 (#17477) 2024-02-13 12:01:45 -08:00
Erick Friis
37678471c4
openai[patch]: relax tiktoken constraint, release 0.0.6 (#17472) 2024-02-13 11:25:55 -08:00
Wendy H. Chun
2df7387c91
langchain[patch]: Fix to avoid infinite loop during collapse chain in map reduce (#16253)
- **Description:** Depending on `token_max` used in
`load_summarize_chain`, it could cause an infinite loop when documents
cannot collapse under `token_max`. This change would not affect the
existing feature, but it also gives an option to users to avoid the
situation.
  - **Issue:** https://github.com/langchain-ai/langchain/issues/16251
  - **Dependencies:** None
  - **Twitter handle:** None

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-02-13 10:55:32 -08:00
wulixuan
5d06797905
community[minor]: integrate chat models with Yuan2.0 (#16575)
1. integrate chat models with
[`Yuan2.0`](https://github.com/IEIT-Yuan/Yuan-2.0/blob/main/README-EN.md)
2. add a new doc for [Yuan2.0
integration](docs/docs/integrations/llms/yuan2.ipynb)
 
Yuan2.0 is a new generation Fundamental Large Language Model developed
by IEIT System. We have published all three models, Yuan 2.0-102B, Yuan
2.0-51B, and Yuan 2.0-2B.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-02-13 10:55:14 -08:00
Taha Khabouss
15baffc484
langchain[patch]: Ensure that the Elasticsearch Query Translator functions accurately w… (#17044)
Description:
Addresses a problem where the Date type within an Elasticsearch
SelfQueryRetriever would encounter difficulties in generating a valid
query.

Issue: #17042

---------

Co-authored-by: Max Jakob <max.jakob@elastic.co>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-02-13 10:54:24 -08:00
Erick Friis
e5c76f9dbd
pinecone[patch]: poetry update (#17471) 2024-02-13 10:32:29 -08:00
Erick Friis
10bdf2422c
pinecone[patch]: release 0.0.2rc0, remove simsimd dep (#17469) 2024-02-13 10:02:16 -08:00
Erick Friis
065cde69b1
google-genai[patch]: release 0.0.9, safety settings docs (#17432) 2024-02-13 10:01:25 -08:00
Sergey Kozlov
db6f266d97
core: improve None value processing in merge_dicts() (#17462)
- **Description:** fix `None` and `0` merging in `merge_dicts()`, add
tests.
```python
from langchain_core.utils._merge import merge_dicts
assert merge_dicts({"a": None}, {"a": 0}) == {"a": 0}
```

---------

Co-authored-by: Sergey Kozlov <sergey.kozlov@ludditelabs.io>
2024-02-13 08:48:02 -08:00
Ian Gregory
e5472b5eb8
Framework for supporting more languages in LanguageParser (#13318)
## Description

I am submitting this for a school project as part of a team of 5. Other
team members are @LeilaChr, @maazh10, @Megabear137, @jelalalamy. This PR
also has contributions from community members @Harrolee and @Mario928.

Initial context is in the issue we opened (#11229).

This pull request adds:

- Generic framework for expanding the languages that `LanguageParser`
can handle, using the
[tree-sitter](https://github.com/tree-sitter/py-tree-sitter#py-tree-sitter)
parsing library and existing language-specific parsers written for it
- Support for the following additional languages in `LanguageParser`:
  - C
  - C++
  - C#
  - Go
- Java (contributed by @Mario928
https://github.com/ThatsJustCheesy/langchain/pull/2)
  - Kotlin
  - Lua
  - Perl
  - Ruby
  - Rust
  - Scala
- TypeScript (contributed by @Harrolee
https://github.com/ThatsJustCheesy/langchain/pull/1)

Here is the [design
document](https://docs.google.com/document/d/17dB14cKCWAaiTeSeBtxHpoVPGKrsPye8W0o_WClz2kk)
if curious, but no need to read it.

## Issues

- Closes #11229
- Closes #10996
- Closes #8405

## Dependencies

`tree_sitter` and `tree_sitter_languages` on PyPI. We have tried to add
these as optional dependencies.

## Documentation

We have updated the list of supported languages, and also added a
section to `source_code.ipynb` detailing how to add support for
additional languages using our framework.

## Maintainer

- @hwchase17 (previously reviewed
https://github.com/langchain-ai/langchain/pull/6486)

Thanks!!

## Git commits

We will gladly squash any/all of our commits (esp merge commits) if
necessary. Let us know if this is desirable, or if you will be
squash-merging anyway.

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: Maaz Hashmi <mhashmi373@gmail.com>
Co-authored-by: LeilaChr <87657694+LeilaChr@users.noreply.github.com>
Co-authored-by: Jeremy La <jeremylai511@gmail.com>
Co-authored-by: Megabear137 <zubair.alnoor27@gmail.com>
Co-authored-by: Lee Harrold <lhharrold@sep.com>
Co-authored-by: Mario928 <88029051+Mario928@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2024-02-13 08:45:49 -08:00
merlin-quix
729c6d6827
docs: add use case for managing chat messages via Apache Kafka (#16771)
Adding a new notebook that demonstrates how to use LangChain's standard
chat features while passing the chat messages back and forth via Apache
Kafka.

This goal is to simulate an architecture where the chat front end and
the LLM are running as separate services that need to communicate with
one another over an internal nework.

It's an alternative to typical pattern of requesting a reponse from the
model via a REST API (there's more info on why you would want to do this
at the end of the notebook).

NOTE: Assuming "uses cases" is the right place for this but feel free to
propose another location.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2024-02-13 08:09:15 -08:00
Bagatur
3925071dd6
langchain[patch], templates[patch]: fix multi query retriever, web re… (#17434)
…search retriever

Fixes #17352
2024-02-12 22:52:07 -08:00
Bagatur
c0ce93236a
experimental[patch]: fix zero-shot pandas agent (#17442) 2024-02-12 21:58:35 -08:00
Abhishek Jain
37e1275f9e
community[patch]: Fixed the 'aembed' method of 'CohereEmbeddings'. (#16497)
**Description:**
- The existing code was trying to find a `.embeddings` property on the
`Coroutine` returned by calling `cohere.async_client.embed`.
- Instead, the `.embeddings` property is present on the value returned
by the `Coroutine`.
- Also, it seems that the original cohere client expects a value of
`max_retries` to not be `None`. Hence, setting the default value of
`max_retries` to `3`.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-02-12 21:57:27 -08:00
Sridhar Ramaswamy
9f1cbbc6ed
community[minor]: Add pebblo safe document loader (#16862)
- **Description:** Pebblo opensource project enables developers to
safely load data to their Gen AI apps. It identifies semantic topics and
entities found in the loaded data and summarizes them in a
developer-friendly report.
  - **Dependencies:** none
  - **Twitter handle:** srics

@hwchase17
2024-02-12 21:56:12 -08:00
Preetam D'Souza
0834457f28
docs: Fix broken link in summarization use-case (#16554)
- **Description:** Fix broken link to `StuffDocumentsChain`
- **Issue:** N/A
- **Dependencies:** None
- **Twitter handle:**
[@preetamdsouza](https://twitter.com/preetamdsouza)
2024-02-12 21:40:57 -08:00
Sheil Naik
d70a5bbf15
docs: Fix broken link in LLMs index.mdx (#16557)
- **Description:** The
[LLMs](https://python.langchain.com/docs/modules/model_io/llms/) page
has a broken link. This fixes the link.
  - **Issue:** N/A
  - **Dependencies:** N/A
  - **Twitter handle:** @sheilnaik
2024-02-12 21:39:56 -08:00