Commit Graph

478 Commits (68527b809daddff48ea05b8aa28c57526d1c72e9)

Author SHA1 Message Date
Nejc Habjan b4fa847a90
community[minor]: add exclude parameter to DirectoryLoader (#17316)
- **Description:** adds an `exclude` parameter to the DirectoryLoader
class, based on similar behavior in GenericLoader
- **Issue:** discussed in
https://github.com/langchain-ai/langchain/discussions/9059 and I think
in some other issues that I cannot find at the moment 🙇
  - **Dependencies:** None
  - **Twitter handle:** don't have one sorry! Just https://github/nejch

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
7 months ago
Bagatur 8f14234afb
infra: ignore flakey lua test (#17618) 7 months ago
Krista Pratico bf8e3c6dd1
community[patch]: add fixes for AzureSearch after update to stable azure-search-documents library (#17599)
- **Description:** Addresses the bugs described in linked issue where an
import was erroneously removed and the rename of a keyword argument was
missed when migrating from beta --> stable of the azure-search-documents
package
- **Issue:** https://github.com/langchain-ai/langchain/issues/17598
- **Dependencies:** N/A
- **Twitter handle:** N/A
7 months ago
William FH 64743dea14
core[patch], community[patch], langchain[patch], experimental[patch], robocorp[patch]: bump LangSmith 0.1.* (#17567) 7 months ago
morgana 9d7ca7df6e
community[patch]: update copy of metadata in rockset vectorstore integration (#17612)
- **Description:** This fixes an issue with working with RecordManager.
RecordManager was generating new hashes on documents because `add_texts`
was modifying the metadata directly. Additionally moved some tests to
unit tests since that was a more appropriate home.
- **Issue:** N/A
- **Dependencies:** N/A
- **Twitter handle:** `@_morgan_adams_`
7 months ago
Stefano Lottini 5240ecab99
astradb: bootstrapping Astra DB as Partner Package (#16875)
**Description:** This PR introduces a new "Astra DB" Partner Package.

So far only the vector store class is _duplicated_ there, all others
following once this is validated and established.

Along with the move to separate package, incidentally, the class name
will change `AstraDB` => `AstraDBVectorStore`.

The strategy has been to duplicate the module (with prospected removal
from community at LangChain 0.2). Until then, the code will be kept in
sync with minimal, known differences (there is a makefile target to
automate drift control. Out of convenience with this check, the
community package has a class `AstraDBVectorStore` aliased to `AstraDB`
at the end of the module).

With this PR several bugfixes and improvement come to the vector store,
as well as a reshuffling of the doc pages/notebooks (Astra and
Cassandra) to align with the move to a separate package.

**Dependencies:** A brand new pyproject.toml in the new package, no
changes otherwise.

**Twitter handle:** `@rsprrs`

---------

Co-authored-by: Christophe Bornet <cbornet@hotmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
7 months ago
Moshe Berchansky 20a56fe0a2
community[minor]: Add QuantizedEmbedders (#17391)
**Description:** 
* adding Quantized embedders using optimum-intel and
intel-extension-for-pytorch.
* added mdx documentation and example notebooks 
* added embedding import testing.

**Dependencies:** 
optimum = {extras = ["neural-compressor"], version = "^1.14.0", optional
= true}
intel_extension_for_pytorch = {version = "^2.2.0", optional = true}

Dependencies have been added to pyproject.toml for the community lib.  

**Twitter handle:** @peter_izsak

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
Amir Karbasi bccc9241ea
community[patch]: Resolve KuzuQAChain API Changes (#16885)
- **Description:** Updates to the Kuzu API had broken this
functionality. These updates resolve those issues and add a new test to
demonstrate the updates.
- **Issue:** #11874
- **Dependencies:** No new dependencies
- **Twitter handle:** @amirk08


Test results:
```
tests/integration_tests/graphs/test_kuzu.py::TestKuzu::test_query_no_params PASSED                                   [ 33%]
tests/integration_tests/graphs/test_kuzu.py::TestKuzu::test_query_params PASSED                                      [ 66%]
tests/integration_tests/graphs/test_kuzu.py::TestKuzu::test_refresh_schema PASSED                                    [100%]

=================================================== slowest 5 durations =================================================== 
0.53s call     tests/integration_tests/graphs/test_kuzu.py::TestKuzu::test_refresh_schema
0.34s call     tests/integration_tests/graphs/test_kuzu.py::TestKuzu::test_query_no_params
0.28s call     tests/integration_tests/graphs/test_kuzu.py::TestKuzu::test_query_params
0.03s teardown tests/integration_tests/graphs/test_kuzu.py::TestKuzu::test_refresh_schema
0.02s teardown tests/integration_tests/graphs/test_kuzu.py::TestKuzu::test_query_params
==================================================== 3 passed in 1.27s ==================================================== 
```
7 months ago
Rafail Giavrimis a84a3add25
Community[patch]: Adjusted import to be compatible with SQLAlchemy<2 (#17520)
- **Description:** Adjusts an import to directly import `Result` from
`sqlalchemy.engine`.
- **Issue:** #17519 
- **Dependencies:** N/A
- **Twitter handle:** @grafail
7 months ago
Zachary Toliver 6746adf363
community[patch]: pass bool value for fetch_schema_from_transport in GraphQLAPIWrapper (#17552)
- **Description:** Allow a bool value to be passed to
fetch_schema_from_transport since not all GraphQL instances support this
feature, such as TigerGraph.
- **Threads:** @zacharytoliver

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
Christophe Bornet 789cd5198d
community[patch]: Use astrapy built-in pagination prefetch in AstraDBLoader (#17569) 7 months ago
Christophe Bornet 387cacb881
community[minor]: Add async methods to AstraDBChatMessageHistory (#17572) 7 months ago
Christophe Bornet ff1f985a2a
community: Fix some mypy types in cassandra doc loader (#17570)
Thank you!
7 months ago
Christophe Bornet ca2d4078f3
community: Add async methods to AstraDBCache (#17415)
Adds async methods to AstraDBCache
7 months ago
Jan Cap 7ae3ce60d2
community[patch]: Fix pwd import that is not available on windows (#17532)
- **Description:** Resolving problem in
`langchain_community\document_loaders\pebblo.py` with `import pwd`.
`pwd` is not available on windows. import moved to try catch block
  - **Issue:** #17514
7 months ago
nvpranak 91bcc9c5c9
community[minor]: Nemo embeddings(#16206)
This PR is adding support for NVIDIA NeMo embeddings issue #16095.

---------

Co-authored-by: Praveen Nakshatrala <pnakshatrala@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
Mateusz Szewczyk 916332ef5b
ibm: added partners package `langchain_ibm`, added llm (#16512)
- **Description:** Added `langchain_ibm` as an langchain partners
package of IBM [watsonx.ai](https://www.ibm.com/products/watsonx-ai) LLM
provider (`WatsonxLLM`)
- **Dependencies:**
[ibm-watsonx-ai](https://pypi.org/project/ibm-watsonx-ai/),
  - **Tag maintainer:** : 
---------

Co-authored-by: Erick Friis <erick@langchain.dev>
7 months ago
Shawn f6d3a3546f
community[patch]: document_loaders: modified athena key logic to handle s3 uris without a prefix (#17526)
https://github.com/langchain-ai/langchain/issues/17525

### Example Code

```python
from langchain_community.document_loaders.athena import AthenaLoader

database_name = "database"
s3_output_path = "s3://bucket-no-prefix"
query="""SELECT 
  CAST(extract(hour FROM current_timestamp) AS INTEGER) AS current_hour,
  CAST(extract(minute FROM current_timestamp) AS INTEGER) AS current_minute,
  CAST(extract(second FROM current_timestamp) AS INTEGER) AS current_second;
"""
profile_name = "AdministratorAccess"

loader = AthenaLoader(
    query=query,
    database=database_name,
    s3_output_uri=s3_output_path,
    profile_name=profile_name,
)

documents = loader.load()
print(documents)
```



### Error Message and Stack Trace (if applicable)

NoSuchKey: An error occurred (NoSuchKey) when calling the GetObject
operation: The specified key does not exist

### Description

Athena Loader errors when result s3 bucket uri has no prefix. The Loader
instance call results in a "NoSuchKey: An error occurred (NoSuchKey)
when calling the GetObject operation: The specified key does not exist."
error.

If s3_output_path contains a prefix like:

```python
s3_output_path = "s3://bucket-with-prefix/prefix"
```

Execution works without an error.

## Suggested solution

Modify:

```python
key = "/".join(tokens[1:]) + "/" + query_execution_id + ".csv"
```

to

```python
key = "/".join(tokens[1:]) + ("/" if tokens[1:] else "") + query_execution_id + ".csv"
```


9e8a3fc4ff/libs/community/langchain_community/document_loaders/athena.py (L128)


### System Info


System Information
------------------
> OS:  Darwin
> OS Version: Darwin Kernel Version 22.6.0: Fri Sep 15 13:41:30 PDT
2023; root:xnu-8796.141.3.700.8~1/RELEASE_ARM64_T8103
> Python Version:  3.9.9 (main, Jan  9 2023, 11:42:03) 
[Clang 14.0.0 (clang-1400.0.29.102)]

Package Information
-------------------
> langchain_core: 0.1.23
> langchain: 0.1.7
> langchain_community: 0.0.20
> langsmith: 0.0.87
> langchain_openai: 0.0.6
> langchainhub: 0.1.14

Packages not installed (Not Necessarily a Problem)
--------------------------------------------------
The following packages were not found:

> langgraph
> langserve

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
wulixuan c776cfc599
community[minor]: integrate with model Yuan2.0 (#15411)
1. integrate with
[`Yuan2.0`](https://github.com/IEIT-Yuan/Yuan-2.0/blob/main/README-EN.md)
2. update `langchain.llms`
3. add a new doc for [Yuan2.0
integration](docs/docs/integrations/llms/yuan2.ipynb)

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
Philippe PRADOS d07db457fc
community[patch]: Fix SQLAlchemyMd5Cache race condition (#16279)
If the SQLAlchemyMd5Cache is shared among multiple processes, it is
possible to encounter a race condition during the cache update.

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
7 months ago
Alex Peplowski 70c296ae96
community[patch]: Expose Anthropic Retry Logic (#17069)
**Description:**

Expose Anthropic's retry logic, so that `max_retries` can be configured
via langchain. Anthropic's retry logic is implemented in their Python
SDK here:
https://github.com/anthropics/anthropic-sdk-python?tab=readme-ov-file#retries

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
Lyndsey 8562a1e7d4
community[patch]: support query filters for NotionDBLoader (#17217)
- **Description:** Support filtering databases in the use case where
devs do not want to query ALL entries within a DB,
- **Issue:** N/A,
- **Dependencies:** N/A,
- **Twitter handle:** I don't have Twitter but feel free to tag my
Github!

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
7 months ago
volodymyr-memsql e36bc379f2
community[patch]: Add vector index support to SingleStoreDB VectorStore (#17308)
This pull request introduces support for various Approximate Nearest
Neighbor (ANN) vector index algorithms in the VectorStore class,
starting from version 8.5 of SingleStore DB. Leveraging this enhancement
enables users to harness the power of vector indexing, significantly
boosting search speed, particularly when handling large sets of vectors.

---------

Co-authored-by: Volodymyr Tkachuk <vtkachuk-ua@singlestore.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
Kate Silverstein 0bc4a9b3fc
community[minor]: Adds Llamafile as an LLM (#17431)
* **Description:** Adds a simple LLM implementation for interacting with
[llamafile](https://github.com/Mozilla-Ocho/llamafile)-based models.
* **Dependencies:** N/A
* **Issue:** N/A

**Detail**
[llamafile](https://github.com/Mozilla-Ocho/llamafile) lets you run LLMs
locally from a single file on most computers without installing any
dependencies.

To use the llamafile LLM implementation, the user needs to:

1. Download a llamafile e.g.
https://huggingface.co/jartine/TinyLlama-1.1B-Chat-v1.0-GGUF/resolve/main/TinyLlama-1.1B-Chat-v1.0.Q5_K_M.llamafile?download=true
2. Make the file executable.
3. Run the llamafile in 'server mode'. (All llamafiles come packaged
with a lightweight server; by default, the server listens at
`http://localhost:8080`.)


```bash
wget https://url/of/model.llamafile
chmod +x model.llamafile
./model.llamafile --server --nobrowser
```

Now, the user can invoke the LLM via the LangChain client:

```python
from langchain_community.llms.llamafile import Llamafile

llm = Llamafile()

llm.invoke("Tell me a joke.")
```
7 months ago
Rakib Hosen 5ce1827d31
community[patch]: fix import in language parser (#17538)
- **Description:** Resolving import error in language_parser.py during
"from langchain.langchain.text_splitter import Language - **Issue:** the
issue #17536
- **Dependencies:** NO
- **Twitter handle:** @iRakibHosen

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
Raunak 685d62b032
community[patch]: Added functions in NetworkxEntityGraph class (#17535)
- **Description:** 
1. Added _clear_edges()_ and _get_number_of_nodes()_ functions in
NetworkxEntityGraph class.
2. Added the above two function in graph_networkx_qa.ipynb
documentation.
7 months ago
Qihui Xie 5738143d4b
add mongodb_store (#13801)
# Add MongoDB storage
  - **Description:** 
  Add MongoDB Storage as an option for large doc store. 

Example usage: 
```Python
# Instantiate the MongodbStore with a MongoDB connection
from langchain.storage import MongodbStore

mongo_conn_str = "mongodb://localhost:27017/"
mongodb_store = MongodbStore(mongo_conn_str, db_name="test-db",
                                collection_name="test-collection")

# Set values for keys
doc1 = Document(page_content='test1')
doc2 = Document(page_content='test2')
mongodb_store.mset([("key1", doc1), ("key2", doc2)])

# Get values for keys
values = mongodb_store.mget(["key1", "key2"])
# [doc1, doc2]

# Iterate over keys
for key in mongodb_store.yield_keys():
    print(key)

# Delete keys
mongodb_store.mdelete(["key1", "key2"])
 ```

  - **Dependencies:**
  Use `mongomock` for integration test.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
7 months ago
Nat Noordanus 8a3b74fe1f
community[patch]: Fix pydantic ForwardRef error in BedrockBase (#17416)
- **Description:** Fixes a type annotation issue in the definition of
BedrockBase. This issue was that the annotation for the `config`
attribute includes a ForwardRef to `botocore.client.Config` which is
only imported when `TYPE_CHECKING`. This can cause pydantic to raise an
error like `pydantic.errors.ConfigError: field "config" not yet prepared
so type is still a ForwardRef, ...`.
  - **Issue:** N/A
  - **Dependencies:** N/A
  - **Twitter handle:** `@__nat_n__`
7 months ago
Ashley Xu f746a73e26
Add the BQ job usage tracking from LangChain (#17123)
- **Description:**
Add the BQ job usage tracking from LangChain

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
7 months ago
Bagatur 39342d98d6
community[patch]: Release 0.0.20 (#17480) 7 months ago
Max Jakob ab3d944667
community[patch]: ElasticsearchStore: preserve user headers (#16830)
Users can provide an Elasticsearch connection with custom headers. This
PR makes sure these headers are preserved when adding the langchain user
agent header.
7 months ago
wulixuan 5d06797905
community[minor]: integrate chat models with Yuan2.0 (#16575)
1. integrate chat models with
[`Yuan2.0`](https://github.com/IEIT-Yuan/Yuan-2.0/blob/main/README-EN.md)
2. add a new doc for [Yuan2.0
integration](docs/docs/integrations/llms/yuan2.ipynb)
 
Yuan2.0 is a new generation Fundamental Large Language Model developed
by IEIT System. We have published all three models, Yuan 2.0-102B, Yuan
2.0-51B, and Yuan 2.0-2B.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
Ian Gregory e5472b5eb8
Framework for supporting more languages in LanguageParser (#13318)
## Description

I am submitting this for a school project as part of a team of 5. Other
team members are @LeilaChr, @maazh10, @Megabear137, @jelalalamy. This PR
also has contributions from community members @Harrolee and @Mario928.

Initial context is in the issue we opened (#11229).

This pull request adds:

- Generic framework for expanding the languages that `LanguageParser`
can handle, using the
[tree-sitter](https://github.com/tree-sitter/py-tree-sitter#py-tree-sitter)
parsing library and existing language-specific parsers written for it
- Support for the following additional languages in `LanguageParser`:
  - C
  - C++
  - C#
  - Go
- Java (contributed by @Mario928
https://github.com/ThatsJustCheesy/langchain/pull/2)
  - Kotlin
  - Lua
  - Perl
  - Ruby
  - Rust
  - Scala
- TypeScript (contributed by @Harrolee
https://github.com/ThatsJustCheesy/langchain/pull/1)

Here is the [design
document](https://docs.google.com/document/d/17dB14cKCWAaiTeSeBtxHpoVPGKrsPye8W0o_WClz2kk)
if curious, but no need to read it.

## Issues

- Closes #11229
- Closes #10996
- Closes #8405

## Dependencies

`tree_sitter` and `tree_sitter_languages` on PyPI. We have tried to add
these as optional dependencies.

## Documentation

We have updated the list of supported languages, and also added a
section to `source_code.ipynb` detailing how to add support for
additional languages using our framework.

## Maintainer

- @hwchase17 (previously reviewed
https://github.com/langchain-ai/langchain/pull/6486)

Thanks!!

## Git commits

We will gladly squash any/all of our commits (esp merge commits) if
necessary. Let us know if this is desirable, or if you will be
squash-merging anyway.

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: Maaz Hashmi <mhashmi373@gmail.com>
Co-authored-by: LeilaChr <87657694+LeilaChr@users.noreply.github.com>
Co-authored-by: Jeremy La <jeremylai511@gmail.com>
Co-authored-by: Megabear137 <zubair.alnoor27@gmail.com>
Co-authored-by: Lee Harrold <lhharrold@sep.com>
Co-authored-by: Mario928 <88029051+Mario928@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
7 months ago
Abhishek Jain 37e1275f9e
community[patch]: Fixed the 'aembed' method of 'CohereEmbeddings'. (#16497)
**Description:**
- The existing code was trying to find a `.embeddings` property on the
`Coroutine` returned by calling `cohere.async_client.embed`.
- Instead, the `.embeddings` property is present on the value returned
by the `Coroutine`.
- Also, it seems that the original cohere client expects a value of
`max_retries` to not be `None`. Hence, setting the default value of
`max_retries` to `3`.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
Sridhar Ramaswamy 9f1cbbc6ed
community[minor]: Add pebblo safe document loader (#16862)
- **Description:** Pebblo opensource project enables developers to
safely load data to their Gen AI apps. It identifies semantic topics and
entities found in the loaded data and summarizes them in a
developer-friendly report.
  - **Dependencies:** none
  - **Twitter handle:** srics

@hwchase17
7 months ago
mhavey 1bbb64d956
community[minor], langchian[minor]: Add Neptune Rdf graph and chain (#16650)
**Description**: This PR adds a chain for Amazon Neptune graph database
RDF format. It complements the existing Neptune Cypher chain. The PR
also includes a Neptune RDF graph class to connect to, introspect, and
query a Neptune RDF graph database from the chain. A sample notebook is
provided under docs that demonstrates the overall effect: invoking the
chain to make natural language queries against Neptune using an LLM.

**Issue**: This is a new feature
 
**Dependencies**: The RDF graph class depends on the AWS boto3 library
if using IAM authentication to connect to the Neptune database.

---------

Co-authored-by: Piyush Jain <piyushjain@duck.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
Michael Feil e1cfd0f3e7
community[patch]: infinity embeddings update incorrect default url (#16759)
The default url has always been incorrect (7797 instead 7997). Here is a
update to the correct url.
7 months ago
Andreas Motl 1fdd9bd980
community/SQLDatabase: Generalize and trim software tests (#16659)
- **Description:** Improve test cases for `SQLDatabase` adapter
component, see
[suggestion](https://github.com/langchain-ai/langchain/pull/16655#pullrequestreview-1846749474).
  - **Depends on:** GH-16655
  - **Addressed to:** @baskaryan, @cbornet, @eyurtsev

_Remark: This PR is stacked upon GH-16655, so that one will need to go
in first._

Edit: Thank you for bringing in GH-17191, @eyurtsev. This is a little
aftermath, improving/streamlining the corresponding test cases.
7 months ago
Theo / Taeyoon Kang 1987f905ed
core[patch]: Support .yml extension for YAML (#16783)
- **Description:**

[AS-IS] When dealing with a yaml file, the extension must be .yaml.  

[TO-BE] In the absence of extension length constraints in the OS, the
extension of the YAML file is yaml, but control over the yml extension
must still be made.

It's as if it's an error because it's a .jpg extension in jpeg support.

  - **Issue:** - 

  - **Dependencies:**
no dependencies required for this change,
7 months ago
Kapil Sachdeva cd00a87db7
community[patch] - in FAISS vector store, support passing custom DocStore implementation when using from_xxx methods (#16801)
- **Description:** The from__xx methods of FAISS class have hardcoded
InMemoryStore implementation and thereby not let users pass a custom
DocStore implementation,
  - **Issue:** no referenced issue,
  - **Dependencies:** none,
  - **Twitter handle:** ksachdeva
7 months ago
Chris f9f5626ca4
community[patch]: Fix github search issues and PRs PaginatedList has no len() error (#16806)
**Description:** 
Bugfix: Langchain_community's GitHub Api wrapper throws a TypeError when
searching for issues and/or PRs (the `search_issues_and_prs` method).
This is because PyGithub's PageinatedList type does not support the
len() method. See https://github.com/PyGithub/PyGithub/issues/1476

![image](https://github.com/langchain-ai/langchain/assets/8849021/57390b11-ed41-4f48-ba50-f3028610789c)
  **Dependencies:** None 
  **Twitter handle**: @ChrisKeoghNZ
  
I haven't registered an issue as it would take me longer to fill the
template out than to make the fix, but I'm happy to if that's deemed
essential.

I've added a simple integration test to cover this as there were no
existing unit tests and it was going to be tricky to set them up.

Co-authored-by: Chris Keogh <chris.keogh@xero.com>
7 months ago
morgana 722aae4fd1
community: add delete method to rocksetdb vectorstore to support recordmanager (#17030)
- **Description:** This adds a delete method so that rocksetdb can be
used with `RecordManager`.
  - **Issue:** N/A
  - **Dependencies:** N/A
  - **Twitter handle:** `@_morgan_adams_`

---------

Co-authored-by: Rockset API Bot <admin@rockset.io>
7 months ago
yin1991 c454dc36fc
community[proxy]: Enhancement/add proxy support playwrighturlloader 16751 (#16822)
- **Description:** Enhancement/add proxy support playwrighturlloader
16751
- **Issue:** [Enhancement: Add Proxy Support to PlaywrightURLLoader
Class](https://github.com/langchain-ai/langchain/issues/16751)
  - **Dependencies:** 
  - **Twitter handle:** @ootR77013489

---------

Co-authored-by: root <root@ip-172-31-46-160.ap-southeast-1.compute.internal>
Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
Lingzhen Chen 30af711c34
community[patch]: update AzureSearch class to work with azure-search-documents=11.4.0 (#15659)
- **Description:** Updates
`libs/community/langchain_community/vectorstores/azuresearch.py` to
support the stable version `azure-search-documents=11.4.0`
- **Issue:** https://github.com/langchain-ai/langchain/issues/14534,
https://github.com/langchain-ai/langchain/issues/15039,
https://github.com/langchain-ai/langchain/issues/15355
  - **Dependencies:** azure-search-documents>=11.4.0

---------

Co-authored-by: Clément Tamines <Skar0@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
Robby e135dc70c3
community[patch]: Invoke callback prior to yielding token (#17348)
**Description:** Invoke callback prior to yielding token in stream
method for Ollama.
**Issue:** [Callback for on_llm_new_token should be invoked before the
token is yielded by the model
#16913](https://github.com/langchain-ai/langchain/issues/16913)

Co-authored-by: Robby <h0rv@users.noreply.github.com>
7 months ago
Christophe Bornet ab025507bc
community[patch]: Add async methods to VectorStoreQATool (#16949) 7 months ago
Christophe Bornet fb7552bfcf
Add async methods to InMemoryCache (#17425)
Add async methods to InMemoryCache
7 months ago
yin1991 37ef6ac113
community[patch]: Add Pagination to GitHubIssuesLoader for Efficient GitHub Issues Retrieval (#16934)
- **Description:** Add Pagination to GitHubIssuesLoader for Efficient
GitHub Issues Retrieval
- **Issue:** [the issue # it fixes if
applicable,](https://github.com/langchain-ai/langchain/issues/16864)

---------

Co-authored-by: root <root@ip-172-31-46-160.ap-southeast-1.compute.internal>
Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
Robby ece4b43a81
community[patch]: doc loaders mypy fixes (#17368)
**Description:** Fixed `type: ignore`'s for mypy for some
document_loaders.
**Issue:** [Remove "type: ignore" comments #17048
](https://github.com/langchain-ai/langchain/issues/17048)

---------

Co-authored-by: Robby <h0rv@users.noreply.github.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
7 months ago
Robby 0653aa469a
community[patch]: Invoke callback prior to yielding token (#17346)
**Description:** Invoke callback prior to yielding token in stream
method for watsonx.
**Issue:** [Callback for on_llm_new_token should be invoked before the
token is yielded by the model
#16913](https://github.com/langchain-ai/langchain/issues/16913)

Co-authored-by: Robby <h0rv@users.noreply.github.com>
7 months ago
Bagatur f7e453971d
community[patch]: remove print (#17435) 7 months ago
Spencer Kelly 54fa78c887
community[patch]: fixed vector similarity filtering (#16967)
**Description:** changed filtering so that failed filter doesn't add
document to results. Currently filtering is entirely broken and all
documents are returned whether or not they pass the filter.

fixes issue introduced in
https://github.com/langchain-ai/langchain/pull/16190
7 months ago
Abhijeeth Padarthi 584b647b96
community[minor]: AWS Athena Document Loader (#15625)
- **Description:** Adds the document loader for [AWS
Athena](https://aws.amazon.com/athena/), a serverless and interactive
analytics service.
  - **Dependencies:** Added boto3 as a dependency
7 months ago
david-tempelmann 93da18b667
community[minor]: Add mmr and similarity_score_threshold retrieval to DatabricksVectorSearch (#16829)
- **Description:** This PR adds support for `search_types="mmr"` and
`search_type="similarity_score_threshold"` to retrievers using
`DatabricksVectorSearch`,
  - **Issue:** 
  - **Dependencies:**
  - **Twitter handle:**

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
7 months ago
Massimiliano Pronesti 3894b4d9a5
community: add gpt-4-turbo and gpt-4-0125 costs (#17349)
Ref: https://openai.com/pricing
<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
7 months ago
Erick Friis 3a2eb6e12b
infra: add print rule to ruff (#16221)
Added noqa for existing prints. Can slowly remove / will prevent more
being intro'd
8 months ago
Jael Gu c07c0da01a
community[patch]: Fix Milvus add texts when ids=None (#17021)
- **Description:** Fix Milvus add texts when ids=None (auto_id=True)

Signed-off-by: Jael Gu <mengjia.gu@zilliz.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
8 months ago
Quang Hoa 54c1fb3f25
community[patch]: Make some functions work with Milvus (#10695)
**Description**
Make some functions work with Milvus:
1. get_ids: Get primary keys by field in the metadata
2. delete: Delete one or more entities by ids
3. upsert: Update/Insert one or more entities

**Issue**
None
**Dependencies**
None
**Tag maintainer:**
@hwchase17 
**Twitter handle:**
None

---------

Co-authored-by: HoaNQ9 <hoanq.1811@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
8 months ago
kYLe c9999557bf
community[patch]: Modify LLMs/Anyscale work with OpenAI API v1 (#14206)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
- **Description:** 
1. Modify LLMs/Anyscale to work with OAI v1
2. Get rid of openai_ prefixed variables in Chat_model/ChatAnyscale
3. Modify `anyscale_api_base` to `anyscale_base_url` to follow OAI name
convention (reverted)

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
8 months ago
Charlie Marsh 24c0bab57b
infra, multiple: Upgrade configuration for Ruff v0.2.0 (#16905)
## Summary

This PR upgrades LangChain's Ruff configuration in preparation for
Ruff's v0.2.0 release. (The changes are compatible with Ruff v0.1.5,
which LangChain uses today.) Specifically, we're now warning when
linter-only options are specified under `[tool.ruff]` instead of
`[tool.ruff.lint]`.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Bagatur <baskaryan@gmail.com>
8 months ago
Leonid Ganeline 932c52c333
community[patch]: docstrings (#16810)
- added missed docstrings
- formated docstrings to the consistent form
8 months ago
Kononov Pavel 15bc201967
langchain_community: Fix typo bug (#17324)
Problem from #17095

This error wasn't in the v1.4.0
8 months ago
Bagatur 65e97c9b53
infra: mv SQLDatabase tests to community (#17276) 8 months ago
Bagatur 02ef9164b5
langchain[patch]: expose cohere rerank score, add parent doc param (#16887) 8 months ago
Bagatur 35c1bf339d
infra: rm boto3, gcaip from pyproject (#17270) 8 months ago
Alex de5e96b5f9
community[patch]: updated openai prices in mapping (#17009)
- **Description:** there are january prices update for chatgpt
[blog](https://openai.com/blog/new-embedding-models-and-api-updates),
also there are updates on their website on page
[pricing](https://openai.com/pricing)
- **Issue:** N/A

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
8 months ago
Armin Stepanyan 641efcf41c
community: add runtime kwargs to HuggingFacePipeline (#17005)
This PR enables changing the behaviour of huggingface pipeline between
different calls. For example, before this PR there's no way of changing
maximum generation length between different invocations of the chain.
This is desirable in cases, such as when we want to scale the maximum
output size depending on a dynamic prompt size.

Usage example:

```python
from langchain_community.llms.huggingface_pipeline import HuggingFacePipeline
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

model_id = "gpt2"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)
hf = HuggingFacePipeline(pipeline=pipe)

hf("Say foo:", pipeline_kwargs={"max_new_tokens": 42})
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
8 months ago
Scott Nath a32798abd7
community: Add you.com utility, update you retriever integration docs (#17014)
<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

- **Description: changes to you.com files** 
    - general cleanup
- adds community/utilities/you.py, moving bulk of code from retriever ->
utility
    - removes `snippet` as endpoint
    - adds `news` as endpoint
    - adds more tests

<s>**Description: update community MAKE file** 
    - adds `integration_tests`
    - adds `coverage`</s>

- **Issue:** the issue # it fixes if applicable,
- [For New Contributors: Update Integration
Documentation](https://github.com/langchain-ai/langchain/issues/15664#issuecomment-1920099868)
- **Dependencies:** n/a
- **Twitter handle:** @scottnath
- **Mastodon handle:** scottnath@mastodon.social

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
8 months ago
Liang Zhang 7306600e2f
community[patch]: Support SerDe transform functions in Databricks LLM (#16752)
**Description:** Databricks LLM does not support SerDe the
transform_input_fn and transform_output_fn. After saving and loading,
the LLM will be broken. This PR serialize these functions into a hex
string using pickle, and saving the hex string in the yaml file. Using
pickle to serialize a function can be flaky, but this is a simple
workaround that unblocks many use cases. If more sophisticated SerDe is
needed, we can improve it later.

Test:
Added a simple unit test.
I did manual test on Databricks and it works well.
The saved yaml looks like:
```
llm:
      _type: databricks
      cluster_driver_port: null
      cluster_id: null
      databricks_uri: databricks
      endpoint_name: databricks-mixtral-8x7b-instruct
      extra_params: {}
      host: e2-dogfood.staging.cloud.databricks.com
      max_tokens: null
      model_kwargs: null
      n: 1
      stop: null
      task: null
      temperature: 0.0
      transform_input_fn: 80049520000000000000008c085f5f6d61696e5f5f948c0f7472616e73666f726d5f696e7075749493942e
      transform_output_fn: null
```

@baskaryan

```python
from langchain_community.embeddings import DatabricksEmbeddings
from langchain_community.llms import Databricks
from langchain.chains import RetrievalQA
from langchain.document_loaders import TextLoader
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores import FAISS
import mlflow

embeddings = DatabricksEmbeddings(endpoint="databricks-bge-large-en")

def transform_input(**request):
  request["messages"] = [
    {
      "role": "user",
      "content": request["prompt"]
    }
  ]
  del request["prompt"]
  return request

llm = Databricks(endpoint_name="databricks-mixtral-8x7b-instruct", transform_input_fn=transform_input)

persist_dir = "faiss_databricks_embedding"

# Create the vector db, persist the db to a local fs folder
loader = TextLoader("state_of_the_union.txt")
documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
docs = text_splitter.split_documents(documents)
db = FAISS.from_documents(docs, embeddings)
db.save_local(persist_dir)

def load_retriever(persist_directory):
    embeddings = DatabricksEmbeddings(endpoint="databricks-bge-large-en")
    vectorstore = FAISS.load_local(persist_directory, embeddings)
    return vectorstore.as_retriever()

retriever = load_retriever(persist_dir)
retrievalQA = RetrievalQA.from_llm(llm=llm, retriever=retriever)
with mlflow.start_run() as run:
    logged_model = mlflow.langchain.log_model(
        retrievalQA,
        artifact_path="retrieval_qa",
        loader_fn=load_retriever,
        persist_dir=persist_dir,
    )

# Load the retrievalQA chain
loaded_model = mlflow.pyfunc.load_model(logged_model.model_uri)
print(loaded_model.predict([{"query": "What did the president say about Ketanji Brown Jackson"}]))

```
8 months ago
cjpark-data ce22e10c4b
community[patch]: Fix KeyError 'embedding' (MongoDBAtlasVectorSearch) (#17178)
- **Description:**
Embedding field name was hard-coded named "embedding".
So I suggest that change `res["embedding"]` into
`res[self._embedding_key]`.
  - **Issue:** #17177,
- **Twitter handle:**
[@bagcheoljun17](https://twitter.com/bagcheoljun17)
8 months ago
Neli Hateva 9bb5157a3d
langchain[patch], community[patch]: Fixes in the Ontotext GraphDB Graph and QA Chain (#17239)
- **Description:** Fixes in the Ontotext GraphDB Graph and QA Chain
related to the error handling in case of invalid SPARQL queries, for
which `prepareQuery` doesn't throw an exception, but the server returns
400 and the query is indeed invalid
  - **Issue:** N/A
  - **Dependencies:** N/A
  - **Twitter handle:** @OntotextGraphDB
8 months ago
ByeongUk Choi b88329e9a5
community[patch]: Implement Unique ID Enforcement in FAISS (#17244)
**Description:**
Implemented unique ID validation in the FAISS component to ensure all
document IDs are distinct. This update resolves issues related to
non-unique IDs, such as inconsistent behavior during deletion processes.
8 months ago
Bassem Yacoube 4e3ed7f043
community[patch]: octoai embeddings bug fix (#17216)
fixes a bug in octoa_embeddings provider
8 months ago
Eugene Yurtsev 780e84ae79
community[minor]: SQLDatabase Add fetch mode `cursor`, query parameters, query by selectable, expose execution options, and documentation (#17191)
- **Description:** Improve `SQLDatabase` adapter component to promote
code re-use, see
[suggestion](https://github.com/langchain-ai/langchain/pull/16246#pullrequestreview-1846590962).
  - **Needed by:** GH-16246
  - **Addressed to:** @baskaryan, @cbornet 

## Details
- Add `cursor` fetch mode
- Accept SQL query parameters
- Accept both `str` and SQLAlchemy selectables as query expression
- Expose `execution_options`
- Documentation page (notebook) about `SQLDatabase` [^1]
See [About
SQLDatabase](https://github.com/langchain-ai/langchain/blob/c1c7b763/docs/docs/integrations/tools/sql_database.ipynb).

[^1]: Apparently there hasn't been any yet?

---------

Co-authored-by: Andreas Motl <andreas.motl@crate.io>
8 months ago
Tomaz Bratanic 7e4b676d53
community[patch]: Better error propagation for neo4jgraph (#17190)
There are other errors that could happen when refreshing the schema, so
we want to propagate specific errors for more clarity
8 months ago
Luiz Ferreira 34d2daffb3
community[patch]: Fix chat openai unit test (#17124)
- **Description:** 
Actually the test named `test_openai_apredict` isn't testing the
apredict method from ChatOpenAI.
  - **Twitter handle:**
  https://twitter.com/OAlmofadas
8 months ago
Dmitry Kankalovich f92738a6f6
langchain[minor], community[minor], core[minor]: Async Cache support and AsyncRedisCache (#15817)
* This PR adds async methods to the LLM cache. 
* Adds an implementation using Redis called AsyncRedisCache.
* Adds a docker compose file at the /docker to help spin up docker
* Updates redis tests to use a context manager so flushing always happens by default
8 months ago
Bagatur 6f1403b9b6
community[patch]: Release 0.0.19 (#17207)
Co-authored-by: Erick Friis <erick@langchain.dev>
8 months ago
Bagatur af74301ab9
core[patch], community[patch]: link extraction continue on failure (#17200) 8 months ago
Erick Friis 22b6a03a28
infra: read min versions (#17135) 8 months ago
Bagatur 226f376d59
community[patch]: Release 0.0.18 (#17129)
Co-authored-by: Erick Friis <erick@langchain.dev>
8 months ago
Frank ef082c77b1
community[minor]: add github file loader to load any github file content b… (#15305)
### Description
support load any github file content based on file extension.  

Why not use [git
loader](https://python.langchain.com/docs/integrations/document_loaders/git#load-existing-repository-from-disk)
?
git loader clones the whole repo even only interested part of files,
that's too heavy. This GithubFileLoader only downloads that you are
interested files.

### Twitter handle
my twitter: @shufanhaotop

---------

Co-authored-by: Hao Fan <h_fan@apple.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
8 months ago
Ryan Kraus f027696b5f
community: Added new Utility runnables for NVIDIA Riva. (#15966)
**Please tag this issue with `nvidia_genai`**

- **Description:** Added new Runnables for integration NVIDIA Riva into
LCEL chains for Automatic Speech Recognition (ASR) and Text To Speech
(TTS).
- **Issue:** N/A
- **Dependencies:** To use these runnables, the NVIDIA Riva client
libraries are required. It they are not installed, an error will be
raised instructing how to install them. The Runnables can be safely
imported without the riva client libraries.
- **Twitter handle:** N/A

All of the Riva Runnables are inside a single folder in the Utilities
module. In this folder are four files:
- common.py - Contains all code that is common to both TTS and ASR
- stream.py - Contains a class representing an audio stream that allows
the end user to put data into the stream like a queue.
- asr.py - Contains the RivaASR runnable
- tts.py - Contains the RivaTTS runnable

The following Python function is an example of creating a chain that
makes use of both of these Runnables:

```python
def create(
    config: Configuration,
    audio_encoding: RivaAudioEncoding,
    sample_rate: int,
    audio_channels: int = 1,
) -> Runnable[ASRInputType, TTSOutputType]:
    """Create a new instance of the chain."""
    _LOGGER.info("Instantiating the chain.")

    # create the riva asr client
    riva_asr = RivaASR(
        url=str(config.riva_asr.service.url),
        ssl_cert=config.riva_asr.service.ssl_cert,
        encoding=audio_encoding,
        audio_channel_count=audio_channels,
        sample_rate_hertz=sample_rate,
        profanity_filter=config.riva_asr.profanity_filter,
        enable_automatic_punctuation=config.riva_asr.enable_automatic_punctuation,
        language_code=config.riva_asr.language_code,
    )

    # create the prompt template
    prompt = PromptTemplate.from_template("{user_input}")

    # model = ChatOpenAI()
    model = ChatNVIDIA(model="mixtral_8x7b")  # type: ignore

    # create the riva tts client
    riva_tts = RivaTTS(
        url=str(config.riva_asr.service.url),
        ssl_cert=config.riva_asr.service.ssl_cert,
        output_directory=config.riva_tts.output_directory,
        language_code=config.riva_tts.language_code,
        voice_name=config.riva_tts.voice_name,
    )

    # construct and return the chain
    return {"user_input": riva_asr} | prompt | model | riva_tts  # type: ignore
```

The following code is an example of creating a new audio stream for
Riva:

```python
input_stream = AudioStream(maxsize=1000)
# Send bytes into the stream
for chunk in audio_chunks:
    await input_stream.aput(chunk)
input_stream.close()
```

The following code is an example of how to execute the chain with
RivaASR and RivaTTS

```python
output_stream = asyncio.Queue()
while not input_stream.complete:
    async for chunk in chain.astream(input_stream):
        output_stream.put(chunk)    
```

Everything should be async safe and thread safe. Audio data can be put
into the input stream while the chain is running without interruptions.

---------

Co-authored-by: Hayden Wolff <hwolff@nvidia.com>
Co-authored-by: Hayden Wolff <hwolff@Haydens-Laptop.local>
Co-authored-by: Hayden Wolff <haydenwolff99@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
8 months ago
François Paupier 929f071513
community[patch]: Fix error in `LlamaCpp` community LLM with Configurable Fields, 'grammar' custom type not available (#16995)
- **Description:** Ensure the `LlamaGrammar` custom type is always
available when instantiating a `LlamaCpp` LLM
  - **Issue:** #16994 
  - **Dependencies:** None
  - **Twitter handle:** @fpaupier

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
8 months ago
Scott Nath 10bd901139
infra: add integration_tests and coverage to MAKEFILE (#17053)
- **Description: update community MAKE file** 
    - adds `integration_tests`
    - adds `coverage`

- **Issue:** the issue # it fixes if applicable,
    - moving out of https://github.com/langchain-ai/langchain/pull/17014
- **Dependencies:** n/a
- **Twitter handle:** @scottnath
- **Mastodon handle:** scottnath@mastodon.social

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
8 months ago
Bagatur 6e2ed9671f
infra: fix breebs test lint (#17075) 8 months ago
Alex Boury 334b6ebdf3
community[minor]: Breebs docs retriever (#16578)
- **Description:** Implementation of breeb retriever with integration
tests ->
libs/community/tests/integration_tests/retrievers/test_breebs.py and
documentation (notebook) ->
docs/docs/integrations/retrievers/breebs.ipynb.
  - **Dependencies:** None
8 months ago
Serena Ruan 9b279ac127
community[patch]: MLflow callback update (#16687)
Signed-off-by: Serena Ruan <serena.rxy@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
8 months ago
Mohammad Mohtashim 3c4b24b69a
community[patch]: Fix the _call of HuggingFaceHub (#16891)
Fixed the following identified issue: #16849

@baskaryan
8 months ago
Tyler Titsworth 304f3f5fc1
community[patch]: Add Progress bar to HuggingFaceEmbeddings (#16758)
- **Description:** Adds a function parameter to HuggingFaceEmbeddings
called `show_progress` that enables a `tqdm` progress bar if enabled.
Does not function if `multi_process = True`.
  - **Issue:** n/a
  - **Dependencies:** n/a
8 months ago
Supreet Takkar ae33979813
community[patch]: Allow adding ARNs as model_id to support Amazon Bedrock custom models (#16800)
- **Description:** Adds an additional class variable to `BedrockBase`
called `provider` that allows sending a model provider such as amazon,
cohere, ai21, etc.
Up until now, the model provider is extracted from the `model_id` using
the first part before the `.`, such as `amazon` for
`amazon.titan-text-express-v1` (see [supported list of Bedrock model IDs
here](https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids-arns.html)).
But for custom Bedrock models where the ARN of the provisioned
throughput must be supplied, the `model_id` is like
`arn:aws:bedrock:...` so the `model_id` cannot be extracted from this. A
model `provider` is required by the LangChain Bedrock class to perform
model-based processing. To allow the same processing to be performed for
custom-models of a specific base model type, passing this `provider`
argument can help solve the issues.
The alternative considered here was the use of
`provider.arn:aws:bedrock:...` which then requires ARN to be extracted
and passed separately when invoking the model. The proposed solution
here is simpler and also does not cause issues for current models
already using the Bedrock class.
  - **Issue:** N/A
  - **Dependencies:** N/A

---------

Co-authored-by: Piyush Jain <piyushjain@duck.com>
8 months ago
Bagatur 66e45e8ab7
community[patch]: chat model mypy fixes (#17061)
Related to #17048
8 months ago
Bagatur d93de71d08
community[patch]: chat message history mypy fixes (#17059)
Related to #17048
8 months ago
Bagatur af5ae24af2
community[patch]: callbacks mypy fixes (#17058)
Related to #17048
8 months ago
Bagatur e7b3290d30
community[patch]: fix agent_toolkits mypy (#17050)
Related to #17048
8 months ago
Erick Friis 6ffd5b15bc
pinecone: init pkg (#16556)
<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
8 months ago
Harrison Chase 4eda647fdd
infra: add -p to mkdir in lint steps (#17013)
Previously, if this did not find a mypy cache then it wouldnt run

this makes it always run

adding mypy ignore comments with existing uncaught issues to unblock other prs

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
8 months ago
Christophe Bornet 2ef69fe11b
Add async methods to BaseChatMessageHistory and BaseMemory (#16728)
Adds:
   * async methods to BaseChatMessageHistory
   * async methods to ChatMessageHistory
   * async methods to BaseMemory
   * async methods to BaseChatMemory
   * async methods to ConversationBufferMemory
   * tests of ConversationBufferMemory's async methods

  **Twitter handle:** cbornet_
8 months ago
Killinsun - Ryota Takeuchi bcfce146d8
community[patch]: Correct the calling to collection_name in qdrant (#16920)
## Description

In #16608, the calling `collection_name` was wrong.
I made a fix for it. 
Sorry for the inconvenience!

## Issue

https://github.com/langchain-ai/langchain/issues/16962

## Dependencies

N/A



<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: Kumar Shivendu <kshivendu1@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
8 months ago
Erick Friis b1a847366c
community: revert SQL Stores (#16912)
This reverts commit cfc225ecb3.


https://github.com/langchain-ai/langchain/pull/15909#issuecomment-1922418097

These will have existed in langchain-community 0.0.16 and 0.0.17.
8 months ago