Commit Graph

7607 Commits

Author SHA1 Message Date
Guangdong Liu
73edf17b4e
community[minor]: Add Apache Doris as vector store (#17527)
---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-02-18 12:05:58 -07:00
Bagatur
a058c8812d
community[patch]: add VoyageEmbeddings truncation (#17638) 2024-02-18 10:21:21 -07:00
Eugene Yurtsev
d7c26c89b2
ci: rename makefile -> Makefile in docker (#17648)
Minor file rename.
2024-02-16 16:59:18 -05:00
Mohammad Mohtashim
8d4547ae97
[Langchain_community]: Corrected the imports to make them compatible with Sqlachemy <2.0 (#17653)
- Small Change in Imports in sql_database module to make it work with
Sqlachemy <2.0
 - This was identified in the following issue: #17616
2024-02-16 16:59:08 -05:00
Christophe Bornet
75465a2a3c
partners/astradb: Add dotenv to langchain-astradb integration tests (#17629) 2024-02-16 11:48:30 -05:00
Stefano Lottini
2a239710a0
docs: update astradb imports to in docs/sample notebook to import from partner package (#17627)
This PR replaces the imports of the Astra DB vector store with the
newly-released partner package, in compliance with the deprecation
notice now attached to the community "legacy" store.
2024-02-16 11:30:13 -05:00
Christophe Bornet
19ebc7418e
community: Use _AstraDBCollectionEnvironment in AstraDB VectorStore (community) (#17635)
Another PR will be done for the langchain-astradb package.

Note: for future PRs, devs will be done in the partner package only. This one is just to align with the rest of the components in the community package and it fixes a bunch of issues.
2024-02-16 11:28:16 -05:00
ccurme
0b33abc8b1
docs: update documentation for RunnableWithMessageHistory (#17602)
- **Description:** Update documentation for RunnableWithMessageHistory
- **Issue:** https://github.com/langchain-ai/langchain/issues/16642

I don't have access to an Anthropic API key so I updated things to use
OpenAI. Let me know if you'd prefer another provider.
2024-02-16 11:25:49 -05:00
Mateusz Szewczyk
e25b722ea9
watsonx[patch]: Invoke callback prior to yielding token when streaming (#17625)
**Description**: Invoke callback prior to yielding token in stream
method for watsonx.
 **Issue**: https://github.com/langchain-ai/langchain/issues/16913
2024-02-16 09:45:12 -05:00
Nejc Habjan
b4fa847a90
community[minor]: add exclude parameter to DirectoryLoader (#17316)
- **Description:** adds an `exclude` parameter to the DirectoryLoader
class, based on similar behavior in GenericLoader
- **Issue:** discussed in
https://github.com/langchain-ai/langchain/discussions/9059 and I think
in some other issues that I cannot find at the moment 🙇
  - **Dependencies:** None
  - **Twitter handle:** don't have one sorry! Just https://github/nejch

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-02-16 09:42:42 -05:00
Bagatur
8f14234afb
infra: ignore flakey lua test (#17618) 2024-02-16 05:02:58 -07:00
Krista Pratico
bf8e3c6dd1
community[patch]: add fixes for AzureSearch after update to stable azure-search-documents library (#17599)
- **Description:** Addresses the bugs described in linked issue where an
import was erroneously removed and the rename of a keyword argument was
missed when migrating from beta --> stable of the azure-search-documents
package
- **Issue:** https://github.com/langchain-ai/langchain/issues/17598
- **Dependencies:** N/A
- **Twitter handle:** N/A
2024-02-15 22:23:52 -08:00
William FH
64743dea14
core[patch], community[patch], langchain[patch], experimental[patch], robocorp[patch]: bump LangSmith 0.1.* (#17567) 2024-02-15 23:17:59 -07:00
morgana
9d7ca7df6e
community[patch]: update copy of metadata in rockset vectorstore integration (#17612)
- **Description:** This fixes an issue with working with RecordManager.
RecordManager was generating new hashes on documents because `add_texts`
was modifying the metadata directly. Additionally moved some tests to
unit tests since that was a more appropriate home.
- **Issue:** N/A
- **Dependencies:** N/A
- **Twitter handle:** `@_morgan_adams_`
2024-02-15 23:13:40 -07:00
Erick Friis
c8d96f30bd
exa[patch]: fix lint (#17610) 2024-02-15 20:45:16 -08:00
Erick Friis
8f5c70769d
astradb[patch]: fix core dep 3 (#17617) 2024-02-15 20:42:30 -08:00
Kartheek Yakkala
44db4412c0
ci[minor] : Added graphdb in docker compose for integration tests (#17510)
This PR adds graphdb to the docker compose so it can be used in integration tests.

Co-authored-by: KARTHEEK YAKKALA <kartheekyakkala.se@gmail.com>
2024-02-15 23:03:22 -05:00
Leonid Ganeline
0835ebad70
docs: Fix bug that caused the word "Deprecated" to appear twice in doc-strings (#17615)
The current issue:
Most of the deprecation descriptions are duplicated. For example:
`[Deprecated] Chat Agent.[Deprecated] Chat Agent.` for the [ChatAgent
class](https://api.python.langchain.com/en/latest/langchain_api_reference.html#classes)
description.

NOTE: I've tested it only with new ut! I cannot build API Reference
locally :(
2024-02-15 22:52:26 -05:00
Kevin
88af4fd514
docs: quickstart example returns 404 (#17609)
**Description:** 
Appears a legacy URL in the quickstart returns a 404. Updated to use
Langchain homepage and ran through tutorial to confirm results.
2024-02-15 16:50:41 -08:00
Erick Friis
aa31025dd7
astradb[patch]: fix core dep 2 (#17608) 2024-02-15 16:33:02 -08:00
Erick Friis
cc562e7c58
astradb[patch]: fix core dep (#17606) 2024-02-15 16:09:38 -08:00
Stefano Lottini
5240ecab99
astradb: bootstrapping Astra DB as Partner Package (#16875)
**Description:** This PR introduces a new "Astra DB" Partner Package.

So far only the vector store class is _duplicated_ there, all others
following once this is validated and established.

Along with the move to separate package, incidentally, the class name
will change `AstraDB` => `AstraDBVectorStore`.

The strategy has been to duplicate the module (with prospected removal
from community at LangChain 0.2). Until then, the code will be kept in
sync with minimal, known differences (there is a makefile target to
automate drift control. Out of convenience with this check, the
community package has a class `AstraDBVectorStore` aliased to `AstraDB`
at the end of the module).

With this PR several bugfixes and improvement come to the vector store,
as well as a reshuffling of the doc pages/notebooks (Astra and
Cassandra) to align with the move to a separate package.

**Dependencies:** A brand new pyproject.toml in the new package, no
changes otherwise.

**Twitter handle:** `@rsprrs`

---------

Co-authored-by: Christophe Bornet <cbornet@hotmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-02-15 15:50:59 -08:00
Erick Friis
f6f0ca1bae
docs: ai21 sidebars (#17600) 2024-02-15 14:43:48 -08:00
Erick Friis
6cc6faa00e
ai21: init package (#17592)
Co-authored-by: Asaf Gardin <asafg@ai21.com>
Co-authored-by: etang <etang@ai21.com>
Co-authored-by: asafgardin <147075902+asafgardin@users.noreply.github.com>
2024-02-15 12:25:05 -08:00
Moshe Berchansky
20a56fe0a2
community[minor]: Add QuantizedEmbedders (#17391)
**Description:** 
* adding Quantized embedders using optimum-intel and
intel-extension-for-pytorch.
* added mdx documentation and example notebooks 
* added embedding import testing.

**Dependencies:** 
optimum = {extras = ["neural-compressor"], version = "^1.14.0", optional
= true}
intel_extension_for_pytorch = {version = "^2.2.0", optional = true}

Dependencies have been added to pyproject.toml for the community lib.  

**Twitter handle:** @peter_izsak

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-02-15 11:01:24 -08:00
Amir Karbasi
bccc9241ea
community[patch]: Resolve KuzuQAChain API Changes (#16885)
- **Description:** Updates to the Kuzu API had broken this
functionality. These updates resolve those issues and add a new test to
demonstrate the updates.
- **Issue:** #11874
- **Dependencies:** No new dependencies
- **Twitter handle:** @amirk08


Test results:
```
tests/integration_tests/graphs/test_kuzu.py::TestKuzu::test_query_no_params PASSED                                   [ 33%]
tests/integration_tests/graphs/test_kuzu.py::TestKuzu::test_query_params PASSED                                      [ 66%]
tests/integration_tests/graphs/test_kuzu.py::TestKuzu::test_refresh_schema PASSED                                    [100%]

=================================================== slowest 5 durations =================================================== 
0.53s call     tests/integration_tests/graphs/test_kuzu.py::TestKuzu::test_refresh_schema
0.34s call     tests/integration_tests/graphs/test_kuzu.py::TestKuzu::test_query_no_params
0.28s call     tests/integration_tests/graphs/test_kuzu.py::TestKuzu::test_query_params
0.03s teardown tests/integration_tests/graphs/test_kuzu.py::TestKuzu::test_refresh_schema
0.02s teardown tests/integration_tests/graphs/test_kuzu.py::TestKuzu::test_query_params
==================================================== 3 passed in 1.27s ==================================================== 
```
2024-02-15 10:18:37 -08:00
Rafail Giavrimis
a84a3add25
Community[patch]: Adjusted import to be compatible with SQLAlchemy<2 (#17520)
- **Description:** Adjusts an import to directly import `Result` from
`sqlalchemy.engine`.
- **Issue:** #17519 
- **Dependencies:** N/A
- **Twitter handle:** @grafail
2024-02-15 11:12:13 -05:00
Zachary Toliver
6746adf363
community[patch]: pass bool value for fetch_schema_from_transport in GraphQLAPIWrapper (#17552)
- **Description:** Allow a bool value to be passed to
fetch_schema_from_transport since not all GraphQL instances support this
feature, such as TigerGraph.
- **Threads:** @zacharytoliver

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-02-15 09:54:04 -05:00
Christophe Bornet
789cd5198d
community[patch]: Use astrapy built-in pagination prefetch in AstraDBLoader (#17569) 2024-02-15 09:52:56 -05:00
Christophe Bornet
387cacb881
community[minor]: Add async methods to AstraDBChatMessageHistory (#17572) 2024-02-15 09:48:42 -05:00
Christophe Bornet
ff1f985a2a
community: Fix some mypy types in cassandra doc loader (#17570)
Thank you!
2024-02-15 09:45:22 -05:00
Mo Latif
f3e4a0e27f
langchain[patch]: Update Chain prep_inputs docstring (#17575)
**Description**: @eyurtsev Following up on #16644 to fix the docstring,
because `prep_inputs` is not longer doing any validation.
2024-02-15 09:44:35 -05:00
William FH
53b8c86309
fix dataset link (#17565) 2024-02-14 23:18:07 -08:00
William FH
fc1617c44f
Update contact link (#17563) 2024-02-14 22:37:32 -08:00
Eugene Yurtsev
79119b4345
Docs: Add repository structure to contributors guide (#17553)
Adding another high level overview page to the contributors guide
2024-02-14 23:20:45 -05:00
Christophe Bornet
ca2d4078f3
community: Add async methods to AstraDBCache (#17415)
Adds async methods to AstraDBCache
2024-02-14 23:10:08 -05:00
Eugene Yurtsev
e438fe6be9
Docs: Contributing changes (#17551)
A few minor changes for contribution:

1) Updating link to say "Contributing" rather than "Developer's guide"
2) Minor changes after going through the contributing documentation
page.
2024-02-14 17:55:09 -05:00
Jan Cap
7ae3ce60d2
community[patch]: Fix pwd import that is not available on windows (#17532)
- **Description:** Resolving problem in
`langchain_community\document_loaders\pebblo.py` with `import pwd`.
`pwd` is not available on windows. import moved to try catch block
  - **Issue:** #17514
2024-02-14 13:45:10 -08:00
nvpranak
91bcc9c5c9
community[minor]: Nemo embeddings(#16206)
This PR is adding support for NVIDIA NeMo embeddings issue #16095.

---------

Co-authored-by: Praveen Nakshatrala <pnakshatrala@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-02-14 13:25:42 -08:00
Mattt394
7c6009b76f
experimental[patch]: Fixed typos in SmartLLMChain ideation and critique prompts (#11507)
Noticed and fixed a few typos in the SmartLLMChain default ideation and
critique prompts

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2024-02-14 13:20:10 -08:00
Erick Friis
86d3e42853
core[minor]: add name to basemessage (#17539)
Adds an optional name param to our base message to support passing names
into LLMs.

OpenAI supports having a name on anything except tool message now
(system, ai, user/human).
2024-02-14 12:21:59 -08:00
Mateusz Szewczyk
916332ef5b
ibm: added partners package langchain_ibm, added llm (#16512)
- **Description:** Added `langchain_ibm` as an langchain partners
package of IBM [watsonx.ai](https://www.ibm.com/products/watsonx-ai) LLM
provider (`WatsonxLLM`)
- **Dependencies:**
[ibm-watsonx-ai](https://pypi.org/project/ibm-watsonx-ai/),
  - **Tag maintainer:** : 
---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-02-14 12:12:19 -08:00
Shawn
f6d3a3546f
community[patch]: document_loaders: modified athena key logic to handle s3 uris without a prefix (#17526)
https://github.com/langchain-ai/langchain/issues/17525

### Example Code

```python
from langchain_community.document_loaders.athena import AthenaLoader

database_name = "database"
s3_output_path = "s3://bucket-no-prefix"
query="""SELECT 
  CAST(extract(hour FROM current_timestamp) AS INTEGER) AS current_hour,
  CAST(extract(minute FROM current_timestamp) AS INTEGER) AS current_minute,
  CAST(extract(second FROM current_timestamp) AS INTEGER) AS current_second;
"""
profile_name = "AdministratorAccess"

loader = AthenaLoader(
    query=query,
    database=database_name,
    s3_output_uri=s3_output_path,
    profile_name=profile_name,
)

documents = loader.load()
print(documents)
```



### Error Message and Stack Trace (if applicable)

NoSuchKey: An error occurred (NoSuchKey) when calling the GetObject
operation: The specified key does not exist

### Description

Athena Loader errors when result s3 bucket uri has no prefix. The Loader
instance call results in a "NoSuchKey: An error occurred (NoSuchKey)
when calling the GetObject operation: The specified key does not exist."
error.

If s3_output_path contains a prefix like:

```python
s3_output_path = "s3://bucket-with-prefix/prefix"
```

Execution works without an error.

## Suggested solution

Modify:

```python
key = "/".join(tokens[1:]) + "/" + query_execution_id + ".csv"
```

to

```python
key = "/".join(tokens[1:]) + ("/" if tokens[1:] else "") + query_execution_id + ".csv"
```


9e8a3fc4ff/libs/community/langchain_community/document_loaders/athena.py (L128)


### System Info


System Information
------------------
> OS:  Darwin
> OS Version: Darwin Kernel Version 22.6.0: Fri Sep 15 13:41:30 PDT
2023; root:xnu-8796.141.3.700.8~1/RELEASE_ARM64_T8103
> Python Version:  3.9.9 (main, Jan  9 2023, 11:42:03) 
[Clang 14.0.0 (clang-1400.0.29.102)]

Package Information
-------------------
> langchain_core: 0.1.23
> langchain: 0.1.7
> langchain_community: 0.0.20
> langsmith: 0.0.87
> langchain_openai: 0.0.6
> langchainhub: 0.1.14

Packages not installed (Not Necessarily a Problem)
--------------------------------------------------
The following packages were not found:

> langgraph
> langserve

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-02-14 11:48:31 -08:00
wulixuan
c776cfc599
community[minor]: integrate with model Yuan2.0 (#15411)
1. integrate with
[`Yuan2.0`](https://github.com/IEIT-Yuan/Yuan-2.0/blob/main/README-EN.md)
2. update `langchain.llms`
3. add a new doc for [Yuan2.0
integration](docs/docs/integrations/llms/yuan2.ipynb)

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-02-14 11:46:20 -08:00
Philippe PRADOS
d07db457fc
community[patch]: Fix SQLAlchemyMd5Cache race condition (#16279)
If the SQLAlchemyMd5Cache is shared among multiple processes, it is
possible to encounter a race condition during the cache update.

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-02-14 11:45:28 -08:00
Alex Peplowski
70c296ae96
community[patch]: Expose Anthropic Retry Logic (#17069)
**Description:**

Expose Anthropic's retry logic, so that `max_retries` can be configured
via langchain. Anthropic's retry logic is implemented in their Python
SDK here:
https://github.com/anthropics/anthropic-sdk-python?tab=readme-ov-file#retries

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-02-14 11:44:28 -08:00
DanisJiang
de9a6cdf16
experimental[patch]: Enhance protection against arbitrary code execution in PALChain (#17091)
- **Description:** Block some ways to trigger arbitrary code execution
bug in PALChain.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-02-14 11:44:07 -08:00
Lyndsey
8562a1e7d4
community[patch]: support query filters for NotionDBLoader (#17217)
- **Description:** Support filtering databases in the use case where
devs do not want to query ALL entries within a DB,
- **Issue:** N/A,
- **Dependencies:** N/A,
- **Twitter handle:** I don't have Twitter but feel free to tag my
Github!

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-02-14 11:43:41 -08:00
volodymyr-memsql
e36bc379f2
community[patch]: Add vector index support to SingleStoreDB VectorStore (#17308)
This pull request introduces support for various Approximate Nearest
Neighbor (ANN) vector index algorithms in the VectorStore class,
starting from version 8.5 of SingleStore DB. Leveraging this enhancement
enables users to harness the power of vector indexing, significantly
boosting search speed, particularly when handling large sets of vectors.

---------

Co-authored-by: Volodymyr Tkachuk <vtkachuk-ua@singlestore.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-02-14 11:43:12 -08:00
Kate Silverstein
0bc4a9b3fc
community[minor]: Adds Llamafile as an LLM (#17431)
* **Description:** Adds a simple LLM implementation for interacting with
[llamafile](https://github.com/Mozilla-Ocho/llamafile)-based models.
* **Dependencies:** N/A
* **Issue:** N/A

**Detail**
[llamafile](https://github.com/Mozilla-Ocho/llamafile) lets you run LLMs
locally from a single file on most computers without installing any
dependencies.

To use the llamafile LLM implementation, the user needs to:

1. Download a llamafile e.g.
https://huggingface.co/jartine/TinyLlama-1.1B-Chat-v1.0-GGUF/resolve/main/TinyLlama-1.1B-Chat-v1.0.Q5_K_M.llamafile?download=true
2. Make the file executable.
3. Run the llamafile in 'server mode'. (All llamafiles come packaged
with a lightweight server; by default, the server listens at
`http://localhost:8080`.)


```bash
wget https://url/of/model.llamafile
chmod +x model.llamafile
./model.llamafile --server --nobrowser
```

Now, the user can invoke the LLM via the LangChain client:

```python
from langchain_community.llms.llamafile import Llamafile

llm = Llamafile()

llm.invoke("Tell me a joke.")
```
2024-02-14 11:15:24 -08:00