Commit Graph

956 Commits

Author SHA1 Message Date
Jorge Piedrahita Ortiz
e65652c3e8
community: add SambaNova embeddings integration (#21227)
- **Description:**  SambaNova hosted embeddings integration
2024-05-06 13:29:59 -07:00
Jorge Piedrahita Ortiz
df1c10260c
community: minor changes sambanova integration (#21231)
- **Description:** fix: variable names in root validator not allowing
pass credentials as named parameters in llm instancing, also added
sambanova's sambaverse and sambastudio llms to __init__.py for module
import
2024-05-06 13:28:35 -07:00
Jan Soubusta
d9a61c0fa9
fix: respect table_name argument when calling from_texts (#21252)
valid for from_documents() as well

fixes #21251
2024-05-06 20:28:22 +00:00
Pedro Lima
bebf46c4a2
community: added args_schema to YahooFinanceNewsTool (#21232)
Description: this change adds args_schema (pydantic BaseModel) to
YahooFinanceNewsTool for correct schema formatting on LLM function calls

Issue: currently using YahooFinanceNewsTool with OpenAI function calling
returns the following error "TypeError("YahooFinanceNewsTool._run() got
an unexpected keyword argument '__arg1'")". This happens because the
schema sent to the LLM is "input: "{'__arg1': 'MSFT'}"" while the method
should be called with the "query" parameter.

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-05-06 13:27:54 -07:00
Mark Cusack
060987d755
community[minor]: Add indexing via locality sensitive hashing to the Yellowbrick vector store (#20856)
- **Description:** Add LSH-based indexing to the Yellowbrick vector
store module
- **Twitter handle:** @markcusack

---------

Co-authored-by: markcusack <markcusack@markcusacksmac.lan>
Co-authored-by: markcusack <markcusack@Mark-Cusack-sMac.local>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-05-06 20:18:02 +00:00
Rashmi Pawar
a2fdabdad2
mark NemoEmbeddings as deprecated (#21239)
The NemoEmbeddings is deprecated, instead use
langchain-nvidia-ai-endpoints NVIDIAEmbeddings interface.

cc: @mattf

---------

Co-authored-by: Daniel Glogowski <167348611+dglogo@users.noreply.github.com>
Co-authored-by: andyjessen <62343929+andyjessen@users.noreply.github.com>
Co-authored-by: Chris Germann <88305668+TAAGECH9@users.noreply.github.com>
Co-authored-by: gere <gere@kapo.zh.ch>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-05-06 19:44:58 +00:00
Erick Friis
5c000f8d79
community: release 0.0.37 (#21332) 2024-05-06 12:17:42 -07:00
Erick Friis
7ecf9996f1
community: Revert "community: langkit dependency" (#21333)
Reverts langchain-ai/langchain#21174

Hey team - going to revert this because it doesn't seem necessary for
testing. We should only be adding optional + extended_testing
dependencies for deps that have extended tests.

otherwise it just increases probability of dependency conflicts in the
community lockfile.
2024-05-06 18:44:41 +00:00
Param Singh
fee91d43b7
baichuan[patch]:standardize chat init args (#21298)
Thank you for contributing to LangChain!

community:baichuan[patch]: standardize init args

updated `baichuan_api_key` so that aliased to `api_key`. Added test that
it continues to set the same underlying attribute. Test checks for
`SecretStr`

updated `temperature` with Pydantic Field, added unit test. 

Related to https://github.com/langchain-ai/langchain/issues/20085
2024-05-06 18:33:57 +00:00
Christophe Bornet
484a009012
community[minor]: Relax constraints on Cassandra VectorStore constructors (#21209)
If Session and/or keyspace are not provided, they are resolved from
cassio's context. So they are not required.
This change is fully backward compatible.
2024-05-06 14:32:32 -04:00
Leonid Ganeline
6feddfae88
community: langkit dependency (#21174)
Issue: the `langkit` package is not presented in the `pyproject.toml`
but it is a requirement for the `WhyLabsCallbackHandler`
Change: added `langkit`

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-05-06 18:09:31 +00:00
Rohan Aggarwal
8021d2a2ab
community[minor]: Oraclevs integration (#21123)
Thank you for contributing to LangChain!

- Oracle AI Vector Search 
Oracle AI Vector Search is designed for Artificial Intelligence (AI)
workloads that allows you to query data based on semantics, rather than
keywords. One of the biggest benefit of Oracle AI Vector Search is that
semantic search on unstructured data can be combined with relational
search on business data in one single system. This is not only powerful
but also significantly more effective because you don't need to add a
specialized vector database, eliminating the pain of data fragmentation
between multiple systems.


- Oracle AI Vector Search is designed for Artificial Intelligence (AI)
workloads that allows you to query data based on semantics, rather than
keywords. One of the biggest benefit of Oracle AI Vector Search is that
semantic search on unstructured data can be combined with relational
search on business data in one single system. This is not only powerful
but also significantly more effective because you don't need to add a
specialized vector database, eliminating the pain of data fragmentation
between multiple systems.
This Pull Requests Adds the following functionalities
Oracle AI Vector Search : Vector Store
Oracle AI Vector Search : Document Loader
Oracle AI Vector Search : Document Splitter
Oracle AI Vector Search : Summary
Oracle AI Vector Search : Oracle Embeddings


- We have added unit tests and have our own local unit test suite which
verifies all the code is correct. We have made sure to add guides for
each of the components and one end to end guide that shows how the
entire thing runs.


- We have made sure that make format and make lint run clean.

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: skmishraoracle <shailendra.mishra@oracle.com>
Co-authored-by: hroyofc <harichandan.roy@oracle.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-05-04 03:15:35 +00:00
Leonid Ganeline
9639457222
community[patch]: tools imports (#21156)
Issue: we have several helper functions to import third-party libraries
like tools.gmail.utils.import_google in
[community.tools](https://api.python.langchain.com/en/latest/community_api_reference.html#id37).
And we have core.utils.utils.guard_import that works exactly for this
purpose.
The import_<package> functions work inconsistently and rather be private
functions.
Change: replaced these functions with the guard_import function.

Related to #21133
2024-05-03 17:22:45 -04:00
ccurme
6da3d92b42
(all): update removal in deprecation warnings from 0.2 to 0.3 (#21265)
We are pushing out the removal of these to 0.3.

`find . -type f -name "*.py" -exec sed -i ''
's/removal="0\.2/removal="0.3/g' {} +`
2024-05-03 14:29:36 -04:00
Eugene Yurtsev
0989c48028
langchain[minor]: Re-add deleted ainetwork tool (#21254)
* Adding __init__.py to turn it into a package in community
* Adding proxy imports that assume that langchain_community is optional
2024-05-03 11:39:40 -04:00
Christophe Bornet
2fbe82f5e6
community[minor]: Relax constraints on CassandraChatMessageHistory constructor (#21241) 2024-05-03 10:20:39 -04:00
Christophe Bornet
683fb45c6b
community[patch]: Refactor CassandraDatabase wrapper (#21075)
* Introduce individual `fetch_` methods for easier typing.
* Rework some docstrings to google style
* Move some logic to the tool
* Merge the 2 cassandra utility files
2024-05-02 13:13:08 -04:00
Raghav Dixit
7d451d0041
community[patch]: Update lancedb.py (#21192)
very minor update in LanceDB integration, 'metric' argument was missing.
2024-05-02 17:06:39 +00:00
Eugene Yurtsev
3cd7fced5f
langchain[patch],community[minor]: Migrate memory implementations to community (#20845)
Migrates memory implementations to community
2024-05-02 10:46:50 -04:00
Eugene Yurtsev
c9119b0e75
langchain[patch],community[minor]: Move some unit tests from langchain to community, use core for fake models (#21190) 2024-05-02 09:57:52 -04:00
Tomaz Bratanic
9e53fa7d2e
Some more fixes to neo4j enhanced schema (#21139) 2024-05-01 13:12:43 -07:00
Eugene Yurtsev
44602bdc20
langchain[patch],community[minor]: Move load_tools to community (#21158)
Move load tools to community
2024-05-01 16:05:41 -04:00
Eugene Yurtsev
bec3eee3fa
langchain[patch]: Migrate retrievers to use optional langchain community imports (#21155) 2024-05-01 14:44:44 -04:00
Eugene Yurtsev
0e5bf16d00
langchain[patch]: Migrate document loaders to use optional langchain community imports (#21095) 2024-05-01 11:26:25 -04:00
Harrison Chase
4d1c21d97d
community[patch]: Fix alternative name in deprecation notice for sql_database (#21144) 2024-05-01 10:59:42 -04:00
East Agile
2a6f78a53f
community[minor]: Rememberizer retriever (#20052)
**Description:**
This pull request introduces a new feature for LangChain: the
integration with the Rememberizer API through a custom retriever.
This enables LangChain applications to allow users to load and sync
their data from Dropbox, Google Drive, Slack, their hard drive into a
vector database that LangChain can query. Queries involve sending text
chunks generated within LangChain and retrieving a collection of
semantically relevant user data for inclusion in LLM prompts.
User knowledge dramatically improved AI applications.
The Rememberizer integration will also allow users to access general
purpose vectorized data such as Reddit channel discussions and US
patents.

**Issue:**
N/A

**Dependencies:**
N/A

**Twitter handle:**
https://twitter.com/Rememberizer
2024-05-01 10:41:44 -04:00
Eugene Yurtsev
1ce1a10f2b
langchain[patch],community[minor]: Move graph index creator (#20795)
Move graph index creator to community
2024-05-01 10:04:30 -04:00
Noah
45ddf4d26f
community[patch]: Update comments for lazy_load method (#21063)
- [ ] **PR message**: 
- **Description:** Refactored the lazy_load method to use asynchronous
execution for improved performance. The method now initiates scraping of
all URLs simultaneously using asyncio.gather, enhancing data fetching
efficiency. Each Document object is yielded immediately once its content
becomes available, streamlining the entire process.
    - **Issue:** N/A
- **Dependencies:** Requires the asyncio library for handling
asynchronous tasks, which should already be part of standard Python
libraries in Python 3.7 and above.
    - **Email:** [r73327118@gmail.com](mailto:r73327118@gmail.com)

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-05-01 01:20:57 -04:00
Ismail Hossain Polas
1fdf63fa6c
community[patch]: update package name to bagelML (#19948)
**Description**
This pull request updates the Bagel Network package name from
"betabageldb" to "bagelML" to align with the latest changes made by the
Bagel Network team.

The following modifications have been made:

- Updated all references to the old package name ("betabageldb") with
the new package name ("bagelML") throughout the codebase.
- Modified the documentation, and any relevant scripts to reflect the
package name change.
- Tested the changes to ensure that the functionality remains intact and
no breaking changes were introduced.

By merging this pull request, our project will stay up to date with the
latest Bagel Network package naming convention, ensuring compatibility
and smooth integration with their updated library.

Please review the changes and provide any feedback or suggestions. Thank
you!
2024-05-01 01:17:33 -04:00
tianzedavid
5a8909440b
docs: remove repetitive words (#21058)
remove repetitive words

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-05-01 01:10:42 +00:00
Tomaz Bratanic
c9e96bb5e2
community[patch]: Fix neo4j enhanced schema bugs (#21072) 2024-04-30 20:16:26 -04:00
MacanPN
0f7f448603
community[patch]: add delete() method to AzureSearch vector store (#21127)
**Issue:**
Currently `AzureSearch` vector store does not implement `delete` method.
This PR implements it. This also makes it compatible with LangChain
indexer.

**Dependencies:**
None

**Twitter handle:**
@martintriska1

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-30 23:46:18 +00:00
Erick Friis
8a62fb0570
community: release 0.0.36 (#21118) 2024-04-30 13:18:44 -07:00
Jamsheed Mistri
3e749369ef
community[minor]: bump version of LayerupSecurity, add support for untrusted_input parameter (#19985)
**Description:** update version of LayerupSecurity package for the
Layerup Security integration. Add untrusted_input parameter.
2024-04-30 14:55:26 -04:00
fubuki8087
f1c3687aa5
community[patch]: Using the right encoding to parse the web page in RecursiveUrlLoader (#20632)
As shown in #13749 , `RecursiveUrlLoader` has encoding issue. This PR is
to solve this.

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-30 18:41:36 +00:00
Jakub Pawłowski
b0b1a67771
community[patch]: Skip unexpected 404 HTTP Error in Arxiv download (#21042)
### Description:
When attempting to download PDF files from arXiv, an unexpected 404
error frequently occurs. This error halts the operation, regardless of
whether there are additional documents to process. As a solution, I
suggest implementing a mechanism to ignore and communicate this error
and continue processing the next document from the list.

Proposed Solution: To address the issue of unexpected 404 errors during
PDF downloads from arXiv, I propose implementing the following solution:

- Error Handling: Implement error handling mechanisms to catch and
handle 404 errors gracefully.
- Communication: Inform the user or logging system about the occurrence
of the 404 error.
- Continued Processing: After encountering a 404 error, continue
processing the remaining documents from the list without interruption.

This solution ensures that the application can handle unexpected errors
without terminating the entire operation. It promotes resilience and
robustness in the face of intermittent issues encountered during PDF
downloads from arXiv.

### Issue:
#20909 
### Dependencies:
none

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-04-30 18:29:22 +00:00
Erick Friis
b9c53e95b7
community: release 0.0.35 (#21104) 2024-04-30 17:48:56 +00:00
Eugene Yurtsev
3c064a757f
core[minor],langchain[patch],community[patch]: Move storage interfaces to core (#20750)
* Move storage interface to core
* Move in memory and file system implementation to core
2024-04-30 13:14:26 -04:00
Charlie Marsh
8f38b7a725
multiple: Remove unnecessary Ruff suppression comments (#21050)
## Summary

I ran `ruff check --extend-select RUF100 -n` to identify `# noqa`
comments that weren't having any effect in Ruff, and then `ruff check
--extend-select RUF100 -n --fix` on select files to remove all of the
unnecessary `# noqa: F401` violations. It's possible that these were
needed at some point in the past, but they're not necessary in Ruff
v0.1.15 (used by LangChain) or in the latest release.

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-04-30 17:13:48 +00:00
Christophe Bornet
5c77f45b06
community[minor]: Add async methods to CassandraCache and CassandraSemanticCache (#20654) 2024-04-30 10:27:44 -04:00
Kuro Denjiro
fa4124b821
community[minor]: add mintbase loader to langchain (#20089)
- [x] **Add Near NFT loader**: "community: Load NFT near block chain
using mintbase graph API"

- [x] **PR message**: 
    - **Description:** a description of the change
    - **Twitter handle:**Kurodenjiro

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-30 04:11:56 +00:00
Alexander Dicke
d7e12750df
community[patch]: allows using text-generation-inference /generate route with HuggingFaceEndpoint (#20100)
- **Description:** allows to use the /generate route of
`text-generation-inference` with the `HuggingFaceEndpoint`
2024-04-29 23:09:55 -04:00
davidkgp
28b0b0d863
community[patch]: Fix for github issue #17690 (#20117)
…/17690

Thank you for contributing to LangChain!

- [x] **Fix Google Lens knowledge graph issue**: "langchain: community"
- Fix for [No "knowledge_graph" property in Google Lens API call from
SerpAPI](https://github.com/langchain-ai/langchain/issues/17690)


- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** handled the existence of keys in the json response of
Google Lens
- **Issue:** [No "knowledge_graph" property in Google Lens API call from
SerpAPI](https://github.com/langchain-ai/langchain/issues/17690)



- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/


If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-04-30 00:10:08 +00:00
高远
a7a4630bf4
community[patch]: Modify the text field type and add new exception handling (#20116)
Co-authored-by: gaoyuan <gaoyuan.20001218@bytedance.com>
2024-04-29 20:06:00 -04:00
Rahul Triptahi
c172611647
community[patch]: Add classifier_url argument in PebbloSafeLoader and documentation update. (#21030)
Description: Add classifier_url argument in PebbloSafeLoader.
Documentation: Updated PebbloSafeLoader documentation with above change
and new links for pebblo github pages.

---------

Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
2024-04-29 17:41:09 -04:00
Leonid Ganeline
85094cbb3a
docs: community docstring updates (#21040)
Added missed docstrings. Updated docstrings to consistent format.
2024-04-29 17:40:23 -04:00
Rodrigo Nogueira
90f19028e5
community[patch]: Add maritalk streaming (sync and async) (#19203)
Co-authored-by: RosevalJr <rdmalajr@gmail.com>
Co-authored-by: Roseval Donisete Malaquias Junior <roseval@maritaca.ai>
2024-04-29 21:31:14 +00:00
Cahid Arda Öz
cc6191cb90
community[minor]: Add support for Upstash Vector (#20824)
## Description

Adding `UpstashVectorStore` to utilize [Upstash
Vector](https://upstash.com/docs/vector/overall/getstarted)!

#17012 was opened to add Upstash Vector to langchain but was closed to
wait for filtering. Now filtering is added to Upstash vector and we open
a new PR. Additionally, [embedding
feature](https://upstash.com/docs/vector/features/embeddingmodels) was
added and we add this to our vectorstore aswell.

## Dependencies

[upstash-vector](https://pypi.org/project/upstash-vector/) should be
installed to use `UpstashVectorStore`. Didn't update dependencies
because of [this comment in the previous
PR](https://github.com/langchain-ai/langchain/pull/17012#pullrequestreview-1876522450).

## Tests

Tests are added and they pass. Tests are naturally network bound since
Upstash Vector is offered through an API.

There was [a discussion in the previous PR about mocking the
unittests](https://github.com/langchain-ai/langchain/pull/17012#pullrequestreview-1891820567).
We didn't make changes to this end yet. We can update the tests if you
can explain how the tests should be mocked.

---------

Co-authored-by: ytkimirti <yusuftaha9@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-29 17:25:01 -04:00
chyroc
3e241956d3
community[minor]: add coze chat model (#20770)
add coze chat model, to call coze.com apis
2024-04-29 12:26:16 -04:00
Massimiliano Pronesti
ce89b34fc0
community[patch]: support hybrid search with threshold in Azure AI Search Retriever (#20907)
Support hybrid search with a score threshold -- similar to what we do
for similarity search.
2024-04-29 12:11:44 -04:00