Commit Graph

272 Commits

Author SHA1 Message Date
Rohan Aggarwal
86e8224cf1
community[patch]: Support for old clients (Thin and Thick) Oracle Vector Store (#22766)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
Support for old clients (Thin and Thick) Oracle Vector Store


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
Support for old clients (Thin and Thick) Oracle Vector Store

- [ ] **Add tests and docs**: If you're adding a new integration, please
include
Have our own local tests

---------

Co-authored-by: rohan.aggarwal@oracle.com <rohaagga@phoenix95642.dev3sub2phx.databasede3phx.oraclevcn.com>
2024-06-11 11:36:06 -07:00
Aayush Kataria
71811e0547
community[minor]: Adds a vector store for Azure Cosmos DB for NoSQL (#21676)
This PR add supports for Azure Cosmos DB for NoSQL vector store.

Summary:

Description: added vector store integration for Azure Cosmos DB for
NoSQL Vector Store,
Dependencies: azure-cosmos dependency,
Tag maintainer: @hwchase17, @baskaryan @efriis @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-06-11 10:34:01 -07:00
Mohammad Mohtashim
36cad5d25c
[Community]: Added Metadata filter support for DocumentDB Vector Store (#22777)
- **Description:** As pointed out in this issue #22770, DocumentDB
`similarity_search` does not support filtering through metadata which
this PR adds by passing in the parameter `filter`. Also this PR fixes a
minor Documentation error.
- **Issue:** #22770

---------

Co-authored-by: Erick Friis <erickfriis@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-06-11 16:37:53 +00:00
am-kinetica
ad101adec8
community[patch]: Kinetica Integrations handled error in querying; quotes in table names; updated gpudb API (#22724)
- [ ] **Miscellaneous updates and fixes**: 
- **Description:** Handled error in querying; quotes in table names;
updated gpudb API
- **Issue:** Threw an error with an error message difficult to
understand if a query failed or returned no records
    - **Dependencies:** Updated GPUDB API version to `7.2.0.9`


@baskaryan @hwchase17
2024-06-11 10:01:26 -04:00
Nithish Raghunandanan
f2f0e0e13d
couchbase: Add the initial version of Couchbase partner package (#22087)
Co-authored-by: Nithish Raghunandanan <nithishr@users.noreply.github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-06-07 14:04:08 -07:00
Erick Friis
a24a9c6427
multiple: get rid of pyproject extras (#22581)
They cause `poetry lock` to take a ton of time, and `uv pip install` can
resolve the constraints from these toml files in trivial time
(addressing problem with #19153)

This allows us to properly upgrade lockfile dependencies moving forward,
which revealed some issues that were either fixed or type-ignored (see
file comments)
2024-06-06 15:45:22 -07:00
lucasiscovici
05bf98b2f9
community[patch]: pgvector replace nin_ by not_in (#22619)
- [ ] **community**: "pgvector: replace nin_ by not_in"

- [ ] **PR message**: nin_ do not exist in sqlalchemy orm, it's not_in
2024-06-06 12:17:22 -04:00
Bagatur
584a1e30ac
community[patch]: AzureSearch async functions (#22075) 2024-06-05 14:39:54 -07:00
Stefano Lottini
328d0c99f2
community[minor]: Add support for metadata indexing policy in Cassandra vector store (#22548)
This PR adds a constructor `metadata_indexing` parameter to the
Cassandra vector store to allow optional fine-tuning of which fields of
the metadata are to be indexed.

This is a feature supported by the underlying CassIO library. Indexing
mode of "all", "none" or deny- and allow-list based choices are
available.

The rationale is, in some cases it's advisable to programmatically
exclude some portions of the metadata from the index if one knows in
advance they won't ever be used at search-time. this keeps the index
more lightweight and performant and avoids limitations on the length of
_indexed_ strings.

I added a integration test of the feature. I also added the possibility
of running the integration test with Cassandra on an arbitrary IP
address (e.g. Dockerized), via
`CASSANDRA_CONTACT_POINTS=10.1.1.5,10.1.1.6 poetry run pytest [...]` or
similar.

While I was at it, I added a line to the `.gitignore` since the mypy
_test_ cache was not ignored yet.

My X (Twitter) handle: @rsprrs.
2024-06-05 11:23:26 -04:00
Vincent Min
59bef31997
community[minor]: Improve InMemoryVectorStore with ability to persist to disk and filter on metadata. (#22186)
- **Description:** The InMemoryVectorStore is a nice and simple vector
store implementation for quick development and debugging. The current
implementation is quite limited in its functionalities. This PR extends
the functionalities by adding utility function to persist the vector
store to a json file and to load it from a json file. We choose the json
file format because it allows inspection of the database contents in a
text editor, which is great for debugging. Furthermore, it adds a
`filter` keyword that can be used to filter out documents on their
`page_content` or `metadata`.
- **Issue:** -
- **Dependencies:** -
- **Twitter handle:** @Vincent_Min
2024-06-05 10:40:34 -04:00
Anthony Bernabeu
4e676a63b8
community[minor]: Added filter search for LanceDB (#22461)
- [ ] **community**: "vectorstore: added filtering support for LanceDB
vector store"

- [ ] **This PR adds filtering capabilities to LanceDB**:
- **Description:** In LanceDB filtering can be applied when searching
for data into the vectorstore. It is using the SQL language as mentioned
in the LanceDB documentation.
    - **Issue:** #18235 
    - **Dependencies:** No

- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
2024-06-05 09:33:54 -04:00
Ofer Mendelevitch
ad502e8d50
community[minor]: Vectara Integration Update - Streaming, FCS, Chat, updates to documentation and example notebooks (#21334)
Thank you for contributing to LangChain!

**Description:** update to the Vectara / Langchain integration to
integrate new Vectara capabilities:
- Full RAG implemented as a Runnable with as_rag()
- Vectara chat supported with as_chat()
- Both support streaming response
- Updated documentation and example notebook to reflect all the changes
- Updated Vectara templates

**Twitter handle:** ofermend

**Add tests and docs**: no new tests or docs, but updated both existing
tests and existing docs
2024-06-04 12:57:28 -07:00
Christophe Bornet
9a8fe58ebe
community[minor]: Improve Cassandra VectorStore as_retriever (#22465)
The Vectorstore's API `as_retriever` doesn't expose explicitly the
parameters `search_type` and `search_kwargs` and so these are not well
documented.
This PR improves `as_retriever` for the Cassandra VectorStore by making
these parameters explicit.

NB: An alternative would have been to modify `as_retriever` in
`Vectorstore`. But there's probably a good reason these were not exposed
in the first place ? Is it because implementations may decide to not
support them and have fixed values when creating the
VectorStoreRetriever ?
2024-06-04 09:51:17 -04:00
Fahreddin Özcan
0061ded002
community[patch]: Upstash Vector Store Namespace Support (#22251)
This PR introduces namespace support for Upstash Vector Store, which
would allow users to partition their data in the vector index.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-06-03 17:30:56 -07:00
Dan
86509161b0
community: fix AzureSearch delete documents (#22315)
**Description**

Fix AzureSearch delete documents method by using FIELDS_ID variable
instead of the hard coded "id" value

**Issue:** 

This is linked to this issue:
https://github.com/langchain-ai/langchain/issues/22314

Co-authored-by: dseban <dan.seban@neoxia.com>
2024-06-03 15:55:06 +00:00
WU LIFU
86698b02a9
doc: fix wrong documentation on FAISS load_local function (#22310)
### Issue: #22299 

### descriptions
The documentation appears to be wrong. When the user actually sets this
parameter "asynchronous" to be True, it fails because the __init__
function of FAISS class doesn't allow this parameter. In fact, most of
the class/instance functions of this class have both the sync/async
version, so it looks like what we need is just to remove this parameter
from the doc.

Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [x] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

Co-authored-by: Lifu Wu <lifu@nextbillion.ai>
2024-05-30 15:15:04 +00:00
ChengZi
404d92ded0
milvus: New langchain_milvus package and new milvus features (#21077)
New features:

- New langchain_milvus package in partner
- Milvus collection hybrid search retriever
- Zilliz cloud pipeline retriever
- Milvus Local guid
- Rag-milvus template

---------

Signed-off-by: ChengZi <chen.zhang@zilliz.com>
Signed-off-by: Jael Gu <mengjia.gu@zilliz.com>
Co-authored-by: Jael Gu <mengjia.gu@zilliz.com>
Co-authored-by: Jackson <jacksonxie612@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Erick Friis <erickfriis@gmail.com>
2024-05-28 08:24:20 -07:00
Pavlo Paliychuk
342df7cf83
community[minor]: Add Zep Cloud components + docs + examples (#21671)
Thank you for contributing to LangChain!

- [x] **PR title**: community: Add Zep Cloud components + docs +
examples

- [x] **PR message**: 
We have recently released our new zep-cloud sdks that are compatible
with Zep Cloud (not Zep Open Source). We have also maintained our Cloud
version of langchain components (ChatMessageHistory, VectorStore) as
part of our sdks. This PRs goal is to port these components to langchain
community repo, and close the gap with the existing Zep Open Source
components already present in community repo (added
ZepCloudMemory,ZepCloudVectorStore,ZepCloudRetriever).
Also added a ZepCloudChatMessageHistory components together with an
expression language example ported from our repo. We have left the
original open source components intact on purpose as to not introduce
any breaking changes.
    - **Issue:** -
- **Dependencies:** Added optional dependency of our new cloud sdk
`zep-cloud`
    - **Twitter handle:** @paulpaliychuk51


- [x] **Add tests and docs**


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-05-27 12:50:13 -07:00
Jan Soubusta
cccc8fbe2f
community[patch]: DuckDB VS - expose similarity, improve performance of from_texts (#20971)
3 fixes of DuckDB vector store:
- unify defaults in constructor and from_texts (users no longer have to
specify `vector_key`).
- include search similarity into output metadata (fixes #20969)
- significantly improve performance of `from_documents`

Dependencies: added Pandas to speed up `from_documents`.
I was thinking about CSV and JSON options, but I expect trouble loading
JSON values this way and also CSV and JSON options require storing data
to disk.
Anyway, the poetry file for langchain-community already contains a
dependency on Pandas.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
2024-05-24 15:17:52 -07:00
Ameya Shenoy
8ba492ed6a
community[minor]: clickhouse -- ability to use secure connection (#22108)
- **Description:** this PR gives clickhouse client the ability to use a
secure connection to the clickhosue server
- **Issue:** fixes #22082
- **Dependencies:** -
- **Twitter handle:** `_codingcoffee_`

Signed-off-by: Ameya Shenoy <shenoy.ameya@gmail.com>
Co-authored-by: Shresth Rana <shresth@grapevine.in>
2024-05-24 17:30:22 +00:00
Eugene Yurtsev
2d693c484e
docs: fix some spelling mistakes caught by newest version of code spell (#22090)
Going to merge this even though it doesn't pass all tests, and open a
separate PR for the remaining spelling mistakes.
2024-05-23 16:59:11 -04:00
Pavel Zloi
fe26f937e4
community[minor]: ManticoreSearch engine added to vectorstore (#19117)
**Description:** ManticoreSearch engine added to vectorstores
**Issue:** no issue, just a new feature
**Dependencies:** https://pypi.org/project/manticoresearch-dev/
**Twitter handle:** @EvilFreelancer

- Example notebook with test integration:

https://github.com/EvilFreelancer/langchain/blob/manticore-search-vectorstore/docs/docs/integrations/vectorstores/manticore_search.ipynb

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-05-23 13:56:18 -07:00
Sky
12d65f17ff
community[patch]: surrealdb provide functions for MMR (Maximal Marginal Relevance) (#21185)
This PR contains 4 added functions:

- max_marginal_relevance_search_by_vector
- amax_marginal_relevance_search_by_vector
- max_marginal_relevance_search
- amax_marginal_relevance_search

I'm no langchain expert, but tried do inspect other vectorstore sources
like chroma, to build these functions for SurrealDB. If someone has some
changes for me, please let me know. Otherwise I would be happy, if these
changes are added to the repository, so that I can use the orignal repo
and not my local monkey patched version.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-05-22 22:53:55 +00:00
Bagatur
50186da0a1
infra: rm unused # noqa violations (#22049)
Updating #21137
2024-05-22 15:21:08 -07:00
Mohammad Mohtashim
16617dd239
community[patch]: AzureSearchVectorStoreRetriever Fixed to account for search_kwargs (#21572)
- **Description:** Fixed `AzureSearchVectorStoreRetriever` to account
for search_kwargs. More explanation is in the mentioned issue.
- **Issue:** #21492

---------

Co-authored-by: MAC <mac@MACs-MacBook-Pro.local>
Co-authored-by: Massimiliano Pronesti <massimiliano.pronesti@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-05-22 14:46:41 -07:00
CaroFG
6b98140b38
community[patch]: update for compatibility with Meilisearch v1.8 (#21979)
Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** Updates Meilisearch vectorstore for compatibility
with v1.8. Adds [”showRankingScore”:
true”](https://www.meilisearch.com/docs/reference/api/search#ranking-score)
in the search parameters and replaces `_semanticScore` field with `
_rankingScore`


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-05-22 13:37:01 -07:00
Oleksii Pokotylo
98c0b093bb
community[patch]: Extend AzureSearch with maximal_marginal_relevance, from_embeddings (#21065)
**Description:**
- Extend AzureSearch with `maximal_marginal_relevance` (for vector and
hybrid search)
- Add construction `from_embeddings` - if the user has already embedded
the texts
- Add `add_embeddings` 
- Refactor common parts (`_simple_search`, `_results_to_documents`,
`_reorder_results_with_maximal_marginal_relevance`)
- Add `vector_search_dimensions` as a parameter to the constructor to
avoid extra calls to `embed_query` (most of the time the user applies
the same model and knows the dimension)

**Issue:** none
**Dependencies:** none

- [x] **Add tests and docs**: The docstrings have been added to the new
functions, and unified for the existing ones. The example notebook is
great in illustrating the main usage of AzureSearch, adding the new
methods would only dilute the main content.
- [x] **Lint and test**

---------

Co-authored-by: Oleksii Pokotylo <oleksii.pokotylo@pwc.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-05-22 13:36:06 -07:00
SaschaStoll
709664a079
community[patch]: Performant filter columns option for Hanavector (#21971)
**Description:** Backwards compatible extension of the initialisation
interface of HanaDB to allow the user to specify
specific_metadata_columns that are used for metadata storage of selected
keys which yields increased filter performance. Any not-mentioned
metadata remains in the general metadata column as part of a JSON
string. Furthermore switched to executemany for batch inserts into
HanaDB.

**Issue:** N/A

**Dependencies:** no new dependencies added

**Twitter handle:** @sapopensource

---------

Co-authored-by: Martin Kolb <martin.kolb@sap.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-05-22 13:21:21 -07:00
Eugene Yurtsev
36813d2f00
community[patch]: Fix remaining __inits__ in community (#22037)
Fixes the __init__ files in community to use __all__ which is statically
defined.
2024-05-22 17:42:17 +00:00
Bagatur
72d4a8eeed
community[patch]: AzureSearch dont overwrite default async (#21989) 2024-05-21 11:01:28 -07:00
Jesse S
fc79b372cb
community[minor]: add aerospike vectorstore integration (#21735)
Please let me know if you see any possible areas of improvement. I would
very much appreciate your constructive criticism if time allows.

**Description:**
- Added a aerospike vector store integration that utilizes
[Aerospike-Vector-Search](https://aerospike.com/products/vector-database-search-llm/)
add-on.
- Added both unit tests and integration tests
- Added a docker compose file for spinning up a test environment
- Added a notebook

 **Dependencies:** any dependencies required for this change
- aerospike-vector-search

 **Twitter handle:** 
- No twitter, you can use my GitHub handle or LinkedIn if you'd like

Thanks!

---------

Co-authored-by: Jesse Schumacher <jschumacher@aerospike.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-05-21 01:01:47 +00:00
缨缨
bd39b2ccdf
community: enable SupabaseVectorStore to support extended table fields (#21762)
Thank you for contributing to LangChain!

- [x] **PR title**: "community: enable SupabaseVectorStore to support
extended table fields"

- [x] **PR message**: 
- Added extension fields to the function _add_vectors so that users can
add other custom fields when insert a record into the database. eg:
    

![image](https://github.com/langchain-ai/langchain/assets/10885578/e1d5ca20-936e-4cab-ba69-8fdd23b8ce8f)

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-05-20 16:32:26 -07:00
Tomaz Bratanic
d85e46321a
community[patch]: Better error message for neo4j vector when text is null (#21861) 2024-05-20 10:25:58 -07:00
Sen Lin
eb7f07ae36
community[patch]: fix typo in ValueError message in load_local function (#21818)
**Description:**
Corrected an error in the `allow_dangerous_deserialization` message
within the `load_local` functions
2024-05-17 17:19:04 +00:00
Massimiliano Pronesti
0c0db7c5db
feat(community): support semantic hybrid score threshold in Azure AI Search (#21527)
Support semantic hybrid search with a score threshold -- similar to what
we do for similarity search and for hybrid search (#20907).
2024-05-16 15:54:32 -04:00
Anush
edd68e4ad4
qdrant: init package (#21146)
## Description

This PR introduces the new `langchain-qdrant` partner package, intending
to deprecate the community package.

## Changes

- Moved the Qdrant vector store implementation `/libs/partners/qdrant`
with integration tests.
- The conditional imports of the client library are now regular with
minor implementation improvements.
- Added a deprecation warning to
`langchain_community.vectorstores.qdrant.Qdrant`.
- Replaced references/imports from `langchain_community` with either
`langchain_core` or by moving the definitions to the `langchain_qdrant`
package itself.
- Updated the Qdrant vector store documentation to reflect the changes.

## Testing
- `QDRANT_URL` and
[`QDRANT_API_KEY`](583e36bf6b)
env values need to be set to [run integration
tests](d608c93d1f)
in the [cloud](https://cloud.qdrant.tech).
- If a Qdrant instance is running at `http://localhost:6333`, the
integration tests will use it too.
- By default, tests use an
[`in-memory`](https://github.com/qdrant/qdrant-client?tab=readme-ov-file#local-mode)
instance(Not comprehensive).

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Erick Friis <erickfriis@gmail.com>
2024-05-13 18:20:03 -07:00
Christophe Bornet
66a4da8ad0
community[patch]: Improve Cassandra VectorStore docsctrings (#21620) 2024-05-13 15:24:26 -04:00
Leonid Ganeline
500569da48
community[patch]: vectorstores import update (#21169)
Issue: we have several helper functions to import third-party libraries
like lancedb.import_lancedb in
[community.vectorstores](https://api.python.langchain.com/en/latest/vectorstores/langchain_community.vectorstores.lancedb.import_lancedb.html#langchain_community.vectorstores.lancedb.import_lancedb).
And we have core.utils.utils.guard_import that works exactly for this
purpose.
The import_<package> functions work inconsistently and rather be private
functions.
Change: replaced these functions with the guard_import function.

Related to #21133
2024-05-13 10:45:31 -04:00
Yash
cb31c3611f
Ndb enterprise (#21233)
Description: Adds NeuralDBClientVectorStore to the langchain, which is
our enterprise client.

---------

Co-authored-by: kartikTAI <129414343+kartikTAI@users.noreply.github.com>
Co-authored-by: Kartik Sarangmath <kartik@thirdai.com>
2024-05-08 16:30:58 -07:00
Andreas Motl
17e42bbd18
community[patch]: pgvector: Slight refactoring to make code a bit more reusable (#16243)
- **Description:** Improve [pgvector vector store
adapter](https://github.com/langchain-ai/langchain/blob/v0.1.1/libs/community/langchain_community/vectorstores/pgvector.py)
to make it reusable by adapters deriving from that.
  - **Issue:** NA
  - **Dependencies:** NA
  - **References:** https://github.com/crate-workbench/langchain/pull/1
  - **Addressed to:** @eyurtsev, @cbornet


Hi from the CrateDB team,

first of all, thanks a stack for conceiving and maintaining LangChain.
We are currently [preparing a
patch](https://github.com/crate-workbench/langchain/pull/1) for adding
[CrateDB](https://github.com/crate/crate) to the list of community
adapters.

Because CrateDB aims to be compatible with PostgreSQL to some degree,
the vector store subsystem in LangChain derives functionality from the
corresponding implementation for pgvector.

Therefore, in order to make the implementation more reusable, we needed
to rename the private methods `__from` and `__query_collection` to the
less private counterparts `_from` and `_query_collection`, so they can
be overwritten, in order to unlock other adapters deriving from
[pgvector](https://github.com/langchain-ai/langchain/blob/v0.1.1/libs/community/langchain_community/vectorstores/pgvector.py).

With kind regards,
Andreas.
2024-05-08 17:21:30 -04:00
Shailendra Mishra
aa966b6161
Replaced bind variable in SQL with formatted string for compatibility with sql syntax. (#21439)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-05-08 13:51:30 -07:00
Eugene Yurtsev
6a1d61dbf1
community[patch]: Fix in memory vectorstore to take into account ids when adding docs (#21384)
Should respect `ids` if passed
2024-05-07 15:05:16 -04:00
Jan Soubusta
d9a61c0fa9
fix: respect table_name argument when calling from_texts (#21252)
valid for from_documents() as well

fixes #21251
2024-05-06 20:28:22 +00:00
Mark Cusack
060987d755
community[minor]: Add indexing via locality sensitive hashing to the Yellowbrick vector store (#20856)
- **Description:** Add LSH-based indexing to the Yellowbrick vector
store module
- **Twitter handle:** @markcusack

---------

Co-authored-by: markcusack <markcusack@markcusacksmac.lan>
Co-authored-by: markcusack <markcusack@Mark-Cusack-sMac.local>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-05-06 20:18:02 +00:00
Christophe Bornet
484a009012
community[minor]: Relax constraints on Cassandra VectorStore constructors (#21209)
If Session and/or keyspace are not provided, they are resolved from
cassio's context. So they are not required.
This change is fully backward compatible.
2024-05-06 14:32:32 -04:00
Rohan Aggarwal
8021d2a2ab
community[minor]: Oraclevs integration (#21123)
Thank you for contributing to LangChain!

- Oracle AI Vector Search 
Oracle AI Vector Search is designed for Artificial Intelligence (AI)
workloads that allows you to query data based on semantics, rather than
keywords. One of the biggest benefit of Oracle AI Vector Search is that
semantic search on unstructured data can be combined with relational
search on business data in one single system. This is not only powerful
but also significantly more effective because you don't need to add a
specialized vector database, eliminating the pain of data fragmentation
between multiple systems.


- Oracle AI Vector Search is designed for Artificial Intelligence (AI)
workloads that allows you to query data based on semantics, rather than
keywords. One of the biggest benefit of Oracle AI Vector Search is that
semantic search on unstructured data can be combined with relational
search on business data in one single system. This is not only powerful
but also significantly more effective because you don't need to add a
specialized vector database, eliminating the pain of data fragmentation
between multiple systems.
This Pull Requests Adds the following functionalities
Oracle AI Vector Search : Vector Store
Oracle AI Vector Search : Document Loader
Oracle AI Vector Search : Document Splitter
Oracle AI Vector Search : Summary
Oracle AI Vector Search : Oracle Embeddings


- We have added unit tests and have our own local unit test suite which
verifies all the code is correct. We have made sure to add guides for
each of the components and one end to end guide that shows how the
entire thing runs.


- We have made sure that make format and make lint run clean.

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: skmishraoracle <shailendra.mishra@oracle.com>
Co-authored-by: hroyofc <harichandan.roy@oracle.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-05-04 03:15:35 +00:00
ccurme
6da3d92b42
(all): update removal in deprecation warnings from 0.2 to 0.3 (#21265)
We are pushing out the removal of these to 0.3.

`find . -type f -name "*.py" -exec sed -i ''
's/removal="0\.2/removal="0.3/g' {} +`
2024-05-03 14:29:36 -04:00
Raghav Dixit
7d451d0041
community[patch]: Update lancedb.py (#21192)
very minor update in LanceDB integration, 'metric' argument was missing.
2024-05-02 17:06:39 +00:00
Ismail Hossain Polas
1fdf63fa6c
community[patch]: update package name to bagelML (#19948)
**Description**
This pull request updates the Bagel Network package name from
"betabageldb" to "bagelML" to align with the latest changes made by the
Bagel Network team.

The following modifications have been made:

- Updated all references to the old package name ("betabageldb") with
the new package name ("bagelML") throughout the codebase.
- Modified the documentation, and any relevant scripts to reflect the
package name change.
- Tested the changes to ensure that the functionality remains intact and
no breaking changes were introduced.

By merging this pull request, our project will stay up to date with the
latest Bagel Network package naming convention, ensuring compatibility
and smooth integration with their updated library.

Please review the changes and provide any feedback or suggestions. Thank
you!
2024-05-01 01:17:33 -04:00
tianzedavid
5a8909440b
docs: remove repetitive words (#21058)
remove repetitive words

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-05-01 01:10:42 +00:00