Commit Graph

28 Commits (9ac302cb97cbf9392ab8c37309873b069e277176)

Author SHA1 Message Date
ccurme 42257b120f
partners: fix numpy dep (#22858)
Following https://github.com/langchain-ai/langchain/pull/22813, which
added python 3.12 to CI, here we update numpy accordingly in partner
packages.
3 months ago
Bagatur efcb04f84b
mongodb[patch]: Release 0.1.6 (#22501) 4 months ago
Joydeep Banik Roy 3796672c67
community, milvus, pinecone, qdrant, mongo: Broadcast operation failure while using simsimd beyond v3.7.7 (#22271)
- [ ] **Packages affected**: 
  - community: fix `cosine_similarity` to support simsimd beyond 3.7.7
- partners/milvus: fix `cosine_similarity` to support simsimd beyond
3.7.7
- partners/mongodb: fix `cosine_similarity` to support simsimd beyond
3.7.7
- partners/pinecone: fix `cosine_similarity` to support simsimd beyond
3.7.7
- partners/qdrant: fix `cosine_similarity` to support simsimd beyond
3.7.7


- [ ] **Broadcast operation failure while using simsimd beyond v3.7.7**:
- **Description:** I was using simsimd 4.3.1 and the unsupported operand
type issue popped up. When I checked out the repo and ran the tests,
they failed as well (have attached a screenshot for that). Looks like it
is a variant of https://github.com/langchain-ai/langchain/issues/18022 .
Prior to 3.7.7, simd.cdist returned an ndarray but now it returns
simsimd.DistancesTensor which is ineligible for a broadcast operation
with numpy. With this change, it also remove the need to explicitly cast
`Z` to numpy array
    - **Issue:** #19905
    - **Dependencies:** No
    - **Twitter handle:** https://x.com/GetzJoydeep

<img width="1622" alt="Screenshot 2024-05-29 at 2 50 00 PM"
src="https://github.com/langchain-ai/langchain/assets/31132555/fb27b383-a9ae-4a6f-b355-6d503b72db56">

- [ ] **Considerations**: 
1. I started with community but since similar changes were there in
Milvus, MongoDB, Pinecone, and QDrant so I modified their files as well.
If touching multiple packages in one PR is not the norm, then I can
remove them from this PR and raise separate ones
2. I have run and verified that the tests work. Since, only MongoDB had
tests, I ran theirs and verified it works as well. Screenshots attached
:
<img width="1573" alt="Screenshot 2024-05-29 at 2 52 13 PM"
src="https://github.com/langchain-ai/langchain/assets/31132555/ce87d1ea-19b6-4900-9384-61fbc1a30de9">
<img width="1614" alt="Screenshot 2024-05-29 at 3 33 51 PM"
src="https://github.com/langchain-ai/langchain/assets/31132555/6ce1d679-db4c-4291-8453-01028ab2dca5">
  

I have added a test for simsimd. I feel it may not go well with the
CI/CD setup as installing simsimd is not a dependency requirement. I
have just imported simsimd to ensure simsimd cosine similarity is
invoked. However, its not a good approach. Suggestions are welcome and I
can make the required changes on the PR. Please provide guidance on the
same as I am new to the community.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
4 months ago
Bagatur 50186da0a1
infra: rm unused # noqa violations (#22049)
Updating #21137
4 months ago
Erick Friis aca98fd150
multiple: releases with relaxed core dep (#21724) 4 months ago
Jib f369495fa0
mongodb: [performance] Increase DEFAULT_INSERT_BATCH_SIZE to 100,000 and introduce sizing constraints (#19608) 4 months ago
Erick Friis 9973547aef
mongodb: release 0.1.4 (#21678) 4 months ago
Jib a97473c846
mongodb[patch]: Make ObjectId JSON-serializable on generation (#21394) 4 months ago
Erick Friis c77d2f2b06
multiple: core 0.2 nonbreaking dep, check_diff community->langchain dep (#21646)
0.2 is not a breaking release for core (but it is for langchain and
community)

To keep the core+langchain+community packages in sync at 0.2, we will
relax deps throughout the ecosystem to tolerate `langchain-core` 0.2
4 months ago
Erick Friis 91a2ea5cd6
chroma, mongodb: fix docstrings (#21629) 4 months ago
Charlie Marsh fd94aa8366
partner[patch]: Upgrade to Ruff v0.4.2 (#21108)
## Summary

No new diagnostics (given that the set of enabled rules hasn't changed),
but gains access to our new parser (much faster) and reduced false
positives all around.
5 months ago
William FH b7b62e29fb
community[patch], mongodb[patch]: Stop spamming SIMD import warnings (#19531)
If you use an embedding dist function in an eval loop, you get warned
every time. Would prefer to just check once and forget about it.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
6 months ago
Ash Vardanian d01bad5169
core[patch]: Convert SimSIMD back to NumPy (#19473)
This patch fixes the #18022 issue, converting the SimSIMD internal
zero-copy outputs to NumPy.

I've also noticed, that oftentimes `dtype=np.float32` conversion is used
before passing to SimSIMD. Which numeric types do LangChain users
generally care about? We support `float64`, `float32`, `float16`, and
`int8` for cosine distances and `float16` seems reasonable for
practically any kind of embeddings and any modern piece of hardware, so
we can change that part as well 🤗
6 months ago
Jib f8078e41e5
mongodb[patch]: Added scoring threshold to caching (#19286)
## Description
Semantic Cache can retrieve noisy information if the score threshold for
the value is too low. Adding the ability to set a `score_threshold` on
cache construction can allow for less noisy scores to appear.


- [x] **Add tests and docs**
  1. Added tests that confirm the `score_threshold` query is valid.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
6 months ago
Jib 516cc44b3f
langchain-mongodb: [test-fix] add explicit index_name setting on test vector creation (#19245)
- **Description:** Tests fail to do value lookup because it does not
specify the index name
  - **Issue:** the issue # Failing integration test
 

- [x] **Add tests and docs**: Tests now pass


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
6 months ago
Erick Friis 6fa1438334
mongodb[patch]: release 0.1.2 (#19243) 6 months ago
Jib ec026004cb
mongodb[patch]: Remove in-memory cache from cache abstractions (#18987)
## Description
* In memory cache easily gets out of sync with the server cache, so we
will remove it entirely to reduce the issues around invalidated caches.

## Dependencies
None

- [x]  If you're adding a new integration, please include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Co-authored-by: Erick Friis <erick@langchain.dev>
6 months ago
Jib 866d6408af
mongodb[patch]: Remove embedding retrieval from mongodb payload (#19035)
## Description
Returning the embedding is not necessary in the vector search
functionality unless specified as a debugging step. This change defaults
the behavior such that the server _only_ returns the embedding key if
explicitly requested, such as in the case of
`max_marginal_relevance_search`.


- [x] **Add tests and docs**: If you're adding a new integration, please
include
* Added `test_from_documents_no_embedding_return`


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
6 months ago
Erick Friis 93ef8ead0b
mongodb[patch]: fix core dep (#18926) 6 months ago
Jib d60e93b6ae
langchain-mongodb: Standardize mongodb collection/index names in tests (#18755)
## **Description:**
MongoDB integration tests link to a provided Atlas Cluster. We have very
stringent permissions set against the cluster provided. In order to make
it easier to track and isolate the collections each test gets run
against, we've updated the collection names to map the test file name.
i.e. `langchain_{filename}` => `langchain_test_vectorstores`

Fixes integration test results

![image](https://github.com/langchain-ai/langchain/assets/2887713/41f911b9-55f7-4fe4-9134-5514b82009f9)

## **Dependencies:** 
Provided MONGODB_ATLAS_URI

- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

cc: @shaneharvey, @blink1073 , @NoahStapp , @caseyclements
6 months ago
Erick Friis 2619420df1
mongodb[patch]: release 0.1.1 (#18692) 7 months ago
Jib 8bc347c5fc
mongodb[patch]: include LLM caches in toplevel library import (#18601) 7 months ago
Jib fc35262356
langchain-mongodb: add unit tests for MongoDBChatMessageHistory (#18599)
## Description
Adding in Unit Test variation for `MongoDBChatMessageHistory` package
Follow-up to #18590 

- [x] **Add tests and docs**: Unit test is what's being added
  

- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
7 months ago
Jib 9da1e0cf34
mongodb[patch]: Migrate MongoDBChatMessageHistory (#18590)
## **Description** 
Migrate the `MongoDBChatMessageHistory` to the managed
`langchain-mongodb` partner-package
## **Dependencies**
None
## **Twitter handle**
@mongodb

## **tests and docs**
- [x] Migrate existing integration test
- [x ]~ Convert existing integration test to a unit test~ Creation is
out of scope for this ticket
- [x ] ~Considering delaying work until #17470 merges to leverage the
`MockCollection` object. ~
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
7 months ago
Jib f92f7d2e03
mongodb[minor]: Add MongoDB LLM Cache (#17470)
# Description

- **Description:** Adding MongoDB LLM Caching Layer abstraction
- **Issue:** N/A
- **Dependencies:** None
- **Twitter handle:** @mongodb

Checklist:

- [x] PR title: Please title your PR "package: description", where
"package" is whichever of langchain, community, core, experimental, etc.
is being modified. Use "docs: ..." for purely docs changes, "templates:
..." for template changes, "infra: ..." for CI changes.
  - Example: "community: add foobar LLM"
- [x] PR Message (above)
- [x] Pass lint and test: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified to check that you're
passing lint and testing. See contribution guidelines for more
information on how to write/run tests, lint, etc:
https://python.langchain.com/docs/contributing/
- [ ] Add tests and docs: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @efriis, @eyurtsev, @hwchase17.

---------

Co-authored-by: Jib <jib@byblack.us>
7 months ago
Jib c2b1abe91b
mongodb[patch]: Set delete_many only if count_documents is not 0 (#18402)
- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** Remove the assert statement on the `count_documents`
in setup_class. It should just delete if there are documents present
    - **Issue:** the issue # Crashes on class setup
    - **Dependencies:** None
    - **Twitter handle:** @mongodb


- [x] **Add tests and docs**: If you're adding a new integration, please
include
  1. N/A


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

Co-authored-by: Jib <jib@byblack.us>
7 months ago
Erick Friis 7bbff98dc7
mongodb[patch]: core 0.1.5 dep (#18348) 7 months ago
Jib 72bfc1d3db
mongodb[minor]: MongoDB Partner Package -- Porting MongoDBAtlasVectorSearch (#17652)
This PR migrates the existing MongoDBAtlasVectorSearch abstraction from
the `langchain_community` section to the partners package section of the
codebase.
- [x] Run the partner package script as advised in the partner-packages
documentation.
- [x] Add Unit Tests
- [x] Migrate Integration Tests
- [x] Refactor `MongoDBAtlasVectorStore` (autogenerated) to
`MongoDBAtlasVectorSearch`
- [x] ~Remove~ deprecate the old `langchain_community` VectorStore
references.

## Additional Callouts
- Implemented the `delete` method
- Included any missing async function implementations
  - `amax_marginal_relevance_search_by_vector`
  - `adelete` 
- Added new Unit Tests that test for functionality of
`MongoDBVectorSearch` methods
- Removed [`del
res[self._embedding_key]`](e0c81e1cb0/libs/community/langchain_community/vectorstores/mongodb_atlas.py (L218))
in `_similarity_search_with_score` function as it would make the
`maximal_marginal_relevance` function fail otherwise. The `Document`
needs to store the embedding key in metadata to work.

Checklist:

- [x] PR title: Please title your PR "package: description", where
"package" is whichever of langchain, community, core, experimental, etc.
is being modified. Use "docs: ..." for purely docs changes, "templates:
..." for template changes, "infra: ..." for CI changes.
  - Example: "community: add foobar LLM"
- [x] PR message
- [x] Pass lint and test: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified to check that you're
passing lint and testing. See contribution guidelines for more
information on how to write/run tests, lint, etc:
https://python.langchain.com/docs/contributing/
- [x] Add tests and docs: If you're adding a new integration, please
include
1. Existing tests supplied in docs/docs do not change. Updated
docstrings for new functions like `delete`
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory. (This already exists)

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Steven Silvester <steven.silvester@ieee.org>
Co-authored-by: Erick Friis <erick@langchain.dev>
7 months ago