langchain/libs/partners/pinecone
Joydeep Banik Roy 3796672c67
community, milvus, pinecone, qdrant, mongo: Broadcast operation failure while using simsimd beyond v3.7.7 (#22271)
- [ ] **Packages affected**: 
  - community: fix `cosine_similarity` to support simsimd beyond 3.7.7
- partners/milvus: fix `cosine_similarity` to support simsimd beyond
3.7.7
- partners/mongodb: fix `cosine_similarity` to support simsimd beyond
3.7.7
- partners/pinecone: fix `cosine_similarity` to support simsimd beyond
3.7.7
- partners/qdrant: fix `cosine_similarity` to support simsimd beyond
3.7.7


- [ ] **Broadcast operation failure while using simsimd beyond v3.7.7**:
- **Description:** I was using simsimd 4.3.1 and the unsupported operand
type issue popped up. When I checked out the repo and ran the tests,
they failed as well (have attached a screenshot for that). Looks like it
is a variant of https://github.com/langchain-ai/langchain/issues/18022 .
Prior to 3.7.7, simd.cdist returned an ndarray but now it returns
simsimd.DistancesTensor which is ineligible for a broadcast operation
with numpy. With this change, it also remove the need to explicitly cast
`Z` to numpy array
    - **Issue:** #19905
    - **Dependencies:** No
    - **Twitter handle:** https://x.com/GetzJoydeep

<img width="1622" alt="Screenshot 2024-05-29 at 2 50 00 PM"
src="https://github.com/langchain-ai/langchain/assets/31132555/fb27b383-a9ae-4a6f-b355-6d503b72db56">

- [ ] **Considerations**: 
1. I started with community but since similar changes were there in
Milvus, MongoDB, Pinecone, and QDrant so I modified their files as well.
If touching multiple packages in one PR is not the norm, then I can
remove them from this PR and raise separate ones
2. I have run and verified that the tests work. Since, only MongoDB had
tests, I ran theirs and verified it works as well. Screenshots attached
:
<img width="1573" alt="Screenshot 2024-05-29 at 2 52 13 PM"
src="https://github.com/langchain-ai/langchain/assets/31132555/ce87d1ea-19b6-4900-9384-61fbc1a30de9">
<img width="1614" alt="Screenshot 2024-05-29 at 3 33 51 PM"
src="https://github.com/langchain-ai/langchain/assets/31132555/6ce1d679-db4c-4291-8453-01028ab2dca5">
  

I have added a test for simsimd. I feel it may not go well with the
CI/CD setup as installing simsimd is not a dependency requirement. I
have just imported simsimd to ensure simsimd cosine similarity is
invoked. However, its not a good approach. Suggestions are welcome and I
can make the required changes on the PR. Please provide guidance on the
same as I am new to the community.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-06-04 17:36:31 +00:00
..
langchain_pinecone community, milvus, pinecone, qdrant, mongo: Broadcast operation failure while using simsimd beyond v3.7.7 (#22271) 2024-06-04 17:36:31 +00:00
scripts infra: add print rule to ruff (#16221) 2024-02-09 16:13:30 -08:00
tests pinecone[patch]: integration test debug (#17960) 2024-02-22 09:11:21 -08:00
.gitignore pinecone: init pkg (#16556) 2024-02-05 11:55:01 -08:00
LICENSE pinecone: init pkg (#16556) 2024-02-05 11:55:01 -08:00
Makefile pinecone[patch], docs: PineconeVectorStore, release 0.0.3 (#17896) 2024-02-22 08:24:08 -08:00
poetry.lock pinecone: bump min core version (#21742) 2024-05-15 19:31:43 -07:00
pyproject.toml pinecone: bump min core version (#21742) 2024-05-15 19:31:43 -07:00
README.md docs: update pinecone README to use PineconeVectorStore (#18170) 2024-03-01 12:12:52 -08:00

langchain-pinecone

This package contains the LangChain integration with Pinecone.

Installation

pip install -U langchain-pinecone

And you should configure credentials by setting the following environment variables:

  • PINECONE_API_KEY
  • PINECONE_INDEX_NAME

Usage

The PineconeVectorStore class exposes the connection to the Pinecone vector store.

from langchain_pinecone import PineconeVectorStore

embeddings = ... # use a LangChain Embeddings class

vectorstore = PineconeVectorStore(embeddings=embeddings)