Fix Pinecone cosine relevance score (#8920)

Fixes: #8207

Description:
Pinecone returns scores (not distances) with cosine similarity. The
values according to the docs are [-1, 1], although I could never
reproduce negative values.

This PR ensures that the score returned from Pinecone is preserved,
rather than inverted, so the most relevant documents can be filtered (eg
when using similarity thresholds)

I'll leave this as a draft PR as I couldn't run the tests (my pinecone
account might not be enough - some errors were being thrown around
namespaces) so hopefully someone who _can_ will pick this up.

Maintainers:
@rlancemartin, @eyurtsev

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
This commit is contained in:
Rui Ramos 2023-11-13 19:47:38 +00:00 committed by GitHub
parent 2e42ed5de6
commit ff19a62afc
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -250,6 +250,11 @@ class Pinecone(VectorStore):
"(dot product), or euclidean"
)
@staticmethod
def _cosine_relevance_score_fn(score: float) -> float:
"""Pinecone returns cosine similarity scores between [-1,1]"""
return (score + 1) / 2
def max_marginal_relevance_search_by_vector(
self,
embedding: List[float],