Refactor similarity_search function in elastic_vector_search.py (#2761)

Optimization :Limit search results when k < 10
Fix issue when k > 10: Elasticsearch will return only 10 docs


[default-search-result](https://www.elastic.co/guide/en/elasticsearch/reference/current/paginate-search-results.html)
By default, searches return the top 10 matching hits

Add size parameter to the search request to limit the number of returned
results from Elasticsearch. Remove slicing of the hits list, since the
response will already contain the desired number of results.
This commit is contained in:
drod 2023-04-14 07:09:00 +02:00 committed by GitHub
parent 1cc7ea333c
commit 9907cb0485
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -200,8 +200,8 @@ class ElasticVectorSearch(VectorStore, ABC):
"""
embedding = self.embedding.embed_query(query)
script_query = _default_script_query(embedding)
response = self.client.search(index=self.index_name, query=script_query)
hits = [hit["_source"] for hit in response["hits"]["hits"][:k]]
response = self.client.search(index=self.index_name, query=script_query, size=k)
hits = [hit["_source"] for hit in response["hits"]["hits"]]
documents = [
Document(page_content=hit["text"], metadata=hit["metadata"]) for hit in hits
]