Es knn index search 5346 (#5569)

# Create elastic_vector_search.ElasticKnnSearch class This extends `langchain/vectorstores/elastic_vector_search.py` by adding a new class `ElasticKnnSearch` Features: - Allow creating an index with the `dense_vector` mapping compataible with kNN search - Store embeddings in index for use with kNN search (correct mapping creates HNSW data structure) - Perform approximate kNN search - Perform hybrid BM25 (`query{}`) + kNN (`knn{}`) search - perform knn search by either providing a `query_vector` or passing a hosted `model_id` to use query_vector_builder to automatically generate a query_vector at search time Connection options - Using `cloud_id` from Elastic Cloud - Passing elasticsearch client object search options - query - k - query_vector - model_id - size - source - knn_boost (hybrid search) - query_boost (hybrid search) - fields This also adds examples to `docs/modules/indexes/vectorstores/examples/elasticsearch.ipynb` Fixes # [5346](https://github.com/hwchase17/langchain/issues/5346) cc: @dev2049 --> --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
1 year ago · d1f65d8dc1
parent 8b3df18bcc
commit d1f65d8dc1
3 changed files with 830 additions and 236 deletions
--- a/.gitignore
+++ b/.gitignore
@ -150,3 +150,6 @@ wandb/
 # integration test artifacts
 data_map*
 \[('_type', 'fake'), ('stop', None)]
+
+# Replit files
+*replit*
--- a/docs/modules/indexes/vectorstores/examples/elasticsearch.ipynb
+++ b/docs/modules/indexes/vectorstores/examples/elasticsearch.ipynb
@ -3,7 +3,9 @@
    {
      "cell_type": "markdown",
      "id": "683953b3",
-   "metadata": {},
+      "metadata": {
+        "id": "683953b3"
+      },
      "source": [
        "# ElasticSearch\n",
        "\n",
@ -12,11 +14,22 @@
        "This notebook shows how to use functionality related to the `Elasticsearch` database."
      ]
    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "# ElasticVectorSearch class"
+      ],
+      "metadata": {
+        "id": "tKSYjyTBtSLc"
+      },
+      "id": "tKSYjyTBtSLc"
+    },
    {
      "cell_type": "markdown",
      "id": "b66c12b2-2a07-4136-ac77-ce1c9fa7a409",
      "metadata": {
-    "tags": []
+        "tags": [],
+        "id": "b66c12b2-2a07-4136-ac77-ce1c9fa7a409"
      },
      "source": [
        "## Installation"
@ -25,7 +38,9 @@
    {
      "cell_type": "markdown",
      "id": "81f43794-f002-477c-9b68-4975df30e718",
-   "metadata": {},
+      "metadata": {
+        "id": "81f43794-f002-477c-9b68-4975df30e718"
+      },
      "source": [
        "Check out [Elasticsearch installation instructions](https://www.elastic.co/guide/en/elasticsearch/reference/current/install-elasticsearch.html).\n",
        "\n",
@ -89,7 +104,8 @@
      "execution_count": null,
      "id": "d6197931-cbe5-460c-a5e6-b5eedb83887c",
      "metadata": {
-    "tags": []
+        "tags": [],
+        "id": "d6197931-cbe5-460c-a5e6-b5eedb83887c"
      },
      "outputs": [],
      "source": [
@ -98,10 +114,12 @@
    },
    {
      "cell_type": "code",
-   "execution_count": 3,
+      "execution_count": null,
      "id": "67ab8afa-f7c6-4fbf-b596-cb512da949da",
      "metadata": {
-    "tags": []
+        "tags": [],
+        "id": "67ab8afa-f7c6-4fbf-b596-cb512da949da",
+        "outputId": "fd16b37f-cb76-40a9-b83f-eab58dd0d912"
      },
      "outputs": [
        {
@ -123,7 +141,8 @@
      "cell_type": "markdown",
      "id": "f6030187-0bd7-4798-8372-a265036af5e0",
      "metadata": {
-    "tags": []
+        "tags": [],
+        "id": "f6030187-0bd7-4798-8372-a265036af5e0"
      },
      "source": [
        "## Example"
@ -131,10 +150,11 @@
    },
    {
      "cell_type": "code",
-   "execution_count": 4,
+      "execution_count": null,
      "id": "aac9563e",
      "metadata": {
-    "tags": []
+        "tags": [],
+        "id": "aac9563e"
      },
      "outputs": [],
      "source": [
@ -146,10 +166,11 @@
    },
    {
      "cell_type": "code",
-   "execution_count": 5,
+      "execution_count": null,
      "id": "a3c3999a",
      "metadata": {
-    "tags": []
+        "tags": [],
+        "id": "a3c3999a"
      },
      "outputs": [],
      "source": [
@ -167,7 +188,8 @@
      "execution_count": null,
      "id": "12eb86d8",
      "metadata": {
-    "tags": []
+        "tags": [],
+        "id": "12eb86d8"
      },
      "outputs": [],
      "source": [
@ -179,9 +201,12 @@
    },
    {
      "cell_type": "code",
-   "execution_count": 7,
+      "execution_count": null,
      "id": "4b172de8",
-   "metadata": {},
+      "metadata": {
+        "id": "4b172de8",
+        "outputId": "ca05a209-4514-4b5c-f6cb-2348f58c19a2"
+      },
      "outputs": [
        {
          "name": "stdout",
@ -205,13 +230,327 @@
        "print(docs[0].page_content)"
      ]
    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "# ElasticKnnSearch Class\n",
+        "The `ElasticKnnSearch` implements features allowing storing vectors and documents in Elasticsearch for use with approximate [kNN search](https://www.elastic.co/guide/en/elasticsearch/reference/current/knn-search.html)"
+      ],
+      "metadata": {
+        "id": "FheGPztJsrRB"
+      },
+      "id": "FheGPztJsrRB"
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "!pip install langchain elasticsearch"
+      ],
+      "metadata": {
+        "id": "gRVcbh5zqCJQ"
+      },
+      "execution_count": null,
+      "outputs": [],
+      "id": "gRVcbh5zqCJQ"
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "from langchain.vectorstores.elastic_vector_search import ElasticKnnSearch\n",
+        "from langchain.embeddings import ElasticsearchEmbeddings\n",
+        "import elasticsearch"
+      ],
+      "metadata": {
+        "id": "TJtqiw5AqBp8"
+      },
+      "execution_count": null,
+      "outputs": [],
+      "id": "TJtqiw5AqBp8"
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "# Initialize ElasticsearchEmbeddings\n",
+        "model_id = \"<model_id_from_es>\" \n",
+        "dims = dim_count\n",
+        "es_cloud_id = \"ESS_CLOUD_ID\"\n",
+        "es_user = \"es_user\"\n",
+        "es_password = \"es_pass\"\n",
+        "test_index = \"<index_name>\"\n",
+        "#input_field = \"your_input_field\" # if different from 'text_field'"
+      ],
+      "metadata": {
+        "id": "XHfC0As6qN3T"
+      },
+      "execution_count": null,
+      "outputs": [],
+      "id": "XHfC0As6qN3T"
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "# Generate embedding object\n",
+        "embeddings = ElasticsearchEmbeddings.from_credentials(\n",
+        "    model_id,\n",
+        "    #input_field=input_field,\n",
+        "    es_cloud_id=es_cloud_id,\n",
+        "    es_user=es_user,\n",
+        "    es_password=es_password,\n",
+        ")"
+      ],
+      "metadata": {
+        "id": "UkTipx1lqc3h"
+      },
+      "execution_count": null,
+      "outputs": [],
+      "id": "UkTipx1lqc3h"
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "# Initialize ElasticKnnSearch\n",
+        "knn_search = ElasticKnnSearch(\n",
+        "\tes_cloud_id=es_cloud_id, \n",
+        "\tes_user=es_user, \n",
+        "\tes_password=es_password, \n",
+        "\tindex_name= test_index, \n",
+        "\tembedding= embeddings\n",
+        ")"
+      ],
+      "metadata": {
+        "id": "74psgD0oqjYK"
+      },
+      "execution_count": null,
+      "outputs": [],
+      "id": "74psgD0oqjYK"
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "## Test adding vectors"
+      ],
+      "metadata": {
+        "id": "7AfgIKLWqnQl"
+      },
+      "id": "7AfgIKLWqnQl"
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "# Test `add_texts` method\n",
+        "texts = [\"Hello, world!\", \"Machine learning is fun.\", \"I love Python.\"]\n",
+        "knn_search.add_texts(texts)\n",
+        "\n",
+        "# Test `from_texts` method\n",
+        "new_texts = [\"This is a new text.\", \"Elasticsearch is powerful.\", \"Python is great for data analysis.\"]\n",
+        "knn_search.from_texts(new_texts, dims=dims)"
+      ],
+      "metadata": {
+        "id": "yNUUIaL9qmze"
+      },
+      "execution_count": null,
+      "outputs": [],
+      "id": "yNUUIaL9qmze"
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "## Test knn search using query vector builder "
+      ],
+      "metadata": {
+        "id": "0zdR-Iubquov"
+      },
+      "id": "0zdR-Iubquov"
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "# Test `knn_search` method with model_id and query_text\n",
+        "query = \"Hello\"\n",
+        "knn_result = knn_search.knn_search(query = query, model_id= model_id, k=2)\n",
+        "print(f\"kNN search results for query '{query}': {knn_result}\")\n",
+        "print(f\"The 'text' field value from the top hit is: '{knn_result['hits']['hits'][0]['_source']['text']}'\")\n",
+        "\n",
+        "# Test `hybrid_search` method\n",
+        "query = \"Hello\"\n",
+        "hybrid_result = knn_search.knn_hybrid_search(query = query, model_id= model_id, k=2)\n",
+        "print(f\"Hybrid search results for query '{query}': {hybrid_result}\")\n",
+        "print(f\"The 'text' field value from the top hit is: '{hybrid_result['hits']['hits'][0]['_source']['text']}'\")"
+      ],
+      "metadata": {
+        "id": "bwR4jYvqqxTo"
+      },
+      "execution_count": null,
+      "outputs": [],
+      "id": "bwR4jYvqqxTo"
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "## Test knn search using pre generated vector \n"
+      ],
+      "metadata": {
+        "id": "ltXYqp0qqz7R"
+      },
+      "id": "ltXYqp0qqz7R"
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "# Generate embedding for tests\n",
+        "query_text = 'Hello'\n",
+        "query_embedding = embeddings.embed_query(query_text)\n",
+        "print(f\"Length of embedding: {len(query_embedding)}\\nFirst two items in embedding: {query_embedding[:2]}\")\n",
+        "\n",
+        "# Test knn Search\n",
+        "knn_result = knn_search.knn_search(query_vector = query_embedding, k=2)\n",
+        "print(f\"The 'text' field value from the top hit is: '{knn_result['hits']['hits'][0]['_source']['text']}'\")\n",
+        "\n",
+        "# Test hybrid search - Requires both query_text and query_vector\n",
+        "knn_result = knn_search.knn_hybrid_search(query_vector = query_embedding, query=query_text, k=2)\n",
+        "print(f\"The 'text' field value from the top hit is: '{knn_result['hits']['hits'][0]['_source']['text']}'\")"
+      ],
+      "metadata": {
+        "id": "O5COtpTqq23t"
+      },
+      "execution_count": null,
+      "outputs": [],
+      "id": "O5COtpTqq23t"
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "## Test source option"
+      ],
+      "metadata": {
+        "id": "0dnmimcJq42C"
+      },
+      "id": "0dnmimcJq42C"
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "# Test `knn_search` method with model_id and query_text\n",
+        "query = \"Hello\"\n",
+        "knn_result = knn_search.knn_search(query = query, model_id= model_id, k=2, source=False)\n",
+        "assert not '_source' in knn_result['hits']['hits'][0].keys()\n",
+        "\n",
+        "# Test `hybrid_search` method\n",
+        "query = \"Hello\"\n",
+        "hybrid_result = knn_search.knn_hybrid_search(query = query, model_id= model_id, k=2, source=False)\n",
+        "assert not '_source' in hybrid_result['hits']['hits'][0].keys()"
+      ],
+      "metadata": {
+        "id": "v4_B72nHq7g1"
+      },
+      "execution_count": null,
+      "outputs": [],
+      "id": "v4_B72nHq7g1"
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "## Test fields option "
+      ],
+      "metadata": {
+        "id": "teHgJgrlq-Jb"
+      },
+      "id": "teHgJgrlq-Jb"
+    },
    {
      "cell_type": "code",
+      "source": [
+        "# Test `knn_search` method with model_id and query_text\n",
+        "query = \"Hello\"\n",
+        "knn_result = knn_search.knn_search(query = query, model_id= model_id, k=2, fields=['text'])\n",
+        "assert 'text' in knn_result['hits']['hits'][0]['fields'].keys()\n",
+        "\n",
+        "# Test `hybrid_search` method\n",
+        "query = \"Hello\"\n",
+        "hybrid_result = knn_search.knn_hybrid_search(query = query, model_id= model_id, k=2, fields=['text'])\n",
+        "assert 'text' in hybrid_result['hits']['hits'][0]['fields'].keys()"
+      ],
+      "metadata": {
+        "id": "utNBbpZYrAYW"
+      },
      "execution_count": null,
-   "id": "a359ed74",
-   "metadata": {},
      "outputs": [],
-   "source": []
+      "id": "utNBbpZYrAYW"
+    },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "### Test with es client connection rather than cloud_id "
+      ],
+      "metadata": {
+        "id": "hddsIFferBy1"
+      },
+      "id": "hddsIFferBy1"
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "# Create Elasticsearch connection\n",
+        "es_connection = Elasticsearch(\n",
+        "    hosts=['https://es_cluster_url:port'], \n",
+        "    basic_auth=('user', 'password')\n",
+        ")"
+      ],
+      "metadata": {
+        "id": "bXqrUnoirFia"
+      },
+      "execution_count": null,
+      "outputs": [],
+      "id": "bXqrUnoirFia"
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "# Instantiate ElasticsearchEmbeddings using es_connection\n",
+        "embeddings = ElasticsearchEmbeddings.from_es_connection(\n",
+        "    model_id,\n",
+        "    es_connection,\n",
+        ")"
+      ],
+      "metadata": {
+        "id": "TIM__Hm8rSEW"
+      },
+      "execution_count": null,
+      "outputs": [],
+      "id": "TIM__Hm8rSEW"
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "# Initialize ElasticKnnSearch\n",
+        "knn_search = ElasticKnnSearch(\n",
+        "\tes_connection = es_connection,\n",
+        "\tindex_name= test_index, \n",
+        "\tembedding= embeddings\n",
+        ")"
+      ],
+      "metadata": {
+        "id": "1-CdnOrArVc_"
+      },
+      "execution_count": null,
+      "outputs": [],
+      "id": "1-CdnOrArVc_"
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "# Test `knn_search` method with model_id and query_text\n",
+        "query = \"Hello\"\n",
+        "knn_result = knn_search.knn_search(query = query, model_id= model_id, k=2)\n",
+        "print(f\"kNN search results for query '{query}': {knn_result}\")\n",
+        "print(f\"The 'text' field value from the top hit is: '{knn_result['hits']['hits'][0]['_source']['text']}'\")\n"
+      ],
+      "metadata": {
+        "id": "0kgyaL6QrYVF"
+      },
+      "execution_count": null,
+      "outputs": [],
+      "id": "0kgyaL6QrYVF"
    }
  ],
  "metadata": {
@ -231,6 +570,9 @@
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.10.6"
+    },
+    "colab": {
+      "provenance": []
    }
  },
  "nbformat": 4,
--- a/langchain/vectorstores/elastic_vector_search.py
+++ b/langchain/vectorstores/elastic_vector_search.py
@ -3,13 +3,26 @@ from __future__ import annotations

 import uuid
 from abc import ABC
-from typing import Any, Dict, Iterable, List, Optional, Tuple
+from typing import (
+    TYPE_CHECKING,
+    Any,
+    Dict,
+    Iterable,
+    List,
+    Mapping,
+    Optional,
+    Tuple,
+    Union,
+)

 from langchain.docstore.document import Document
 from langchain.embeddings.base import Embeddings
 from langchain.utils import get_from_env
 from langchain.vectorstores.base import VectorStore

+if TYPE_CHECKING:
+    from elasticsearch import Elasticsearch
+

 def _default_text_mapping(dim: int) -> Dict:
    return {
@ -304,3 +317,239 @@ class ElasticVectorSearch(VectorStore, ABC):
                index=index_name, body={"query": script_query, "size": size}
            )
        return response
+
+
+class ElasticKnnSearch(ElasticVectorSearch):
+    """
+    A class for performing k-Nearest Neighbors (k-NN) search on an Elasticsearch index.
+    The class is designed for a text search scenario where documents are text strings
+    and their embeddings are vector representations of those strings.
+    """
+
+    def __init__(
+        self,
+        index_name: str,
+        embedding: Embeddings,
+        es_connection: Optional["Elasticsearch"] = None,
+        es_cloud_id: Optional[str] = None,
+        es_user: Optional[str] = None,
+        es_password: Optional[str] = None,
+    ):
+        """
+        Initializes an instance of the ElasticKnnSearch class and sets up the
+            Elasticsearch client.
+
+        Args:
+            index_name: The name of the Elasticsearch index.
+            embedding: An instance of the Embeddings class, used to generate vector
+                representations of text strings.
+            es_connection: An existing Elasticsearch connection.
+            es_cloud_id: The Cloud ID of the Elasticsearch instance. Required if
+                creating a new connection.
+            es_user: The username for the Elasticsearch instance. Required if
+                creating a new connection.
+            es_password: The password for the Elasticsearch instance. Required if
+                creating a new connection.
+        """
+
+        try:
+            import elasticsearch
+        except ImportError:
+            raise ImportError(
+                "Could not import elasticsearch python package. "
+                "Please install it with `pip install elasticsearch`."
+            )
+
+        self.embedding = embedding
+        self.index_name = index_name
+
+        # If a pre-existing Elasticsearch connection is provided, use it.
+        if es_connection is not None:
+            self.client = es_connection
+        else:
+            # If credentials for a new Elasticsearch connection are provided,
+            # create a new connection.
+            if es_cloud_id and es_user and es_password:
+                self.client = elasticsearch.Elasticsearch(
+                    cloud_id=es_cloud_id, basic_auth=(es_user, es_password)
+                )
+            else:
+                raise ValueError(
+                    """Either provide a pre-existing Elasticsearch connection, \
+                or valid credentials for creating a new connection."""
+                )
+
+    @staticmethod
+    def _default_knn_mapping(dims: int) -> Dict:
+        """Generates a default index mapping for kNN search."""
+        return {
+            "properties": {
+                "text": {"type": "text"},
+                "vector": {
+                    "type": "dense_vector",
+                    "dims": dims,
+                    "index": True,
+                    "similarity": "dot_product",
+                },
+            }
+        }
+
+    @staticmethod
+    def _default_knn_query(
+        query_vector: Optional[List[float]] = None,
+        query: Optional[str] = None,
+        model_id: Optional[str] = None,
+        field: Optional[str] = "vector",
+        k: Optional[int] = 10,
+        num_candidates: Optional[int] = 10,
+    ) -> Dict:
+        knn: Dict = {
+            "field": field,
+            "k": k,
+            "num_candidates": num_candidates,
+        }
+
+        # Case 1: `query_vector` is provided, but not `model_id` -> use query_vector
+        if query_vector and not model_id:
+            knn["query_vector"] = query_vector
+
+        # Case 2: `query` and `model_id` are provided, -> use query_vector_builder
+        elif query and model_id:
+            knn["query_vector_builder"] = {
+                "text_embedding": {
+                    "model_id": model_id,  # use 'model_id' argument
+                    "model_text": query,  # use 'query' argument
+                }
+            }
+
+        else:
+            raise ValueError(
+                "Either `query_vector` or `model_id` must be provided, but not both."
+            )
+
+        return knn
+
+    def knn_search(
+        self,
+        query: Optional[str] = None,
+        k: Optional[int] = 10,
+        query_vector: Optional[List[float]] = None,
+        model_id: Optional[str] = None,
+        size: Optional[int] = 10,
+        source: Optional[bool] = True,
+        fields: Optional[
+            Union[List[Mapping[str, Any]], Tuple[Mapping[str, Any], ...], None]
+        ] = None,
+    ) -> Dict:
+        """
+        Performs a k-nearest neighbor (k-NN) search on the Elasticsearch index.
+
+        The search can be conducted using either a raw query vector or a model ID.
+        The method first generates
+        the body of the search query, which can be interpreted by Elasticsearch.
+        It then performs the k-NN
+        search on the Elasticsearch index and returns the results.
+
+        Args:
+            query: The query or queries to be used for the search. Required if
+                `query_vector` is not provided.
+            k: The number of nearest neighbors to return. Defaults to 10.
+            query_vector: The query vector to be used for the search. Required if
+                `query` is not provided.
+            model_id: The ID of the model to use for generating the query vector, if
+                `query` is provided.
+            size: The number of search hits to return. Defaults to 10.
+            source: Whether to include the source of each hit in the results.
+            fields: The fields to include in the source of each hit. If None, all
+                fields are included.
+
+        Returns:
+            The search results.
+
+        Raises:
+            ValueError: If neither `query_vector` nor `model_id` is provided, or if
+                both are provided.
+        """
+
+        knn_query_body = self._default_knn_query(
+            query_vector=query_vector, query=query, model_id=model_id, k=k
+        )
+
+        # Perform the kNN search on the Elasticsearch index and return the results.
+        res = self.client.search(
+            index=self.index_name,
+            knn=knn_query_body,
+            size=size,
+            source=source,
+            fields=fields,
+        )
+        return dict(res)
+
+    def knn_hybrid_search(
+        self,
+        query: Optional[str] = None,
+        k: Optional[int] = 10,
+        query_vector: Optional[List[float]] = None,
+        model_id: Optional[str] = None,
+        size: Optional[int] = 10,
+        source: Optional[bool] = True,
+        knn_boost: Optional[float] = 0.9,
+        query_boost: Optional[float] = 0.1,
+        fields: Optional[
+            Union[List[Mapping[str, Any]], Tuple[Mapping[str, Any], ...], None]
+        ] = None,
+    ) -> Dict[Any, Any]:
+        """Performs a hybrid k-nearest neighbor (k-NN) and text-based search on the
+            Elasticsearch index.
+
+        The search can be conducted using either a raw query vector or a model ID.
+        The method first generates
+        the body of the k-NN search query and the text-based query, which can be
+        interpreted by Elasticsearch.
+        It then performs the hybrid search on the Elasticsearch index and returns the
+        results.
+
+        Args:
+            query: The query or queries to be used for the search. Required if
+                `query_vector` is not provided.
+            k: The number of nearest neighbors to return. Defaults to 10.
+            query_vector: The query vector to be used for the search. Required if
+                `query` is not provided.
+            model_id: The ID of the model to use for generating the query vector, if
+                `query` is provided.
+            size: The number of search hits to return. Defaults to 10.
+            source: Whether to include the source of each hit in the results.
+            knn_boost: The boost factor for the k-NN part of the search.
+            query_boost: The boost factor for the text-based part of the search.
+            fields
+                The fields to include in the source of each hit. If None, all fields are
+                included. Defaults to None.
+
+        Returns:
+            The search results.
+
+        Raises:
+            ValueError: If neither `query_vector` nor `model_id` is provided, or if
+                both are provided.
+        """
+
+        knn_query_body = self._default_knn_query(
+            query_vector=query_vector, query=query, model_id=model_id, k=k
+        )
+
+        # Modify the knn_query_body to add a "boost" parameter
+        knn_query_body["boost"] = knn_boost
+
+        # Generate the body of the standard Elasticsearch query
+        match_query_body = {"match": {"text": {"query": query, "boost": query_boost}}}
+
+        # Perform the hybrid search on the Elasticsearch index and return the results.
+        res = self.client.search(
+            index=self.index_name,
+            query=match_query_body,
+            knn=knn_query_body,
+            fields=fields,
+            size=size,
+            source=source,
+        )
+        return dict(res)