community[minor]: Adding Azure Cosmos Mongo vCore Vector DB Cache (#16856)

Description: This pull request introduces several enhancements for Azure Cosmos Vector DB, primarily focused on improving caching and search capabilities using Azure Cosmos MongoDB vCore Vector DB. Here's a summary of the changes: - **AzureCosmosDBSemanticCache**: Added a new cache implementation called AzureCosmosDBSemanticCache, which utilizes Azure Cosmos MongoDB vCore Vector DB for efficient caching of semantic data. Added comprehensive test cases for AzureCosmosDBSemanticCache to ensure its correctness and robustness. These tests cover various scenarios and edge cases to validate the cache's behavior. - **HNSW Vector Search**: Added HNSW vector search functionality in the CosmosDB Vector Search module. This enhancement enables more efficient and accurate vector searches by utilizing the HNSW (Hierarchical Navigable Small World) algorithm. Added corresponding test cases to validate the HNSW vector search functionality in both AzureCosmosDBSemanticCache and AzureCosmosDBVectorSearch. These tests ensure the correctness and performance of the HNSW search algorithm. - **LLM Caching Notebook** - The notebook now includes a comprehensive example showcasing the usage of the AzureCosmosDBSemanticCache. This example highlights how the cache can be employed to efficiently store and retrieve semantic data. Additionally, the example provides default values for all parameters used within the AzureCosmosDBSemanticCache, ensuring clarity and ease of understanding for users who are new to the cache implementation. @hwchase17,@baskaryan, @eyurtsev,
3 months ago · 7c2f3f6f95
parent db47b5deee
commit 7c2f3f6f95
6 changed files with 1503 additions and 122 deletions
--- a/docs/docs/integrations/llms/llm_caching.ipynb
+++ b/docs/docs/integrations/llms/llm_caching.ipynb
@ -12,9 +12,14 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 3,
+   "execution_count": 15,
   "id": "10ad9224",
-   "metadata": {},
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2024-02-02T21:34:23.461332Z",
+     "start_time": "2024-02-02T21:34:23.394461Z"
+    }
+   },
   "outputs": [],
   "source": [
    "from langchain.globals import set_llm_cache\n",
@ -1349,6 +1354,144 @@
    "print(llm(\"Is is possible that something false can be also true?\"))"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "source": [
+    "## Azure Cosmos DB Semantic Cache"
+   ],
+   "metadata": {
+    "collapsed": false
+   },
+   "id": "40624c26e86b57a4"
+  },
+  {
+   "cell_type": "code",
+   "outputs": [],
+   "source": [
+    "from langchain.cache import AzureCosmosDBSemanticCache\n",
+    "from langchain_community.vectorstores.azure_cosmos_db import (\n",
+    "    CosmosDBSimilarityType,\n",
+    "    CosmosDBVectorSearchType,\n",
+    ")\n",
+    "from langchain_openai import OpenAIEmbeddings\n",
+    "\n",
+    "# Read more about Azure CosmosDB Mongo vCore vector search here https://learn.microsoft.com/en-us/azure/cosmos-db/mongodb/vcore/vector-search\n",
+    "\n",
+    "INDEX_NAME = \"langchain-test-index\"\n",
+    "NAMESPACE = \"langchain_test_db.langchain_test_collection\"\n",
+    "CONNECTION_STRING = (\n",
+    "    \"Please provide your azure cosmos mongo vCore vector db connection string\"\n",
+    ")\n",
+    "DB_NAME, COLLECTION_NAME = NAMESPACE.split(\".\")\n",
+    "\n",
+    "# Default value for these params\n",
+    "num_lists = 3\n",
+    "dimensions = 1536\n",
+    "similarity_algorithm = CosmosDBSimilarityType.COS\n",
+    "kind = CosmosDBVectorSearchType.VECTOR_IVF\n",
+    "m = 16\n",
+    "ef_construction = 64\n",
+    "ef_search = 40\n",
+    "score_threshold = 0.1\n",
+    "\n",
+    "set_llm_cache(\n",
+    "    AzureCosmosDBSemanticCache(\n",
+    "        cosmosdb_connection_string=CONNECTION_STRING,\n",
+    "        cosmosdb_client=None,\n",
+    "        embedding=OpenAIEmbeddings(),\n",
+    "        database_name=DB_NAME,\n",
+    "        collection_name=COLLECTION_NAME,\n",
+    "        num_lists=num_lists,\n",
+    "        similarity=similarity_algorithm,\n",
+    "        kind=kind,\n",
+    "        dimensions=dimensions,\n",
+    "        m=m,\n",
+    "        ef_construction=ef_construction,\n",
+    "        ef_search=ef_search,\n",
+    "        score_threshold=score_threshold,\n",
+    "    )\n",
+    ")"
+   ],
+   "metadata": {
+    "collapsed": false,
+    "ExecuteTime": {
+     "end_time": "2024-02-02T21:34:49.457001Z",
+     "start_time": "2024-02-02T21:34:49.411293Z"
+    }
+   },
+   "id": "4a9d592db01b11b2",
+   "execution_count": 16
+  },
+  {
+   "cell_type": "code",
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "CPU times: user 43.4 ms, sys: 7.23 ms, total: 50.7 ms\n",
+      "Wall time: 1.61 s\n"
+     ]
+    },
+    {
+     "data": {
+      "text/plain": "\"\\n\\nWhy couldn't the bicycle stand up by itself?\\n\\nBecause it was two-tired!\""
+     },
+     "execution_count": 17,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "%%time\n",
+    "# The first time, it is not yet in cache, so it should take longer\n",
+    "llm(\"Tell me a joke\")"
+   ],
+   "metadata": {
+    "collapsed": false,
+    "ExecuteTime": {
+     "end_time": "2024-02-02T21:34:53.704234Z",
+     "start_time": "2024-02-02T21:34:52.091096Z"
+    }
+   },
+   "id": "8488cf9c97ec7ab",
+   "execution_count": 17
+  },
+  {
+   "cell_type": "code",
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "CPU times: user 6.89 ms, sys: 2.24 ms, total: 9.13 ms\n",
+      "Wall time: 337 ms\n"
+     ]
+    },
+    {
+     "data": {
+      "text/plain": "\"\\n\\nWhy couldn't the bicycle stand up by itself?\\n\\nBecause it was two-tired!\""
+     },
+     "execution_count": 18,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "%%time\n",
+    "# The first time, it is not yet in cache, so it should take longer\n",
+    "llm(\"Tell me a joke\")"
+   ],
+   "metadata": {
+    "collapsed": false,
+    "ExecuteTime": {
+     "end_time": "2024-02-02T21:34:56.004502Z",
+     "start_time": "2024-02-02T21:34:55.650136Z"
+    }
+   },
+   "id": "bc1570a2a77b58c8",
+   "execution_count": 18
+  },
  {
   "cell_type": "markdown",
   "id": "0c69d84d",
--- a/docs/docs/integrations/vectorstores/azure_cosmos_db.ipynb
+++ b/docs/docs/integrations/vectorstores/azure_cosmos_db.ipynb
@ -23,24 +23,34 @@
    "        "
   ]
  },
+  {
+   "cell_type": "markdown",
+   "source": [],
+   "metadata": {
+    "collapsed": false
+   },
+   "id": "8c493e205ce1dda5"
+  },
  {
   "cell_type": "code",
-   "execution_count": 2,
+   "execution_count": 1,
   "id": "ab8e45f5bd435ade",
   "metadata": {
+    "collapsed": false,
    "ExecuteTime": {
-     "end_time": "2023-10-10T17:20:00.721985Z",
-     "start_time": "2023-10-10T17:19:57.996265Z"
-    },
-    "collapsed": false
+     "end_time": "2024-02-08T18:25:05.278480Z",
+     "start_time": "2024-02-08T18:24:51.560677Z"
+    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
-      "Requirement already satisfied: pymongo in /Users/iekpo/Langchain/langchain-python/.venv/lib/python3.10/site-packages (4.5.0)\r\n",
-      "Requirement already satisfied: dnspython<3.0.0,>=1.16.0 in /Users/iekpo/Langchain/langchain-python/.venv/lib/python3.10/site-packages (from pymongo) (2.4.2)\r\n"
+      "\r\n",
+      "\u001B[1m[\u001B[0m\u001B[34;49mnotice\u001B[0m\u001B[1;39;49m]\u001B[0m\u001B[39;49m A new release of pip is available: \u001B[0m\u001B[31;49m23.2.1\u001B[0m\u001B[39;49m -> \u001B[0m\u001B[32;49m23.3.2\u001B[0m\r\n",
+      "\u001B[1m[\u001B[0m\u001B[34;49mnotice\u001B[0m\u001B[1;39;49m]\u001B[0m\u001B[39;49m To update, run: \u001B[0m\u001B[32;49mpip install --upgrade pip\u001B[0m\r\n",
+      "Note: you may need to restart the kernel to use updated packages.\n"
     ]
    }
   ],
@ -50,20 +60,20 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 24,
+   "execution_count": 2,
   "id": "9c7ce9e7b26efbb0",
   "metadata": {
+    "collapsed": false,
    "ExecuteTime": {
-     "end_time": "2023-10-10T17:50:03.615234Z",
-     "start_time": "2023-10-10T17:50:03.604289Z"
-    },
-    "collapsed": false
+     "end_time": "2024-02-08T18:25:56.926147Z",
+     "start_time": "2024-02-08T18:25:56.900087Z"
+    }
   },
   "outputs": [],
   "source": [
    "import os\n",
    "\n",
-    "CONNECTION_STRING = \"AZURE COSMOS DB MONGO vCORE connection string\"\n",
+    "CONNECTION_STRING = \"YOUR_CONNECTION_STRING\"\n",
    "INDEX_NAME = \"izzy-test-index\"\n",
    "NAMESPACE = \"izzy_test_db.izzy_test_collection\"\n",
    "DB_NAME, COLLECTION_NAME = NAMESPACE.split(\".\")"
@ -81,14 +91,14 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 25,
+   "execution_count": 3,
   "id": "4a052d99c6b8a2a7",
   "metadata": {
+    "collapsed": false,
    "ExecuteTime": {
-     "end_time": "2023-10-10T17:50:11.712929Z",
-     "start_time": "2023-10-10T17:50:11.703871Z"
-    },
-    "collapsed": false
+     "end_time": "2024-02-08T18:26:06.558294Z",
+     "start_time": "2024-02-08T18:26:06.550008Z"
+    }
   },
   "outputs": [],
   "source": [
@ -98,7 +108,7 @@
    "os.environ[\n",
    "    \"OPENAI_API_BASE\"\n",
    "] = \"YOUR_OPEN_AI_ENDPOINT\"  # https://example.openai.azure.com/\n",
-    "os.environ[\"OPENAI_API_KEY\"] = \"YOUR_OPEN_AI_KEY\"\n",
+    "os.environ[\"OPENAI_API_KEY\"] = \"YOUR_OPENAI_API_KEY\"\n",
    "os.environ[\n",
    "    \"OPENAI_EMBEDDINGS_DEPLOYMENT\"\n",
    "] = \"smart-agent-embedding-ada\"  # the deployment name for the embedding model\n",
@ -119,14 +129,14 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 26,
+   "execution_count": 4,
   "id": "183741cf8f4c7c53",
   "metadata": {
+    "collapsed": false,
    "ExecuteTime": {
-     "end_time": "2023-10-10T17:50:16.732718Z",
-     "start_time": "2023-10-10T17:50:16.716642Z"
-    },
-    "collapsed": false
+     "end_time": "2024-02-08T18:27:00.782280Z",
+     "start_time": "2024-02-08T18:26:47.339151Z"
+    }
   },
   "outputs": [],
   "source": [
@ -134,6 +144,7 @@
    "from langchain_community.vectorstores.azure_cosmos_db import (\n",
    "    AzureCosmosDBVectorSearch,\n",
    "    CosmosDBSimilarityType,\n",
+    "    CosmosDBVectorSearchType,\n",
    ")\n",
    "from langchain_openai import OpenAIEmbeddings\n",
    "from langchain_text_splitters import CharacterTextSplitter\n",
@ -159,21 +170,21 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 28,
+   "execution_count": 5,
   "id": "39ae6058c2f7fdf1",
   "metadata": {
+    "collapsed": false,
    "ExecuteTime": {
-     "end_time": "2023-10-10T17:51:17.980698Z",
-     "start_time": "2023-10-10T17:51:11.786336Z"
-    },
-    "collapsed": false
+     "end_time": "2024-02-08T18:31:13.486173Z",
+     "start_time": "2024-02-08T18:30:54.175890Z"
+    }
   },
   "outputs": [
    {
     "data": {
-      "text/plain": "{'raw': {'defaultShard': {'numIndexesBefore': 2,\n   'numIndexesAfter': 3,\n   'createdCollectionAutomatically': False,\n   'ok': 1}},\n 'ok': 1}"
+      "text/plain": "{'raw': {'defaultShard': {'numIndexesBefore': 1,\n   'numIndexesAfter': 2,\n   'createdCollectionAutomatically': False,\n   'ok': 1}},\n 'ok': 1}"
     },
-     "execution_count": 28,
+     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
@ -181,9 +192,9 @@
   "source": [
    "from pymongo import MongoClient\n",
    "\n",
-    "INDEX_NAME = \"izzy-test-index-2\"\n",
-    "NAMESPACE = \"izzy_test_db.izzy_test_collection\"\n",
-    "DB_NAME, COLLECTION_NAME = NAMESPACE.split(\".\")\n",
+    "# INDEX_NAME = \"izzy-test-index-2\"\n",
+    "# NAMESPACE = \"izzy_test_db.izzy_test_collection\"\n",
+    "# DB_NAME, COLLECTION_NAME = NAMESPACE.split(\".\")\n",
    "\n",
    "client: MongoClient = MongoClient(CONNECTION_STRING)\n",
    "collection = client[DB_NAME][COLLECTION_NAME]\n",
@ -200,23 +211,31 @@
    "    index_name=INDEX_NAME,\n",
    ")\n",
    "\n",
+    "# Read more about these variables in detail here. https://learn.microsoft.com/en-us/azure/cosmos-db/mongodb/vcore/vector-search\n",
    "num_lists = 100\n",
    "dimensions = 1536\n",
    "similarity_algorithm = CosmosDBSimilarityType.COS\n",
+    "kind = CosmosDBVectorSearchType.VECTOR_IVF\n",
+    "m = 16\n",
+    "ef_construction = 64\n",
+    "ef_search = 40\n",
+    "score_threshold = 0.1\n",
    "\n",
-    "vectorstore.create_index(num_lists, dimensions, similarity_algorithm)"
+    "vectorstore.create_index(\n",
+    "    num_lists, dimensions, similarity_algorithm, kind, m, ef_construction\n",
+    ")"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 29,
+   "execution_count": 6,
   "id": "32c68d3246adc21f",
   "metadata": {
+    "collapsed": false,
    "ExecuteTime": {
-     "end_time": "2023-10-10T17:51:44.840121Z",
-     "start_time": "2023-10-10T17:51:44.498639Z"
-    },
-    "collapsed": false
+     "end_time": "2024-02-08T18:31:47.468902Z",
+     "start_time": "2024-02-08T18:31:46.053602Z"
+    }
   },
   "outputs": [],
   "source": [
@ -227,14 +246,14 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 31,
+   "execution_count": 7,
   "id": "8feeeb4364efb204",
   "metadata": {
+    "collapsed": false,
    "ExecuteTime": {
-     "end_time": "2023-10-10T17:52:08.049294Z",
-     "start_time": "2023-10-10T17:52:08.038511Z"
-    },
-    "collapsed": false
+     "end_time": "2024-02-08T18:31:50.982598Z",
+     "start_time": "2024-02-08T18:31:50.977605Z"
+    }
   },
   "outputs": [
    {
@ -267,14 +286,14 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 32,
+   "execution_count": 8,
   "id": "3c218ab6f59301f7",
   "metadata": {
+    "collapsed": false,
    "ExecuteTime": {
-     "end_time": "2023-10-10T17:52:14.994861Z",
-     "start_time": "2023-10-10T17:52:13.986379Z"
-    },
-    "collapsed": false
+     "end_time": "2024-02-08T18:32:14.299599Z",
+     "start_time": "2024-02-08T18:32:12.923464Z"
+    }
   },
   "outputs": [
    {
@ -305,14 +324,14 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 33,
+   "execution_count": 9,
   "id": "fd67e4d92c9ab32f",
   "metadata": {
+    "collapsed": false,
    "ExecuteTime": {
-     "end_time": "2023-10-10T17:53:21.145431Z",
-     "start_time": "2023-10-10T17:53:20.884531Z"
-    },
-    "collapsed": false
+     "end_time": "2024-02-08T18:32:24.021434Z",
+     "start_time": "2024-02-08T18:32:22.867658Z"
+    }
   },
   "outputs": [
    {
--- a/libs/community/langchain_community/cache.py
+++ b/libs/community/langchain_community/cache.py
@ -29,6 +29,7 @@ import uuid
 import warnings
 from abc import ABC
 from datetime import timedelta
+from enum import Enum
 from functools import lru_cache, wraps
 from typing import (
    TYPE_CHECKING,
@ -51,6 +52,11 @@ from sqlalchemy.engine import Row
 from sqlalchemy.engine.base import Engine
 from sqlalchemy.orm import Session

+from langchain_community.vectorstores.azure_cosmos_db import (
+    CosmosDBSimilarityType,
+    CosmosDBVectorSearchType,
+)
+
 try:
    from sqlalchemy.orm import declarative_base
 except ImportError:
@ -68,6 +74,7 @@ from langchain_community.utilities.astradb import (
    SetupMode,
    _AstraDBCollectionEnvironment,
 )
+from langchain_community.vectorstores import AzureCosmosDBVectorSearch
 from langchain_community.vectorstores.redis import Redis as RedisVectorstore

 logger = logging.getLogger(__file__)
@ -1837,3 +1844,194 @@ class AstraDBSemanticCache(BaseCache):
    async def aclear(self, **kwargs: Any) -> None:
        await self.astra_env.aensure_db_setup()
        await self.async_collection.clear()
+
+
+class AzureCosmosDBSemanticCache(BaseCache):
+    """Cache that uses Cosmos DB Mongo vCore vector-store backend"""
+
+    DEFAULT_DATABASE_NAME = "CosmosMongoVCoreCacheDB"
+    DEFAULT_COLLECTION_NAME = "CosmosMongoVCoreCacheColl"
+
+    def __init__(
+        self,
+        cosmosdb_connection_string: str,
+        database_name: str,
+        collection_name: str,
+        embedding: Embeddings,
+        *,
+        cosmosdb_client: Optional[Any] = None,
+        num_lists: int = 100,
+        similarity: CosmosDBSimilarityType = CosmosDBSimilarityType.COS,
+        kind: CosmosDBVectorSearchType = CosmosDBVectorSearchType.VECTOR_IVF,
+        dimensions: int = 1536,
+        m: int = 16,
+        ef_construction: int = 64,
+        ef_search: int = 40,
+        score_threshold: Optional[float] = None,
+    ):
+        """
+        Args:
+            cosmosdb_connection_string: Cosmos DB Mongo vCore connection string
+            cosmosdb_client: Cosmos DB Mongo vCore client
+            embedding (Embedding): Embedding provider for semantic encoding and search.
+            database_name: Database name for the CosmosDBMongoVCoreSemanticCache
+            collection_name: Collection name for the CosmosDBMongoVCoreSemanticCache
+            num_lists: This integer is the number of clusters that the
+                inverted file (IVF) index uses to group the vector data.
+                We recommend that numLists is set to documentCount/1000
+                for up to 1 million documents and to sqrt(documentCount)
+                for more than 1 million documents.
+                Using a numLists value of 1 is akin to performing
+                brute-force search, which has limited performance
+            dimensions: Number of dimensions for vector similarity.
+                The maximum number of supported dimensions is 2000
+            similarity: Similarity metric to use with the IVF index.
+
+                Possible options are:
+                    - CosmosDBSimilarityType.COS (cosine distance),
+                    - CosmosDBSimilarityType.L2 (Euclidean distance), and
+                    - CosmosDBSimilarityType.IP (inner product).
+            kind: Type of vector index to create.
+                Possible options are:
+                    - vector-ivf
+                    - vector-hnsw: available as a preview feature only,
+                                   to enable visit https://learn.microsoft.com/en-us/azure/azure-resource-manager/management/preview-features
+            m: The max number of connections per layer (16 by default, minimum
+               value is 2, maximum value is 100). Higher m is suitable for datasets
+               with high dimensionality and/or high accuracy requirements.
+            ef_construction: the size of the dynamic candidate list for constructing
+                            the graph (64 by default, minimum value is 4, maximum
+                            value is 1000). Higher ef_construction will result in
+                            better index quality and higher accuracy, but it will
+                            also increase the time required to build the index.
+                            ef_construction has to be at least 2 * m
+            ef_search: The size of the dynamic candidate list for search
+                       (40 by default). A higher value provides better
+                       recall at the cost of speed.
+            score_threshold: Maximum score used to filter the vector search documents.
+        """
+
+        self._validate_enum_value(similarity, CosmosDBSimilarityType)
+        self._validate_enum_value(kind, CosmosDBVectorSearchType)
+
+        if not cosmosdb_connection_string:
+            raise ValueError(" CosmosDB connection string can be empty.")
+
+        self.cosmosdb_connection_string = cosmosdb_connection_string
+        self.cosmosdb_client = cosmosdb_client
+        self.embedding = embedding
+        self.database_name = database_name or self.DEFAULT_DATABASE_NAME
+        self.collection_name = collection_name or self.DEFAULT_COLLECTION_NAME
+        self.num_lists = num_lists
+        self.dimensions = dimensions
+        self.similarity = similarity
+        self.kind = kind
+        self.m = m
+        self.ef_construction = ef_construction
+        self.ef_search = ef_search
+        self.score_threshold = score_threshold
+        self._cache_dict: Dict[str, AzureCosmosDBVectorSearch] = {}
+
+    def _index_name(self, llm_string: str) -> str:
+        hashed_index = _hash(llm_string)
+        return f"cache:{hashed_index}"
+
+    def _get_llm_cache(self, llm_string: str) -> AzureCosmosDBVectorSearch:
+        index_name = self._index_name(llm_string)
+
+        namespace = self.database_name + "." + self.collection_name
+
+        # return vectorstore client for the specific llm string
+        if index_name in self._cache_dict:
+            return self._cache_dict[index_name]
+
+        # create new vectorstore client for the specific llm string
+        if self.cosmosdb_client:
+            collection = self.cosmosdb_client[self.database_name][self.collection_name]
+            self._cache_dict[index_name] = AzureCosmosDBVectorSearch(
+                collection=collection,
+                embedding=self.embedding,
+                index_name=index_name,
+            )
+        else:
+            self._cache_dict[
+                index_name
+            ] = AzureCosmosDBVectorSearch.from_connection_string(
+                connection_string=self.cosmosdb_connection_string,
+                namespace=namespace,
+                embedding=self.embedding,
+                index_name=index_name,
+            )
+
+        # create index for the vectorstore
+        vectorstore = self._cache_dict[index_name]
+        if not vectorstore.index_exists():
+            vectorstore.create_index(
+                self.num_lists,
+                self.dimensions,
+                self.similarity,
+                self.kind,
+                self.m,
+                self.ef_construction,
+            )
+
+        return vectorstore
+
+    def lookup(self, prompt: str, llm_string: str) -> Optional[RETURN_VAL_TYPE]:
+        """Look up based on prompt and llm_string."""
+        llm_cache = self._get_llm_cache(llm_string)
+        generations: List = []
+        # Read from a Hash
+        results = llm_cache.similarity_search(
+            query=prompt,
+            k=1,
+            kind=self.kind,
+            ef_search=self.ef_search,
+            score_threshold=self.score_threshold,
+        )
+        if results:
+            for document in results:
+                try:
+                    generations.extend(loads(document.metadata["return_val"]))
+                except Exception:
+                    logger.warning(
+                        "Retrieving a cache value that could not be deserialized "
+                        "properly. This is likely due to the cache being in an "
+                        "older format. Please recreate your cache to avoid this "
+                        "error."
+                    )
+                    # In a previous life we stored the raw text directly
+                    # in the table, so assume it's in that format.
+                    generations.extend(
+                        _load_generations_from_json(document.metadata["return_val"])
+                    )
+        return generations if generations else None
+
+    def update(self, prompt: str, llm_string: str, return_val: RETURN_VAL_TYPE) -> None:
+        """Update cache based on prompt and llm_string."""
+        for gen in return_val:
+            if not isinstance(gen, Generation):
+                raise ValueError(
+                    "CosmosDBMongoVCoreSemanticCache only supports caching of "
+                    f"normal LLM generations, got {type(gen)}"
+                )
+
+        llm_cache = self._get_llm_cache(llm_string)
+        metadata = {
+            "llm_string": llm_string,
+            "prompt": prompt,
+            "return_val": dumps([g for g in return_val]),
+        }
+        llm_cache.add_texts(texts=[prompt], metadatas=[metadata])
+
+    def clear(self, **kwargs: Any) -> None:
+        """Clear semantic cache for a given llm_string."""
+        index_name = self._index_name(kwargs["llm_string"])
+        if index_name in self._cache_dict:
+            self._cache_dict[index_name].get_collection().delete_many({})
+            # self._cache_dict[index_name].clear_collection()
+
+    @staticmethod
+    def _validate_enum_value(value: Any, enum_type: Type[Enum]) -> None:
+        if not isinstance(value, enum_type):
+            raise ValueError(f"Invalid enum value: {value}. Expected {enum_type}.")
--- a/libs/community/langchain_community/vectorstores/azure_cosmos_db.py
+++ b/libs/community/langchain_community/vectorstores/azure_cosmos_db.py
@ -38,6 +38,15 @@ class CosmosDBSimilarityType(str, Enum):
    """Euclidean distance"""


+class CosmosDBVectorSearchType(str, Enum):
+    """Cosmos DB Vector Search Type as enumerator."""
+
+    VECTOR_IVF = "vector-ivf"
+    """IVF vector index"""
+    VECTOR_HNSW = "vector-hnsw"
+    """HNSW vector index"""
+
+
 CosmosDBDocumentType = TypeVar("CosmosDBDocumentType", bound=Dict[str, Any])

 logger = logging.getLogger(__name__)
@ -166,6 +175,9 @@ class AzureCosmosDBVectorSearch(VectorStore):
        num_lists: int = 100,
        dimensions: int = 1536,
        similarity: CosmosDBSimilarityType = CosmosDBSimilarityType.COS,
+        kind: str = "vector-ivf",
+        m: int = 16,
+        ef_construction: int = 64,
    ) -> dict[str, Any]:
        """Creates an index using the index name specified at
            instance construction
@ -195,6 +207,11 @@ class AzureCosmosDBVectorSearch(VectorStore):
            the numLists parameter using the above guidance.

        Args:
+            kind: Type of vector index to create.
+                Possible options are:
+                    - vector-ivf
+                    - vector-hnsw: available as a preview feature only,
+                                   to enable visit https://learn.microsoft.com/en-us/azure/azure-resource-manager/management/preview-features
            num_lists: This integer is the number of clusters that the
                inverted file (IVF) index uses to group the vector data.
                We recommend that numLists is set to documentCount/1000
@ -210,20 +227,52 @@ class AzureCosmosDBVectorSearch(VectorStore):
                    - CosmosDBSimilarityType.COS (cosine distance),
                    - CosmosDBSimilarityType.L2 (Euclidean distance), and
                    - CosmosDBSimilarityType.IP (inner product).
-
+            m: The max number of connections per layer (16 by default, minimum
+               value is 2, maximum value is 100). Higher m is suitable for datasets
+               with high dimensionality and/or high accuracy requirements.
+            ef_construction: the size of the dynamic candidate list for constructing
+                            the graph (64 by default, minimum value is 4, maximum
+                            value is 1000). Higher ef_construction will result in
+                            better index quality and higher accuracy, but it will
+                            also increase the time required to build the index.
+                            ef_construction has to be at least 2 * m
        Returns:
            An object describing the created index

        """
-        # prepare the command
-        create_index_commands = {
+        # check the kind of vector search to be performed
+        # prepare the command accordingly
+        create_index_commands = {}
+        if kind == CosmosDBVectorSearchType.VECTOR_IVF:
+            create_index_commands = self._get_vector_index_ivf(
+                kind, num_lists, similarity, dimensions
+            )
+        elif kind == CosmosDBVectorSearchType.VECTOR_HNSW:
+            create_index_commands = self._get_vector_index_hnsw(
+                kind, m, ef_construction, similarity, dimensions
+            )
+
+        # retrieve the database object
+        current_database = self._collection.database
+
+        # invoke the command from the database object
+        create_index_responses: dict[str, Any] = current_database.command(
+            create_index_commands
+        )
+
+        return create_index_responses
+
+    def _get_vector_index_ivf(
+        self, kind: str, num_lists: int, similarity: str, dimensions: int
+    ) -> Dict[str, Any]:
+        command = {
            "createIndexes": self._collection.name,
            "indexes": [
                {
                    "name": self._index_name,
                    "key": {self._embedding_key: "cosmosSearch"},
                    "cosmosSearchOptions": {
-                        "kind": "vector-ivf",
+                        "kind": kind,
                        "numLists": num_lists,
                        "similarity": similarity,
                        "dimensions": dimensions,
@ -231,16 +280,28 @@ class AzureCosmosDBVectorSearch(VectorStore):
                }
            ],
        }
+        return command

-        # retrieve the database object
-        current_database = self._collection.database
-
-        # invoke the command from the database object
-        create_index_responses: dict[str, Any] = current_database.command(
-            create_index_commands
-        )
-
-        return create_index_responses
+    def _get_vector_index_hnsw(
+        self, kind: str, m: int, ef_construction: int, similarity: str, dimensions: int
+    ) -> Dict[str, Any]:
+        command = {
+            "createIndexes": self._collection.name,
+            "indexes": [
+                {
+                    "name": self._index_name,
+                    "key": {self._embedding_key: "cosmosSearch"},
+                    "cosmosSearchOptions": {
+                        "kind": kind,
+                        "m": m,
+                        "efConstruction": ef_construction,
+                        "similarity": similarity,
+                        "dimensions": dimensions,
+                    },
+                }
+            ],
+        }
+        return command

    def add_texts(
        self,
@ -329,17 +390,60 @@ class AzureCosmosDBVectorSearch(VectorStore):
        self._collection.delete_one({"_id": ObjectId(document_id)})

    def _similarity_search_with_score(
-        self, embeddings: List[float], k: int = 4
+        self,
+        embeddings: List[float],
+        k: int = 4,
+        kind: CosmosDBVectorSearchType = CosmosDBVectorSearchType.VECTOR_IVF,
+        ef_search: int = 40,
+        score_threshold: float = 0.0,
    ) -> List[Tuple[Document, float]]:
        """Returns a list of documents with their scores

        Args:
            embeddings: The query vector
            k: the number of documents to return
+            kind: Type of vector index to create.
+                Possible options are:
+                    - vector-ivf
+                    - vector-hnsw: available as a preview feature only,
+                                   to enable visit https://learn.microsoft.com/en-us/azure/azure-resource-manager/management/preview-features
+            ef_search: The size of the dynamic candidate list for search
+                       (40 by default). A higher value provides better
+                       recall at the cost of speed.
+            score_threshold: (Optional[float], optional): Maximum vector distance
+                between selected documents and the query vector. Defaults to None.
+                Only vector-ivf search supports this for now.

        Returns:
            A list of documents closest to the query vector
        """
+        pipeline: List[dict[str, Any]] = []
+        if kind == CosmosDBVectorSearchType.VECTOR_IVF:
+            pipeline = self._get_pipeline_vector_ivf(embeddings, k)
+        elif kind == CosmosDBVectorSearchType.VECTOR_HNSW:
+            pipeline = self._get_pipeline_vector_hnsw(embeddings, k, ef_search)
+
+        cursor = self._collection.aggregate(pipeline)
+
+        docs = []
+        for res in cursor:
+            score = res.pop("similarityScore")
+            if score < score_threshold:
+                continue
+            document_object_field = (
+                res.pop("document")
+                if kind == CosmosDBVectorSearchType.VECTOR_IVF
+                else res
+            )
+            text = document_object_field.pop(self._text_key)
+            docs.append(
+                (Document(page_content=text, metadata=document_object_field), score)
+            )
+        return docs
+
+    def _get_pipeline_vector_ivf(
+        self, embeddings: List[float], k: int = 4
+    ) -> List[dict[str, Any]]:
        pipeline: List[dict[str, Any]] = [
            {
                "$search": {
@ -358,32 +462,65 @@ class AzureCosmosDBVectorSearch(VectorStore):
                }
            },
        ]
+        return pipeline

-        cursor = self._collection.aggregate(pipeline)
-
-        docs = []
-
-        for res in cursor:
-            score = res.pop("similarityScore")
-            document_object_field = res.pop("document")
-            text = document_object_field.pop(self._text_key)
-            docs.append(
-                (Document(page_content=text, metadata=document_object_field), score)
-            )
-
-        return docs
+    def _get_pipeline_vector_hnsw(
+        self, embeddings: List[float], k: int = 4, ef_search: int = 40
+    ) -> List[dict[str, Any]]:
+        pipeline: List[dict[str, Any]] = [
+            {
+                "$search": {
+                    "cosmosSearch": {
+                        "vector": embeddings,
+                        "path": self._embedding_key,
+                        "k": k,
+                        "efSearch": ef_search,
+                    },
+                }
+            },
+            {
+                "$project": {
+                    "similarityScore": {"$meta": "searchScore"},
+                    "document": "$$ROOT",
+                }
+            },
+        ]
+        return pipeline

    def similarity_search_with_score(
-        self, query: str, k: int = 4
+        self,
+        query: str,
+        k: int = 4,
+        kind: CosmosDBVectorSearchType = CosmosDBVectorSearchType.VECTOR_IVF,
+        ef_search: int = 40,
+        score_threshold: float = 0.0,
    ) -> List[Tuple[Document, float]]:
        embeddings = self._embedding.embed_query(query)
-        docs = self._similarity_search_with_score(embeddings=embeddings, k=k)
+        docs = self._similarity_search_with_score(
+            embeddings=embeddings,
+            k=k,
+            kind=kind,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )
        return docs

    def similarity_search(
-        self, query: str, k: int = 4, **kwargs: Any
+        self,
+        query: str,
+        k: int = 4,
+        kind: CosmosDBVectorSearchType = CosmosDBVectorSearchType.VECTOR_IVF,
+        ef_search: int = 40,
+        score_threshold: float = 0.0,
+        **kwargs: Any,
    ) -> List[Document]:
-        docs_and_scores = self.similarity_search_with_score(query, k=k)
+        docs_and_scores = self.similarity_search_with_score(
+            query,
+            k=k,
+            kind=kind,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )
        return [doc for doc, _ in docs_and_scores]

    def max_marginal_relevance_search_by_vector(
@ -392,11 +529,20 @@ class AzureCosmosDBVectorSearch(VectorStore):
        k: int = 4,
        fetch_k: int = 20,
        lambda_mult: float = 0.5,
+        kind: CosmosDBVectorSearchType = CosmosDBVectorSearchType.VECTOR_IVF,
+        ef_search: int = 40,
+        score_threshold: float = 0.0,
        **kwargs: Any,
    ) -> List[Document]:
        # Retrieves the docs with similarity scores
        # sorted by similarity scores in DESC order
-        docs = self._similarity_search_with_score(embedding, k=fetch_k)
+        docs = self._similarity_search_with_score(
+            embedding,
+            k=fetch_k,
+            kind=kind,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )

        # Re-ranks the docs using MMR
        mmr_doc_indexes = maximal_marginal_relevance(
@ -414,12 +560,24 @@ class AzureCosmosDBVectorSearch(VectorStore):
        k: int = 4,
        fetch_k: int = 20,
        lambda_mult: float = 0.5,
+        kind: CosmosDBVectorSearchType = CosmosDBVectorSearchType.VECTOR_IVF,
+        ef_search: int = 40,
+        score_threshold: float = 0.0,
        **kwargs: Any,
    ) -> List[Document]:
        # compute the embeddings vector from the query string
        embeddings = self._embedding.embed_query(query)

        docs = self.max_marginal_relevance_search_by_vector(
-            embeddings, k=k, fetch_k=fetch_k, lambda_mult=lambda_mult
+            embeddings,
+            k=k,
+            fetch_k=fetch_k,
+            lambda_mult=lambda_mult,
+            kind=kind,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
        )
        return docs
+
+    def get_collection(self) -> Collection[CosmosDBDocumentType]:
+        return self._collection
--- a/libs/community/tests/integration_tests/vectorstores/test_azure_cosmos_db.py
+++ b/libs/community/tests/integration_tests/vectorstores/test_azure_cosmos_db.py
@ -11,6 +11,7 @@ from langchain_community.embeddings import OpenAIEmbeddings
 from langchain_community.vectorstores.azure_cosmos_db import (
    AzureCosmosDBVectorSearch,
    CosmosDBSimilarityType,
+    CosmosDBVectorSearchType,
 )

 logging.basicConfig(level=logging.DEBUG)
@ -21,6 +22,7 @@ model_deployment = os.getenv(
 model_name = os.getenv("OPENAI_EMBEDDINGS_MODEL_NAME", "text-embedding-ada-002")

 INDEX_NAME = "langchain-test-index"
+INDEX_NAME_VECTOR_HNSW = "langchain-test-index-hnsw"
 NAMESPACE = "langchain_test_db.langchain_test_collection"
 CONNECTION_STRING: str = os.environ.get("MONGODB_VCORE_URI", "")
 DB_NAME, COLLECTION_NAME = NAMESPACE.split(".")
@ -28,6 +30,11 @@ DB_NAME, COLLECTION_NAME = NAMESPACE.split(".")
 num_lists = 3
 dimensions = 1536
 similarity_algorithm = CosmosDBSimilarityType.COS
+kind = CosmosDBVectorSearchType.VECTOR_IVF
+m = 16
+ef_construction = 64
+ef_search = 40
+score_threshold = 0.1


 def prepare_collection() -> Any:
@ -82,7 +89,7 @@ class TestAzureCosmosDBVectorSearch:

    @pytest.fixture(scope="class", autouse=True)
    def cosmos_db_url(self) -> Union[str, Generator[str, None, None]]:
-        """Return the elasticsearch url."""
+        """Return the cosmos db url."""
        return "805.555.1212"

    def test_from_documents_cosine_distance(
@ -105,14 +112,23 @@ class TestAzureCosmosDBVectorSearch:
        sleep(1)  # waits for Cosmos DB to save contents to the collection

        # Create the IVF index that will be leveraged later for vector search
-        vectorstore.create_index(num_lists, dimensions, similarity_algorithm)
+        vectorstore.create_index(
+            num_lists, dimensions, similarity_algorithm, kind, m, ef_construction
+        )
        sleep(2)  # waits for the index to be set up

-        output = vectorstore.similarity_search("Sandwich", k=1)
+        output = vectorstore.similarity_search(
+            "Sandwich",
+            k=1,
+            kind=kind,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )

        assert output
        assert output[0].page_content == "What is a sandwich?"
        assert output[0].metadata["c"] == 1
+
        vectorstore.delete_index()

    def test_from_documents_inner_product(
@ -135,14 +151,23 @@ class TestAzureCosmosDBVectorSearch:
        sleep(1)  # waits for Cosmos DB to save contents to the collection

        # Create the IVF index that will be leveraged later for vector search
-        vectorstore.create_index(num_lists, dimensions, CosmosDBSimilarityType.IP)
+        vectorstore.create_index(
+            num_lists, dimensions, CosmosDBSimilarityType.IP, kind, m, ef_construction
+        )
        sleep(2)  # waits for the index to be set up

-        output = vectorstore.similarity_search("Sandwich", k=1)
+        output = vectorstore.similarity_search(
+            "Sandwich",
+            k=1,
+            kind=kind,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )

        assert output
        assert output[0].page_content == "What is a sandwich?"
        assert output[0].metadata["c"] == 1
+
        vectorstore.delete_index()

    def test_from_texts_cosine_distance(
@ -162,12 +187,21 @@ class TestAzureCosmosDBVectorSearch:
        )

        # Create the IVF index that will be leveraged later for vector search
-        vectorstore.create_index(num_lists, dimensions, similarity_algorithm)
+        vectorstore.create_index(
+            num_lists, dimensions, CosmosDBSimilarityType.IP, kind, m, ef_construction
+        )
        sleep(2)  # waits for the index to be set up

-        output = vectorstore.similarity_search("Sandwich", k=1)
+        output = vectorstore.similarity_search(
+            "Sandwich",
+            k=1,
+            kind=kind,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )

        assert output[0].page_content == "What is a sandwich?"
+
        vectorstore.delete_index()

    def test_from_texts_with_metadatas_cosine_distance(
@ -189,10 +223,18 @@ class TestAzureCosmosDBVectorSearch:
        )

        # Create the IVF index that will be leveraged later for vector search
-        vectorstore.create_index(num_lists, dimensions, similarity_algorithm)
+        vectorstore.create_index(
+            num_lists, dimensions, similarity_algorithm, kind, m, ef_construction
+        )
        sleep(2)  # waits for the index to be set up

-        output = vectorstore.similarity_search("Sandwich", k=1)
+        output = vectorstore.similarity_search(
+            "Sandwich",
+            k=1,
+            kind=kind,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )

        assert output
        assert output[0].page_content == "What is a sandwich?"
@ -219,10 +261,18 @@ class TestAzureCosmosDBVectorSearch:
        )

        # Create the IVF index that will be leveraged later for vector search
-        vectorstore.create_index(num_lists, dimensions, similarity_algorithm)
+        vectorstore.create_index(
+            num_lists, dimensions, similarity_algorithm, kind, m, ef_construction
+        )
        sleep(2)  # waits for the index to be set up

-        output = vectorstore.similarity_search("Sandwich", k=1)
+        output = vectorstore.similarity_search(
+            "Sandwich",
+            k=1,
+            kind=kind,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )

        assert output
        assert output[0].page_content == "What is a sandwich?"
@ -234,7 +284,13 @@ class TestAzureCosmosDBVectorSearch:
        vectorstore.delete_document_by_id(first_document_id)
        sleep(2)  # waits for the index to be updated

-        output2 = vectorstore.similarity_search("Sandwich", k=1)
+        output2 = vectorstore.similarity_search(
+            "Sandwich",
+            k=1,
+            kind=kind,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )
        assert output2
        assert output2[0].page_content != "What is a sandwich?"

@ -259,25 +315,36 @@ class TestAzureCosmosDBVectorSearch:
        )

        # Create the IVF index that will be leveraged later for vector search
-        vectorstore.create_index(num_lists, dimensions, similarity_algorithm)
+        vectorstore.create_index(
+            num_lists, dimensions, similarity_algorithm, kind, m, ef_construction
+        )
        sleep(2)  # waits for the index to be set up

-        output = vectorstore.similarity_search("Sandwich", k=5)
+        output = vectorstore.similarity_search(
+            "Sandwich",
+            k=5,
+            kind=kind,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )

-        first_document_id_object = output[0].metadata["_id"]
-        first_document_id = str(first_document_id_object)
+        first_document_id = str(output[0].metadata["_id"])

-        output[1].metadata["_id"]
-        second_document_id = output[1].metadata["_id"]
+        second_document_id = str(output[1].metadata["_id"])

-        output[2].metadata["_id"]
-        third_document_id = output[2].metadata["_id"]
+        third_document_id = str(output[2].metadata["_id"])

        document_ids = [first_document_id, second_document_id, third_document_id]
        vectorstore.delete(document_ids)
        sleep(2)  # waits for the index to be updated

-        output_2 = vectorstore.similarity_search("Sandwich", k=5)
+        output_2 = vectorstore.similarity_search(
+            "Sandwich",
+            k=5,
+            kind=kind,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )
        assert output
        assert output_2

@ -307,14 +374,23 @@ class TestAzureCosmosDBVectorSearch:
        )

        # Create the IVF index that will be leveraged later for vector search
-        vectorstore.create_index(num_lists, dimensions, CosmosDBSimilarityType.IP)
+        vectorstore.create_index(
+            num_lists, dimensions, CosmosDBSimilarityType.IP, kind, m, ef_construction
+        )
        sleep(2)  # waits for the index to be set up

-        output = vectorstore.similarity_search("Sandwich", k=1)
+        output = vectorstore.similarity_search(
+            "Sandwich",
+            k=1,
+            kind=kind,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )

        assert output
        assert output[0].page_content == "What is a sandwich?"
        assert output[0].metadata["c"] == 1
+
        vectorstore.delete_index()

    def test_from_texts_with_metadatas_euclidean_distance(
@ -336,14 +412,23 @@ class TestAzureCosmosDBVectorSearch:
        )

        # Create the IVF index that will be leveraged later for vector search
-        vectorstore.create_index(num_lists, dimensions, CosmosDBSimilarityType.L2)
+        vectorstore.create_index(
+            num_lists, dimensions, CosmosDBSimilarityType.L2, kind, m, ef_construction
+        )
        sleep(2)  # waits for the index to be set up

-        output = vectorstore.similarity_search("Sandwich", k=1)
+        output = vectorstore.similarity_search(
+            "Sandwich",
+            k=1,
+            kind=kind,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )

        assert output
        assert output[0].page_content == "What is a sandwich?"
        assert output[0].metadata["c"] == 1
+
        vectorstore.delete_index()

    def test_max_marginal_relevance_cosine_distance(
@ -358,15 +443,20 @@ class TestAzureCosmosDBVectorSearch:
        )

        # Create the IVF index that will be leveraged later for vector search
-        vectorstore.create_index(num_lists, dimensions, CosmosDBSimilarityType.COS)
+        vectorstore.create_index(
+            num_lists, dimensions, similarity_algorithm, kind, m, ef_construction
+        )
        sleep(2)  # waits for the index to be set up

        query = "foo"
-        output = vectorstore.max_marginal_relevance_search(query, k=10, lambda_mult=0.1)
+        output = vectorstore.max_marginal_relevance_search(
+            query, k=10, kind=kind, lambda_mult=0.1, score_threshold=score_threshold
+        )

        assert len(output) == len(texts)
        assert output[0].page_content == "foo"
        assert output[1].page_content != "foo"
+
        vectorstore.delete_index()

    def test_max_marginal_relevance_inner_product(
@ -381,19 +471,439 @@ class TestAzureCosmosDBVectorSearch:
        )

        # Create the IVF index that will be leveraged later for vector search
-        vectorstore.create_index(num_lists, dimensions, CosmosDBSimilarityType.IP)
+        vectorstore.create_index(
+            num_lists, dimensions, CosmosDBSimilarityType.IP, kind, m, ef_construction
+        )
        sleep(2)  # waits for the index to be set up

        query = "foo"
-        output = vectorstore.max_marginal_relevance_search(query, k=10, lambda_mult=0.1)
+        output = vectorstore.max_marginal_relevance_search(
+            query, k=10, kind=kind, lambda_mult=0.1, score_threshold=score_threshold
+        )

        assert len(output) == len(texts)
        assert output[0].page_content == "foo"
        assert output[1].page_content != "foo"
+
        vectorstore.delete_index()

-    def invoke_delete_with_no_args(
+    """
+        Test cases for the similarity algorithm using vector-hnsw
+    """
+
+    def test_from_documents_cosine_distance_vector_hnsw(
+        self, azure_openai_embeddings: OpenAIEmbeddings, collection: Any
+    ) -> None:
+        """Test end to end construction and search."""
+        documents = [
+            Document(page_content="Dogs are tough.", metadata={"a": 1}),
+            Document(page_content="Cats have fluff.", metadata={"b": 1}),
+            Document(page_content="What is a sandwich?", metadata={"c": 1}),
+            Document(page_content="That fence is purple.", metadata={"d": 1, "e": 2}),
+        ]
+
+        vectorstore = AzureCosmosDBVectorSearch.from_documents(
+            documents,
+            azure_openai_embeddings,
+            collection=collection,
+            index_name=INDEX_NAME_VECTOR_HNSW,
+        )
+        sleep(1)  # waits for Cosmos DB to save contents to the collection
+
+        # Create the IVF index that will be leveraged later for vector search
+        vectorstore.create_index(
+            num_lists,
+            dimensions,
+            similarity_algorithm,
+            CosmosDBVectorSearchType.VECTOR_HNSW,
+            m,
+            ef_construction,
+        )
+        sleep(2)  # waits for the index to be set up
+
+        output = vectorstore.similarity_search(
+            "Sandwich",
+            k=1,
+            kind=CosmosDBVectorSearchType.VECTOR_HNSW,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )
+
+        assert output
+        assert output[0].page_content == "What is a sandwich?"
+        assert output[0].metadata["c"] == 1
+
+        vectorstore.delete_index()
+
+    def test_from_documents_inner_product_vector_hnsw(
+        self, azure_openai_embeddings: OpenAIEmbeddings, collection: Any
+    ) -> None:
+        """Test end to end construction and search."""
+        documents = [
+            Document(page_content="Dogs are tough.", metadata={"a": 1}),
+            Document(page_content="Cats have fluff.", metadata={"b": 1}),
+            Document(page_content="What is a sandwich?", metadata={"c": 1}),
+            Document(page_content="That fence is purple.", metadata={"d": 1, "e": 2}),
+        ]
+
+        vectorstore = AzureCosmosDBVectorSearch.from_documents(
+            documents,
+            azure_openai_embeddings,
+            collection=collection,
+            index_name=INDEX_NAME_VECTOR_HNSW,
+        )
+        sleep(1)  # waits for Cosmos DB to save contents to the collection
+
+        # Create the IVF index that will be leveraged later for vector search
+        vectorstore.create_index(
+            num_lists,
+            dimensions,
+            similarity_algorithm,
+            CosmosDBVectorSearchType.VECTOR_HNSW,
+            m,
+            ef_construction,
+        )
+        sleep(2)  # waits for the index to be set up
+
+        output = vectorstore.similarity_search(
+            "Sandwich",
+            k=1,
+            kind=CosmosDBVectorSearchType.VECTOR_HNSW,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )
+
+        assert output
+        assert output[0].page_content == "What is a sandwich?"
+        assert output[0].metadata["c"] == 1
+
+        vectorstore.delete_index()
+
+    def test_from_texts_cosine_distance_vector_hnsw(
+        self, azure_openai_embeddings: OpenAIEmbeddings, collection: Any
+    ) -> None:
+        texts = [
+            "Dogs are tough.",
+            "Cats have fluff.",
+            "What is a sandwich?",
+            "That fence is purple.",
+        ]
+        vectorstore = AzureCosmosDBVectorSearch.from_texts(
+            texts,
+            azure_openai_embeddings,
+            collection=collection,
+            index_name=INDEX_NAME_VECTOR_HNSW,
+        )
+
+        # Create the IVF index that will be leveraged later for vector search
+        vectorstore.create_index(
+            num_lists,
+            dimensions,
+            similarity_algorithm,
+            CosmosDBVectorSearchType.VECTOR_HNSW,
+            m,
+            ef_construction,
+        )
+        sleep(2)  # waits for the index to be set up
+
+        output = vectorstore.similarity_search(
+            "Sandwich",
+            k=1,
+            kind=CosmosDBVectorSearchType.VECTOR_HNSW,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )
+
+        assert output[0].page_content == "What is a sandwich?"
+
+        vectorstore.delete_index()
+
+    def test_from_texts_with_metadatas_cosine_distance_vector_hnsw(
        self, azure_openai_embeddings: OpenAIEmbeddings, collection: Any
+    ) -> None:
+        texts = [
+            "Dogs are tough.",
+            "Cats have fluff.",
+            "What is a sandwich?",
+            "The fence is purple.",
+        ]
+        metadatas = [{"a": 1}, {"b": 1}, {"c": 1}, {"d": 1, "e": 2}]
+        vectorstore = AzureCosmosDBVectorSearch.from_texts(
+            texts,
+            azure_openai_embeddings,
+            metadatas=metadatas,
+            collection=collection,
+            index_name=INDEX_NAME_VECTOR_HNSW,
+        )
+
+        # Create the IVF index that will be leveraged later for vector search
+        vectorstore.create_index(
+            num_lists,
+            dimensions,
+            similarity_algorithm,
+            CosmosDBVectorSearchType.VECTOR_HNSW,
+            m,
+            ef_construction,
+        )
+        sleep(2)  # waits for the index to be set up
+
+        output = vectorstore.similarity_search(
+            "Sandwich",
+            k=1,
+            kind=CosmosDBVectorSearchType.VECTOR_HNSW,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )
+
+        assert output
+        assert output[0].page_content == "What is a sandwich?"
+        assert output[0].metadata["c"] == 1
+
+        vectorstore.delete_index()
+
+    def test_from_texts_with_metadatas_delete_one_vector_hnsw(
+        self, azure_openai_embeddings: OpenAIEmbeddings, collection: Any
+    ) -> None:
+        texts = [
+            "Dogs are tough.",
+            "Cats have fluff.",
+            "What is a sandwich?",
+            "The fence is purple.",
+        ]
+        metadatas = [{"a": 1}, {"b": 1}, {"c": 1}, {"d": 1, "e": 2}]
+        vectorstore = AzureCosmosDBVectorSearch.from_texts(
+            texts,
+            azure_openai_embeddings,
+            metadatas=metadatas,
+            collection=collection,
+            index_name=INDEX_NAME_VECTOR_HNSW,
+        )
+
+        # Create the IVF index that will be leveraged later for vector search
+        vectorstore.create_index(
+            num_lists,
+            dimensions,
+            similarity_algorithm,
+            CosmosDBVectorSearchType.VECTOR_HNSW,
+            m,
+            ef_construction,
+        )
+        sleep(2)  # waits for the index to be set up
+
+        output = vectorstore.similarity_search(
+            "Sandwich",
+            k=1,
+            kind=CosmosDBVectorSearchType.VECTOR_HNSW,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )
+
+        assert output
+        assert output[0].page_content == "What is a sandwich?"
+        assert output[0].metadata["c"] == 1
+
+        first_document_id_object = output[0].metadata["_id"]
+        first_document_id = str(first_document_id_object)
+
+        vectorstore.delete_document_by_id(first_document_id)
+        sleep(2)  # waits for the index to be updated
+
+        output2 = vectorstore.similarity_search(
+            "Sandwich",
+            k=1,
+            kind=CosmosDBVectorSearchType.VECTOR_HNSW,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )
+        assert output2
+        assert output2[0].page_content != "What is a sandwich?"
+
+        vectorstore.delete_index()
+
+    def test_from_texts_with_metadatas_delete_multiple_vector_hnsw(
+        self, azure_openai_embeddings: OpenAIEmbeddings, collection: Any
+    ) -> None:
+        texts = [
+            "Dogs are tough.",
+            "Cats have fluff.",
+            "What is a sandwich?",
+            "The fence is purple.",
+        ]
+        metadatas = [{"a": 1}, {"b": 1}, {"c": 1}, {"d": 1, "e": 2}]
+        vectorstore = AzureCosmosDBVectorSearch.from_texts(
+            texts,
+            azure_openai_embeddings,
+            metadatas=metadatas,
+            collection=collection,
+            index_name=INDEX_NAME_VECTOR_HNSW,
+        )
+
+        # Create the IVF index that will be leveraged later for vector search
+        vectorstore.create_index(
+            num_lists,
+            dimensions,
+            similarity_algorithm,
+            CosmosDBVectorSearchType.VECTOR_HNSW,
+            m,
+            ef_construction,
+        )
+        sleep(2)  # waits for the index to be set up
+
+        output = vectorstore.similarity_search(
+            "Sandwich",
+            k=5,
+            kind=CosmosDBVectorSearchType.VECTOR_HNSW,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )
+
+        first_document_id = str(output[0].metadata["_id"])
+
+        second_document_id = str(output[1].metadata["_id"])
+
+        third_document_id = str(output[2].metadata["_id"])
+
+        document_ids = [first_document_id, second_document_id, third_document_id]
+        vectorstore.delete(document_ids)
+        sleep(2)  # waits for the index to be updated
+
+        output_2 = vectorstore.similarity_search(
+            "Sandwich",
+            k=5,
+            kind=CosmosDBVectorSearchType.VECTOR_HNSW,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )
+        assert output
+        assert output_2
+
+        assert len(output) == 4  # we should see all the four documents
+        assert (
+            len(output_2) == 1
+        )  # we should see only one document left after three have been deleted
+
+        vectorstore.delete_index()
+
+    def test_from_texts_with_metadatas_inner_product_vector_hnsw(
+        self, azure_openai_embeddings: OpenAIEmbeddings, collection: Any
+    ) -> None:
+        texts = [
+            "Dogs are tough.",
+            "Cats have fluff.",
+            "What is a sandwich?",
+            "The fence is purple.",
+        ]
+        metadatas = [{"a": 1}, {"b": 1}, {"c": 1}, {"d": 1, "e": 2}]
+        vectorstore = AzureCosmosDBVectorSearch.from_texts(
+            texts,
+            azure_openai_embeddings,
+            metadatas=metadatas,
+            collection=collection,
+            index_name=INDEX_NAME_VECTOR_HNSW,
+        )
+
+        # Create the IVF index that will be leveraged later for vector search
+        vectorstore.create_index(
+            num_lists,
+            dimensions,
+            similarity_algorithm,
+            CosmosDBVectorSearchType.VECTOR_HNSW,
+            m,
+            ef_construction,
+        )
+        sleep(2)  # waits for the index to be set up
+
+        output = vectorstore.similarity_search(
+            "Sandwich",
+            k=1,
+            kind=CosmosDBVectorSearchType.VECTOR_HNSW,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )
+
+        assert output
+        assert output[0].page_content == "What is a sandwich?"
+        assert output[0].metadata["c"] == 1
+
+        vectorstore.delete_index()
+
+    def test_max_marginal_relevance_cosine_distance_vector_hnsw(
+        self, azure_openai_embeddings: OpenAIEmbeddings, collection: Any
+    ) -> None:
+        texts = ["foo", "foo", "fou", "foy"]
+        vectorstore = AzureCosmosDBVectorSearch.from_texts(
+            texts,
+            azure_openai_embeddings,
+            collection=collection,
+            index_name=INDEX_NAME_VECTOR_HNSW,
+        )
+
+        # Create the IVF index that will be leveraged later for vector search
+        vectorstore.create_index(
+            num_lists,
+            dimensions,
+            similarity_algorithm,
+            CosmosDBVectorSearchType.VECTOR_HNSW,
+            m,
+            ef_construction,
+        )
+        sleep(2)  # waits for the index to be set up
+
+        query = "foo"
+        output = vectorstore.max_marginal_relevance_search(
+            query,
+            k=10,
+            kind=CosmosDBVectorSearchType.VECTOR_HNSW,
+            lambda_mult=0.1,
+            score_threshold=score_threshold,
+        )
+
+        assert len(output) == len(texts)
+        assert output[0].page_content == "foo"
+        assert output[1].page_content != "foo"
+
+        vectorstore.delete_index()
+
+    def test_max_marginal_relevance_inner_product_vector_hnsw(
+        self, azure_openai_embeddings: OpenAIEmbeddings, collection: Any
+    ) -> None:
+        texts = ["foo", "foo", "fou", "foy"]
+        vectorstore = AzureCosmosDBVectorSearch.from_texts(
+            texts,
+            azure_openai_embeddings,
+            collection=collection,
+            index_name=INDEX_NAME_VECTOR_HNSW,
+        )
+
+        # Create the IVF index that will be leveraged later for vector search
+        vectorstore.create_index(
+            num_lists,
+            dimensions,
+            similarity_algorithm,
+            CosmosDBVectorSearchType.VECTOR_HNSW,
+            m,
+            ef_construction,
+        )
+        sleep(2)  # waits for the index to be set up
+
+        query = "foo"
+        output = vectorstore.max_marginal_relevance_search(
+            query,
+            k=10,
+            kind=CosmosDBVectorSearchType.VECTOR_HNSW,
+            lambda_mult=0.1,
+            score_threshold=score_threshold,
+        )
+
+        assert len(output) == len(texts)
+        assert output[0].page_content == "foo"
+        assert output[1].page_content != "foo"
+
+        vectorstore.delete_index()
+
+    @staticmethod
+    def invoke_delete_with_no_args(
+        azure_openai_embeddings: OpenAIEmbeddings, collection: Any
    ) -> Optional[bool]:
        vectorstore: AzureCosmosDBVectorSearch = (
            AzureCosmosDBVectorSearch.from_connection_string(
@ -406,8 +916,9 @@ class TestAzureCosmosDBVectorSearch:

        return vectorstore.delete()

+    @staticmethod
    def invoke_delete_by_id_with_no_args(
-        self, azure_openai_embeddings: OpenAIEmbeddings, collection: Any
+        azure_openai_embeddings: OpenAIEmbeddings, collection: Any
    ) -> None:
        vectorstore: AzureCosmosDBVectorSearch = (
            AzureCosmosDBVectorSearch.from_connection_string(
@ -431,5 +942,7 @@ class TestAzureCosmosDBVectorSearch:
        self, azure_openai_embeddings: OpenAIEmbeddings, collection: Any
    ) -> None:
        with pytest.raises(Exception) as exception_info:
-            self.invoke_delete_by_id_with_no_args(azure_openai_embeddings, collection)
+            self.invoke_delete_by_id_with_no_args(
+                azure_openai_embeddings=azure_openai_embeddings, collection=collection
+            )
        assert str(exception_info.value) == "No document id provided to delete."
--- a/libs/langchain/tests/integration_tests/cache/test_azure_cosmosdb_cache.py
+++ b/libs/langchain/tests/integration_tests/cache/test_azure_cosmosdb_cache.py
@ -0,0 +1,350 @@
+"""Test Azure CosmosDB cache functionality.
+
+Required to run this test:
+    - a recent 'pymongo' Python package available
+    - an Azure CosmosDB Mongo vCore instance
+    - one environment variable set:
+        export MONGODB_VCORE_URI="connection string for azure cosmos db mongo vCore"
+"""
+import os
+import uuid
+
+import pytest
+from langchain_community.cache import AzureCosmosDBSemanticCache
+from langchain_community.vectorstores.azure_cosmos_db import (
+    CosmosDBSimilarityType,
+    CosmosDBVectorSearchType,
+)
+from langchain_core.outputs import Generation
+
+from langchain.globals import get_llm_cache, set_llm_cache
+from tests.integration_tests.cache.fake_embeddings import (
+    FakeEmbeddings,
+)
+from tests.unit_tests.llms.fake_llm import FakeLLM
+
+INDEX_NAME = "langchain-test-index"
+NAMESPACE = "langchain_test_db.langchain_test_collection"
+CONNECTION_STRING: str = os.environ.get("MONGODB_VCORE_URI", "")
+DB_NAME, COLLECTION_NAME = NAMESPACE.split(".")
+
+num_lists = 3
+dimensions = 10
+similarity_algorithm = CosmosDBSimilarityType.COS
+kind = CosmosDBVectorSearchType.VECTOR_IVF
+m = 16
+ef_construction = 64
+ef_search = 40
+score_threshold = 0.1
+
+
+def _has_env_vars() -> bool:
+    return all(["MONGODB_VCORE_URI" in os.environ])
+
+
+def random_string() -> str:
+    return str(uuid.uuid4())
+
+
+@pytest.mark.requires("pymongo")
+@pytest.mark.skipif(
+    not _has_env_vars(), reason="Missing Azure CosmosDB Mongo vCore env. vars"
+)
+def test_azure_cosmos_db_semantic_cache() -> None:
+    set_llm_cache(
+        AzureCosmosDBSemanticCache(
+            cosmosdb_connection_string=CONNECTION_STRING,
+            cosmosdb_client=None,
+            embedding=FakeEmbeddings(),
+            database_name=DB_NAME,
+            collection_name=COLLECTION_NAME,
+            num_lists=num_lists,
+            similarity=similarity_algorithm,
+            kind=kind,
+            dimensions=dimensions,
+            m=m,
+            ef_construction=ef_construction,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )
+    )
+
+    llm = FakeLLM()
+    params = llm.dict()
+    params["stop"] = None
+    llm_string = str(sorted([(k, v) for k, v in params.items()]))
+    get_llm_cache().update("foo", llm_string, [Generation(text="fizz")])
+
+    # foo and bar will have the same embedding produced by FakeEmbeddings
+    cache_output = get_llm_cache().lookup("bar", llm_string)
+    assert cache_output == [Generation(text="fizz")]
+
+    # clear the cache
+    get_llm_cache().clear(llm_string=llm_string)
+
+
+@pytest.mark.requires("pymongo")
+@pytest.mark.skipif(
+    not _has_env_vars(), reason="Missing Azure CosmosDB Mongo vCore env. vars"
+)
+def test_azure_cosmos_db_semantic_cache_inner_product() -> None:
+    set_llm_cache(
+        AzureCosmosDBSemanticCache(
+            cosmosdb_connection_string=CONNECTION_STRING,
+            cosmosdb_client=None,
+            embedding=FakeEmbeddings(),
+            database_name=DB_NAME,
+            collection_name=COLLECTION_NAME,
+            num_lists=num_lists,
+            similarity=CosmosDBSimilarityType.IP,
+            kind=kind,
+            dimensions=dimensions,
+            m=m,
+            ef_construction=ef_construction,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )
+    )
+
+    llm = FakeLLM()
+    params = llm.dict()
+    params["stop"] = None
+    llm_string = str(sorted([(k, v) for k, v in params.items()]))
+    get_llm_cache().update("foo", llm_string, [Generation(text="fizz")])
+
+    # foo and bar will have the same embedding produced by FakeEmbeddings
+    cache_output = get_llm_cache().lookup("bar", llm_string)
+    assert cache_output == [Generation(text="fizz")]
+
+    # clear the cache
+    get_llm_cache().clear(llm_string=llm_string)
+
+
+@pytest.mark.requires("pymongo")
+@pytest.mark.skipif(
+    not _has_env_vars(), reason="Missing Azure CosmosDB Mongo vCore env. vars"
+)
+def test_azure_cosmos_db_semantic_cache_multi() -> None:
+    set_llm_cache(
+        AzureCosmosDBSemanticCache(
+            cosmosdb_connection_string=CONNECTION_STRING,
+            cosmosdb_client=None,
+            embedding=FakeEmbeddings(),
+            database_name=DB_NAME,
+            collection_name=COLLECTION_NAME,
+            num_lists=num_lists,
+            similarity=similarity_algorithm,
+            kind=kind,
+            dimensions=dimensions,
+            m=m,
+            ef_construction=ef_construction,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )
+    )
+
+    llm = FakeLLM()
+    params = llm.dict()
+    params["stop"] = None
+    llm_string = str(sorted([(k, v) for k, v in params.items()]))
+    get_llm_cache().update(
+        "foo", llm_string, [Generation(text="fizz"), Generation(text="Buzz")]
+    )
+
+    # foo and bar will have the same embedding produced by FakeEmbeddings
+    cache_output = get_llm_cache().lookup("bar", llm_string)
+    assert cache_output == [Generation(text="fizz"), Generation(text="Buzz")]
+
+    # clear the cache
+    get_llm_cache().clear(llm_string=llm_string)
+
+
+@pytest.mark.requires("pymongo")
+@pytest.mark.skipif(
+    not _has_env_vars(), reason="Missing Azure CosmosDB Mongo vCore env. vars"
+)
+def test_azure_cosmos_db_semantic_cache_multi_inner_product() -> None:
+    set_llm_cache(
+        AzureCosmosDBSemanticCache(
+            cosmosdb_connection_string=CONNECTION_STRING,
+            cosmosdb_client=None,
+            embedding=FakeEmbeddings(),
+            database_name=DB_NAME,
+            collection_name=COLLECTION_NAME,
+            num_lists=num_lists,
+            similarity=CosmosDBSimilarityType.IP,
+            kind=kind,
+            dimensions=dimensions,
+            m=m,
+            ef_construction=ef_construction,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )
+    )
+
+    llm = FakeLLM()
+    params = llm.dict()
+    params["stop"] = None
+    llm_string = str(sorted([(k, v) for k, v in params.items()]))
+    get_llm_cache().update(
+        "foo", llm_string, [Generation(text="fizz"), Generation(text="Buzz")]
+    )
+
+    # foo and bar will have the same embedding produced by FakeEmbeddings
+    cache_output = get_llm_cache().lookup("bar", llm_string)
+    assert cache_output == [Generation(text="fizz"), Generation(text="Buzz")]
+
+    # clear the cache
+    get_llm_cache().clear(llm_string=llm_string)
+
+
+@pytest.mark.requires("pymongo")
+@pytest.mark.skipif(
+    not _has_env_vars(), reason="Missing Azure CosmosDB Mongo vCore env. vars"
+)
+def test_azure_cosmos_db_semantic_cache_hnsw() -> None:
+    set_llm_cache(
+        AzureCosmosDBSemanticCache(
+            cosmosdb_connection_string=CONNECTION_STRING,
+            cosmosdb_client=None,
+            embedding=FakeEmbeddings(),
+            database_name=DB_NAME,
+            collection_name=COLLECTION_NAME,
+            num_lists=num_lists,
+            similarity=similarity_algorithm,
+            kind=CosmosDBVectorSearchType.VECTOR_HNSW,
+            dimensions=dimensions,
+            m=m,
+            ef_construction=ef_construction,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )
+    )
+
+    llm = FakeLLM()
+    params = llm.dict()
+    params["stop"] = None
+    llm_string = str(sorted([(k, v) for k, v in params.items()]))
+    get_llm_cache().update("foo", llm_string, [Generation(text="fizz")])
+
+    # foo and bar will have the same embedding produced by FakeEmbeddings
+    cache_output = get_llm_cache().lookup("bar", llm_string)
+    assert cache_output == [Generation(text="fizz")]
+
+    # clear the cache
+    get_llm_cache().clear(llm_string=llm_string)
+
+
+@pytest.mark.requires("pymongo")
+@pytest.mark.skipif(
+    not _has_env_vars(), reason="Missing Azure CosmosDB Mongo vCore env. vars"
+)
+def test_azure_cosmos_db_semantic_cache_inner_product_hnsw() -> None:
+    set_llm_cache(
+        AzureCosmosDBSemanticCache(
+            cosmosdb_connection_string=CONNECTION_STRING,
+            cosmosdb_client=None,
+            embedding=FakeEmbeddings(),
+            database_name=DB_NAME,
+            collection_name=COLLECTION_NAME,
+            num_lists=num_lists,
+            similarity=CosmosDBSimilarityType.IP,
+            kind=CosmosDBVectorSearchType.VECTOR_HNSW,
+            dimensions=dimensions,
+            m=m,
+            ef_construction=ef_construction,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )
+    )
+
+    llm = FakeLLM()
+    params = llm.dict()
+    params["stop"] = None
+    llm_string = str(sorted([(k, v) for k, v in params.items()]))
+    get_llm_cache().update("foo", llm_string, [Generation(text="fizz")])
+
+    # foo and bar will have the same embedding produced by FakeEmbeddings
+    cache_output = get_llm_cache().lookup("bar", llm_string)
+    assert cache_output == [Generation(text="fizz")]
+
+    # clear the cache
+    get_llm_cache().clear(llm_string=llm_string)
+
+
+@pytest.mark.requires("pymongo")
+@pytest.mark.skipif(
+    not _has_env_vars(), reason="Missing Azure CosmosDB Mongo vCore env. vars"
+)
+def test_azure_cosmos_db_semantic_cache_multi_hnsw() -> None:
+    set_llm_cache(
+        AzureCosmosDBSemanticCache(
+            cosmosdb_connection_string=CONNECTION_STRING,
+            cosmosdb_client=None,
+            embedding=FakeEmbeddings(),
+            database_name=DB_NAME,
+            collection_name=COLLECTION_NAME,
+            num_lists=num_lists,
+            similarity=similarity_algorithm,
+            kind=CosmosDBVectorSearchType.VECTOR_HNSW,
+            dimensions=dimensions,
+            m=m,
+            ef_construction=ef_construction,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )
+    )
+
+    llm = FakeLLM()
+    params = llm.dict()
+    params["stop"] = None
+    llm_string = str(sorted([(k, v) for k, v in params.items()]))
+    get_llm_cache().update(
+        "foo", llm_string, [Generation(text="fizz"), Generation(text="Buzz")]
+    )
+
+    # foo and bar will have the same embedding produced by FakeEmbeddings
+    cache_output = get_llm_cache().lookup("bar", llm_string)
+    assert cache_output == [Generation(text="fizz"), Generation(text="Buzz")]
+
+    # clear the cache
+    get_llm_cache().clear(llm_string=llm_string)
+
+
+@pytest.mark.requires("pymongo")
+@pytest.mark.skipif(
+    not _has_env_vars(), reason="Missing Azure CosmosDB Mongo vCore env. vars"
+)
+def test_azure_cosmos_db_semantic_cache_multi_inner_product_hnsw() -> None:
+    set_llm_cache(
+        AzureCosmosDBSemanticCache(
+            cosmosdb_connection_string=CONNECTION_STRING,
+            cosmosdb_client=None,
+            embedding=FakeEmbeddings(),
+            database_name=DB_NAME,
+            collection_name=COLLECTION_NAME,
+            num_lists=num_lists,
+            similarity=CosmosDBSimilarityType.IP,
+            kind=CosmosDBVectorSearchType.VECTOR_HNSW,
+            dimensions=dimensions,
+            m=m,
+            ef_construction=ef_construction,
+            ef_search=ef_search,
+            score_threshold=score_threshold,
+        )
+    )
+
+    llm = FakeLLM()
+    params = llm.dict()
+    params["stop"] = None
+    llm_string = str(sorted([(k, v) for k, v in params.items()]))
+    get_llm_cache().update(
+        "foo", llm_string, [Generation(text="fizz"), Generation(text="Buzz")]
+    )
+
+    # foo and bar will have the same embedding produced by FakeEmbeddings
+    cache_output = get_llm_cache().lookup("bar", llm_string)
+    assert cache_output == [Generation(text="fizz"), Generation(text="Buzz")]
+
+    # clear the cache
+    get_llm_cache().clear(llm_string=llm_string)