community[minor]: Add VDMS vectorstore (#19551)

- **Description:** Add support for Intel Lab's [Visual Data Management System (VDMS)](https://github.com/IntelLabs/vdms) as a vector store - **Dependencies:** `vdms` library which requires protobuf = "4.24.2". There is a conflict with dashvector in `langchain` package but conflict is resolved in `community`. - **Contribution maintainer:** [@cwlacewe](https://github.com/cwlacewe) - **Added tests:** libs/community/tests/integration_tests/vectorstores/test_vdms.py - **Added docs:** docs/docs/integrations/vectorstores/vdms.ipynb - **Added cookbook:** cookbook/multi_modal_RAG_vdms.ipynb --------- Co-authored-by: Eugene Yurtsev <eugene@langchain.dev> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>
1 month ago · a31f692f4e
parent b7b62e29fb
commit a31f692f4e
12 changed files with 3705 additions and 20 deletions
--- a/cookbook/multi_modal_RAG_vdms.ipynb
+++ b/cookbook/multi_modal_RAG_vdms.ipynb
--- a/docker/docker-compose.yml
+++ b/docker/docker-compose.yml
@ -4,14 +4,14 @@
 # ATTENTION: When adding a service below use a non-standard port
 # increment by one from the preceding port.
 # For credentials always use `langchain` and `langchain` for the
-# username and password. 
+# username and password.
 version: "3"
 name: langchain-tests

 services:
  redis:
    image: redis/redis-stack-server:latest
-    # We use non standard ports since 
+    # We use non standard ports since
    # these instances are used for testing
    # and users may already have existing
    # redis instances set up locally
@ -73,6 +73,11 @@ services:
      retries: 60
    volumes:
      - postgres_data_pgvector:/var/lib/postgresql/data
+  vdms:
+    image: intellabs/vdms:latest
+    container_name: vdms_container
+    ports:
+      - "6025:55555"

 volumes:
  postgres_data:
--- a/docs/docs/integrations/providers/vdms.mdx
+++ b/docs/docs/integrations/providers/vdms.mdx
@ -0,0 +1,62 @@
+# VDMS
+
+> [VDMS](https://github.com/IntelLabs/vdms/blob/master/README.md) is a storage solution for efficient access
+> of big-”visual”-data that aims to achieve cloud scale by searching for relevant visual data via visual metadata
+> stored as a graph and enabling machine friendly enhancements to visual data for faster access.
+
+## Installation and Setup
+
+### Install Client
+
+```bash
+pip install vdms
+```
+
+### Install Database
+
+There are two ways to get started with VDMS:
+
+#### Install VDMS on your local machine via docker
+```bash
+    docker run -d -p 55555:55555 intellabs/vdms:latest
+```
+
+#### Install VDMS directly on your local machine
+Please see [installation instructions](https://github.com/IntelLabs/vdms/blob/master/INSTALL.md).
+
+
+
+## VectorStore
+
+The vector store is a simple wrapper around VDMS. It provides a simple interface to store and retrieve data.
+
+```python
+from langchain_community.document_loaders import TextLoader
+from langchain.text_splitter import CharacterTextSplitter
+
+loader = TextLoader("./state_of_the_union.txt")
+documents = loader.load()
+text_splitter = CharacterTextSplitter(chunk_size=500, chunk_overlap=0)
+docs = text_splitter.split_documents(documents)
+
+from langchain_community.vectorstores import VDMS
+from langchain_community.vectorstores.vdms import VDMS_Client
+from langchain_community.embeddings.huggingface import HuggingFaceEmbeddings
+
+client = VDMS_Client("localhost", 55555)
+vectorstore = VDMS.from_documents(
+    docs,
+    client=client,
+    collection_name="langchain-demo",
+    embedding_function=HuggingFaceEmbeddings(),
+    engine="FaissFlat"
+    distance_strategy="L2",
+)
+
+query = "What did the president say about Ketanji Brown Jackson"
+results = vectorstore.similarity_search(query)
+```
+
+For a more detailed walkthrough of the VDMS wrapper, see [this notebook](/docs/integrations/vectorstores/vdms)
+
+
--- a/docs/docs/integrations/vectorstores/vdms.ipynb
+++ b/docs/docs/integrations/vectorstores/vdms.ipynb
--- a/docs/docs/modules/data_connection/indexing.ipynb
+++ b/docs/docs/modules/data_connection/indexing.ipynb
@ -60,7 +60,7 @@
    "   * document addition by id (`add_documents` method with `ids` argument)\n",
    "   * delete by id (`delete` method with `ids` argument)\n",
    "\n",
-    "Compatible Vectorstores: `AnalyticDB`, `AstraDB`, `AwaDB`, `Bagel`, `Cassandra`, `Chroma`, `CouchbaseVectorStore`, `DashVector`, `DatabricksVectorSearch`, `DeepLake`, `Dingo`, `ElasticVectorSearch`, `ElasticsearchStore`, `FAISS`, `HanaDB`, `Milvus`, `MyScale`, `OpenSearchVectorSearch`, `PGVector`, `Pinecone`, `Qdrant`, `Redis`, `Rockset`, `ScaNN`, `SupabaseVectorStore`, `SurrealDBStore`, `TimescaleVector`, `Vald`, `Vearch`, `VespaStore`, `Weaviate`, `ZepVectorStore`.\n",
+    "Compatible Vectorstores: `AnalyticDB`, `AstraDB`, `AwaDB`, `Bagel`, `Cassandra`, `Chroma`, `CouchbaseVectorStore`, `DashVector`, `DatabricksVectorSearch`, `DeepLake`, `Dingo`, `ElasticVectorSearch`, `ElasticsearchStore`, `FAISS`, `HanaDB`, `Milvus`, `MyScale`, `OpenSearchVectorSearch`, `PGVector`, `Pinecone`, `Qdrant`, `Redis`, `Rockset`, `ScaNN`, `SupabaseVectorStore`, `SurrealDBStore`, `TimescaleVector`, `Vald`, `VDMS`, `Vearch`, `VespaStore`, `Weaviate`, `ZepVectorStore`.\n",
    "  \n",
    "## Caution\n",
    "\n",
--- a/libs/community/langchain_community/vectorstores/init.py
+++ b/libs/community/langchain_community/vectorstores/init.py
@ -102,6 +102,7 @@ _module_lookup = {
    "Typesense": "langchain_community.vectorstores.typesense",
    "USearch": "langchain_community.vectorstores.usearch",
    "Vald": "langchain_community.vectorstores.vald",
+    "VDMS": "langchain_community.vectorstores.vdms",
    "Vearch": "langchain_community.vectorstores.vearch",
    "Vectara": "langchain_community.vectorstores.vectara",
    "VectorStore": "langchain_core.vectorstores",
--- a/libs/community/langchain_community/vectorstores/vdms.py
+++ b/libs/community/langchain_community/vectorstores/vdms.py
--- a/libs/community/poetry.lock
+++ b/libs/community/poetry.lock
@ -3725,7 +3725,7 @@ files = [

 [[package]]
 name = "langchain-core"
-version = "0.1.34"
+version = "0.1.35"
 description = "Building applications with LLMs through composability"
 optional = false
 python-versions = ">=3.8.1,<4.0"
@ -5467,22 +5467,24 @@ testing = ["google-api-core[grpc] (>=1.31.5)"]

 [[package]]
 name = "protobuf"
-version = "4.25.3"
+version = "4.24.2"
 description = ""
 optional = false
-python-versions = ">=3.8"
+python-versions = ">=3.7"
 files = [
-    {file = "protobuf-4.25.3-cp310-abi3-win32.whl", hash = "sha256:d4198877797a83cbfe9bffa3803602bbe1625dc30d8a097365dbc762e5790faa"},
-    {file = "protobuf-4.25.3-cp310-abi3-win_amd64.whl", hash = "sha256:209ba4cc916bab46f64e56b85b090607a676f66b473e6b762e6f1d9d591eb2e8"},
-    {file = "protobuf-4.25.3-cp37-abi3-macosx_10_9_universal2.whl", hash = "sha256:f1279ab38ecbfae7e456a108c5c0681e4956d5b1090027c1de0f934dfdb4b35c"},
-    {file = "protobuf-4.25.3-cp37-abi3-manylinux2014_aarch64.whl", hash = "sha256:e7cb0ae90dd83727f0c0718634ed56837bfeeee29a5f82a7514c03ee1364c019"},
-    {file = "protobuf-4.25.3-cp37-abi3-manylinux2014_x86_64.whl", hash = "sha256:7c8daa26095f82482307bc717364e7c13f4f1c99659be82890dcfc215194554d"},
-    {file = "protobuf-4.25.3-cp38-cp38-win32.whl", hash = "sha256:f4f118245c4a087776e0a8408be33cf09f6c547442c00395fbfb116fac2f8ac2"},
-    {file = "protobuf-4.25.3-cp38-cp38-win_amd64.whl", hash = "sha256:c053062984e61144385022e53678fbded7aea14ebb3e0305ae3592fb219ccfa4"},
-    {file = "protobuf-4.25.3-cp39-cp39-win32.whl", hash = "sha256:19b270aeaa0099f16d3ca02628546b8baefe2955bbe23224aaf856134eccf1e4"},
-    {file = "protobuf-4.25.3-cp39-cp39-win_amd64.whl", hash = "sha256:e3c97a1555fd6388f857770ff8b9703083de6bf1f9274a002a332d65fbb56c8c"},
-    {file = "protobuf-4.25.3-py3-none-any.whl", hash = "sha256:f0700d54bcf45424477e46a9f0944155b46fb0639d69728739c0e47bab83f2b9"},
-    {file = "protobuf-4.25.3.tar.gz", hash = "sha256:25b5d0b42fd000320bd7830b349e3b696435f3b329810427a6bcce6a5492cc5c"},
+    {file = "protobuf-4.24.2-cp310-abi3-win32.whl", hash = "sha256:58e12d2c1aa428ece2281cef09bbaa6938b083bcda606db3da4e02e991a0d924"},
+    {file = "protobuf-4.24.2-cp310-abi3-win_amd64.whl", hash = "sha256:77700b55ba41144fc64828e02afb41901b42497b8217b558e4a001f18a85f2e3"},
+    {file = "protobuf-4.24.2-cp37-abi3-macosx_10_9_universal2.whl", hash = "sha256:237b9a50bd3b7307d0d834c1b0eb1a6cd47d3f4c2da840802cd03ea288ae8880"},
+    {file = "protobuf-4.24.2-cp37-abi3-manylinux2014_aarch64.whl", hash = "sha256:25ae91d21e3ce8d874211110c2f7edd6384816fb44e06b2867afe35139e1fd1c"},
+    {file = "protobuf-4.24.2-cp37-abi3-manylinux2014_x86_64.whl", hash = "sha256:c00c3c7eb9ad3833806e21e86dca448f46035242a680f81c3fe068ff65e79c74"},
+    {file = "protobuf-4.24.2-cp37-cp37m-win32.whl", hash = "sha256:4e69965e7e54de4db989289a9b971a099e626f6167a9351e9d112221fc691bc1"},
+    {file = "protobuf-4.24.2-cp37-cp37m-win_amd64.whl", hash = "sha256:c5cdd486af081bf752225b26809d2d0a85e575b80a84cde5172a05bbb1990099"},
+    {file = "protobuf-4.24.2-cp38-cp38-win32.whl", hash = "sha256:6bd26c1fa9038b26c5c044ee77e0ecb18463e957fefbaeb81a3feb419313a54e"},
+    {file = "protobuf-4.24.2-cp38-cp38-win_amd64.whl", hash = "sha256:bb7aa97c252279da65584af0456f802bd4b2de429eb945bbc9b3d61a42a8cd16"},
+    {file = "protobuf-4.24.2-cp39-cp39-win32.whl", hash = "sha256:2b23bd6e06445699b12f525f3e92a916f2dcf45ffba441026357dea7fa46f42b"},
+    {file = "protobuf-4.24.2-cp39-cp39-win_amd64.whl", hash = "sha256:839952e759fc40b5d46be319a265cf94920174d88de31657d5622b5d8d6be5cd"},
+    {file = "protobuf-4.24.2-py3-none-any.whl", hash = "sha256:3b7b170d3491ceed33f723bbf2d5a260f8a4e23843799a3906f16ef736ef251e"},
+    {file = "protobuf-4.24.2.tar.gz", hash = "sha256:7fda70797ddec31ddfa3576cbdcc3ddbb6b3078b737a1a87ab9136af0570cd6e"},
 ]

 [[package]]
@ -8700,6 +8702,20 @@ yarl = "*"
 [package.extras]
 tests = ["Werkzeug (==2.0.3)", "aiohttp", "boto3", "httplib2", "httpx", "pytest", "pytest-aiohttp", "pytest-asyncio", "pytest-cov", "pytest-httpbin", "requests (>=2.22.0)", "tornado", "urllib3"]

+[[package]]
+name = "vdms"
+version = "0.0.20"
+description = "VDMS Client Module"
+optional = false
+python-versions = ">=2.6, !=3.0.*, !=3.1.*, !=3.2.*, <4"
+files = [
+    {file = "vdms-0.0.20-py3-none-any.whl", hash = "sha256:7b81127f2981f2dabdcc5880ad7eb4bc2c7833a25aaf79a7b1a560e86bf7b5ec"},
+    {file = "vdms-0.0.20.tar.gz", hash = "sha256:746c21a96e420b9b034495537b42d70f2326b020a1c6907677f7851a926e8605"},
+]
+
+[package.dependencies]
+protobuf = "4.24.2"
+
 [[package]]
 name = "watchdog"
 version = "4.0.0"
@ -9247,9 +9263,9 @@ testing = ["big-O", "jaraco.functools", "jaraco.itertools", "more-itertools", "p

 [extras]
 cli = ["typer"]
-extended-testing = ["aiosqlite", "aleph-alpha-client", "anthropic", "arxiv", "assemblyai", "atlassian-python-api", "azure-ai-documentintelligence", "beautifulsoup4", "bibtexparser", "cassio", "chardet", "cloudpickle", "cloudpickle", "cohere", "databricks-vectorsearch", "datasets", "dgml-utils", "elasticsearch", "esprima", "faiss-cpu", "feedparser", "fireworks-ai", "friendli-client", "geopandas", "gitpython", "google-cloud-documentai", "gql", "gradientai", "hdbcli", "hologres-vector", "html2text", "httpx", "javelin-sdk", "jinja2", "jq", "jsonschema", "lxml", "markdownify", "motor", "msal", "mwparserfromhell", "mwxml", "newspaper3k", "numexpr", "nvidia-riva-client", "oci", "openai", "openapi-pydantic", "oracle-ads", "pandas", "pdfminer-six", "pgvector", "praw", "premai", "psychicapi", "py-trello", "pymupdf", "pypdf", "pypdfium2", "pyspark", "rank-bm25", "rapidfuzz", "rapidocr-onnxruntime", "rdflib", "requests-toolbelt", "rspace_client", "scikit-learn", "sqlite-vss", "streamlit", "sympy", "telethon", "tidb-vector", "timescale-vector", "tqdm", "tree-sitter", "tree-sitter-languages", "upstash-redis", "xata", "xmltodict", "zhipuai"]
+extended-testing = ["aiosqlite", "aleph-alpha-client", "anthropic", "arxiv", "assemblyai", "atlassian-python-api", "azure-ai-documentintelligence", "beautifulsoup4", "bibtexparser", "cassio", "chardet", "cloudpickle", "cloudpickle", "cohere", "databricks-vectorsearch", "datasets", "dgml-utils", "elasticsearch", "esprima", "faiss-cpu", "feedparser", "fireworks-ai", "friendli-client", "geopandas", "gitpython", "google-cloud-documentai", "gql", "gradientai", "hdbcli", "hologres-vector", "html2text", "httpx", "javelin-sdk", "jinja2", "jq", "jsonschema", "lxml", "markdownify", "motor", "msal", "mwparserfromhell", "mwxml", "newspaper3k", "numexpr", "nvidia-riva-client", "oci", "openai", "openapi-pydantic", "oracle-ads", "pandas", "pdfminer-six", "pgvector", "praw", "premai", "psychicapi", "py-trello", "pymupdf", "pypdf", "pypdfium2", "pyspark", "rank-bm25", "rapidfuzz", "rapidocr-onnxruntime", "rdflib", "requests-toolbelt", "rspace_client", "scikit-learn", "sqlite-vss", "streamlit", "sympy", "telethon", "tidb-vector", "timescale-vector", "tqdm", "tree-sitter", "tree-sitter-languages", "upstash-redis", "vdms", "xata", "xmltodict", "zhipuai"]

 [metadata]
 lock-version = "2.0"
 python-versions = ">=3.8.1,<4.0"
-content-hash = "c3f981923b0ba3a6b3ffa99e2ba23ebb0bb548f9f09f979c46e675eb8233cd81"
+content-hash = "310c6e7bd72b09bf42f3fd3565c33072c11438d23cb160cb4666e44bce41a068"
--- a/libs/community/pyproject.toml
+++ b/libs/community/pyproject.toml
@ -98,6 +98,7 @@ nvidia-riva-client = {version = "^2.14.0", optional = true}
 tidb-vector = {version = ">=0.0.3,<1.0.0", optional = true}
 friendli-client = {version = "^1.2.4", optional = true}
 premai = {version = "^0.3.25", optional = true}
+vdms = {version = "^0.0.20", optional = true}

 [tool.poetry.group.test]
 optional = true
@ -156,6 +157,7 @@ tiktoken = ">=0.3.2,<0.6.0"
 anthropic = "^0.3.11"
 langchain-core = { path = "../core", develop = true }
 fireworks-ai = "^0.9.0"
+vdms = "^0.0.20"

 [tool.poetry.group.lint]
 optional = true
@ -269,7 +271,8 @@ extended_testing = [
 "tidb-vector",
 "cloudpickle",
 "friendli-client",
- "premai"
+ "premai",
+ "vdms"
 ]

 [tool.ruff]
--- a/libs/community/tests/integration_tests/vectorstores/test_vdms.py
+++ b/libs/community/tests/integration_tests/vectorstores/test_vdms.py
@ -0,0 +1,365 @@
+"""Test VDMS functionality."""
+from __future__ import annotations
+
+import logging
+import os
+from typing import TYPE_CHECKING
+
+import pytest
+from langchain_core.documents import Document
+
+from langchain_community.vectorstores import VDMS
+from langchain_community.vectorstores.vdms import VDMS_Client, embedding2bytes
+from tests.integration_tests.vectorstores.fake_embeddings import (
+    ConsistentFakeEmbeddings,
+    FakeEmbeddings,
+)
+
+if TYPE_CHECKING:
+    import vdms
+
+logging.basicConfig(level=logging.DEBUG)
+
+
+# The connection string matches the default settings in the docker-compose file
+# located in the root of the repository: [root]/docker/docker-compose.yml
+# To spin up a detached VDMS server:
+# cd [root]/docker
+# docker compose up -d vdms
+@pytest.fixture
+def vdms_client() -> vdms.vdms:
+    return VDMS_Client(
+        host=os.getenv("VDMS_DBHOST", "localhost"),
+        port=int(os.getenv("VDMS_DBPORT", 6025)),
+    )
+
+
+@pytest.mark.requires("vdms")
+def test_init_from_client(vdms_client: vdms.vdms) -> None:
+    embedding_function = FakeEmbeddings()
+    _ = VDMS(
+        embedding_function=embedding_function,
+        client=vdms_client,
+    )
+
+
+@pytest.mark.requires("vdms")
+def test_from_texts_with_metadatas(vdms_client: vdms.vdms) -> None:
+    """Test end to end construction and search."""
+    collection_name = "test_from_texts_with_metadatas"
+    embedding_function = FakeEmbeddings()
+    texts = ["foo", "bar", "baz"]
+    ids = [f"test_from_texts_with_metadatas_{i}" for i in range(len(texts))]
+    metadatas = [{"page": str(i)} for i in range(1, len(texts) + 1)]
+    docsearch = VDMS.from_texts(
+        texts=texts,
+        ids=ids,
+        embedding=embedding_function,
+        metadatas=metadatas,
+        collection_name=collection_name,
+        client=vdms_client,
+    )
+    output = docsearch.similarity_search("foo", k=1)
+    assert output == [
+        Document(page_content="foo", metadata={"page": "1", "id": ids[0]})
+    ]
+
+
+@pytest.mark.requires("vdms")
+def test_from_texts_with_metadatas_with_scores(vdms_client: vdms.vdms) -> None:
+    """Test end to end construction and scored search."""
+    collection_name = "test_from_texts_with_metadatas_with_scores"
+    embedding_function = FakeEmbeddings()
+    texts = ["foo", "bar", "baz"]
+    ids = [f"test_from_texts_with_metadatas_with_scores_{i}" for i in range(len(texts))]
+    metadatas = [{"page": str(i)} for i in range(1, len(texts) + 1)]
+    docsearch = VDMS.from_texts(
+        texts=texts,
+        ids=ids,
+        embedding=embedding_function,
+        metadatas=metadatas,
+        collection_name=collection_name,
+        client=vdms_client,
+    )
+    output = docsearch.similarity_search_with_score("foo", k=1)
+    assert output == [
+        (Document(page_content="foo", metadata={"page": "1", "id": ids[0]}), 0.0)
+    ]
+
+
+@pytest.mark.requires("vdms")
+def test_from_texts_with_metadatas_with_scores_using_vector(
+    vdms_client: vdms.vdms,
+) -> None:
+    """Test end to end construction and scored search, using embedding vector."""
+    collection_name = "test_from_texts_with_metadatas_with_scores_using_vector"
+    embedding_function = FakeEmbeddings()
+    texts = ["foo", "bar", "baz"]
+    ids = [f"test_from_texts_with_metadatas_{i}" for i in range(len(texts))]
+    metadatas = [{"page": str(i)} for i in range(1, len(texts) + 1)]
+    docsearch = VDMS.from_texts(
+        texts=texts,
+        ids=ids,
+        embedding=embedding_function,
+        metadatas=metadatas,
+        collection_name=collection_name,
+        client=vdms_client,
+    )
+    output = docsearch._similarity_search_with_relevance_scores("foo", k=1)
+    assert output == [
+        (Document(page_content="foo", metadata={"page": "1", "id": ids[0]}), 0.0)
+    ]
+
+
+@pytest.mark.requires("vdms")
+def test_search_filter(vdms_client: vdms.vdms) -> None:
+    """Test end to end construction and search with metadata filtering."""
+    collection_name = "test_search_filter"
+    embedding_function = FakeEmbeddings()
+    texts = ["far", "bar", "baz"]
+    ids = [f"test_search_filter_{i}" for i in range(len(texts))]
+    metadatas = [{"first_letter": "{}".format(text[0])} for text in texts]
+    docsearch = VDMS.from_texts(
+        texts=texts,
+        ids=ids,
+        embedding=embedding_function,
+        metadatas=metadatas,
+        collection_name=collection_name,
+        client=vdms_client,
+    )
+    output = docsearch.similarity_search(
+        "far", k=1, filter={"first_letter": ["==", "f"]}
+    )
+    assert output == [
+        Document(page_content="far", metadata={"first_letter": "f", "id": ids[0]})
+    ]
+    output = docsearch.similarity_search(
+        "far", k=2, filter={"first_letter": ["==", "b"]}
+    )
+    assert output == [
+        Document(page_content="bar", metadata={"first_letter": "b", "id": ids[1]}),
+        Document(page_content="baz", metadata={"first_letter": "b", "id": ids[2]}),
+    ]
+
+
+@pytest.mark.requires("vdms")
+def test_search_filter_with_scores(vdms_client: vdms.vdms) -> None:
+    """Test end to end construction and scored search with metadata filtering."""
+    collection_name = "test_search_filter_with_scores"
+    embedding_function = FakeEmbeddings()
+    texts = ["far", "bar", "baz"]
+    ids = [f"test_search_filter_with_scores_{i}" for i in range(len(texts))]
+    metadatas = [{"first_letter": "{}".format(text[0])} for text in texts]
+    docsearch = VDMS.from_texts(
+        texts=texts,
+        ids=ids,
+        embedding=embedding_function,
+        metadatas=metadatas,
+        collection_name=collection_name,
+        client=vdms_client,
+    )
+    output = docsearch.similarity_search_with_score(
+        "far", k=1, filter={"first_letter": ["==", "f"]}
+    )
+    assert output == [
+        (
+            Document(page_content="far", metadata={"first_letter": "f", "id": ids[0]}),
+            0.0,
+        )
+    ]
+
+    output = docsearch.similarity_search_with_score(
+        "far", k=2, filter={"first_letter": ["==", "b"]}
+    )
+    assert output == [
+        (
+            Document(page_content="bar", metadata={"first_letter": "b", "id": ids[1]}),
+            1.0,
+        ),
+        (
+            Document(page_content="baz", metadata={"first_letter": "b", "id": ids[2]}),
+            4.0,
+        ),
+    ]
+
+
+@pytest.mark.requires("vdms")
+def test_mmr(vdms_client: vdms.vdms) -> None:
+    """Test end to end construction and search."""
+    collection_name = "test_mmr"
+    embedding_function = FakeEmbeddings()
+    texts = ["foo", "bar", "baz"]
+    ids = [f"test_mmr_{i}" for i in range(len(texts))]
+    docsearch = VDMS.from_texts(
+        texts=texts,
+        ids=ids,
+        embedding=embedding_function,
+        collection_name=collection_name,
+        client=vdms_client,
+    )
+    output = docsearch.max_marginal_relevance_search("foo", k=1)
+    assert output == [Document(page_content="foo", metadata={"id": ids[0]})]
+
+
+@pytest.mark.requires("vdms")
+def test_mmr_by_vector(vdms_client: vdms.vdms) -> None:
+    """Test end to end construction and search."""
+    collection_name = "test_mmr_by_vector"
+    embedding_function = FakeEmbeddings()
+    texts = ["foo", "bar", "baz"]
+    ids = [f"test_mmr_by_vector_{i}" for i in range(len(texts))]
+    docsearch = VDMS.from_texts(
+        texts=texts,
+        ids=ids,
+        embedding=embedding_function,
+        collection_name=collection_name,
+        client=vdms_client,
+    )
+    embedded_query = embedding_function.embed_query("foo")
+    output = docsearch.max_marginal_relevance_search_by_vector(embedded_query, k=1)
+    assert output == [Document(page_content="foo", metadata={"id": ids[0]})]
+
+
+@pytest.mark.requires("vdms")
+def test_with_include_parameter(vdms_client: vdms.vdms) -> None:
+    """Test end to end construction and include parameter."""
+    collection_name = "test_with_include_parameter"
+    embedding_function = FakeEmbeddings()
+    texts = ["foo", "bar", "baz"]
+    docsearch = VDMS.from_texts(
+        texts=texts,
+        embedding=embedding_function,
+        collection_name=collection_name,
+        client=vdms_client,
+    )
+    response, response_array = docsearch.get(collection_name, include=["embeddings"])
+    assert response_array != []
+    response, response_array = docsearch.get(collection_name)
+    assert response_array == []
+
+
+@pytest.mark.requires("vdms")
+def test_update_document(vdms_client: vdms.vdms) -> None:
+    """Test the update_document function in the VDMS class."""
+    collection_name = "test_update_document"
+
+    # Make a consistent embedding
+    embedding_function = ConsistentFakeEmbeddings()
+
+    # Initial document content and id
+    initial_content = "foo"
+    document_id = "doc1"
+
+    # Create an instance of Document with initial content and metadata
+    original_doc = Document(page_content=initial_content, metadata={"page": "1"})
+
+    # Initialize a VDMS instance with the original document
+    docsearch = VDMS.from_documents(
+        client=vdms_client,
+        collection_name=collection_name,
+        documents=[original_doc],
+        embedding=embedding_function,
+        ids=[document_id],
+    )
+    response, old_embedding = docsearch.get(
+        collection_name,
+        constraints={"id": ["==", document_id]},
+        include=["metadata", "embeddings"],
+    )
+    # old_embedding = response_array[0]
+
+    # Define updated content for the document
+    updated_content = "updated foo"
+
+    # Create a new Document instance with the updated content and the same id
+    updated_doc = Document(page_content=updated_content, metadata={"page": "1"})
+
+    # Update the document in the VDMS instance
+    docsearch.update_document(
+        collection_name, document_id=document_id, document=updated_doc
+    )
+
+    # Perform a similarity search with the updated content
+    output = docsearch.similarity_search(updated_content, k=1)
+
+    # Assert that the updated document is returned by the search
+    assert output == [
+        Document(
+            page_content=updated_content, metadata={"page": "1", "id": document_id}
+        )
+    ]
+
+    # Assert that the new embedding is correct
+    response, new_embedding = docsearch.get(
+        collection_name,
+        constraints={"id": ["==", document_id]},
+        include=["metadata", "embeddings"],
+    )
+    # new_embedding = response_array[0]
+
+    assert new_embedding[0] == embedding2bytes(
+        embedding_function.embed_documents([updated_content])[0]
+    )
+    assert new_embedding != old_embedding
+
+
+@pytest.mark.requires("vdms")
+def test_with_relevance_score(vdms_client: vdms.vdms) -> None:
+    """Test to make sure the relevance score is scaled to 0-1."""
+    collection_name = "test_with_relevance_score"
+    embedding_function = FakeEmbeddings()
+    texts = ["foo", "bar", "baz"]
+    ids = [f"test_relevance_scores_{i}" for i in range(len(texts))]
+    metadatas = [{"page": str(i)} for i in range(1, len(texts) + 1)]
+    docsearch = VDMS.from_texts(
+        texts=texts,
+        ids=ids,
+        embedding=embedding_function,
+        metadatas=metadatas,
+        collection_name=collection_name,
+        client=vdms_client,
+    )
+    output = docsearch.similarity_search_with_relevance_scores("foo", k=3)
+    assert output == [
+        (Document(page_content="foo", metadata={"page": "1", "id": ids[0]}), 0.0),
+        (Document(page_content="bar", metadata={"page": "2", "id": ids[1]}), 0.25),
+        (Document(page_content="baz", metadata={"page": "3", "id": ids[2]}), 1.0),
+    ]
+
+
+@pytest.mark.requires("vdms")
+def test_add_documents_no_metadata(vdms_client: vdms.vdms) -> None:
+    collection_name = "test_add_documents_no_metadata"
+    embedding_function = FakeEmbeddings()
+    db = VDMS(
+        collection_name=collection_name,
+        embedding_function=embedding_function,
+        client=vdms_client,
+    )
+    db.add_documents([Document(page_content="foo")])
+
+
+@pytest.mark.requires("vdms")
+def test_add_documents_mixed_metadata(vdms_client: vdms.vdms) -> None:
+    collection_name = "test_add_documents_mixed_metadata"
+    embedding_function = FakeEmbeddings()
+    db = VDMS(
+        collection_name=collection_name,
+        embedding_function=embedding_function,
+        client=vdms_client,
+    )
+
+    docs = [
+        Document(page_content="foo"),
+        Document(page_content="bar", metadata={"baz": 1}),
+    ]
+    ids = ["10", "11"]
+    actual_ids = db.add_documents(docs, ids=ids)
+    assert actual_ids == ids
+
+    search = db.similarity_search("foo bar", k=2)
+    docs[0].metadata = {"id": ids[0]}
+    docs[1].metadata["id"] = ids[1]
+    assert sorted(search, key=lambda d: d.page_content) == sorted(
+        docs, key=lambda d: d.page_content
+    )
--- a/libs/community/tests/unit_tests/vectorstores/test_indexing_docs.py
+++ b/libs/community/tests/unit_tests/vectorstores/test_indexing_docs.py
@ -84,6 +84,7 @@ def test_compatible_vectorstore_documentation() -> None:
        "TimescaleVector",
        "EcloudESVectorStore",
        "Vald",
+        "VDMS",
        "Vearch",
        "VespaStore",
        "Weaviate",
--- a/libs/community/tests/unit_tests/vectorstores/test_public_api.py
+++ b/libs/community/tests/unit_tests/vectorstores/test_public_api.py
@ -77,6 +77,7 @@ _EXPECTED = [
    "Typesense",
    "USearch",
    "Vald",
+    "VDMS",
    "Vearch",
    "Vectara",
    "VespaStore",