langchain/libs/partners/mongodb
Ash Vardanian d01bad5169
core[patch]: Convert SimSIMD back to NumPy (#19473)
This patch fixes the #18022 issue, converting the SimSIMD internal
zero-copy outputs to NumPy.

I've also noticed, that oftentimes `dtype=np.float32` conversion is used
before passing to SimSIMD. Which numeric types do LangChain users
generally care about? We support `float64`, `float32`, `float16`, and
`int8` for cosine distances and `float16` seems reasonable for
practically any kind of embeddings and any modern piece of hardware, so
we can change that part as well 🤗
2024-03-25 16:36:26 -07:00
..
langchain_mongodb core[patch]: Convert SimSIMD back to NumPy (#19473) 2024-03-25 16:36:26 -07:00
scripts
tests mongodb[patch]: Added scoring threshold to caching (#19286) 2024-03-19 11:30:02 -07:00
.gitignore
LICENSE
Makefile
poetry.lock mongodb[patch]: fix core dep (#18926) 2024-03-11 10:27:29 -07:00
pyproject.toml mongodb[patch]: Added scoring threshold to caching (#19286) 2024-03-19 11:30:02 -07:00
README.md

langchain-mongodb

Installation

pip install -U langchain-mongodb

Usage

Using MongoDBAtlasVectorSearch

from langchain_mongodb import MongoDBAtlasVectorSearch

# Pull MongoDB Atlas URI from environment variables
MONGODB_ATLAS_CLUSTER_URI = os.environ.get("MONGODB_ATLAS_CLUSTER_URI")

DB_NAME = "langchain_db"
COLLECTION_NAME = "test"
ATLAS_VECTOR_SEARCH_INDEX_NAME = "index_name"
MONGODB_COLLECTION = client[DB_NAME][COLLECITON_NAME]

# Create the vector search via `from_connection_string`
vector_search = MongoDBAtlasVectorSearch.from_connection_string(
    MONGODB_ATLAS_CLUSTER_URI,
    DB_NAME + "." + COLLECTION_NAME,
    OpenAIEmbeddings(disallowed_special=()),
    index_name=ATLAS_VECTOR_SEARCH_INDEX_NAME,
)

# Initialize MongoDB python client
client = MongoClient(MONGODB_ATLAS_CLUSTER_URI)
# Create the vector search via instantiation
vector_search_2 = MongoDBAtlasVectorSearch(
    collection=MONGODB_COLLECTION,
    embeddings=OpenAIEmbeddings(disallowed_special=()),
    index_name=ATLAS_VECTOR_SEARCH_INDEX_NAME,
)