langchain/docs/extras/modules/data_connection
Stefano Lottini 22af93d851
Vector store support for Cassandra (#6426)
This addresses #6291 adding support for using Cassandra (and compatible
databases, such as DataStax Astra DB) as a [Vector
Store](https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-30%3A+Approximate+Nearest+Neighbor(ANN)+Vector+Search+via+Storage-Attached+Indexes).

A new class `Cassandra` is introduced, which complies with the contract
and interface for a vector store, along with the corresponding
integration test, a sample notebook and modified dependency toml.

Dependencies: the implementation relies on the library `cassio`, which
simplifies interacting with Cassandra for ML- and LLM-oriented
workloads. CassIO, in turn, uses the `cassandra-driver` low-lever
drivers to communicate with the database. The former is added as
optional dependency (+ in `extended_testing`), the latter was already in
the project.

Integration testing relies on a locally-running instance of Cassandra.
[Here](https://cassio.org/more_info/#use-a-local-vector-capable-cassandra)
a detailed description can be found on how to compile and run it (at the
time of writing the feature has not made it yet to a release).

During development of the integration tests, I added a new "fake
embedding" class for what I consider a more controlled way of testing
the MMR search method. Likewise, I had to amend what looked like a
glitch in the behaviour of `ConsistentFakeEmbeddings` whereby an
`embed_query` call would have bypassed storage of the requested text in
the class cache for use in later repeated invocations.

@dev2049 might be the right person to tag here for a review. Thank you!

---------

Co-authored-by: rlm <pexpresss31@gmail.com>
2023-06-20 10:46:20 -07:00
..
document_loaders/integrations Harrison/unstructured page number (#6464) 2023-06-19 22:31:43 -07:00
document_transformers/text_splitters Add self query retriever example with MD header splitting (#6359) 2023-06-17 21:40:20 -07:00
retrievers docs retrievers fixes (#6299) 2023-06-19 22:04:35 -07:00
text_embedding/integrations
vectorstores/integrations Vector store support for Cassandra (#6426) 2023-06-20 10:46:20 -07:00