Davit Buniatyan b4914888a7

Deep Lake upgrade to include attribute search, distance metrics, returning scores and MMR (#2455 )

### Features include

- Metadata based embedding search
- Choice of distance metric function (`L2` for Euclidean, `L1` for
Nuclear, `max` L-infinity distance, `cos` for cosine similarity, 'dot'
for dot product. Defaults to `L2`
- Returning scores
- Max Marginal Relevance Search
- Deleting samples from the dataset

### Notes
- Added numerous tests, let me know if you would like to shorten them or
make smarter

---------

Co-authored-by: Davit Buniatyan <d@activeloop.ai>

2023-04-06 12:47:33 -07:00

1.5 KiB

Raw Blame History

Deep Lake

This page covers how to use the Deep Lake ecosystem within LangChain.

Why Deep Lake?

More than just a (multi-modal) vector store. You can later use the dataset to fine-tune your own LLM models.
Not only stores embeddings, but also the original data with automatic version control.
Truly serverless. Doesn't require another service and can be used with major cloud providers (AWS S3, GCS, etc.)

More Resources

Ultimate Guide to LangChain & Deep Lake: Build ChatGPT to Answer Questions on Your Financial Data
Here is whitepaper and academic paper for Deep Lake
Here is a set of additional resources available for review: Deep Lake, Getting Started and Tutorials

Installation and Setup

Install the Python package with pip install deeplake

Wrappers

VectorStore

There exists a wrapper around Deep Lake, a data lake for Deep Learning applications, allowing you to use it as a vector store (for now), whether for semantic search or example selection.

To import this vectorstore:

from langchain.vectorstores import DeepLake

For a more detailed walkthrough of the Deep Lake wrapper, see this notebook

1.5 KiB Raw Blame History