docs: DeepLake example (#9663)

Updated the `Deep Lake` example. Added a link to an example provided by
Activeloop.
This commit is contained in:
Leonid Ganeline 2023-09-03 20:42:52 -07:00 committed by GitHub
parent 0b6993987f
commit 056e59672b
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -1,19 +1,25 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# Activeloop's Deep Lake\n",
"# Activeloop Deep Lake\n",
"\n",
">[Activeloop's Deep Lake](https://docs.activeloop.ai/) as a Multi-Modal Vector Store that stores embeddings and their metadata including text, jsons, images, audio, video, and more. It saves the data locally, in your cloud, or on Activeloop storage. It performs hybrid search including embeddings and their attributes.\n",
">[Activeloop Deep Lake](https://docs.activeloop.ai/) as a Multi-Modal Vector Store that stores embeddings and their metadata including text, Jsons, images, audio, video, and more. It saves the data locally, in your cloud, or on Activeloop storage. It performs hybrid search including embeddings and their attributes.\n",
"\n",
"This notebook showcases basic functionality related to `Activeloop's Deep Lake`. While `Deep Lake` can store embeddings, it is capable of storing any type of data. It is a serverless data lake with version control, query engine and streaming dataloaders to deep learning frameworks. \n",
"This notebook showcases basic functionality related to `Activeloop Deep Lake`. While `Deep Lake` can store embeddings, it is capable of storing any type of data. It is a serverless data lake with version control, query engine and streaming dataloaders to deep learning frameworks. \n",
"\n",
"For more information, please see the Deep Lake [documentation](https://docs.activeloop.ai) or [api reference](https://docs.deeplake.ai)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Setting up"
]
},
{
"cell_type": "code",
"execution_count": null,
@ -23,6 +29,22 @@
"!pip install openai 'deeplake[enterprise]' tiktoken"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Example provided by Activeloop\n",
"\n",
"[Integration with LangChain](https://docs.activeloop.ai/tutorials/vector-store/deep-lake-vector-store-in-langchain).\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Deep Lake locally"
]
},
{
"cell_type": "code",
"execution_count": 2,
@ -65,13 +87,35 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### Create a local dataset\n",
"\n",
"Create a dataset locally at `./deeplake/`, then run similarity search. The Deeplake+LangChain integration uses Deep Lake datasets under the hood, so `dataset` and `vector store` are used interchangeably. To create a dataset in your own cloud, or in the Deep Lake storage, [adjust the path accordingly](https://docs.activeloop.ai/storage-and-credentials/storage-options)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"db = DeepLake(\n",
" dataset_path=\"./my_deeplake/\", embedding=embeddings, overwrite=True\n",
")\n",
"db.add_documents(docs)\n",
"# or shorter\n",
"# db = DeepLake.from_documents(docs, dataset_path=\"./my_deeplake/\", embedding=embeddings, overwrite=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Query dataset"
]
},
{
"cell_type": "code",
"execution_count": 5,
@ -103,18 +147,11 @@
}
],
"source": [
"db = DeepLake(\n",
" dataset_path=\"./my_deeplake/\", embedding=embeddings, overwrite=True\n",
")\n",
"db.add_documents(docs)\n",
"# or shorter\n",
"# db = DeepLake.from_documents(docs, dataset_path=\"./my_deeplake/\", embedding=embeddings, overwrite=True)\n",
"query = \"What did the president say about Ketanji Brown Jackson\"\n",
"docs = db.similarity_search(query)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -145,7 +182,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -173,7 +209,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -181,7 +216,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -235,7 +269,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -243,7 +276,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -324,7 +356,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -358,7 +389,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -392,7 +422,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -415,7 +444,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -438,7 +466,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -530,7 +557,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -538,7 +564,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -604,7 +629,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -612,7 +636,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -685,7 +708,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -752,7 +774,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -796,7 +817,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -953,7 +973,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python 3.9.6 ('langchain_venv': venv)",
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
@ -967,7 +987,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.4"
"version": "3.10.12"
},
"vscode": {
"interpreter": {