[hotfix] Deep Lake fails on newer version due to hardcode (#6383)

Hot Fixes for Deep Lake [would highly appreciate expedited review]

* deeplake version was hardcoded and since deeplake upgraded the
integration fails with confusing error
* an additional integration test fixed due to embedding function
* Additionally fixed docs for code understanding links after docs
upgraded
* notebook removal of public parameter to make sure code understanding
notebook works

#### Who can review?
  @hwchase17  @dev2049

---------

Co-authored-by: Davit Buniatyan <d@activeloop.ai>
This commit is contained in:
Davit Buniatyan 2023-06-18 17:33:49 -07:00 committed by GitHub
parent 6aa7b04f79
commit 1ab9dc8293
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
3 changed files with 37 additions and 18 deletions

View File

@ -22,5 +22,5 @@ Query Understanding: GPT-4 processes user queries, grasping the context and extr
5. Ask questions: Define a list of questions to ask about the codebase, and then use the ConversationalRetrievalChain to generate context-aware answers. The LLM (GPT-4) generates comprehensive, context-aware answers based on retrieved code snippets and conversation history.
The full tutorial is available below.
- [Twitter the-algorithm codebase analysis with Deep Lake](code/twitter-the-algorithm-analysis-deeplake.html): A notebook walking through how to parse github source code and run queries conversation.
- [LangChain codebase analysis with Deep Lake](code/code-analysis-deeplake.html): A notebook walking through how to analyze and do question answering over THIS code base.
- [Twitter the-algorithm codebase analysis with Deep Lake](./twitter-the-algorithm-analysis-deeplake.html): A notebook walking through how to parse github source code and run queries conversation.
- [LangChain codebase analysis with Deep Lake](./code-analysis-deeplake.html): A notebook walking through how to analyze and do question answering over THIS code base.

View File

@ -1,6 +1,7 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -29,7 +30,7 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
@ -45,7 +46,7 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
@ -61,6 +62,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -78,6 +80,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -86,7 +89,7 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
@ -105,6 +108,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -124,6 +128,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -140,12 +145,12 @@
"db = DeepLake(\n",
" dataset_path=f\"hub://{username}/twitter-algorithm\",\n",
" embedding_function=embeddings,\n",
" public=True,\n",
") # dataset would be publicly available\n",
")\n",
"db.add_documents(texts)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -155,9 +160,17 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 13,
"metadata": {},
"outputs": [],
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Deep Lake Dataset in hub://davitbun/twitter-algorithm already exists, loading from the storage\n"
]
}
],
"source": [
"db = DeepLake(\n",
" dataset_path=\"hub://davitbun/twitter-algorithm\",\n",
@ -168,7 +181,7 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 14,
"metadata": {},
"outputs": [],
"source": [
@ -180,6 +193,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@ -188,7 +202,7 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 15,
"metadata": {},
"outputs": [],
"source": [
@ -208,7 +222,7 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 16,
"metadata": {},
"outputs": [],
"source": [
@ -391,6 +405,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": []
@ -412,7 +427,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.0"
"version": "3.9.7"
}
},
"nbformat": 4,

View File

@ -8,6 +8,7 @@ import numpy as np
try:
import deeplake
from deeplake.core.fast_forwarding import version_compare
from deeplake.core.vectorstore import DeepLakeVectorStore
_DEEPLAKE_INSTALLED = True
@ -124,11 +125,11 @@ class DeepLake(VectorStore):
"Please install it with `pip install deeplake`."
)
version = deeplake.__version__
if version != "3.6.2":
if version_compare(deeplake.__version__, "3.6.2") == -1:
raise ValueError(
"deeplake version should be = 3.6.3, but you've installed"
f" {version}. Consider changing deeplake version to 3.6.3 ."
"deeplake version should be >= 3.6.3, but you've installed"
f" {deeplake.__version__}. Consider upgrading deeplake version \
pip install --upgrade deeplake."
)
self.dataset_path = dataset_path
@ -303,6 +304,9 @@ class DeepLake(VectorStore):
)
if embedding_function:
if isinstance(embedding_function, Embeddings):
_embedding_function = embedding_function.embed_query
else:
_embedding_function = embedding_function
elif self._embedding_function:
_embedding_function = self._embedding_function.embed_query