Rename Databerry to Chaindesk (#7022)

--------- Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-11-08 07:10:35 +00:00 · 2023-07-07 23:28:04 +02:00 · 2023-07-07 23:28:04 +02:00 · ec033ae277
commit ec033ae277
parent da5b0723d2
7 changed files with 125 additions and 23 deletions
--- a/docs/api_reference/api_reference.rst
+++ b/docs/api_reference/api_reference.rst
@ -1316,7 +1316,7 @@ Classes
    retrievers.azure_cognitive_search.AzureCognitiveSearchRetriever
    retrievers.chatgpt_plugin_retriever.ChatGPTPluginRetriever
    retrievers.contextual_compression.ContextualCompressionRetriever
-    retrievers.databerry.DataberryRetriever
+    retrievers.chaindesk.ChaindeskRetriever
    retrievers.docarray.DocArrayRetriever
    retrievers.docarray.SearchType
    retrievers.document_compressors.base.BaseDocumentCompressor
--- a/docs/docs_skeleton/docs/_static/ChaindeskDashboard.png
+++ b/docs/docs_skeleton/docs/_static/ChaindeskDashboard.png
--- a/docs/docs_skeleton/vercel.json
+++ b/docs/docs_skeleton/vercel.json
@ -137,8 +137,8 @@
      "destination": "/docs/ecosystem/integrations/ctransformers"
    },
    {
-      "source": "/en/latest/integrations/databerry.html",
+      "source": "/en/latest/integrations/chaindesk.html",
-      "destination": "/docs/ecosystem/integrations/databerry"
+      "destination": "/docs/ecosystem/integrations/chaindesk"
    },
    {
      "source": "/en/latest/integrations/databricks/databricks.html",
@ -1329,8 +1329,8 @@
      "destination": "/docs/modules/data_connection/retrievers/integrations/cohere-reranker"
    },
    {
-      "source": "/en/latest/modules/indexes/retrievers/examples/databerry.html",
+      "source": "/en/latest/modules/indexes/retrievers/examples/chaindesk.html",
-      "destination": "/docs/modules/data_connection/retrievers/integrations/databerry"
+      "destination": "/docs/modules/data_connection/retrievers/integrations/chaindesk"
    },
    {
      "source": "/en/latest/modules/indexes/retrievers/examples/elastic_search_bm25.html",
--- a/docs/extras/ecosystem/integrations/chaindesk.mdx
+++ b/docs/extras/ecosystem/integrations/chaindesk.mdx
@ -1,17 +1,17 @@
-# Databerry
+# Chaindesk
->[Databerry](https://databerry.ai) is an [open source](https://github.com/gmpetrov/databerry) document retrieval platform that helps to connect your personal data with Large Language Models.
+>[Chaindesk](https://chaindesk.ai) is an [open source](https://github.com/gmpetrov/databerry) document retrieval platform that helps to connect your personal data with Large Language Models.
 ## Installation and Setup
-We need to sign up for Databerry, create a datastore, add some data and get your datastore api endpoint url. 
+We need to sign up for Chaindesk, create a datastore, add some data and get your datastore api endpoint url. 
-We need the [API Key](https://docs.databerry.ai/api-reference/authentication).
+We need the [API Key](https://docs.chaindesk.ai/api-reference/authentication).
 ## Retriever
-See a [usage example](/docs/modules/data_connection/retrievers/integrations/databerry.html).
+See a [usage example](/docs/modules/data_connection/retrievers/integrations/chaindesk.html).
 ```python
-from langchain.retrievers import DataberryRetriever
+from langchain.retrievers import ChaindeskRetriever
 ```
--- a/docs/extras/modules/data_connection/retrievers/integrations/chaindesk.ipynb
+++ b/docs/extras/modules/data_connection/retrievers/integrations/chaindesk.ipynb
@ -1,21 +1,31 @@
 {
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "9fc6205b",
   "metadata": {},
   "source": [
-    "# Databerry\n",
+    "# Chaindesk\n",
    "\n",
-    ">[Databerry platform](https://docs.databerry.ai/introduction) brings data from anywhere (Datsources: Text, PDF, Word, PowerPpoint, Excel, Notion, Airtable, Google Sheets, etc..) into Datastores (container of multiple Datasources).\n",
+    ">[Chaindesk platform](https://docs.chaindesk.ai/introduction) brings data from anywhere (Datsources: Text, PDF, Word, PowerPpoint, Excel, Notion, Airtable, Google Sheets, etc..) into Datastores (container of multiple Datasources).\n",
-    "Then your Datastores can be connected to ChatGPT via Plugins or any other Large Langue Model (LLM) via the `Databerry API`.\n",
+    "Then your Datastores can be connected to ChatGPT via Plugins or any other Large Langue Model (LLM) via the `Chaindesk API`.\n",
    "\n",
-    "This notebook shows how to use [Databerry's](https://www.databerry.ai/) retriever.\n",
+    "This notebook shows how to use [Chaindesk's](https://www.chaindesk.ai/) retriever.\n",
    "\n",
-    "First, you will need to sign up for Databerry, create a datastore, add some data and get your datastore api endpoint url. You need the [API Key](https://docs.databerry.ai/api-reference/authentication)."
+    "First, you will need to sign up for Chaindesk, create a datastore, add some data and get your datastore api endpoint url. You need the [API Key](https://docs.chaindesk.ai/api-reference/authentication)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3697b9fd",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "944e172b",
   "metadata": {},
@ -34,7 +44,7 @@
   },
   "outputs": [],
   "source": [
-    "from langchain.retrievers import DataberryRetriever"
+    "from langchain.retrievers import ChaindeskRetriever"
   ]
  },
  {
@ -46,9 +56,9 @@
   },
   "outputs": [],
   "source": [
-    "retriever = DataberryRetriever(\n",
+    "retriever = ChaindeskRetriever(\n",
-    "    datastore_url=\"https://clg1xg2h80000l708dymr0fxc.databerry.ai/query\",\n",
+    "    datastore_url=\"https://clg1xg2h80000l708dymr0fxc.chaindesk.ai/query\",\n",
-    "    # api_key=\"DATABERRY_API_KEY\", # optional if datastore is public\n",
+    "    # api_key=\"CHAINDESK_API_KEY\", # optional if datastore is public\n",
    "    # top_k=10 # optional\n",
    ")"
   ]
--- a/langchain/retrievers/init.py
+++ b/langchain/retrievers/init.py
@ -1,8 +1,8 @@
 from langchain.retrievers.arxiv import ArxivRetriever
 from langchain.retrievers.azure_cognitive_search import AzureCognitiveSearchRetriever
 from langchain.retrievers.chaindesk import ChaindeskRetriever
 from langchain.retrievers.chatgpt_plugin_retriever import ChatGPTPluginRetriever
 from langchain.retrievers.contextual_compression import ContextualCompressionRetriever
 from langchain.retrievers.databerry import DataberryRetriever
 from langchain.retrievers.docarray import DocArrayRetriever
 from langchain.retrievers.elastic_search_bm25 import ElasticSearchBM25Retriever
 from langchain.retrievers.kendra import AmazonKendraRetriever
@ -36,7 +36,7 @@ __all__ = [
    "AzureCognitiveSearchRetriever",
    "ChatGPTPluginRetriever",
    "ContextualCompressionRetriever",
-    "DataberryRetriever",
+    "ChaindeskRetriever",
    "ElasticSearchBM25Retriever",
    "KNNRetriever",
    "LlamaIndexGraphRetriever",
--- a/langchain/retrievers/chaindesk.py
+++ b/langchain/retrievers/chaindesk.py
@ -0,0 +1,92 @@
 from typing import Any, List, Optional
 import aiohttp
 import requests
 from langchain.callbacks.manager import (
    AsyncCallbackManagerForRetrieverRun,
    CallbackManagerForRetrieverRun,
 )
 from langchain.schema import BaseRetriever, Document
 class ChaindeskRetriever(BaseRetriever):
    """Retriever that uses the Chaindesk API."""
    datastore_url: str
    top_k: Optional[int]
    api_key: Optional[str]
    def __init__(
        self,
        datastore_url: str,
        top_k: Optional[int] = None,
        api_key: Optional[str] = None,
    ):
        self.datastore_url = datastore_url
        self.api_key = api_key
        self.top_k = top_k
    def _get_relevant_documents(
        self,
        query: str,
        *,
        run_manager: CallbackManagerForRetrieverRun,
        **kwargs: Any,
    ) -> List[Document]:
        response = requests.post(
            self.datastore_url,
            json={
                "query": query,
                **({"topK": self.top_k} if self.top_k is not None else {}),
            },
            headers={
                "Content-Type": "application/json",
                **(
                    {"Authorization": f"Bearer {self.api_key}"}
                    if self.api_key is not None
                    else {}
                ),
            },
        )
        data = response.json()
        return [
            Document(
                page_content=r["text"],
                metadata={"source": r["source"], "score": r["score"]},
            )
            for r in data["results"]
        ]
    async def _aget_relevant_documents(
        self,
        query: str,
        *,
        run_manager: AsyncCallbackManagerForRetrieverRun,
        **kwargs: Any,
    ) -> List[Document]:
        async with aiohttp.ClientSession() as session:
            async with session.request(
                "POST",
                self.datastore_url,
                json={
                    "query": query,
                    **({"topK": self.top_k} if self.top_k is not None else {}),
                },
                headers={
                    "Content-Type": "application/json",
                    **(
                        {"Authorization": f"Bearer {self.api_key}"}
                        if self.api_key is not None
                        else {}
                    ),
                },
            ) as response:
                data = await response.json()
        return [
            Document(
                page_content=r["text"],
                metadata={"source": r["source"], "score": r["score"]},
            )
            for r in data["results"]
        ]