From b93638ef1ef683dfbb46e8e7654e96325324a98c Mon Sep 17 00:00:00 2001 From: Liang Zhang Date: Wed, 7 Jun 2023 20:45:47 -0700 Subject: [PATCH] Refactor and update databricks integration page (#5575) # Your PR Title (What it does) Fixes # (issue) ## Before submitting ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: --- docs/integrations/databricks.md | 36 +++++++++++++++++++ .../{ => databricks}/databricks.ipynb | 0 .../models/llms/integrations/databricks.ipynb | 5 ++- .../llms/integrations/huggingface_hub.ipynb | 12 +++++-- 4 files changed, 50 insertions(+), 3 deletions(-) create mode 100644 docs/integrations/databricks.md rename docs/integrations/{ => databricks}/databricks.ipynb (100%) diff --git a/docs/integrations/databricks.md b/docs/integrations/databricks.md new file mode 100644 index 00000000..0a81ce6a --- /dev/null +++ b/docs/integrations/databricks.md @@ -0,0 +1,36 @@ +Databricks +========== + +The [Databricks](https://www.databricks.com/) Lakehouse Platform unifies data, analytics, and AI on one platform. + +Databricks embraces the LangChain ecosystem in various ways: + +1. Databricks connector for the SQLDatabase Chain: SQLDatabase.from_databricks() provides an easy way to query your data on Databricks through LangChain +2. Databricks-managed MLflow integrates with LangChain: Tracking and serving LangChain applications with fewer steps +3. Databricks as an LLM provider: Deploy your fine-tuned LLMs on Databricks via serving endpoints or cluster driver proxy apps, and query it as langchain.llms.Databricks +4. Databricks Dolly: Databricks open-sourced Dolly which allows for commercial use, and can be accessed through the HuggingFace Hub + +Databricks connector for the SQLDatabase Chain +---------------------------------------------- +You can connect to [Databricks runtimes](https://docs.databricks.com/runtime/index.html) and [Databricks SQL](https://www.databricks.com/product/databricks-sql) using the SQLDatabase wrapper of LangChain. See the notebook [Connect to Databricks](./databricks/databricks.html) for details. + +Databricks-managed MLflow integrates with LangChain +--------------------------------------------------- + +MLflow is an open source platform to manage the ML lifecycle, including experimentation, reproducibility, deployment, and a central model registry. See the notebook [MLflow Callback Handler](./mlflow_tracking.ipynb) for details about MLflow's integration with LangChain. + +Databricks provides a fully managed and hosted version of MLflow integrated with enterprise security features, high availability, and other Databricks workspace features such as experiment and run management and notebook revision capture. MLflow on Databricks offers an integrated experience for tracking and securing machine learning model training runs and running machine learning projects. See [MLflow guide](https://docs.databricks.com/mlflow/index.html) for more details. + +Databricks-managed MLflow makes it more convenient to develop LangChain applications on Databricks. For MLflow tracking, you don't need to set the tracking uri. For MLflow Model Serving, you can save LangChain Chains in the MLflow langchain flavor, and then register and serve the Chain with a few clicks on Databricks, with credentials securely managed by MLflow Model Serving. + +Databricks as an LLM provider +----------------------------- + +The notebook [Wrap Databricks endpoints as LLMs](../modules/models/llms/integrations/databricks.html) illustrates the method to wrap Databricks endpoints as LLMs in LangChain. It supports two types of endpoints: the serving endpoint, which is recommended for both production and development, and the cluster driver proxy app, which is recommended for interactive development. + +Databricks endpoints support Dolly, but are also great for hosting models like MPT-7B or any other models from the HuggingFace ecosystem. Databricks endpoints can also be used with proprietary models like OpenAI to provide a governance layer for enterprises. + +Databricks Dolly +---------------- + +Databricks’ Dolly is an instruction-following large language model trained on the Databricks machine learning platform that is licensed for commercial use. The model is available on Hugging Face Hub as databricks/dolly-v2-12b. See the notebook [HuggingFace Hub](../modules/models/llms/integrations/huggingface_hub.html) for instructions to access it through the HuggingFace Hub integration with LangChain. diff --git a/docs/integrations/databricks.ipynb b/docs/integrations/databricks/databricks.ipynb similarity index 100% rename from docs/integrations/databricks.ipynb rename to docs/integrations/databricks/databricks.ipynb diff --git a/docs/modules/models/llms/integrations/databricks.ipynb b/docs/modules/models/llms/integrations/databricks.ipynb index cc0c1a96..c74feb7f 100644 --- a/docs/modules/models/llms/integrations/databricks.ipynb +++ b/docs/modules/models/llms/integrations/databricks.ipynb @@ -1,6 +1,7 @@ { "cells": [ { + "attachments": {}, "cell_type": "markdown", "metadata": { "application/vnd.databricks.v1+cell": { @@ -43,6 +44,7 @@ ] }, { + "attachments": {}, "cell_type": "markdown", "metadata": { "application/vnd.databricks.v1+cell": { @@ -257,6 +259,7 @@ ] }, { + "attachments": {}, "cell_type": "markdown", "metadata": { "application/vnd.databricks.v1+cell": { @@ -273,7 +276,7 @@ "Prerequisites:\n", "* An LLM loaded on a Databricks interactive cluster in \"single user\" or \"no isolation shared\" mode.\n", "* A local HTTP server running on the driver node to serve the model at `\"/\"` using HTTP POST with JSON input/output.\n", - "* It uses a port number between `[3000, 8000]` and litens to the driver IP address or simply `0.0.0.0` instead of localhost only.\n", + "* It uses a port number between `[3000, 8000]` and listens to the driver IP address or simply `0.0.0.0` instead of localhost only.\n", "* You have \"Can Attach To\" permission to the cluster.\n", "\n", "The expected server schema (using JSON schema) is:\n", diff --git a/docs/modules/models/llms/integrations/huggingface_hub.ipynb b/docs/modules/models/llms/integrations/huggingface_hub.ipynb index 6ed04670..b0e69eae 100644 --- a/docs/modules/models/llms/integrations/huggingface_hub.ipynb +++ b/docs/modules/models/llms/integrations/huggingface_hub.ipynb @@ -1,6 +1,7 @@ { "cells": [ { + "attachments": {}, "cell_type": "markdown", "id": "959300d4", "metadata": {}, @@ -13,6 +14,7 @@ ] }, { + "attachments": {}, "cell_type": "markdown", "id": "4c1b8450-5eaf-4d34-8341-2d785448a1ff", "metadata": { @@ -60,6 +62,7 @@ ] }, { + "attachments": {}, "cell_type": "markdown", "id": "84dd44c1-c428-41f3-a911-520281386c94", "metadata": {}, @@ -104,6 +107,7 @@ ] }, { + "attachments": {}, "cell_type": "markdown", "id": "ddaa06cf-95ec-48ce-b0ab-d892a7909693", "metadata": {}, @@ -114,6 +118,7 @@ ] }, { + "attachments": {}, "cell_type": "markdown", "id": "4fa9337e-ccb5-4c52-9b7c-1653148bc256", "metadata": {}, @@ -158,13 +163,14 @@ ] }, { + "attachments": {}, "cell_type": "markdown", "id": "1a5c97af-89bc-4e59-95c1-223742a9160b", "metadata": {}, "source": [ - "### Dolly, by DataBricks\n", + "### Dolly, by Databricks\n", "\n", - "See [DataBricks](https://huggingface.co/databricks) organization page for a list of available models." + "See [Databricks](https://huggingface.co/databricks) organization page for a list of available models." ] }, { @@ -196,6 +202,7 @@ ] }, { + "attachments": {}, "cell_type": "markdown", "id": "03f6ae52-b5f9-4de6-832c-551cb3fa11ae", "metadata": {}, @@ -233,6 +240,7 @@ ] }, { + "attachments": {}, "cell_type": "markdown", "id": "2bf838eb-1083-402f-b099-b07c452418c8", "metadata": {},