<!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> - Adds integration for MLflow AI Gateway (this will be shipped in MLflow 2.5 this week). Manual testing: ```sh # Move to mlflow repo cd /path/to/mlflow # install langchain pip install git+https://github.com/harupy/langchain.git@gateway-integration # launch gateway service mlflow gateway start --config-path examples/gateway/openai/config.yaml # Then, run the examples in this PR ```
3.7 KiB
Databricks
The Databricks Lakehouse Platform unifies data, analytics, and AI on one platform.
Databricks embraces the LangChain ecosystem in various ways:
- Databricks connector for the SQLDatabase Chain: SQLDatabase.from_databricks() provides an easy way to query your data on Databricks through LangChain
- Databricks MLflow integrates with LangChain: Tracking and serving LangChain applications with fewer steps
- Databricks MLflow AI Gateway
- Databricks as an LLM provider: Deploy your fine-tuned LLMs on Databricks via serving endpoints or cluster driver proxy apps, and query it as langchain.llms.Databricks
- Databricks Dolly: Databricks open-sourced Dolly which allows for commercial use, and can be accessed through the Hugging Face Hub
Databricks connector for the SQLDatabase Chain
You can connect to Databricks runtimes and Databricks SQL using the SQLDatabase wrapper of LangChain. See the notebook Connect to Databricks for details.
Databricks MLflow integrates with LangChain
MLflow is an open source platform to manage the ML lifecycle, including experimentation, reproducibility, deployment, and a central model registry. See the notebook MLflow Callback Handler for details about MLflow's integration with LangChain.
Databricks provides a fully managed and hosted version of MLflow integrated with enterprise security features, high availability, and other Databricks workspace features such as experiment and run management and notebook revision capture. MLflow on Databricks offers an integrated experience for tracking and securing machine learning model training runs and running machine learning projects. See MLflow guide for more details.
Databricks MLflow makes it more convenient to develop LangChain applications on Databricks. For MLflow tracking, you don't need to set the tracking uri. For MLflow Model Serving, you can save LangChain Chains in the MLflow langchain flavor, and then register and serve the Chain with a few clicks on Databricks, with credentials securely managed by MLflow Model Serving.
Databricks MLflow AI Gateway
See MLflow AI Gateway.
Databricks as an LLM provider
The notebook Wrap Databricks endpoints as LLMs illustrates the method to wrap Databricks endpoints as LLMs in LangChain. It supports two types of endpoints: the serving endpoint, which is recommended for both production and development, and the cluster driver proxy app, which is recommended for interactive development.
Databricks endpoints support Dolly, but are also great for hosting models like MPT-7B or any other models from the Hugging Face ecosystem. Databricks endpoints can also be used with proprietary models like OpenAI to provide a governance layer for enterprises.
Databricks Dolly
Databricks’ Dolly is an instruction-following large language model trained on the Databricks machine learning platform that is licensed for commercial use. The model is available on Hugging Face Hub as databricks/dolly-v2-12b. See the notebook Hugging Face Hub for instructions to access it through the Hugging Face Hub integration with LangChain.