Adds DeepSparse as an LLM (#9184)

Adds [DeepSparse](https://github.com/neuralmagic/deepsparse) as an LLM backend. DeepSparse supports running various open-source sparsified models hosted on [SparseZoo](https://sparsezoo.neuralmagic.com/) for performance gains on CPUs. Twitter handles: @mgoin_ @neuralmagic --------- Co-authored-by: Bagatur <baskaryan@gmail.com>
1 year ago · 621da3c164
parent 0fa69d8988
commit 621da3c164
5 changed files with 220 additions and 0 deletions
--- a/docs/extras/integrations/llms/deepsparse.ipynb
+++ b/docs/extras/integrations/llms/deepsparse.ipynb
@ -0,0 +1,78 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "15d7ce70-8879-42a0-86d9-a3d604a3ec83",
+   "metadata": {},
+   "source": [
+    "# DeepSparse\n",
+    "\n",
+    "This page covers how to use the [DeepSparse](https://github.com/neuralmagic/deepsparse) inference runtime within LangChain.\n",
+    "It is broken into two parts: installation and setup, and then examples of DeepSparse usage.\n",
+    "\n",
+    "## Installation and Setup\n",
+    "\n",
+    "- Install the Python package with `pip install deepsparse`\n",
+    "- Choose a [SparseZoo model](https://sparsezoo.neuralmagic.com/?useCase=text_generation) or export a support model to ONNX [using Optimum](https://github.com/neuralmagic/notebooks/blob/main/notebooks/opt-text-generation-deepsparse-quickstart/OPT_Text_Generation_DeepSparse_Quickstart.ipynb)\n",
+    "\n",
+    "\n",
+    "There exists a DeepSparse LLM wrapper, that provides a unified interface for all models:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "79d24d37-737a-428c-b6c5-84c1633070d7",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.llms import DeepSparse\n",
+    "\n",
+    "llm = DeepSparse(model='zoo:nlg/text_generation/codegen_mono-350m/pytorch/huggingface/bigpython_bigquery_thepile/base-none')\n",
+    "\n",
+    "print(llm('def fib():'))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "ea7ea674-d6b0-49d9-9c2b-014032973be6",
+   "metadata": {},
+   "source": [
+    "Additional parameters can be passed using the `config` parameter:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "ff61b845-41e6-4457-8625-6e21a11bfe7c",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "config = {'max_generated_tokens': 256}\n",
+    "\n",
+    "llm = DeepSparse(model='zoo:nlg/text_generation/codegen_mono-350m/pytorch/huggingface/bigpython_bigquery_thepile/base-none', config=config)"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.1"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/extras/integrations/providers/deepsparse.mdx
+++ b/docs/extras/integrations/providers/deepsparse.mdx
@ -0,0 +1,35 @@
+# DeepSparse
+
+This page covers how to use the [DeepSparse](https://github.com/neuralmagic/deepsparse) inference runtime within LangChain.
+It is broken into two parts: installation and setup, and then examples of DeepSparse usage.
+
+## Installation and Setup
+
+- Install the Python package with `pip install deepsparse`
+- Choose a [SparseZoo model](https://sparsezoo.neuralmagic.com/?useCase=text_generation) or export a support model to ONNX [using Optimum](https://github.com/neuralmagic/notebooks/blob/main/notebooks/opt-text-generation-deepsparse-quickstart/OPT_Text_Generation_DeepSparse_Quickstart.ipynb)
+
+## Wrappers
+
+### LLM
+
+There exists a DeepSparse LLM wrapper, which you can access with:
+
+```python
+from langchain.llms import DeepSparse
+```
+
+It provides a unified interface for all models:
+
+```python
+llm = DeepSparse(model='zoo:nlg/text_generation/codegen_mono-350m/pytorch/huggingface/bigpython_bigquery_thepile/base-none')
+
+print(llm('def fib():'))
+```
+
+Additional parameters can be passed using the `config` parameter:
+
+```python
+config = {'max_generated_tokens': 256}
+
+llm = DeepSparse(model='zoo:nlg/text_generation/codegen_mono-350m/pytorch/huggingface/bigpython_bigquery_thepile/base-none', config=config)
+```
--- a/libs/langchain/langchain/llms/init.py
+++ b/libs/langchain/langchain/llms/init.py
@ -38,6 +38,7 @@ from langchain.llms.cohere import Cohere
 from langchain.llms.ctransformers import CTransformers
 from langchain.llms.databricks import Databricks
 from langchain.llms.deepinfra import DeepInfra
+from langchain.llms.deepsparse import DeepSparse
 from langchain.llms.edenai import EdenAI
 from langchain.llms.fake import FakeListLLM
 from langchain.llms.fireworks import Fireworks, FireworksChat
@ -103,6 +104,7 @@ __all__ = [
    "Cohere",
    "Databricks",
    "DeepInfra",
+    "DeepSparse",
    "EdenAI",
    "FakeListLLM",
    "Fireworks",
@ -172,6 +174,7 @@ type_to_cls_dict: Dict[str, Type[BaseLLM]] = {
    "ctransformers": CTransformers,
    "databricks": Databricks,
    "deepinfra": DeepInfra,
+    "deepsparse": DeepSparse,
    "edenai": EdenAI,
    "fake-list": FakeListLLM,
    "forefrontai": ForefrontAI,
--- a/libs/langchain/langchain/llms/deepsparse.py
+++ b/libs/langchain/langchain/llms/deepsparse.py
@ -0,0 +1,87 @@
+# flake8: noqa
+from typing import Any, Dict, Optional, List
+
+from pydantic import root_validator
+
+from langchain.callbacks.manager import CallbackManagerForLLMRun
+from langchain.llms.base import LLM
+from langchain.llms.utils import enforce_stop_tokens
+
+
+class DeepSparse(LLM):
+    """Neural Magic DeepSparse LLM interface.
+
+    To use, you should have the ``deepsparse`` or ``deepsparse-nightly``
+    python package installed. See https://github.com/neuralmagic/deepsparse
+
+    This interface let's you deploy optimized LLMs straight from the
+    [SparseZoo](https://sparsezoo.neuralmagic.com/?useCase=text_generation)
+    Example:
+        .. code-block:: python
+            from langchain.llms import DeepSparse
+            llm = DeepSparse(model="zoo:nlg/text_generation/codegen_mono-350m/pytorch/huggingface/bigpython_bigquery_thepile/base-none")
+    """  # noqa: E501
+
+    pipeline: Any  #: :meta private:
+
+    model: str
+    """The path to a model file or directory or the name of a SparseZoo model stub."""
+
+    config: Optional[Dict[str, Any]] = None
+    """Key word arguments passed to the pipeline."""
+
+    @property
+    def _identifying_params(self) -> Dict[str, Any]:
+        """Get the identifying parameters."""
+        return {
+            "model": self.model,
+            "config": self.config,
+        }
+
+    @property
+    def _llm_type(self) -> str:
+        """Return type of llm."""
+        return "deepsparse"
+
+    @root_validator()
+    def validate_environment(cls, values: Dict) -> Dict:
+        """Validate that ``deepsparse`` package is installed."""
+        try:
+            from deepsparse import Pipeline
+        except ImportError:
+            raise ImportError(
+                "Could not import `deepsparse` package. "
+                "Please install it with `pip install deepsparse`"
+            )
+
+        config = values["config"] or {}
+
+        values["pipeline"] = Pipeline.create(
+            task="text_generation",
+            model_path=values["model"],
+            **config,
+        )
+        return values
+
+    def _call(
+        self,
+        prompt: str,
+        stop: Optional[List[str]] = None,
+        run_manager: Optional[CallbackManagerForLLMRun] = None,
+        **kwargs: Any,
+    ) -> str:
+        """Generate text from a prompt.
+        Args:
+            prompt: The prompt to generate text from.
+            stop: A list of strings to stop generation when encountered.
+        Returns:
+            The generated text.
+        Example:
+            .. code-block:: python
+                response = llm("Tell me a joke.")
+        """
+        text = self.pipeline(sequences=prompt).sequences[0]
+
+        if stop is not None:
+            text = enforce_stop_tokens(text, stop)
+        return "".join(text)
--- a/libs/langchain/tests/integration_tests/llms/test_deepsparse.py
+++ b/libs/langchain/tests/integration_tests/llms/test_deepsparse.py
@ -0,0 +1,17 @@
+"""Test DeepSparse wrapper."""
+from langchain.llms import DeepSparse
+
+
+def test_deepsparse_call() -> None:
+    """Test valid call to DeepSparse."""
+    config = {"max_generated_tokens": 5, "use_deepsparse_cache": False}
+
+    llm = DeepSparse(
+        model="zoo:nlg/text_generation/codegen_mono-350m/pytorch/huggingface/bigpython_bigquery_thepile/base-none",
+        config=config,
+    )
+
+    output = llm("def ")
+    assert isinstance(output, str)
+    assert len(output) > 1
+    assert output == "ids_to_names"