From 40079d4936a49b567d00fe3f0b35f9a6a0917547 Mon Sep 17 00:00:00 2001
From: David vonThenen <12752197+dvonthenen@users.noreply.github.com>
Date: Mon, 7 Aug 2023 13:15:26 -0700
Subject: [PATCH] Introduce Nebula LLM to LangChain (#8876)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

## Description

This PR adds Nebula to the available LLMs in LangChain.

Nebula is an LLM focused on conversation understanding and enables users
to extract conversation insights from video, audio, text, and chat-based
conversations. These conversations can occur between any mix of human or
AI participants.

Examples of some questions you could ask Nebula from a given
conversation are:
- What could be the customer’s pain points based on the conversation?
- What sales opportunities can be identified from this conversation?
- What best practices can be derived from this conversation for future
customer interactions?

You can read more about Nebula here:

https://symbl.ai/blog/extract-insights-symbl-ai-generative-ai-recall-ai-meetings/

#### Integration Test

An integration test is added, but it requires network access. Since
Nebula is fully managed like OpenAI, network access is required to
exercise the integration test.

#### Linting

- [x] make lint
- [x] make test (TODO: there seems to be a failure in another
non-related test??? Need to check on this.)
- [x] make format

### Dependencies

No new dependencies were introduced.

### Twitter handle

[@symbldotai](https://twitter.com/symbldotai)
[@dvonthenen](https://twitter.com/dvonthenen)


If you have any questions, please let me know.

cc: @hwchase17, @baskaryan

---------

Co-authored-by: dvonthenen <david.vonthenen@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
---
 .../integrations/llms/symblai_nebula.ipynb    | 106 ++++++++++
 .../integrations/providers/symblai_nebula.mdx |  20 ++
 libs/langchain/langchain/llms/__init__.py     |   3 +
 .../langchain/llms/symblai_nebula.py          | 196 ++++++++++++++++++
 .../llms/test_symblai_nebula.py               |  56 +++++
 5 files changed, 381 insertions(+)
 create mode 100644 docs/extras/integrations/llms/symblai_nebula.ipynb
 create mode 100644 docs/extras/integrations/providers/symblai_nebula.mdx
 create mode 100644 libs/langchain/langchain/llms/symblai_nebula.py
 create mode 100644 libs/langchain/tests/integration_tests/llms/test_symblai_nebula.py

diff --git a/docs/extras/integrations/llms/symblai_nebula.ipynb b/docs/extras/integrations/llms/symblai_nebula.ipynb
new file mode 100644
index 0000000000..587dd4b7d9
--- /dev/null
+++ b/docs/extras/integrations/llms/symblai_nebula.ipynb
@@ -0,0 +1,106 @@
+{
+ "cells": [
+  {
+   "attachments": {},
+   "cell_type": "markdown",
+   "id": "9597802c",
+   "metadata": {},
+   "source": [
+    "# Nebula\n",
+    "\n",
+    "[Nebula](https://symbl.ai/nebula/) is a fully-managed Conversation platform, on which you can build, deploy, and manage scalable AI applications.\n",
+    "\n",
+    "This example goes over how to use LangChain to interact with the [Nebula platform](https://docs.symbl.ai/docs/nebula-llm-overview). \n",
+    "\n",
+    "It will send the requests to Nebula Service endpoint, which concatenates `SYMBLAI_NEBULA_SERVICE_URL` and `SYMBLAI_NEBULA_SERVICE_PATH`, with a token defined in `SYMBLAI_NEBULA_SERVICE_TOKEN`"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f15ebe0d",
+   "metadata": {},
+   "source": [
+    "### Integrate with a LLMChain"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "5472a7cd-af26-48ca-ae9b-5f6ae73c74d2",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "\n",
+    "os.environ[\"SYMBLAI_NEBULA_SERVICE_URL\"] = SYMBLAI_NEBULA_SERVICE_URL\n",
+    "os.environ[\"SYMBLAI_NEBULA_SERVICE_PATH\"] = SYMBLAI_NEBULA_SERVICE_PATH\n",
+    "os.environ[\"SYMBLAI_NEBULA_SERVICE_TOKEN\"] = SYMBLAI_NEBULA_SERVICE_TOKEN"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "6fb585dd",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "from langchain.llms import OpenLLM\n",
+    "\n",
+    "llm = OpenLLM(\n",
+    "    conversation=\"<Drop your text conversation that you want to ask Nebula to analyze here>\",\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "035dea0f",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "from langchain import PromptTemplate, LLMChain\n",
+    "\n",
+    "template = \"Identify the {count} main objectives or goals mentioned in this context concisely in less points. Emphasize on key intents.\"\n",
+    "\n",
+    "prompt = PromptTemplate(template=template, input_variables=[\"count\"])\n",
+    "\n",
+    "llm_chain = LLMChain(prompt=prompt, llm=llm)\n",
+    "\n",
+    "generated = llm_chain.run(count=\"five\")\n",
+    "print(generated)"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.8"
+  },
+  "vscode": {
+   "interpreter": {
+    "hash": "a0a0263b650d907a3bfe41c0f8d6a63a071b884df3cfdc1579f00cdc1aed6b03"
+   }
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
diff --git a/docs/extras/integrations/providers/symblai_nebula.mdx b/docs/extras/integrations/providers/symblai_nebula.mdx
new file mode 100644
index 0000000000..24ecd76a0e
--- /dev/null
+++ b/docs/extras/integrations/providers/symblai_nebula.mdx
@@ -0,0 +1,20 @@
+# Nebula
+
+This page covers how to use [Nebula](https://symbl.ai/nebula), [Symbl.ai](https://symbl.ai/)'s LLM, ecosystem within LangChain.
+It is broken into two parts: installation and setup, and then references to specific Nebula wrappers.
+
+## Installation and Setup
+
+- Get an Nebula API Key and set as environment variables (`SYMBLAI_NEBULA_SERVICE_URL`, `SYMBLAI_NEBULA_SERVICE_PATH`, `SYMBLAI_NEBULA_SERVICE_TOKEN`)
+  - Sign up for a FREE Symbl.ai/Nebula Account: [https://nebula.symbl.ai/playground/](https://nebula.symbl.ai/playground/)
+- Please see the [Nebula documentation](https://docs.symbl.ai/docs/nebula-llm-overview) for more details.
+  - No time? Visit the [Nebula Quickstart Guide](https://docs.symbl.ai/docs/nebula-quickstart).
+
+## Wrappers
+
+### LLM
+
+There exists an Nebula LLM wrapper, which you can access with 
+```python
+from langchain.llms import Nebula
+```
diff --git a/libs/langchain/langchain/llms/__init__.py b/libs/langchain/langchain/llms/__init__.py
index 53612278c9..5052dd75bd 100644
--- a/libs/langchain/langchain/llms/__init__.py
+++ b/libs/langchain/langchain/llms/__init__.py
@@ -73,6 +73,7 @@ from langchain.llms.sagemaker_endpoint import SagemakerEndpoint
 from langchain.llms.self_hosted import SelfHostedPipeline
 from langchain.llms.self_hosted_hugging_face import SelfHostedHuggingFaceLLM
 from langchain.llms.stochasticai import StochasticAI
+from langchain.llms.symblai_nebula import Nebula
 from langchain.llms.textgen import TextGen
 from langchain.llms.tongyi import Tongyi
 from langchain.llms.vertexai import VertexAI
@@ -121,6 +122,7 @@ __all__ = [
     "MlflowAIGateway",
     "Modal",
     "MosaicML",
+    "Nebula",
     "NLPCloud",
     "OpenAI",
     "OpenAIChat",
@@ -184,6 +186,7 @@ type_to_cls_dict: Dict[str, Type[BaseLLM]] = {
     "mlflow-ai-gateway": MlflowAIGateway,
     "modal": Modal,
     "mosaic": MosaicML,
+    "nebula": Nebula,
     "nlpcloud": NLPCloud,
     "openai": OpenAI,
     "openlm": OpenLM,
diff --git a/libs/langchain/langchain/llms/symblai_nebula.py b/libs/langchain/langchain/llms/symblai_nebula.py
new file mode 100644
index 0000000000..aedeec3179
--- /dev/null
+++ b/libs/langchain/langchain/llms/symblai_nebula.py
@@ -0,0 +1,196 @@
+import logging
+from typing import Any, Dict, List, Mapping, Optional
+
+import requests
+from pydantic import Extra, root_validator
+
+from langchain.callbacks.manager import CallbackManagerForLLMRun
+from langchain.llms.base import LLM
+from langchain.llms.utils import enforce_stop_tokens
+from langchain.utils import get_from_dict_or_env
+
+DEFAULT_SYMBLAI_NEBULA_SERVICE_URL = "https://api-nebula.symbl.ai"
+DEFAULT_SYMBLAI_NEBULA_SERVICE_PATH = "/v1/model/generate"
+
+logger = logging.getLogger(__name__)
+
+
+class Nebula(LLM):
+    """Nebula Service models.
+
+    To use, you should have the environment variable ``SYMBLAI_NEBULA_SERVICE_URL``,
+    ``SYMBLAI_NEBULA_SERVICE_PATH`` and ``SYMBLAI_NEBULA_SERVICE_TOKEN`` set with your Nebula
+    Service, or pass it as a named parameter to the constructor.
+
+    Example:
+        .. code-block:: python
+
+            from langchain.llms import Nebula
+
+            nebula = Nebula(
+                nebula_service_url="SERVICE_URL",
+                nebula_service_path="SERVICE_ROUTE",
+                nebula_service_token="SERVICE_TOKEN",
+            )
+
+            # Use Ray for distributed processing
+            import ray
+
+            prompt_list=[]
+
+            @ray.remote
+            def send_query(llm, prompt):
+                resp = llm(prompt)
+                return resp
+
+            futures = [send_query.remote(nebula, prompt) for prompt in prompt_list]
+            results = ray.get(futures)
+    """  # noqa: E501
+
+    """Key/value arguments to pass to the model. Reserved for future use"""
+    model_kwargs: Optional[dict] = None
+
+    """Optional"""
+    nebula_service_url: Optional[str] = None
+    nebula_service_path: Optional[str] = None
+    nebula_service_token: Optional[str] = None
+    conversation: str = ""
+    return_scores: Optional[str] = "false"
+    max_new_tokens: Optional[int] = 2048
+    top_k: Optional[float] = 2
+    penalty_alpha: Optional[float] = 0.1
+
+    class Config:
+        """Configuration for this pydantic object."""
+
+        extra = Extra.forbid
+
+    @root_validator()
+    def validate_environment(cls, values: Dict) -> Dict:
+        """Validate that api key and python package exists in environment."""
+        nebula_service_url = get_from_dict_or_env(
+            values, "nebula_service_url", "SYMBLAI_NEBULA_SERVICE_URL"
+        )
+        nebula_service_path = get_from_dict_or_env(
+            values, "nebula_service_path", "SYMBLAI_NEBULA_SERVICE_PATH"
+        )
+        nebula_service_token = get_from_dict_or_env(
+            values, "nebula_service_token", "SYMBLAI_NEBULA_SERVICE_TOKEN"
+        )
+
+        if len(nebula_service_url) == 0:
+            nebula_service_url = DEFAULT_SYMBLAI_NEBULA_SERVICE_URL
+        if len(nebula_service_path) == 0:
+            nebula_service_path = DEFAULT_SYMBLAI_NEBULA_SERVICE_PATH
+
+        if nebula_service_url.endswith("/"):
+            nebula_service_url = nebula_service_url[:-1]
+        if not nebula_service_path.startswith("/"):
+            nebula_service_path = "/" + nebula_service_path
+
+        """ TODO: Future login"""
+        """
+        try:
+            nebula_service_endpoint = f"{nebula_service_url}{nebula_service_path}"
+            headers = {
+                "Content-Type": "application/json",
+                "ApiKey": f"Bearer {nebula_service_token}",
+                }
+            requests.get(nebula_service_endpoint, headers=headers)
+        except requests.exceptions.RequestException as e:
+            raise ValueError(e)
+        """
+
+        values["nebula_service_url"] = nebula_service_url
+        values["nebula_service_path"] = nebula_service_path
+        values["nebula_service_token"] = nebula_service_token
+
+        return values
+
+    @property
+    def _identifying_params(self) -> Mapping[str, Any]:
+        """Get the identifying parameters."""
+        _model_kwargs = self.model_kwargs or {}
+        return {
+            "nebula_service_url": self.nebula_service_url,
+            "nebula_service_path": self.nebula_service_path,
+            **{"model_kwargs": _model_kwargs},
+            "conversation": self.conversation,
+        }
+
+    @property
+    def _llm_type(self) -> str:
+        """Return type of llm."""
+        return "nebula"
+
+    def _call(
+        self,
+        prompt: str,
+        stop: Optional[List[str]] = None,
+        run_manager: Optional[CallbackManagerForLLMRun] = None,
+        **kwargs: Any,
+    ) -> str:
+        """Call out to Nebula Service endpoint.
+        Args:
+            prompt: The prompt to pass into the model.
+            stop: Optional list of stop words to use when generating.
+        Returns:
+            The string generated by the model.
+        Example:
+            .. code-block:: python
+                response = nebula("Tell me a joke.")
+        """
+
+        _model_kwargs = self.model_kwargs or {}
+
+        nebula_service_endpoint = f"{self.nebula_service_url}{self.nebula_service_path}"
+
+        headers = {
+            "Content-Type": "application/json",
+            "ApiKey": f"Bearer {self.nebula_service_token}",
+        }
+
+        body = {
+            "prompt": {
+                "instruction": prompt,
+                "conversation": {"text": f"{self.conversation}"},
+            },
+            "return_scores": self.return_scores,
+            "max_new_tokens": self.max_new_tokens,
+            "top_k": self.top_k,
+            "penalty_alpha": self.penalty_alpha,
+        }
+
+        if len(self.conversation) == 0:
+            raise ValueError("Error conversation is empty.")
+
+        logger.debug(f"NEBULA _model_kwargs: {_model_kwargs}")
+        logger.debug(f"NEBULA body: {body}")
+        logger.debug(f"NEBULA kwargs: {kwargs}")
+        logger.debug(f"NEBULA conversation: {self.conversation}")
+
+        # call API
+        try:
+            response = requests.post(
+                nebula_service_endpoint, headers=headers, json=body
+            )
+        except requests.exceptions.RequestException as e:
+            raise ValueError(f"Error raised by inference endpoint: {e}")
+
+        logger.debug(f"NEBULA response: {response}")
+
+        if response.status_code != 200:
+            raise ValueError(
+                f"Error returned by service, status code {response.status_code}"
+            )
+
+        """ get the result """
+        text = response.text
+
+        """ enforce stop """
+        if stop is not None:
+            # This is required since the stop tokens
+            # are not enforced by the model parameters
+            text = enforce_stop_tokens(text, stop)
+
+        return text
diff --git a/libs/langchain/tests/integration_tests/llms/test_symblai_nebula.py b/libs/langchain/tests/integration_tests/llms/test_symblai_nebula.py
new file mode 100644
index 0000000000..618ad4ed0d
--- /dev/null
+++ b/libs/langchain/tests/integration_tests/llms/test_symblai_nebula.py
@@ -0,0 +1,56 @@
+"""Test Nebula API wrapper."""
+
+from langchain import LLMChain, PromptTemplate
+from langchain.llms.symblai_nebula import Nebula
+
+
+def test_symblai_nebula_call() -> None:
+    """Test valid call to Nebula."""
+    conversation = """Speaker 1: Thank you for calling ABC, company.Speaker 1: My name 
+is Mary.Speaker 1: How may I help you?Speaker 2: Today?Speaker 1: All right, 
+Madam.Speaker 1: I really apologize for this inconvenient.Speaker 1: I will be happy 
+to assist you in this matter.Speaker 1: Could you please offer me Yuri your account 
+number?Speaker 1: Alright Madam, thank you very much.Speaker 1: Let me check that 
+for confirmation.Speaker 1: Did you say 534 00 365?Speaker 2: 48?Speaker 1: Very good 
+man.Speaker 1: Now for verification purposes, can I please get your full?Speaker 
+2: Name?Speaker 1: Alright, thank you.Speaker 1: Very much.Speaker 1: Madam.Speaker 
+1: Can I, please get your birthdate now?Speaker 1: I am sorry madam.Speaker 1: I 
+didn't make this clear is for verification.Speaker 1: Purposes is the company 
+request.Speaker 1: The system requires me, your name, your complete name and your 
+date of.Speaker 2: Birth.Speaker 2: Alright, thank you very much madam.Speaker 1: 
+All right.Speaker 1: Thank you very much, Madam.Speaker 1: Thank you for that 
+information.Speaker 1: Let me check what happens.Speaker 2: Here.Speaker 1: So 
+according to our data space them, you did pay your last bill last August the 12, 
+which was two days ago in one of our Affiliated payment centers.Speaker 1: So, at the 
+moment you currently, We have zero balance.Speaker 1: So however, the bill that you 
+received was generated a week before you made the pavement, this is reason why you 
+already make this payment, have not been reflected yet.Speaker 1: So what we do in 
+this case, you just simply disregard the amount indicated in the field and you 
+continue to enjoy our service man.Speaker 1: Sure, Madam.Speaker 1: And I am sure 
+you need your cell phone for everything for life, right?Speaker 1: So I really 
+apologize for this inconvenience.Speaker 1: And let me tell you that delays in the 
+bill is usually caused by delays in our Courier Service.Speaker 1: That is to say 
+that it'''s a problem, not with the company, but with a courier service, For a more 
+updated, feel of your account, you can visit our website and log into your account, 
+and they'''re in the system.Speaker 1: On the website, you are going to have the 
+possibility to pay the bill.Speaker 1: That is more.Speaker 2: Updated.Speaker 2: 
+Of course, Madam I can definitely assist you with that.Speaker 2: Once you have, 
+you want to see your bill updated, please go to www.hsn BC campus, any.com after 
+that.Speaker 2: You will see in the tale.Speaker 1: All right corner.Speaker 1: So 
+you're going to see a pay now button.Speaker 1: Please click on the pay now button 
+and the serve.Speaker 1: The system is going to ask you for personal 
+information.Speaker 1: Such as your first name, your ID account, your the number of 
+your account, your email address, and your phone number once you complete this personal 
+information."""
+    llm = Nebula(
+        conversation=conversation,
+    )
+
+    template = """Identify the {count} main objectives or goals mentioned in this 
+context concisely in less points. Emphasize on key intents."""
+    prompt = PromptTemplate.from_template(template)
+
+    llm_chain = LLMChain(prompt=prompt, llm=llm)
+    output = llm_chain.run(count="five")
+
+    assert isinstance(output, str)