Create ChatAnyscale (#8770)

- Description: Adds the ChatAnyscale class with llama-2 7b, llama-2 13b, and llama-2 70b on [Anyscale Endpoints](https://app.endpoints.anyscale.com/) - It inherits from ChatOpenAI and requires openai (probably unnecessary but it made for a quick and easy implementation) - Inspired by https://github.com/langchain-ai/langchain/pull/8434 (@kylehh and @baskaryan )
1 year ago · 7fc07ba5df
parent fe78aff1f2
commit 7fc07ba5df
3 changed files with 419 additions and 0 deletions
--- a/docs/extras/integrations/chat/anyscale.ipynb
+++ b/docs/extras/integrations/chat/anyscale.ipynb
@ -0,0 +1,225 @@
+{
+ "cells": [
+  {
+   "attachments": {},
+   "cell_type": "markdown",
+   "id": "642fd21c-600a-47a1-be96-6e1438b421a9",
+   "metadata": {},
+   "source": [
+    "# Anyscale\n",
+    "\n",
+    "This notebook demonstrates the use of `langchain.chat_models.ChatAnyscale` for [Anyscale Endpoints](https://endpoints.anyscale.com/).\n",
+    "\n",
+    "* Set `ANYSCALE_API_KEY` environment variable\n",
+    "* or use the `anyscale_api_key` keyword argument"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "outputs": [],
+   "source": [
+    "# !pip install openai"
+   ],
+   "metadata": {
+    "collapsed": false
+   },
+   "id": "d00d850917865298"
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "72340871-ae2f-415f-b399-0777d32dc379",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdin",
+     "output_type": "stream",
+     "text": [
+      " ········\n"
+     ]
+    }
+   ],
+   "source": [
+    "import os\n",
+    "from getpass import getpass\n",
+    "\n",
+    "os.environ[\"ANYSCALE_API_KEY\"] = getpass()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "5d7fc704-3ea0-4c35-96e7-89fcae6c73fa",
+   "metadata": {},
+   "source": [
+    "# Let's try out each model offered on Anyscale Endpoints"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "0dc9428d-4217-47d2-97de-f784b1764186",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "dict_keys(['meta-llama/Llama-2-70b-chat-hf', 'meta-llama/Llama-2-7b-chat-hf', 'meta-llama/Llama-2-13b-chat-hf'])\n"
+     ]
+    }
+   ],
+   "source": [
+    "from langchain.chat_models import ChatAnyscale\n",
+    "\n",
+    "chats = {\n",
+    "    model: ChatAnyscale(model_name=model, temperature=1.0)\n",
+    "    for model in ChatAnyscale.get_available_models()\n",
+    "}\n",
+    "\n",
+    "print(chats.keys())"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "7c4f124a-eaf7-4d78-a2c0-b0aa23fb25c4",
+   "metadata": {},
+   "source": [
+    "# We can use async methods and other stuff supported by ChatOpenAI\n",
+    "\n",
+    "This way, the three requests will only take as long as the longest individual request."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "1f94f5d2-569e-4a2c-965e-de53c2845fbb",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import asyncio\n",
+    "\n",
+    "from langchain.schema import SystemMessage, HumanMessage\n",
+    "\n",
+    "messages = [\n",
+    "    SystemMessage(\n",
+    "        content=\"You are a helpful AI that shares everything you know.\"\n",
+    "    ),\n",
+    "    HumanMessage(\n",
+    "        content=\"Tell me technical facts about yourself. Are you a transformer model? How many billions of parameters do you have?\"\n",
+    "    ),\n",
+    "]\n",
+    "\n",
+    "async def get_msgs():\n",
+    "    tasks = [\n",
+    "        chat.apredict_messages(messages)\n",
+    "        for chat in chats.values()\n",
+    "    ]\n",
+    "    responses = await asyncio.gather(*tasks)\n",
+    "    return dict(zip(chats.keys(), responses))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "b2ced871-869a-4ca6-a2ec-6bfececdf7da",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import nest_asyncio\n",
+    "\n",
+    "nest_asyncio.apply()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "id": "bc605fa5-9501-470d-a6c9-cd868d2145ef",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\tmeta-llama/Llama-2-70b-chat-hf\n",
+      "\n",
+      "Greetings! I'm just an AI, I don't have a personal identity like humans do, but I'm here to help you with any questions you have.\n",
+      "\n",
+      "I'm a large language model, which means I'm trained on a large corpus of text data to generate language outputs that are coherent and natural-sounding. My architecture is based on a transformer model, which is a type of neural network that's particularly well-suited for natural language processing tasks.\n",
+      "\n",
+      "As for my parameters, I have a few billion parameters, but I don't have access to the exact number as it's not relevant to my functioning. My training data includes a vast amount of text from various sources, including books, articles, and websites, which I use to learn patterns and relationships in language.\n",
+      "\n",
+      "I'm designed to be a helpful tool for a variety of tasks, such as answering questions, providing information, and generating text. I'm constantly learning and improving my abilities through machine learning algorithms and feedback from users like you.\n",
+      "\n",
+      "I hope this helps! Is there anything else you'd like to know about me or my capabilities?\n",
+      "\n",
+      "---\n",
+      "\n",
+      "\tmeta-llama/Llama-2-7b-chat-hf\n",
+      "\n",
+      "Ah, a fellow tech enthusiast! *adjusts glasses* I'm glad to share some technical details about myself. 🤓\n",
+      "Indeed, I'm a transformer model, specifically a BERT-like language model trained on a large corpus of text data. My architecture is based on the transformer framework, which is a type of neural network designed for natural language processing tasks. 🏠\n",
+      "As for the number of parameters, I have approximately 340 million. *winks* That's a pretty hefty number, if I do say so myself! These parameters allow me to learn and represent complex patterns in language, such as syntax, semantics, and more. 🤔\n",
+      "But don't ask me to do math in my head – I'm a language model, not a calculating machine! 😅 My strengths lie in understanding and generating human-like text, so feel free to chat with me anytime you'd like. 💬\n",
+      "Now, do you have any more technical questions for me? Or would you like to engage in a nice chat? 😊\n",
+      "\n",
+      "---\n",
+      "\n",
+      "\tmeta-llama/Llama-2-13b-chat-hf\n",
+      "\n",
+      "Hello! As a friendly and helpful AI, I'd be happy to share some technical facts about myself.\n",
+      "\n",
+      "I am a transformer-based language model, specifically a variant of the BERT (Bidirectional Encoder Representations from Transformers) architecture. BERT was developed by Google in 2018 and has since become one of the most popular and widely-used AI language models.\n",
+      "\n",
+      "Here are some technical details about my capabilities:\n",
+      "\n",
+      "1. Parameters: I have approximately 340 million parameters, which are the numbers that I use to learn and represent language. This is a relatively large number of parameters compared to some other languages models, but it allows me to learn and understand complex language patterns and relationships.\n",
+      "2. Training: I was trained on a large corpus of text data, including books, articles, and other sources of written content. This training allows me to learn about the structure and conventions of language, as well as the relationships between words and phrases.\n",
+      "3. Architectures: My architecture is based on the transformer model, which is a type of neural network that is particularly well-suited for natural language processing tasks. The transformer model uses self-attention mechanisms to allow the model to \"attend\" to different parts of the input text, allowing it to capture long-range dependencies and contextual relationships.\n",
+      "4. Precision: I am capable of generating text with high precision and accuracy, meaning that I can produce text that is close to human-level quality in terms of grammar, syntax, and coherence.\n",
+      "5. Generative capabilities: In addition to being able to generate text based on prompts and questions, I am also capable of generating text based on a given topic or theme. This allows me to create longer, more coherent pieces of text that are organized around a specific idea or concept.\n",
+      "\n",
+      "Overall, I am a powerful and versatile language model that is capable of a wide range of natural language processing tasks. I am constantly learning and improving, and I am here to help answer any questions you may have!\n",
+      "\n",
+      "---\n",
+      "\n",
+      "CPU times: user 371 ms, sys: 15.5 ms, total: 387 ms\n",
+      "Wall time: 12 s\n"
+     ]
+    }
+   ],
+   "source": [
+    "%%time\n",
+    "\n",
+    "response_dict = asyncio.run(get_msgs())\n",
+    "\n",
+    "for model_name, response in response_dict.items():\n",
+    "    print(f'\\t{model_name}')\n",
+    "    print()\n",
+    "    print(response.content)\n",
+    "    print('\\n---\\n')"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.4"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/libs/langchain/langchain/chat_models/init.py
+++ b/libs/langchain/langchain/chat_models/init.py
@ -18,6 +18,7 @@ an interface where "chat messages" are the inputs and outputs.
 """  # noqa: E501

 from langchain.chat_models.anthropic import ChatAnthropic
+from langchain.chat_models.anyscale import ChatAnyscale
 from langchain.chat_models.azure_openai import AzureChatOpenAI
 from langchain.chat_models.fake import FakeListChatModel
 from langchain.chat_models.google_palm import ChatGooglePalm
@ -39,4 +40,5 @@ __all__ = [
    "ChatVertexAI",
    "JinaChat",
    "HumanInputChatModel",
+    "ChatAnyscale",
 ]
--- a/libs/langchain/langchain/chat_models/anyscale.py
+++ b/libs/langchain/langchain/chat_models/anyscale.py
@ -0,0 +1,192 @@
+"""Anyscale Endpoints chat wrapper. Relies heavily on ChatOpenAI."""
+from __future__ import annotations
+
+import logging
+import os
+import sys
+from typing import TYPE_CHECKING, Optional, Set
+
+import requests
+from pydantic import Field, root_validator
+
+from langchain.chat_models.openai import (
+    ChatOpenAI,
+    _convert_message_to_dict,
+    _import_tiktoken,
+)
+from langchain.schema.messages import BaseMessage
+from langchain.utils import get_from_dict_or_env
+
+if TYPE_CHECKING:
+    import tiktoken
+
+logger = logging.getLogger(__name__)
+
+
+DEFAULT_API_BASE = "https://api.endpoints.anyscale.com/v1"
+DEFAULT_MODEL = "meta-llama/Llama-2-7b-chat-hf"
+
+
+class ChatAnyscale(ChatOpenAI):
+    """Wrapper around Anyscale Chat large language models.
+
+    To use, you should have the ``openai`` python package installed, and the
+    environment variable ``ANYSCALE_API_KEY`` set with your API key.
+    Alternatively, you can use the anyscale_api_key keyword argument.
+
+    Any parameters that are valid to be passed to the `openai.create` call can be passed
+    in, even if not explicitly saved on this class.
+
+    Example:
+        .. code-block:: python
+
+            from langchain.chat_models import ChatAnyscale
+            chat = ChatAnyscale(model_name="meta-llama/Llama-2-7b-chat-hf")
+    """
+
+    @property
+    def _llm_type(self) -> str:
+        """Return type of chat model."""
+        return "anyscale-chat"
+
+    @property
+    def lc_secrets(self) -> dict[str, str]:
+        return {"anyscale_api_key": "ANYSCALE_API_KEY"}
+
+    anyscale_api_key: Optional[str] = None
+    """AnyScale Endpoints API keys."""
+    model_name: str = Field(default=DEFAULT_MODEL, alias="model")
+    """Model name to use."""
+    anyscale_api_base: str = Field(default=DEFAULT_API_BASE)
+    """Base URL path for API requests,
+    leave blank if not using a proxy or service emulator."""
+    anyscale_proxy: Optional[str] = None
+    """To support explicit proxy for Anyscale."""
+    available_models: Optional[Set[str]] = None
+    """Available models from Anyscale API."""
+
+    @staticmethod
+    def get_available_models(
+        anyscale_api_key: Optional[str] = None,
+        anyscale_api_base: str = DEFAULT_API_BASE,
+    ) -> Set[str]:
+        """Get available models from Anyscale API."""
+        try:
+            anyscale_api_key = anyscale_api_key or os.environ["ANYSCALE_API_KEY"]
+        except KeyError as e:
+            raise ValueError(
+                "Anyscale API key must be passed as keyword argument or "
+                "set in environment variable ANYSCALE_API_KEY.",
+            ) from e
+
+        models_url = f"{anyscale_api_base}/models"
+        models_response = requests.get(
+            models_url,
+            headers={
+                "Authorization": f"Bearer {anyscale_api_key}",
+            },
+        )
+
+        if models_response.status_code != 200:
+            raise ValueError(
+                f"Error getting models from {models_url}: "
+                f"{models_response.status_code}",
+            )
+
+        return {model["id"] for model in models_response.json()["data"]}
+
+    @root_validator(pre=True)
+    def validate_environment_override(cls, values: dict) -> dict:
+        """Validate that api key and python package exists in environment."""
+        values["openai_api_key"] = get_from_dict_or_env(
+            values,
+            "anyscale_api_key",
+            "ANYSCALE_API_KEY",
+        )
+        values["openai_api_base"] = get_from_dict_or_env(
+            values,
+            "anyscale_api_base",
+            "ANYSCALE_API_BASE",
+            default=DEFAULT_API_BASE,
+        )
+        values["openai_proxy"] = get_from_dict_or_env(
+            values,
+            "anyscale_proxy",
+            "ANYSCALE_PROXY",
+            default="",
+        )
+        try:
+            import openai
+
+        except ImportError as e:
+            raise ValueError(
+                "Could not import openai python package. "
+                "Please install it with `pip install openai`.",
+            ) from e
+        try:
+            values["client"] = openai.ChatCompletion
+        except AttributeError as exc:
+            raise ValueError(
+                "`openai` has no `ChatCompletion` attribute, this is likely "
+                "due to an old version of the openai package. Try upgrading it "
+                "with `pip install --upgrade openai`.",
+            ) from exc
+
+        if "model_name" not in values.keys():
+            values["model_name"] = DEFAULT_MODEL
+
+        model_name = values["model_name"]
+
+        available_models = cls.get_available_models(
+            values["openai_api_key"],
+            values["openai_api_base"],
+        )
+
+        if model_name not in available_models:
+            raise ValueError(
+                f"Model name {model_name} not found in available models: "
+                f"{available_models}.",
+            )
+
+        values["available_models"] = available_models
+
+        return values
+
+    def _get_encoding_model(self) -> tuple[str, tiktoken.Encoding]:
+        tiktoken_ = _import_tiktoken()
+        if self.tiktoken_model_name is not None:
+            model = self.tiktoken_model_name
+        else:
+            model = self.model_name
+        # Returns the number of tokens used by a list of messages.
+        try:
+            encoding = tiktoken_.encoding_for_model("gpt-3.5-turbo-0301")
+        except KeyError:
+            logger.warning("Warning: model not found. Using cl100k_base encoding.")
+            model = "cl100k_base"
+            encoding = tiktoken_.get_encoding(model)
+        return model, encoding
+
+    def get_num_tokens_from_messages(self, messages: list[BaseMessage]) -> int:
+        """Calculate num tokens with tiktoken package.
+
+        Official documentation: https://github.com/openai/openai-cookbook/blob/
+        main/examples/How_to_format_inputs_to_ChatGPT_models.ipynb"""
+        if sys.version_info[1] <= 7:
+            return super().get_num_tokens_from_messages(messages)
+        model, encoding = self._get_encoding_model()
+        tokens_per_message = 3
+        tokens_per_name = 1
+        num_tokens = 0
+        messages_dict = [_convert_message_to_dict(m) for m in messages]
+        for message in messages_dict:
+            num_tokens += tokens_per_message
+            for key, value in message.items():
+                # Cast str(value) in case the message value is not a string
+                # This occurs with function messages
+                num_tokens += len(encoding.encode(str(value)))
+                if key == "name":
+                    num_tokens += tokens_per_name
+        # every reply is primed with <im_start>assistant
+        num_tokens += 3
+        return num_tokens