Add konko chat model (#10380)

12 months ago · 5d8a689d5e
parent 5a4ce9ef2b 0a86a70fe7
commit 5d8a689d5e
5 changed files with 716 additions and 2 deletions
--- a/docs/extras/integrations/chat/konko.ipynb
+++ b/docs/extras/integrations/chat/konko.ipynb
@ -0,0 +1,164 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Konko\n",
+    "\n",
+    ">[Konko](https://www.konko.ai/) API is a fully managed Web API designed to help application developers:\n",
+    "\n",
+    "Konko API is a fully managed API designed to help application developers:\n",
+    "\n",
+    "1. Select the right LLM(s) for their application\n",
+    "2. Prototype with various open-source and proprietary LLMs\n",
+    "3. Move to production in-line with their security, privacy, throughput, latency SLAs without infrastructure set-up or administration using Konko AI's SOC 2 compliant infrastructure\n",
+    "\n",
+    "\n",
+    "This example goes over how to use LangChain to interact with `Konko` [models](https://docs.konko.ai/docs/overview)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "To run this notebook, you'll need Konko API key. You can request it by messaging support@konko.ai."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "from langchain.chat_models import ChatKonko\n",
+    "from langchain.prompts.chat import (\n",
+    "    ChatPromptTemplate,\n",
+    "    SystemMessagePromptTemplate,\n",
+    "    AIMessagePromptTemplate,\n",
+    "    HumanMessagePromptTemplate,\n",
+    ")\n",
+    "from langchain.schema import AIMessage, HumanMessage, SystemMessage"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 2. Set API Keys\n",
+    "\n",
+    "<br />\n",
+    "\n",
+    "### Option 1: Set Environment Variables\n",
+    "\n",
+    "1. You can set environment variables for \n",
+    "   1. KONKO_API_KEY (Required)\n",
+    "   2. OPENAI_API_KEY (Optional)\n",
+    "2. In your current shell session, use the export command:\n",
+    "\n",
+    "```shell\n",
+    "export KONKO_API_KEY={your_KONKO_API_KEY_here}\n",
+    "export OPENAI_API_KEY={your_OPENAI_API_KEY_here} #Optional\n",
+    "```\n",
+    "\n",
+    "Alternatively, you can add the above lines directly to your shell startup script (such as .bashrc or .bash_profile for Bash shell and .zshrc for Zsh shell) to have them set automatically every time a new shell session starts.\n",
+    "\n",
+    "### Option 2: Set API Keys Programmatically\n",
+    "\n",
+    "If you prefer to set your API keys directly within your Python script or Jupyter notebook, you can use the following commands:\n",
+    "\n",
+    "```python\n",
+    "konko.set_api_key('your_KONKO_API_KEY_here')  \n",
+    "konko.set_openai_api_key('your_OPENAI_API_KEY_here') # Optional\n",
+    "```\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Calling a model\n",
+    "\n",
+    "Find a model on the [Konko overview page](https://docs.konko.ai/docs/overview)\n",
+    "\n",
+    "For example, for this [LLama 2 model](https://docs.konko.ai/docs/meta-llama-2-13b-chat). The model id would be: `\"meta-llama/Llama-2-13b-chat-hf\"`\n",
+    "\n",
+    "Another way to find the list of models running on the Konko instance is through this [endpoint](https://docs.konko.ai/reference/listmodels).\n",
+    "\n",
+    "From here, we can initialize our model:\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "chat = ChatKonko(max_tokens=400, model = 'meta-llama/Llama-2-13b-chat-hf')"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "AIMessage(content=\" Sure, I'd be happy to explain the Big Bang Theory briefly!\\n\\nThe Big Bang Theory is the leading explanation for the origin and evolution of the universe, based on a vast amount of observational evidence from many fields of science. In essence, the theory posits that the universe began as an infinitely hot and dense point, known as a singularity, around 13.8 billion years ago. This singularity expanded rapidly, and as it did, it cooled and formed subatomic particles, which eventually coalesced into the first atoms, and later into the stars and galaxies we see today.\\n\\nThe theory gets its name from the idea that the universe began in a state of incredibly high energy and temperature, and has been expanding and cooling ever since. This expansion is thought to have been driven by a mysterious force known as dark energy, which is thought to be responsible for the accelerating expansion of the universe.\\n\\nOne of the key predictions of the Big Bang Theory is that the universe should be homogeneous and isotropic on large scales, meaning that it should look the same in all directions and have the same properties everywhere. This prediction has been confirmed by a wealth of observational evidence, including the cosmic microwave background radiation, which is thought to be a remnant of the early universe.\\n\\nOverall, the Big Bang Theory is a well-established and widely accepted explanation for the origins of the universe, and it has been supported by a vast amount of observational evidence from many fields of science.\", additional_kwargs={}, example=False)"
+      ]
+     },
+     "execution_count": 7,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "messages = [\n",
+    "    SystemMessage(\n",
+    "        content=\"You are a helpful assistant.\"\n",
+    "    ),\n",
+    "    HumanMessage(\n",
+    "        content=\"Explain Big Bang Theory briefly\"\n",
+    "    ),\n",
+    "]\n",
+    "chat(messages)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.3"
+  },
+  "vscode": {
+   "interpreter": {
+    "hash": "a0a0263b650d907a3bfe41c0f8d6a63a071b884df3cfdc1579f00cdc1aed6b03"
+   }
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
--- a/docs/extras/integrations/providers/konko.mdx
+++ b/docs/extras/integrations/providers/konko.mdx
@ -0,0 +1,80 @@
+# Konko
+This page covers how to run models on Konko within LangChain.
+
+Konko API is a fully managed API designed to help application developers:
+
+Select the right LLM(s) for their application
+Prototype with various open-source and proprietary LLMs
+Move to production in-line with their security, privacy, throughput, latency SLAs without infrastructure set-up or administration using Konko AI's SOC 2 compliant infrastructure
+
+## Installation and Setup
+
+### First you'll need an API key
+You can request it by messaging [support@konko.ai](mailto:support@konko.ai) 
+
+### Install Konko AI's Python SDK
+
+#### 1. Enable a Python3.8+ environment
+
+#### 2. Set API Keys
+
+##### Option 1: Set Environment Variables
+
+1. You can set environment variables for 
+   1. KONKO_API_KEY (Required)
+   2. OPENAI_API_KEY (Optional)
+
+2. In your current shell session, use the export command:
+
+```shell
+export KONKO_API_KEY={your_KONKO_API_KEY_here}
+export OPENAI_API_KEY={your_OPENAI_API_KEY_here} #Optional
+```
+
+Alternatively, you can add the above lines directly to your shell startup script (such as .bashrc or .bash_profile for Bash shell and .zshrc for Zsh shell) to have them set automatically every time a new shell session starts.
+
+##### Option 2: Set API Keys Programmatically
+
+If you prefer to set your API keys directly within your Python script or Jupyter notebook, you can use the following commands:
+
+```python
+konko.set_api_key('your_KONKO_API_KEY_here')  
+konko.set_openai_api_key('your_OPENAI_API_KEY_here') # Optional
+```
+
+#### 3. Install the SDK
+
+
+```shell
+pip install konko
+```
+
+#### 4. Verify Installation & Authentication
+
+```python
+#Confirm konko has installed successfully
+import konko
+#Confirm API keys from Konko and OpenAI are set properly
+konko.Model.list()
+```
+
+## Calling a model
+
+Find a model on the [Konko Introduction page](https://docs.konko.ai/docs#available-models)
+
+For example, for this [LLama 2 model](https://docs.konko.ai/docs/meta-llama-2-13b-chat). The model id would be: `"meta-llama/Llama-2-13b-chat-hf"`
+
+Another way to find the list of models running on the Konko instance is through this [endpoint](https://docs.konko.ai/reference/listmodels).
+
+From here, we can initialize our model:
+
+```python
+chat_instance = ChatKonko(max_tokens=10, model = 'meta-llama/Llama-2-13b-chat-hf')
+```
+
+And run it:
+
+```python
+msg = HumanMessage(content="Hi")
+chat_response = chat_instance([msg])
+```
--- a/libs/langchain/langchain/chat_models/init.py
+++ b/libs/langchain/langchain/chat_models/init.py
@ -20,12 +20,12 @@ an interface where "chat messages" are the inputs and outputs.
 from langchain.chat_models.anthropic import ChatAnthropic
 from langchain.chat_models.anyscale import ChatAnyscale
 from langchain.chat_models.azure_openai import AzureChatOpenAI
-from langchain.chat_models.bedrock import BedrockChat
 from langchain.chat_models.ernie import ErnieBotChat
 from langchain.chat_models.fake import FakeListChatModel
 from langchain.chat_models.google_palm import ChatGooglePalm
 from langchain.chat_models.human import HumanInputChatModel
 from langchain.chat_models.jinachat import JinaChat
+from langchain.chat_models.konko import ChatKonko
 from langchain.chat_models.litellm import ChatLiteLLM
 from langchain.chat_models.mlflow_ai_gateway import ChatMLflowAIGateway
 from langchain.chat_models.ollama import ChatOllama
@ -36,7 +36,6 @@ from langchain.chat_models.vertexai import ChatVertexAI
 __all__ = [
    "ChatOpenAI",
    "AzureChatOpenAI",
-    "BedrockChat",
    "FakeListChatModel",
    "PromptLayerChatOpenAI",
    "ChatAnthropic",
@ -49,4 +48,5 @@ __all__ = [
    "ChatAnyscale",
    "ChatLiteLLM",
    "ErnieBotChat",
+    "ChatKonko",
 ]
--- a/libs/langchain/langchain/chat_models/konko.py
+++ b/libs/langchain/langchain/chat_models/konko.py
@ -0,0 +1,292 @@
+"""KonkoAI chat wrapper."""
+from __future__ import annotations
+
+import logging
+import os
+from typing import (
+    Any,
+    Dict,
+    Iterator,
+    List,
+    Mapping,
+    Optional,
+    Set,
+    Tuple,
+    Union,
+)
+
+import requests
+
+from langchain.adapters.openai import convert_dict_to_message, convert_message_to_dict
+from langchain.callbacks.manager import (
+    CallbackManagerForLLMRun,
+)
+from langchain.chat_models.openai import ChatOpenAI, _convert_delta_to_message_chunk
+from langchain.pydantic_v1 import Field, root_validator
+from langchain.schema import ChatGeneration, ChatResult
+from langchain.schema.messages import AIMessageChunk, BaseMessage
+from langchain.schema.output import ChatGenerationChunk
+from langchain.utils import get_from_dict_or_env
+
+DEFAULT_API_BASE = "https://api.konko.ai/v1"
+DEFAULT_MODEL = "meta-llama/Llama-2-13b-chat-hf"
+
+logger = logging.getLogger(__name__)
+
+
+class ChatKonko(ChatOpenAI):
+    """`ChatKonko` Chat large language models API.
+
+    To use, you should have the ``konko`` python package installed, and the
+    environment variable ``KONKO_API_KEY`` and ``OPENAI_API_KEY`` set with your API key.
+
+    Any parameters that are valid to be passed to the konko.create call can be passed
+    in, even if not explicitly saved on this class.
+
+    Example:
+        .. code-block:: python
+
+            from langchain.chat_models import ChatKonko
+            llm = ChatKonko(model="meta-llama/Llama-2-13b-chat-hf")
+    """
+
+    @property
+    def lc_secrets(self) -> Dict[str, str]:
+        return {"konko_api_key": "KONKO_API_KEY", "openai_api_key": "OPENAI_API_KEY"}
+
+    @property
+    def lc_serializable(self) -> bool:
+        return True
+
+    client: Any = None  #: :meta private:
+    model: str = Field(default=DEFAULT_MODEL, alias="model")
+    """Model name to use."""
+    temperature: float = 0.7
+    """What sampling temperature to use."""
+    model_kwargs: Dict[str, Any] = Field(default_factory=dict)
+    """Holds any model parameters valid for `create` call not explicitly specified."""
+    openai_api_key: Optional[str] = None
+    konko_api_key: Optional[str] = None
+    request_timeout: Optional[Union[float, Tuple[float, float]]] = None
+    """Timeout for requests to Konko completion API."""
+    max_retries: int = 6
+    """Maximum number of retries to make when generating."""
+    streaming: bool = False
+    """Whether to stream the results or not."""
+    n: int = 1
+    """Number of chat completions to generate for each prompt."""
+    max_tokens: int = 20
+    """Maximum number of tokens to generate."""
+
+    @root_validator()
+    def validate_environment(cls, values: Dict) -> Dict:
+        """Validate that api key and python package exists in environment."""
+        values["konko_api_key"] = get_from_dict_or_env(
+            values, "konko_api_key", "KONKO_API_KEY"
+        )
+        try:
+            import konko
+
+        except ImportError:
+            raise ValueError(
+                "Could not import konko python package. "
+                "Please install it with `pip install konko`."
+            )
+        try:
+            values["client"] = konko.ChatCompletion
+        except AttributeError:
+            raise ValueError(
+                "`konko` has no `ChatCompletion` attribute, this is likely "
+                "due to an old version of the konko package. Try upgrading it "
+                "with `pip install --upgrade konko`."
+            )
+        if values["n"] < 1:
+            raise ValueError("n must be at least 1.")
+        if values["n"] > 1 and values["streaming"]:
+            raise ValueError("n must be 1 when streaming.")
+        return values
+
+    @property
+    def _default_params(self) -> Dict[str, Any]:
+        """Get the default parameters for calling Konko API."""
+        return {
+            "model": self.model,
+            "request_timeout": self.request_timeout,
+            "max_tokens": self.max_tokens,
+            "stream": self.streaming,
+            "n": self.n,
+            "temperature": self.temperature,
+            **self.model_kwargs,
+        }
+
+    @staticmethod
+    def get_available_models(
+        konko_api_key: Optional[str] = None,
+        openai_api_key: Optional[str] = None,
+        konko_api_base: str = DEFAULT_API_BASE,
+    ) -> Set[str]:
+        """Get available models from Konko API."""
+
+        # Try to retrieve the OpenAI API key if it's not passed as an argument
+        if not openai_api_key:
+            try:
+                openai_api_key = os.environ["OPENAI_API_KEY"]
+            except KeyError:
+                pass  # It's okay if it's not set, we just won't use it
+
+        # Try to retrieve the Konko API key if it's not passed as an argument
+        if not konko_api_key:
+            try:
+                konko_api_key = os.environ["KONKO_API_KEY"]
+            except KeyError:
+                raise ValueError(
+                    "Konko API key must be passed as keyword argument or "
+                    "set in environment variable KONKO_API_KEY."
+                )
+
+        models_url = f"{konko_api_base}/models"
+
+        headers = {
+            "Authorization": f"Bearer {konko_api_key}",
+        }
+
+        if openai_api_key:
+            headers["X-OpenAI-Api-Key"] = openai_api_key
+
+        models_response = requests.get(models_url, headers=headers)
+
+        if models_response.status_code != 200:
+            raise ValueError(
+                f"Error getting models from {models_url}: "
+                f"{models_response.status_code}"
+            )
+
+        return {model["id"] for model in models_response.json()["data"]}
+
+    def completion_with_retry(
+        self, run_manager: Optional[CallbackManagerForLLMRun] = None, **kwargs: Any
+    ) -> Any:
+        def _completion_with_retry(**kwargs: Any) -> Any:
+            return self.client.create(**kwargs)
+
+        return _completion_with_retry(**kwargs)
+
+    def _combine_llm_outputs(self, llm_outputs: List[Optional[dict]]) -> dict:
+        overall_token_usage: dict = {}
+        for output in llm_outputs:
+            if output is None:
+                # Happens in streaming
+                continue
+            token_usage = output["token_usage"]
+            for k, v in token_usage.items():
+                if k in overall_token_usage:
+                    overall_token_usage[k] += v
+                else:
+                    overall_token_usage[k] = v
+        return {"token_usage": overall_token_usage, "model_name": self.model}
+
+    def _stream(
+        self,
+        messages: List[BaseMessage],
+        stop: Optional[List[str]] = None,
+        run_manager: Optional[CallbackManagerForLLMRun] = None,
+        **kwargs: Any,
+    ) -> Iterator[ChatGenerationChunk]:
+        message_dicts, params = self._create_message_dicts(messages, stop)
+        params = {**params, **kwargs, "stream": True}
+
+        default_chunk_class = AIMessageChunk
+        for chunk in self.completion_with_retry(
+            messages=message_dicts, run_manager=run_manager, **params
+        ):
+            if len(chunk["choices"]) == 0:
+                continue
+            choice = chunk["choices"][0]
+            chunk = _convert_delta_to_message_chunk(
+                choice["delta"], default_chunk_class
+            )
+            finish_reason = choice.get("finish_reason")
+            generation_info = (
+                dict(finish_reason=finish_reason) if finish_reason is not None else None
+            )
+            default_chunk_class = chunk.__class__
+            yield ChatGenerationChunk(message=chunk, generation_info=generation_info)
+            if run_manager:
+                run_manager.on_llm_new_token(chunk.content, chunk=chunk)
+
+    def _generate(
+        self,
+        messages: List[BaseMessage],
+        stop: Optional[List[str]] = None,
+        run_manager: Optional[CallbackManagerForLLMRun] = None,
+        stream: Optional[bool] = None,
+        **kwargs: Any,
+    ) -> ChatResult:
+        if stream if stream is not None else self.streaming:
+            generation: Optional[ChatGenerationChunk] = None
+            for chunk in self._stream(
+                messages=messages, stop=stop, run_manager=run_manager, **kwargs
+            ):
+                if generation is None:
+                    generation = chunk
+                else:
+                    generation += chunk
+            assert generation is not None
+            return ChatResult(generations=[generation])
+
+        message_dicts, params = self._create_message_dicts(messages, stop)
+        params = {**params, **kwargs}
+        response = self.completion_with_retry(
+            messages=message_dicts, run_manager=run_manager, **params
+        )
+        return self._create_chat_result(response)
+
+    def _create_message_dicts(
+        self, messages: List[BaseMessage], stop: Optional[List[str]]
+    ) -> Tuple[List[Dict[str, Any]], Dict[str, Any]]:
+        params = self._client_params
+        if stop is not None:
+            if "stop" in params:
+                raise ValueError("`stop` found in both the input and default params.")
+            params["stop"] = stop
+        message_dicts = [convert_message_to_dict(m) for m in messages]
+        return message_dicts, params
+
+    def _create_chat_result(self, response: Mapping[str, Any]) -> ChatResult:
+        generations = []
+        for res in response["choices"]:
+            message = convert_dict_to_message(res["message"])
+            gen = ChatGeneration(
+                message=message,
+                generation_info=dict(finish_reason=res.get("finish_reason")),
+            )
+            generations.append(gen)
+        token_usage = response.get("usage", {})
+        llm_output = {"token_usage": token_usage, "model_name": self.model}
+        return ChatResult(generations=generations, llm_output=llm_output)
+
+    @property
+    def _identifying_params(self) -> Dict[str, Any]:
+        """Get the identifying parameters."""
+        return {**{"model_name": self.model}, **self._default_params}
+
+    @property
+    def _client_params(self) -> Dict[str, Any]:
+        """Get the parameters used for the konko client."""
+        return {**self._default_params}
+
+    def _get_invocation_params(
+        self, stop: Optional[List[str]] = None, **kwargs: Any
+    ) -> Dict[str, Any]:
+        """Get the parameters used to invoke the model."""
+        return {
+            "model": self.model,
+            **super()._get_invocation_params(stop=stop),
+            **self._default_params,
+            **kwargs,
+        }
+
+    @property
+    def _llm_type(self) -> str:
+        """Return type of chat model."""
+        return "konko-chat"
--- a/libs/langchain/tests/integration_tests/chat_models/test_konko.py
+++ b/libs/langchain/tests/integration_tests/chat_models/test_konko.py
@ -0,0 +1,178 @@
+"""Evaluate ChatKonko Interface."""
+from typing import Any
+
+import pytest
+
+from langchain.callbacks.manager import CallbackManager
+from langchain.chat_models.konko import ChatKonko
+from langchain.schema import (
+    ChatGeneration,
+    ChatResult,
+    LLMResult,
+)
+from langchain.schema.messages import BaseMessage, HumanMessage, SystemMessage
+from tests.unit_tests.callbacks.fake_callback_handler import FakeCallbackHandler
+
+
+def test_konko_chat_test() -> None:
+    """Evaluate basic ChatKonko functionality."""
+    chat_instance = ChatKonko(max_tokens=10)
+    msg = HumanMessage(content="Hi")
+    chat_response = chat_instance([msg])
+    assert isinstance(chat_response, BaseMessage)
+    assert isinstance(chat_response.content, str)
+
+
+def test_konko_chat_test_openai() -> None:
+    """Evaluate basic ChatKonko functionality."""
+    chat_instance = ChatKonko(max_tokens=10, model="gpt-3.5-turbo")
+    msg = HumanMessage(content="Hi")
+    chat_response = chat_instance([msg])
+    assert isinstance(chat_response, BaseMessage)
+    assert isinstance(chat_response.content, str)
+
+
+def test_konko_model_test() -> None:
+    """Check how ChatKonko manages model_name."""
+    chat_instance = ChatKonko(model="alpha")
+    assert chat_instance.model == "alpha"
+    chat_instance = ChatKonko(model="beta")
+    assert chat_instance.model == "beta"
+
+
+def test_konko_available_model_test() -> None:
+    """Check how ChatKonko manages model_name."""
+    chat_instance = ChatKonko(max_tokens=10, n=2)
+    res = chat_instance.get_available_models()
+    assert isinstance(res, set)
+
+
+def test_konko_system_msg_test() -> None:
+    """Evaluate ChatKonko's handling of system messages."""
+    chat_instance = ChatKonko(max_tokens=10)
+    sys_msg = SystemMessage(content="Initiate user chat.")
+    user_msg = HumanMessage(content="Hi there")
+    chat_response = chat_instance([sys_msg, user_msg])
+    assert isinstance(chat_response, BaseMessage)
+    assert isinstance(chat_response.content, str)
+
+
+def test_konko_generation_test() -> None:
+    """Check ChatKonko's generation ability."""
+    chat_instance = ChatKonko(max_tokens=10, n=2)
+    msg = HumanMessage(content="Hi")
+    gen_response = chat_instance.generate([[msg], [msg]])
+    assert isinstance(gen_response, LLMResult)
+    assert len(gen_response.generations) == 2
+    for gen_list in gen_response.generations:
+        assert len(gen_list) == 2
+        for gen in gen_list:
+            assert isinstance(gen, ChatGeneration)
+            assert isinstance(gen.text, str)
+            assert gen.text == gen.message.content
+
+
+def test_konko_multiple_outputs_test() -> None:
+    """Test multiple completions with ChatKonko."""
+    chat_instance = ChatKonko(max_tokens=10, n=5)
+    msg = HumanMessage(content="Hi")
+    gen_response = chat_instance._generate([msg])
+    assert isinstance(gen_response, ChatResult)
+    assert len(gen_response.generations) == 5
+    for gen in gen_response.generations:
+        assert isinstance(gen.message, BaseMessage)
+        assert isinstance(gen.message.content, str)
+
+
+def test_konko_streaming_callback_test() -> None:
+    """Evaluate streaming's token callback functionality."""
+    callback_instance = FakeCallbackHandler()
+    callback_mgr = CallbackManager([callback_instance])
+    chat_instance = ChatKonko(
+        max_tokens=10,
+        streaming=True,
+        temperature=0,
+        callback_manager=callback_mgr,
+        verbose=True,
+    )
+    msg = HumanMessage(content="Hi")
+    chat_response = chat_instance([msg])
+    assert callback_instance.llm_streams > 0
+    assert isinstance(chat_response, BaseMessage)
+
+
+def test_konko_streaming_info_test() -> None:
+    """Ensure generation details are retained during streaming."""
+
+    class TestCallback(FakeCallbackHandler):
+        data_store: dict = {}
+
+        def on_llm_end(self, *args: Any, **kwargs: Any) -> Any:
+            self.data_store["generation"] = args[0]
+
+    callback_instance = TestCallback()
+    callback_mgr = CallbackManager([callback_instance])
+    chat_instance = ChatKonko(
+        max_tokens=2,
+        temperature=0,
+        callback_manager=callback_mgr,
+    )
+    list(chat_instance.stream("hey"))
+    gen_data = callback_instance.data_store["generation"]
+    assert gen_data.generations[0][0].text == " Hey"
+
+
+def test_konko_llm_model_name_test() -> None:
+    """Check if llm_output has model info."""
+    chat_instance = ChatKonko(max_tokens=10)
+    msg = HumanMessage(content="Hi")
+    llm_data = chat_instance.generate([[msg]])
+    assert llm_data.llm_output is not None
+    assert llm_data.llm_output["model_name"] == chat_instance.model
+
+
+def test_konko_streaming_model_name_test() -> None:
+    """Check model info during streaming."""
+    chat_instance = ChatKonko(max_tokens=10, streaming=True)
+    msg = HumanMessage(content="Hi")
+    llm_data = chat_instance.generate([[msg]])
+    assert llm_data.llm_output is not None
+    assert llm_data.llm_output["model_name"] == chat_instance.model
+
+
+def test_konko_streaming_param_validation_test() -> None:
+    """Ensure correct token callback during streaming."""
+    with pytest.raises(ValueError):
+        ChatKonko(
+            max_tokens=10,
+            streaming=True,
+            temperature=0,
+            n=5,
+        )
+
+
+def test_konko_additional_args_test() -> None:
+    """Evaluate extra arguments for ChatKonko."""
+    chat_instance = ChatKonko(extra=3, max_tokens=10)
+    assert chat_instance.max_tokens == 10
+    assert chat_instance.model_kwargs == {"extra": 3}
+
+    chat_instance = ChatKonko(extra=3, model_kwargs={"addition": 2})
+    assert chat_instance.model_kwargs == {"extra": 3, "addition": 2}
+
+    with pytest.raises(ValueError):
+        ChatKonko(extra=3, model_kwargs={"extra": 2})
+
+    with pytest.raises(ValueError):
+        ChatKonko(model_kwargs={"temperature": 0.2})
+
+    with pytest.raises(ValueError):
+        ChatKonko(model_kwargs={"model": "text-davinci-003"})
+
+
+def test_konko_token_streaming_test() -> None:
+    """Check token streaming for ChatKonko."""
+    chat_instance = ChatKonko(max_tokens=10)
+
+    for token in chat_instance.stream("Just a test"):
+        assert isinstance(token.content, str)