Add konko chat model (#10380)

pull/10383/head
Bagatur 12 months ago committed by GitHub
commit 5d8a689d5e
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

@ -0,0 +1,164 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Konko\n",
"\n",
">[Konko](https://www.konko.ai/) API is a fully managed Web API designed to help application developers:\n",
"\n",
"Konko API is a fully managed API designed to help application developers:\n",
"\n",
"1. Select the right LLM(s) for their application\n",
"2. Prototype with various open-source and proprietary LLMs\n",
"3. Move to production in-line with their security, privacy, throughput, latency SLAs without infrastructure set-up or administration using Konko AI's SOC 2 compliant infrastructure\n",
"\n",
"\n",
"This example goes over how to use LangChain to interact with `Konko` [models](https://docs.konko.ai/docs/overview)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To run this notebook, you'll need Konko API key. You can request it by messaging support@konko.ai."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.chat_models import ChatKonko\n",
"from langchain.prompts.chat import (\n",
" ChatPromptTemplate,\n",
" SystemMessagePromptTemplate,\n",
" AIMessagePromptTemplate,\n",
" HumanMessagePromptTemplate,\n",
")\n",
"from langchain.schema import AIMessage, HumanMessage, SystemMessage"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. Set API Keys\n",
"\n",
"<br />\n",
"\n",
"### Option 1: Set Environment Variables\n",
"\n",
"1. You can set environment variables for \n",
" 1. KONKO_API_KEY (Required)\n",
" 2. OPENAI_API_KEY (Optional)\n",
"2. In your current shell session, use the export command:\n",
"\n",
"```shell\n",
"export KONKO_API_KEY={your_KONKO_API_KEY_here}\n",
"export OPENAI_API_KEY={your_OPENAI_API_KEY_here} #Optional\n",
"```\n",
"\n",
"Alternatively, you can add the above lines directly to your shell startup script (such as .bashrc or .bash_profile for Bash shell and .zshrc for Zsh shell) to have them set automatically every time a new shell session starts.\n",
"\n",
"### Option 2: Set API Keys Programmatically\n",
"\n",
"If you prefer to set your API keys directly within your Python script or Jupyter notebook, you can use the following commands:\n",
"\n",
"```python\n",
"konko.set_api_key('your_KONKO_API_KEY_here') \n",
"konko.set_openai_api_key('your_OPENAI_API_KEY_here') # Optional\n",
"```\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Calling a model\n",
"\n",
"Find a model on the [Konko overview page](https://docs.konko.ai/docs/overview)\n",
"\n",
"For example, for this [LLama 2 model](https://docs.konko.ai/docs/meta-llama-2-13b-chat). The model id would be: `\"meta-llama/Llama-2-13b-chat-hf\"`\n",
"\n",
"Another way to find the list of models running on the Konko instance is through this [endpoint](https://docs.konko.ai/reference/listmodels).\n",
"\n",
"From here, we can initialize our model:\n"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"chat = ChatKonko(max_tokens=400, model = 'meta-llama/Llama-2-13b-chat-hf')"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"AIMessage(content=\" Sure, I'd be happy to explain the Big Bang Theory briefly!\\n\\nThe Big Bang Theory is the leading explanation for the origin and evolution of the universe, based on a vast amount of observational evidence from many fields of science. In essence, the theory posits that the universe began as an infinitely hot and dense point, known as a singularity, around 13.8 billion years ago. This singularity expanded rapidly, and as it did, it cooled and formed subatomic particles, which eventually coalesced into the first atoms, and later into the stars and galaxies we see today.\\n\\nThe theory gets its name from the idea that the universe began in a state of incredibly high energy and temperature, and has been expanding and cooling ever since. This expansion is thought to have been driven by a mysterious force known as dark energy, which is thought to be responsible for the accelerating expansion of the universe.\\n\\nOne of the key predictions of the Big Bang Theory is that the universe should be homogeneous and isotropic on large scales, meaning that it should look the same in all directions and have the same properties everywhere. This prediction has been confirmed by a wealth of observational evidence, including the cosmic microwave background radiation, which is thought to be a remnant of the early universe.\\n\\nOverall, the Big Bang Theory is a well-established and widely accepted explanation for the origins of the universe, and it has been supported by a vast amount of observational evidence from many fields of science.\", additional_kwargs={}, example=False)"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"messages = [\n",
" SystemMessage(\n",
" content=\"You are a helpful assistant.\"\n",
" ),\n",
" HumanMessage(\n",
" content=\"Explain Big Bang Theory briefly\"\n",
" ),\n",
"]\n",
"chat(messages)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.3"
},
"vscode": {
"interpreter": {
"hash": "a0a0263b650d907a3bfe41c0f8d6a63a071b884df3cfdc1579f00cdc1aed6b03"
}
}
},
"nbformat": 4,
"nbformat_minor": 4
}

@ -0,0 +1,80 @@
# Konko
This page covers how to run models on Konko within LangChain.
Konko API is a fully managed API designed to help application developers:
Select the right LLM(s) for their application
Prototype with various open-source and proprietary LLMs
Move to production in-line with their security, privacy, throughput, latency SLAs without infrastructure set-up or administration using Konko AI's SOC 2 compliant infrastructure
## Installation and Setup
### First you'll need an API key
You can request it by messaging [support@konko.ai](mailto:support@konko.ai)
### Install Konko AI's Python SDK
#### 1. Enable a Python3.8+ environment
#### 2. Set API Keys
##### Option 1: Set Environment Variables
1. You can set environment variables for
1. KONKO_API_KEY (Required)
2. OPENAI_API_KEY (Optional)
2. In your current shell session, use the export command:
```shell
export KONKO_API_KEY={your_KONKO_API_KEY_here}
export OPENAI_API_KEY={your_OPENAI_API_KEY_here} #Optional
```
Alternatively, you can add the above lines directly to your shell startup script (such as .bashrc or .bash_profile for Bash shell and .zshrc for Zsh shell) to have them set automatically every time a new shell session starts.
##### Option 2: Set API Keys Programmatically
If you prefer to set your API keys directly within your Python script or Jupyter notebook, you can use the following commands:
```python
konko.set_api_key('your_KONKO_API_KEY_here')
konko.set_openai_api_key('your_OPENAI_API_KEY_here') # Optional
```
#### 3. Install the SDK
```shell
pip install konko
```
#### 4. Verify Installation & Authentication
```python
#Confirm konko has installed successfully
import konko
#Confirm API keys from Konko and OpenAI are set properly
konko.Model.list()
```
## Calling a model
Find a model on the [Konko Introduction page](https://docs.konko.ai/docs#available-models)
For example, for this [LLama 2 model](https://docs.konko.ai/docs/meta-llama-2-13b-chat). The model id would be: `"meta-llama/Llama-2-13b-chat-hf"`
Another way to find the list of models running on the Konko instance is through this [endpoint](https://docs.konko.ai/reference/listmodels).
From here, we can initialize our model:
```python
chat_instance = ChatKonko(max_tokens=10, model = 'meta-llama/Llama-2-13b-chat-hf')
```
And run it:
```python
msg = HumanMessage(content="Hi")
chat_response = chat_instance([msg])
```

@ -20,12 +20,12 @@ an interface where "chat messages" are the inputs and outputs.
from langchain.chat_models.anthropic import ChatAnthropic
from langchain.chat_models.anyscale import ChatAnyscale
from langchain.chat_models.azure_openai import AzureChatOpenAI
from langchain.chat_models.bedrock import BedrockChat
from langchain.chat_models.ernie import ErnieBotChat
from langchain.chat_models.fake import FakeListChatModel
from langchain.chat_models.google_palm import ChatGooglePalm
from langchain.chat_models.human import HumanInputChatModel
from langchain.chat_models.jinachat import JinaChat
from langchain.chat_models.konko import ChatKonko
from langchain.chat_models.litellm import ChatLiteLLM
from langchain.chat_models.mlflow_ai_gateway import ChatMLflowAIGateway
from langchain.chat_models.ollama import ChatOllama
@ -36,7 +36,6 @@ from langchain.chat_models.vertexai import ChatVertexAI
__all__ = [
"ChatOpenAI",
"AzureChatOpenAI",
"BedrockChat",
"FakeListChatModel",
"PromptLayerChatOpenAI",
"ChatAnthropic",
@ -49,4 +48,5 @@ __all__ = [
"ChatAnyscale",
"ChatLiteLLM",
"ErnieBotChat",
"ChatKonko",
]

@ -0,0 +1,292 @@
"""KonkoAI chat wrapper."""
from __future__ import annotations
import logging
import os
from typing import (
Any,
Dict,
Iterator,
List,
Mapping,
Optional,
Set,
Tuple,
Union,
)
import requests
from langchain.adapters.openai import convert_dict_to_message, convert_message_to_dict
from langchain.callbacks.manager import (
CallbackManagerForLLMRun,
)
from langchain.chat_models.openai import ChatOpenAI, _convert_delta_to_message_chunk
from langchain.pydantic_v1 import Field, root_validator
from langchain.schema import ChatGeneration, ChatResult
from langchain.schema.messages import AIMessageChunk, BaseMessage
from langchain.schema.output import ChatGenerationChunk
from langchain.utils import get_from_dict_or_env
DEFAULT_API_BASE = "https://api.konko.ai/v1"
DEFAULT_MODEL = "meta-llama/Llama-2-13b-chat-hf"
logger = logging.getLogger(__name__)
class ChatKonko(ChatOpenAI):
"""`ChatKonko` Chat large language models API.
To use, you should have the ``konko`` python package installed, and the
environment variable ``KONKO_API_KEY`` and ``OPENAI_API_KEY`` set with your API key.
Any parameters that are valid to be passed to the konko.create call can be passed
in, even if not explicitly saved on this class.
Example:
.. code-block:: python
from langchain.chat_models import ChatKonko
llm = ChatKonko(model="meta-llama/Llama-2-13b-chat-hf")
"""
@property
def lc_secrets(self) -> Dict[str, str]:
return {"konko_api_key": "KONKO_API_KEY", "openai_api_key": "OPENAI_API_KEY"}
@property
def lc_serializable(self) -> bool:
return True
client: Any = None #: :meta private:
model: str = Field(default=DEFAULT_MODEL, alias="model")
"""Model name to use."""
temperature: float = 0.7
"""What sampling temperature to use."""
model_kwargs: Dict[str, Any] = Field(default_factory=dict)
"""Holds any model parameters valid for `create` call not explicitly specified."""
openai_api_key: Optional[str] = None
konko_api_key: Optional[str] = None
request_timeout: Optional[Union[float, Tuple[float, float]]] = None
"""Timeout for requests to Konko completion API."""
max_retries: int = 6
"""Maximum number of retries to make when generating."""
streaming: bool = False
"""Whether to stream the results or not."""
n: int = 1
"""Number of chat completions to generate for each prompt."""
max_tokens: int = 20
"""Maximum number of tokens to generate."""
@root_validator()
def validate_environment(cls, values: Dict) -> Dict:
"""Validate that api key and python package exists in environment."""
values["konko_api_key"] = get_from_dict_or_env(
values, "konko_api_key", "KONKO_API_KEY"
)
try:
import konko
except ImportError:
raise ValueError(
"Could not import konko python package. "
"Please install it with `pip install konko`."
)
try:
values["client"] = konko.ChatCompletion
except AttributeError:
raise ValueError(
"`konko` has no `ChatCompletion` attribute, this is likely "
"due to an old version of the konko package. Try upgrading it "
"with `pip install --upgrade konko`."
)
if values["n"] < 1:
raise ValueError("n must be at least 1.")
if values["n"] > 1 and values["streaming"]:
raise ValueError("n must be 1 when streaming.")
return values
@property
def _default_params(self) -> Dict[str, Any]:
"""Get the default parameters for calling Konko API."""
return {
"model": self.model,
"request_timeout": self.request_timeout,
"max_tokens": self.max_tokens,
"stream": self.streaming,
"n": self.n,
"temperature": self.temperature,
**self.model_kwargs,
}
@staticmethod
def get_available_models(
konko_api_key: Optional[str] = None,
openai_api_key: Optional[str] = None,
konko_api_base: str = DEFAULT_API_BASE,
) -> Set[str]:
"""Get available models from Konko API."""
# Try to retrieve the OpenAI API key if it's not passed as an argument
if not openai_api_key:
try:
openai_api_key = os.environ["OPENAI_API_KEY"]
except KeyError:
pass # It's okay if it's not set, we just won't use it
# Try to retrieve the Konko API key if it's not passed as an argument
if not konko_api_key:
try:
konko_api_key = os.environ["KONKO_API_KEY"]
except KeyError:
raise ValueError(
"Konko API key must be passed as keyword argument or "
"set in environment variable KONKO_API_KEY."
)
models_url = f"{konko_api_base}/models"
headers = {
"Authorization": f"Bearer {konko_api_key}",
}
if openai_api_key:
headers["X-OpenAI-Api-Key"] = openai_api_key
models_response = requests.get(models_url, headers=headers)
if models_response.status_code != 200:
raise ValueError(
f"Error getting models from {models_url}: "
f"{models_response.status_code}"
)
return {model["id"] for model in models_response.json()["data"]}
def completion_with_retry(
self, run_manager: Optional[CallbackManagerForLLMRun] = None, **kwargs: Any
) -> Any:
def _completion_with_retry(**kwargs: Any) -> Any:
return self.client.create(**kwargs)
return _completion_with_retry(**kwargs)
def _combine_llm_outputs(self, llm_outputs: List[Optional[dict]]) -> dict:
overall_token_usage: dict = {}
for output in llm_outputs:
if output is None:
# Happens in streaming
continue
token_usage = output["token_usage"]
for k, v in token_usage.items():
if k in overall_token_usage:
overall_token_usage[k] += v
else:
overall_token_usage[k] = v
return {"token_usage": overall_token_usage, "model_name": self.model}
def _stream(
self,
messages: List[BaseMessage],
stop: Optional[List[str]] = None,
run_manager: Optional[CallbackManagerForLLMRun] = None,
**kwargs: Any,
) -> Iterator[ChatGenerationChunk]:
message_dicts, params = self._create_message_dicts(messages, stop)
params = {**params, **kwargs, "stream": True}
default_chunk_class = AIMessageChunk
for chunk in self.completion_with_retry(
messages=message_dicts, run_manager=run_manager, **params
):
if len(chunk["choices"]) == 0:
continue
choice = chunk["choices"][0]
chunk = _convert_delta_to_message_chunk(
choice["delta"], default_chunk_class
)
finish_reason = choice.get("finish_reason")
generation_info = (
dict(finish_reason=finish_reason) if finish_reason is not None else None
)
default_chunk_class = chunk.__class__
yield ChatGenerationChunk(message=chunk, generation_info=generation_info)
if run_manager:
run_manager.on_llm_new_token(chunk.content, chunk=chunk)
def _generate(
self,
messages: List[BaseMessage],
stop: Optional[List[str]] = None,
run_manager: Optional[CallbackManagerForLLMRun] = None,
stream: Optional[bool] = None,
**kwargs: Any,
) -> ChatResult:
if stream if stream is not None else self.streaming:
generation: Optional[ChatGenerationChunk] = None
for chunk in self._stream(
messages=messages, stop=stop, run_manager=run_manager, **kwargs
):
if generation is None:
generation = chunk
else:
generation += chunk
assert generation is not None
return ChatResult(generations=[generation])
message_dicts, params = self._create_message_dicts(messages, stop)
params = {**params, **kwargs}
response = self.completion_with_retry(
messages=message_dicts, run_manager=run_manager, **params
)
return self._create_chat_result(response)
def _create_message_dicts(
self, messages: List[BaseMessage], stop: Optional[List[str]]
) -> Tuple[List[Dict[str, Any]], Dict[str, Any]]:
params = self._client_params
if stop is not None:
if "stop" in params:
raise ValueError("`stop` found in both the input and default params.")
params["stop"] = stop
message_dicts = [convert_message_to_dict(m) for m in messages]
return message_dicts, params
def _create_chat_result(self, response: Mapping[str, Any]) -> ChatResult:
generations = []
for res in response["choices"]:
message = convert_dict_to_message(res["message"])
gen = ChatGeneration(
message=message,
generation_info=dict(finish_reason=res.get("finish_reason")),
)
generations.append(gen)
token_usage = response.get("usage", {})
llm_output = {"token_usage": token_usage, "model_name": self.model}
return ChatResult(generations=generations, llm_output=llm_output)
@property
def _identifying_params(self) -> Dict[str, Any]:
"""Get the identifying parameters."""
return {**{"model_name": self.model}, **self._default_params}
@property
def _client_params(self) -> Dict[str, Any]:
"""Get the parameters used for the konko client."""
return {**self._default_params}
def _get_invocation_params(
self, stop: Optional[List[str]] = None, **kwargs: Any
) -> Dict[str, Any]:
"""Get the parameters used to invoke the model."""
return {
"model": self.model,
**super()._get_invocation_params(stop=stop),
**self._default_params,
**kwargs,
}
@property
def _llm_type(self) -> str:
"""Return type of chat model."""
return "konko-chat"

@ -0,0 +1,178 @@
"""Evaluate ChatKonko Interface."""
from typing import Any
import pytest
from langchain.callbacks.manager import CallbackManager
from langchain.chat_models.konko import ChatKonko
from langchain.schema import (
ChatGeneration,
ChatResult,
LLMResult,
)
from langchain.schema.messages import BaseMessage, HumanMessage, SystemMessage
from tests.unit_tests.callbacks.fake_callback_handler import FakeCallbackHandler
def test_konko_chat_test() -> None:
"""Evaluate basic ChatKonko functionality."""
chat_instance = ChatKonko(max_tokens=10)
msg = HumanMessage(content="Hi")
chat_response = chat_instance([msg])
assert isinstance(chat_response, BaseMessage)
assert isinstance(chat_response.content, str)
def test_konko_chat_test_openai() -> None:
"""Evaluate basic ChatKonko functionality."""
chat_instance = ChatKonko(max_tokens=10, model="gpt-3.5-turbo")
msg = HumanMessage(content="Hi")
chat_response = chat_instance([msg])
assert isinstance(chat_response, BaseMessage)
assert isinstance(chat_response.content, str)
def test_konko_model_test() -> None:
"""Check how ChatKonko manages model_name."""
chat_instance = ChatKonko(model="alpha")
assert chat_instance.model == "alpha"
chat_instance = ChatKonko(model="beta")
assert chat_instance.model == "beta"
def test_konko_available_model_test() -> None:
"""Check how ChatKonko manages model_name."""
chat_instance = ChatKonko(max_tokens=10, n=2)
res = chat_instance.get_available_models()
assert isinstance(res, set)
def test_konko_system_msg_test() -> None:
"""Evaluate ChatKonko's handling of system messages."""
chat_instance = ChatKonko(max_tokens=10)
sys_msg = SystemMessage(content="Initiate user chat.")
user_msg = HumanMessage(content="Hi there")
chat_response = chat_instance([sys_msg, user_msg])
assert isinstance(chat_response, BaseMessage)
assert isinstance(chat_response.content, str)
def test_konko_generation_test() -> None:
"""Check ChatKonko's generation ability."""
chat_instance = ChatKonko(max_tokens=10, n=2)
msg = HumanMessage(content="Hi")
gen_response = chat_instance.generate([[msg], [msg]])
assert isinstance(gen_response, LLMResult)
assert len(gen_response.generations) == 2
for gen_list in gen_response.generations:
assert len(gen_list) == 2
for gen in gen_list:
assert isinstance(gen, ChatGeneration)
assert isinstance(gen.text, str)
assert gen.text == gen.message.content
def test_konko_multiple_outputs_test() -> None:
"""Test multiple completions with ChatKonko."""
chat_instance = ChatKonko(max_tokens=10, n=5)
msg = HumanMessage(content="Hi")
gen_response = chat_instance._generate([msg])
assert isinstance(gen_response, ChatResult)
assert len(gen_response.generations) == 5
for gen in gen_response.generations:
assert isinstance(gen.message, BaseMessage)
assert isinstance(gen.message.content, str)
def test_konko_streaming_callback_test() -> None:
"""Evaluate streaming's token callback functionality."""
callback_instance = FakeCallbackHandler()
callback_mgr = CallbackManager([callback_instance])
chat_instance = ChatKonko(
max_tokens=10,
streaming=True,
temperature=0,
callback_manager=callback_mgr,
verbose=True,
)
msg = HumanMessage(content="Hi")
chat_response = chat_instance([msg])
assert callback_instance.llm_streams > 0
assert isinstance(chat_response, BaseMessage)
def test_konko_streaming_info_test() -> None:
"""Ensure generation details are retained during streaming."""
class TestCallback(FakeCallbackHandler):
data_store: dict = {}
def on_llm_end(self, *args: Any, **kwargs: Any) -> Any:
self.data_store["generation"] = args[0]
callback_instance = TestCallback()
callback_mgr = CallbackManager([callback_instance])
chat_instance = ChatKonko(
max_tokens=2,
temperature=0,
callback_manager=callback_mgr,
)
list(chat_instance.stream("hey"))
gen_data = callback_instance.data_store["generation"]
assert gen_data.generations[0][0].text == " Hey"
def test_konko_llm_model_name_test() -> None:
"""Check if llm_output has model info."""
chat_instance = ChatKonko(max_tokens=10)
msg = HumanMessage(content="Hi")
llm_data = chat_instance.generate([[msg]])
assert llm_data.llm_output is not None
assert llm_data.llm_output["model_name"] == chat_instance.model
def test_konko_streaming_model_name_test() -> None:
"""Check model info during streaming."""
chat_instance = ChatKonko(max_tokens=10, streaming=True)
msg = HumanMessage(content="Hi")
llm_data = chat_instance.generate([[msg]])
assert llm_data.llm_output is not None
assert llm_data.llm_output["model_name"] == chat_instance.model
def test_konko_streaming_param_validation_test() -> None:
"""Ensure correct token callback during streaming."""
with pytest.raises(ValueError):
ChatKonko(
max_tokens=10,
streaming=True,
temperature=0,
n=5,
)
def test_konko_additional_args_test() -> None:
"""Evaluate extra arguments for ChatKonko."""
chat_instance = ChatKonko(extra=3, max_tokens=10)
assert chat_instance.max_tokens == 10
assert chat_instance.model_kwargs == {"extra": 3}
chat_instance = ChatKonko(extra=3, model_kwargs={"addition": 2})
assert chat_instance.model_kwargs == {"extra": 3, "addition": 2}
with pytest.raises(ValueError):
ChatKonko(extra=3, model_kwargs={"extra": 2})
with pytest.raises(ValueError):
ChatKonko(model_kwargs={"temperature": 0.2})
with pytest.raises(ValueError):
ChatKonko(model_kwargs={"model": "text-davinci-003"})
def test_konko_token_streaming_test() -> None:
"""Check token streaming for ChatKonko."""
chat_instance = ChatKonko(max_tokens=10)
for token in chat_instance.stream("Just a test"):
assert isinstance(token.content, str)
Loading…
Cancel
Save