mirror of
https://github.com/hwchase17/langchain
synced 2024-10-29 17:07:25 +00:00
13c62cf6b1
Co-authored-by: Max Cembalest <115359769+arthuractivemodeling@users.noreply.github.com>
465 lines
13 KiB
Plaintext
465 lines
13 KiB
Plaintext
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "944e4194",
|
|
"metadata": {},
|
|
"source": [
|
|
"# Arthur LangChain integration"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "b1ccdfe8",
|
|
"metadata": {},
|
|
"source": [
|
|
"[Arthur](https://www.arthur.ai/) is a model monitoring and observability platform.\n",
|
|
"\n",
|
|
"This notebook shows how to register LLMs (chat and non-chat) as models with the Arthur platform. Then we show how to set up langchain LLMs with an Arthur callback that will automatically log model inferences to Arthur.\n",
|
|
"\n",
|
|
"For more information about how to use the Arthur SDK, visit our [docs](http://docs.arthur.ai), in particular our [model onboarding guide](https://docs.arthur.ai/user-guide/walkthroughs/model-onboarding/index.html)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 21,
|
|
"id": "961c6691",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from langchain.callbacks import ArthurCallbackHandler\n",
|
|
"from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler\n",
|
|
"from langchain.chat_models import ChatOpenAI, ChatAnthropic\n",
|
|
"from langchain.schema import HumanMessage\n",
|
|
"from langchain.llms import OpenAI, Cohere, HuggingFacePipeline"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 3,
|
|
"id": "a23d1963",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from arthurai import ArthurAI\n",
|
|
"from arthurai.common.constants import InputType, OutputType, Stage, ValueType\n",
|
|
"from arthurai.core.attributes import ArthurAttribute, AttributeCategory"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "4d1b90c0",
|
|
"metadata": {},
|
|
"source": [
|
|
"# ArthurModel for chatbot with only input text and output text attributes"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "1a4a4a8a",
|
|
"metadata": {},
|
|
"source": [
|
|
"Connect to Arthur client"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 4,
|
|
"id": "f49e9b79",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"arthur_url = \"https://app.arthur.ai\"\n",
|
|
"arthur_login = \"your-username-here\"\n",
|
|
"arthur = ArthurAI(url=arthur_url, login=arthur_login)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "c6e063bf",
|
|
"metadata": {},
|
|
"source": [
|
|
"Before you can register model inferences to Arthur, you must have a registered model with an ID in the Arthur platform. We will provide this ID to the ArthurCallbackHandler.\n",
|
|
"\n",
|
|
"You can register a model with Arthur here in the notebook using this `register_chat_llm()` function. This function returns the ID of the model saved to the platform. To use the function, uncomment `arthur_model_chatbot_id = register_chat_llm()` in the cell below."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 5,
|
|
"id": "31b17b5e",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"def register_chat_llm():\n",
|
|
"\n",
|
|
" arthur_model = arthur.model(\n",
|
|
" display_name=\"LangChainChat\",\n",
|
|
" input_type=InputType.NLP,\n",
|
|
" output_type=OutputType.TokenSequence\n",
|
|
" )\n",
|
|
"\n",
|
|
" arthur_model._add_attribute_to_model(ArthurAttribute(\n",
|
|
" name=\"my_input_text\",\n",
|
|
" stage=Stage.ModelPipelineInput,\n",
|
|
" value_type=ValueType.Unstructured_Text,\n",
|
|
" categorical=True,\n",
|
|
" is_unique=True\n",
|
|
" ))\n",
|
|
" arthur_model._add_attribute_to_model(ArthurAttribute(\n",
|
|
" name=\"my_output_text\",\n",
|
|
" stage=Stage.PredictedValue,\n",
|
|
" value_type=ValueType.Unstructured_Text,\n",
|
|
" categorical=True,\n",
|
|
" is_unique=False,\n",
|
|
" ))\n",
|
|
" \n",
|
|
" return arthur_model.save()\n",
|
|
"# arthur_model_chatbot_id = register_chat_llm()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "0d1d1e60",
|
|
"metadata": {},
|
|
"source": [
|
|
"Alternatively, you can set the `arthur_model_chatbot_id` variable to be the ID of your model on your [model dashboard](https://app.arthur.ai/)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 1,
|
|
"id": "cdfa02c8",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"arthur_model_chatbot_id = \"your-model-id-here\""
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "58be5234",
|
|
"metadata": {},
|
|
"source": [
|
|
"This function creates a Langchain chat LLM with the ArthurCallbackHandler to log inferences to Arthur. We provide our `arthur_model_chatbot_id`, as well as the Arthur url and login we are using."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 7,
|
|
"id": "448a8fee",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"def make_langchain_chat_llm(chat_model=ChatOpenAI):\n",
|
|
" if chat_model not in [ChatOpenAI, ChatAnthropic]:\n",
|
|
" raise ValueError(\"For this notebook, use one of the chat models imported from langchain.chat_models\")\n",
|
|
" return chat_model(\n",
|
|
" streaming=True, \n",
|
|
" temperature=0.1,\n",
|
|
" callbacks=[\n",
|
|
" StreamingStdOutCallbackHandler(), \n",
|
|
" ArthurCallbackHandler.from_credentials(arthur_model_chatbot_id, arthur_url=arthur_url, arthur_login=arthur_login)\n",
|
|
" ])\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "17c182da",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": []
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 8,
|
|
"id": "2dfc00ed",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"chat_llm = make_langchain_chat_llm()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "139291f2",
|
|
"metadata": {},
|
|
"source": [
|
|
"Run the chatbot (it will save the chat history in the `history` list so that the conversation can reference earlier messages)\n",
|
|
"\n",
|
|
"Type `q` to quit"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 9,
|
|
"id": "7480a443",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"def run_langchain_chat_llm(llm):\n",
|
|
" history = []\n",
|
|
" while True:\n",
|
|
" user_input = input(\"\\n>>> input >>>\\n>>>: \")\n",
|
|
" if user_input == 'q': break\n",
|
|
" history.append(HumanMessage(content=user_input))\n",
|
|
" history.append(llm(history))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 10,
|
|
"id": "6868ce71",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"run_langchain_chat_llm(chat_llm)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "a0be7d01",
|
|
"metadata": {},
|
|
"source": [
|
|
"# ArthurModel with input text, output text, token likelihoods, finish reason, and amount of token usage attributes"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "1ee4b741",
|
|
"metadata": {},
|
|
"source": [
|
|
"This function registers an LLM with additional metadata attributes to log to Arthur with each inference\n",
|
|
"\n",
|
|
"As above, you can register your callback handler for an LLM using this function here in the notebook or by pasting the ID of an already-registered model from your [model dashboard](https://app.arthur.ai/)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 11,
|
|
"id": "e671836c",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"def register_llm():\n",
|
|
"\n",
|
|
" arthur_model = arthur.model(\n",
|
|
" display_name=\"LangChainLLM\",\n",
|
|
" input_type=InputType.NLP,\n",
|
|
" output_type=OutputType.TokenSequence\n",
|
|
" )\n",
|
|
" arthur_model._add_attribute_to_model(ArthurAttribute(\n",
|
|
" name=\"my_input_text\",\n",
|
|
" stage=Stage.ModelPipelineInput,\n",
|
|
" value_type=ValueType.Unstructured_Text,\n",
|
|
" categorical=True,\n",
|
|
" is_unique=True\n",
|
|
" ))\n",
|
|
" arthur_model._add_attribute_to_model(ArthurAttribute(\n",
|
|
" name=\"my_output_text\",\n",
|
|
" stage=Stage.PredictedValue,\n",
|
|
" value_type=ValueType.Unstructured_Text,\n",
|
|
" categorical=True,\n",
|
|
" is_unique=False,\n",
|
|
" token_attribute_link=\"my_output_likelihoods\"\n",
|
|
" ))\n",
|
|
" arthur_model._add_attribute_to_model(ArthurAttribute(\n",
|
|
" name=\"my_output_likelihoods\",\n",
|
|
" stage=Stage.PredictedValue,\n",
|
|
" value_type=ValueType.TokenLikelihoods,\n",
|
|
" token_attribute_link=\"my_output_text\"\n",
|
|
" ))\n",
|
|
" arthur_model._add_attribute_to_model(ArthurAttribute(\n",
|
|
" name=\"finish_reason\",\n",
|
|
" stage=Stage.NonInputData,\n",
|
|
" value_type=ValueType.String,\n",
|
|
" categorical=True,\n",
|
|
" categories=[\n",
|
|
" AttributeCategory(value='stop'),\n",
|
|
" AttributeCategory(value='length'),\n",
|
|
" AttributeCategory(value='content_filter'),\n",
|
|
" AttributeCategory(value='null')\n",
|
|
" ]\n",
|
|
" ))\n",
|
|
" arthur_model._add_attribute_to_model(ArthurAttribute(\n",
|
|
" name=\"prompt_tokens\",\n",
|
|
" stage=Stage.NonInputData,\n",
|
|
" value_type=ValueType.Integer\n",
|
|
" ))\n",
|
|
" arthur_model._add_attribute_to_model(ArthurAttribute(\n",
|
|
" name=\"completion_tokens\",\n",
|
|
" stage=Stage.NonInputData,\n",
|
|
" value_type=ValueType.Integer\n",
|
|
" ))\n",
|
|
" arthur_model._add_attribute_to_model(ArthurAttribute(\n",
|
|
" name=\"duration\",\n",
|
|
" stage=Stage.NonInputData,\n",
|
|
" value_type=ValueType.Float\n",
|
|
" ))\n",
|
|
" \n",
|
|
" return arthur_model.save()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 12,
|
|
"id": "2a6686f7",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"arthur_model_llm_id = \"your-model-id-here\""
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "2dcacb96",
|
|
"metadata": {},
|
|
"source": [
|
|
"These functions create Langchain LLMs with the ArthurCallbackHandler to log inferences to Arthur.\n",
|
|
"\n",
|
|
"There are small differences in the underlying Langchain integrations with these libraries and the available metadata for model inputs & outputs"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 23,
|
|
"id": "34cf0072",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"def make_langchain_openai_llm():\n",
|
|
" return OpenAI(\n",
|
|
" temperature=0.1,\n",
|
|
" model_kwargs = {'logprobs': 3},\n",
|
|
" callbacks=[\n",
|
|
" ArthurCallbackHandler.from_credentials(arthur_model_llm_id, arthur_url=arthur_url, arthur_login=arthur_login)\n",
|
|
" ])\n",
|
|
"\n",
|
|
"def make_langchain_cohere_llm():\n",
|
|
" return Cohere(\n",
|
|
" temperature=0.1,\n",
|
|
" callbacks=[\n",
|
|
" ArthurCallbackHandler.from_credentials(arthur_model_chatbot_id, arthur_url=arthur_url, arthur_login=arthur_login)\n",
|
|
" ])\n",
|
|
"\n",
|
|
"def make_langchain_huggingface_llm():\n",
|
|
" llm = HuggingFacePipeline.from_model_id(\n",
|
|
" model_id=\"bert-base-uncased\", \n",
|
|
" task=\"text-generation\", \n",
|
|
" model_kwargs={\"temperature\":2.5, \"max_length\":64})\n",
|
|
" llm.callbacks = [\n",
|
|
" ArthurCallbackHandler.from_credentials(arthur_model_chatbot_id, arthur_url=arthur_url, arthur_login=arthur_login)\n",
|
|
" ]\n",
|
|
" return llm"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 24,
|
|
"id": "f40c3ce0",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"openai_llm = make_langchain_openai_llm()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 25,
|
|
"id": "8476d531",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"cohere_llm = make_langchain_cohere_llm()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 2,
|
|
"id": "7483b9d3",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"huggingface_llm = make_langchain_huggingface_llm()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "c17d8e86",
|
|
"metadata": {},
|
|
"source": [
|
|
"Run the LLM (each completion is independent, no chat history is saved as we were doing above with the chat llms)\n",
|
|
"\n",
|
|
"Type `q` to quit"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 17,
|
|
"id": "72ee0790",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"def run_langchain_llm(llm):\n",
|
|
" while True:\n",
|
|
" print(\"Type your text for completion:\\n\")\n",
|
|
" user_input = input(\"\\n>>> input >>>\\n>>>: \")\n",
|
|
" if user_input == 'q': break\n",
|
|
" print(llm(user_input), \"\\n================\\n\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 18,
|
|
"id": "fb864057",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"run_langchain_llm(openai_llm)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 19,
|
|
"id": "e6673769",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"run_langchain_llm(cohere_llm)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 1,
|
|
"id": "85541f1c",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"run_langchain_llm(huggingface_llm)"
|
|
]
|
|
}
|
|
],
|
|
"metadata": {
|
|
"kernelspec": {
|
|
"display_name": "Python 3 (ipykernel)",
|
|
"language": "python",
|
|
"name": "python3"
|
|
},
|
|
"language_info": {
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"file_extension": ".py",
|
|
"mimetype": "text/x-python",
|
|
"name": "python",
|
|
"nbconvert_exporter": "python",
|
|
"pygments_lexer": "ipython3",
|
|
"version": "3.10.8"
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 5
|
|
}
|