You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
langchain/docs/modules/prompts/prompt_templates/examples/connecting_to_a_feature_sto...

475 lines
14 KiB
Plaintext

{
"cells": [
{
"cell_type": "markdown",
"id": "a792b119",
"metadata": {},
"source": [
"# Connecting to a Feature Store\n",
"\n",
"Feature stores are a concept from traditional machine learning that make sure data fed into models is up-to-date and relevant. For more on this, see [here](https://www.tecton.ai/blog/what-is-a-feature-store/).\n",
"\n",
"This concept is extremely relevant when considering putting LLM applications in production. In order to personalize LLM applications, you may want to combine LLMs with up-to-date information about particular users. Feature stores can be a great way to keep that data fresh, and LangChain provides an easy way to combine that data with LLMs.\n",
"\n",
"In this notebook we will show how to connect prompt templates to feature stores. The basic idea is to call a feature store from inside a prompt template to retrieve values that are then formatted into the prompt."
]
},
{
"cell_type": "markdown",
"id": "ad0b5edf",
"metadata": {
"tags": []
},
"source": [
"## Feast\n",
"\n",
"To start, we will use the popular open source feature store framework [Feast](https://github.com/feast-dev/feast).\n",
"\n",
"This assumes you have already run the steps in the README around getting started. We will build of off that example in getting started, and create and LLMChain to write a note to a specific driver regarding their up-to-date statistics."
]
},
{
"cell_type": "markdown",
"id": "7f02f6f3",
"metadata": {},
"source": [
"### Load Feast Store\n",
"\n",
"Again, this should be set up according to the instructions in the Feast README"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "fd1a452a",
"metadata": {},
"outputs": [],
"source": [
"from feast import FeatureStore\n",
"\n",
"# You may need to update the path depending on where you stored it\n",
"feast_repo_path = \"../../../../../my_feature_repo/feature_repo/\"\n",
"store = FeatureStore(repo_path=feast_repo_path)"
]
},
{
"cell_type": "markdown",
"id": "cfe8aae5",
"metadata": {},
"source": [
"### Prompts\n",
"\n",
"Here we will set up a custom FeastPromptTemplate. This prompt template will take in a driver id, look up their stats, and format those stats into a prompt.\n",
"\n",
"Note that the input to this prompt template is just `driver_id`, since that is the only user defined piece (all other variables are looked up inside the prompt template)."
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "5e9cee04",
"metadata": {},
"outputs": [],
"source": [
"from langchain.prompts import PromptTemplate, StringPromptTemplate"
]
},
{
"cell_type": "code",
"execution_count": 19,
"id": "594a3cf3",
"metadata": {},
"outputs": [],
"source": [
"template = \"\"\"Given the driver's up to date stats, write them note relaying those stats to them.\n",
"If they have a conversation rate above .5, give them a compliment. Otherwise, make a silly joke about chickens at the end to make them feel better\n",
"\n",
"Here are the drivers stats:\n",
"Conversation rate: {conv_rate}\n",
"Acceptance rate: {acc_rate}\n",
"Average Daily Trips: {avg_daily_trips}\n",
"\n",
"Your response:\"\"\"\n",
"prompt = PromptTemplate.from_template(template)"
]
},
{
"cell_type": "code",
"execution_count": 20,
"id": "8464c731",
"metadata": {},
"outputs": [],
"source": [
"class FeastPromptTemplate(StringPromptTemplate):\n",
" \n",
" def format(self, **kwargs) -> str:\n",
" driver_id = kwargs.pop(\"driver_id\")\n",
" feature_vector = store.get_online_features(\n",
" features=[\n",
" 'driver_hourly_stats:conv_rate',\n",
" 'driver_hourly_stats:acc_rate',\n",
" 'driver_hourly_stats:avg_daily_trips'\n",
" ],\n",
" entity_rows=[{\"driver_id\": 1001}]\n",
" ).to_dict()\n",
" kwargs[\"conv_rate\"] = feature_vector[\"conv_rate\"][0]\n",
" kwargs[\"acc_rate\"] = feature_vector[\"acc_rate\"][0]\n",
" kwargs[\"avg_daily_trips\"] = feature_vector[\"avg_daily_trips\"][0]\n",
" return prompt.format(**kwargs)"
]
},
{
"cell_type": "code",
"execution_count": 21,
"id": "c0c7bae2",
"metadata": {},
"outputs": [],
"source": [
"prompt_template = FeastPromptTemplate(input_variables=[\"driver_id\"])"
]
},
{
"cell_type": "code",
"execution_count": 22,
"id": "d8d70bb7",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Given the driver's up to date stats, write them note relaying those stats to them.\n",
"If they have a conversation rate above .5, give them a compliment. Otherwise, make a silly joke about chickens at the end to make them feel better\n",
"\n",
"Here are the drivers stats:\n",
"Conversation rate: 0.4745151400566101\n",
"Acceptance rate: 0.055561766028404236\n",
"Average Daily Trips: 936\n",
"\n",
"Your response:\n"
]
}
],
"source": [
"print(prompt_template.format(driver_id=1001))"
]
},
{
"cell_type": "markdown",
"id": "2870d070",
"metadata": {},
"source": [
"### Use in a chain\n",
"\n",
"We can now use this in a chain, successfully creating a chain that achieves personalization backed by a feature store"
]
},
{
"cell_type": "code",
"execution_count": 23,
"id": "7106255c",
"metadata": {},
"outputs": [],
"source": [
"from langchain.chat_models import ChatOpenAI\n",
"from langchain.chains import LLMChain"
]
},
{
"cell_type": "code",
"execution_count": 24,
"id": "79543326",
"metadata": {},
"outputs": [],
"source": [
"chain = LLMChain(llm=ChatOpenAI(), prompt=prompt_template)"
]
},
{
"cell_type": "code",
"execution_count": 25,
"id": "97a741a0",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"\"Hi there! I wanted to update you on your current stats. Your acceptance rate is 0.055561766028404236 and your average daily trips are 936. While your conversation rate is currently 0.4745151400566101, I have no doubt that with a little extra effort, you'll be able to exceed that .5 mark! Keep up the great work! And remember, even chickens can't always cross the road, but they still give it their best shot.\""
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chain.run(1001)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "12e59aaf",
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"id": "c4049990-651d-44d3-82b1-0cd122da55c1",
"metadata": {},
"source": [
"## Tecton\n",
"\n",
"Above, we showed how you could use Feast, a popular open source and self-managed feature store, with LangChain. Our examples below will show a similar integration using Tecton. Tecton is a fully managed feature platform built to orchestrate the complete ML feature lifecycle, from transformation to online serving, with enterprise-grade SLAs."
]
},
{
"cell_type": "markdown",
"id": "7bb4dba1-0678-4ea4-be0a-d353c0b13fc2",
"metadata": {
"tags": []
},
"source": [
"### Prerequisites\n",
"\n",
"* Tecton Deployment (sign up at [https://tecton.ai](https://tecton.ai))\n",
"* `TECTON_API_KEY` environment variable set to a valid Service Account key"
]
},
{
"cell_type": "markdown",
"id": "ac9eb618-8c52-4cd6-bb8e-9c99a150dfa6",
"metadata": {
"tags": []
},
"source": [
"### Define and Load Features\n",
"\n",
"We will use the user_transaction_counts Feature View from the [Tecton tutorial](https://docs.tecton.ai/docs/tutorials/tecton-fundamentals) as part of a Feature Service. For simplicity, we are only using a single Feature View; however, more sophisticated applications may require more feature views to retrieve the features needed for its prompt.\n",
"\n",
"```python\n",
"user_transaction_metrics = FeatureService(\n",
" name = \"user_transaction_metrics\",\n",
" features = [user_transaction_counts]\n",
")\n",
"```\n",
"\n",
"The above Feature Service is expected to be [applied to a live workspace](https://docs.tecton.ai/docs/applying-feature-repository-changes-to-a-workspace). For this example, we will be using the \"prod\" workspace."
]
},
{
"cell_type": "code",
"execution_count": 60,
"id": "32e9675d-a7e5-429f-906f-2260294d3e46",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"import tecton\n",
"\n",
"workspace = tecton.get_workspace(\"prod\")\n",
"feature_service = workspace.get_feature_service(\"user_transaction_metrics\")"
]
},
{
"cell_type": "markdown",
"id": "29b7550c-0eb4-4bd1-a501-1c63fb77aa56",
"metadata": {},
"source": [
"### Prompts\n",
"\n",
"Here we will set up a custom TectonPromptTemplate. This prompt template will take in a user_id , look up their stats, and format those stats into a prompt.\n",
"\n",
"Note that the input to this prompt template is just `user_id`, since that is the only user defined piece (all other variables are looked up inside the prompt template)."
]
},
{
"cell_type": "code",
"execution_count": 61,
"id": "6fb77ea4-64c6-4e48-a783-bd1ece021b82",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.prompts import PromptTemplate, StringPromptTemplate"
]
},
{
"cell_type": "code",
"execution_count": 77,
"id": "02a98fbc-8135-4b11-bf60-85d28e426667",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"template = \"\"\"Given the vendor's up to date transaction stats, write them a note based on the following rules:\n",
"\n",
"1. If they had a transaction in the last day, write a short congratulations message on their recent sales\n",
"2. If no transaction in the last day, but they had a transaction in the last 30 days, playfully encourage them to sell more.\n",
"3. Always add a silly joke about chickens at the end\n",
"\n",
"Here are the vendor's stats:\n",
"Number of Transactions Last Day: {transaction_count_1d}\n",
"Number of Transactions Last 30 Days: {transaction_count_30d}\n",
"\n",
"Your response:\"\"\"\n",
"prompt = PromptTemplate.from_template(template)"
]
},
{
"cell_type": "code",
"execution_count": 78,
"id": "a35cdfd5-6ccc-4394-acfe-60d53804be51",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"class TectonPromptTemplate(StringPromptTemplate):\n",
" \n",
" def format(self, **kwargs) -> str:\n",
" user_id = kwargs.pop(\"user_id\")\n",
" feature_vector = feature_service.get_online_features(join_keys={\"user_id\": user_id}).to_dict()\n",
" kwargs[\"transaction_count_1d\"] = feature_vector[\"user_transaction_counts.transaction_count_1d_1d\"]\n",
" kwargs[\"transaction_count_30d\"] = feature_vector[\"user_transaction_counts.transaction_count_30d_1d\"]\n",
" return prompt.format(**kwargs)"
]
},
{
"cell_type": "code",
"execution_count": 79,
"id": "d5915df0-fb16-4770-8a82-22f885b74d1a",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"prompt_template = TectonPromptTemplate(input_variables=[\"user_id\"])"
]
},
{
"cell_type": "code",
"execution_count": 80,
"id": "a36abfc8-ea60-4ae0-a36d-d7b639c7307c",
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Given the vendor's up to date transaction stats, write them a note based on the following rules:\n",
"\n",
"1. If they had a transaction in the last day, write a short congratulations message on their recent sales\n",
"2. If no transaction in the last day, but they had a transaction in the last 30 days, playfully encourage them to sell more.\n",
"3. Always add a silly joke about chickens at the end\n",
"\n",
"Here are the vendor's stats:\n",
"Number of Transactions Last Day: 657\n",
"Number of Transactions Last 30 Days: 20326\n",
"\n",
"Your response:\n"
]
}
],
"source": [
"print(prompt_template.format(user_id=\"user_469998441571\"))"
]
},
{
"cell_type": "markdown",
"id": "f8d4b905-1051-4303-9c33-8eddb65c1274",
"metadata": {
"tags": []
},
"source": [
"### Use in a chain\n",
"\n",
"We can now use this in a chain, successfully creating a chain that achieves personalization backed by the Tecton Feature Platform"
]
},
{
"cell_type": "code",
"execution_count": 81,
"id": "ffb60cd0-8e3c-4c9d-b639-43d766e12c4c",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.chat_models import ChatOpenAI\n",
"from langchain.chains import LLMChain"
]
},
{
"cell_type": "code",
"execution_count": 82,
"id": "3918abc7-00b5-466f-bdfc-ab046cd282da",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"chain = LLMChain(llm=ChatOpenAI(), prompt=prompt_template)"
]
},
{
"cell_type": "code",
"execution_count": 83,
"id": "e7d91c4b-3e99-40cc-b3e9-a004c8c9193e",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"'Wow, congratulations on your recent sales! Your business is really soaring like a chicken on a hot air balloon! Keep up the great work!'"
]
},
"execution_count": 83,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chain.run(\"user_469998441571\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f752b924-caf9-4f7a-b78b-cb8c8ada8c2e",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}