mirror of
https://github.com/hwchase17/langchain
synced 2024-10-31 15:20:26 +00:00
dff00ea91e
Still working out interface/notebooks + need discord data dump to test out things other than copy+paste Update: - Going to remove the 'user_id' arg in the loaders themselves and just standardize on putting the "sender" arg in the extra kwargs. Then can provide a utility function to map these to ai and human messages - Going to move the discord one into just a notebook since I don't have a good dump to test on and copy+paste maybe isn't the greatest thing to support in v0 - Need to do more testing on slack since it seems the dump only includes channels and NOT 1 on 1 convos - --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
164 lines
5.2 KiB
Plaintext
164 lines
5.2 KiB
Plaintext
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "01fcfa2f-33a9-48f3-835a-b1956c394d6b",
|
|
"metadata": {},
|
|
"source": [
|
|
"# Slack\n",
|
|
"\n",
|
|
"This notebook shows how to use the Slack chat loader. This class helps map exported slack conversations to LangChain chat messages.\n",
|
|
"\n",
|
|
"The process has three steps:\n",
|
|
"1. Export the desired conversation thread by following the [instructions here](https://slack.com/help/articles/1500001548241-Request-to-export-all-conversations).\n",
|
|
"2. Create the `SlackChatLoader` with the file path pointed to the json file or directory of JSON files\n",
|
|
"3. Call `loader.load()` (or `loader.lazy_load()`) to perform the conversion. Optionally use `merge_chat_runs` to combine message from the same sender in sequence, and/or `map_ai_messages` to convert messages from the specified sender to the \"AIMessage\" class.\n",
|
|
"\n",
|
|
"## 1. Creat message dump\n",
|
|
"\n",
|
|
"Currently (2023/08/23) this loader best supports a zip directory of files in the format generated by exporting your a direct message converstion from Slack. Follow up-to-date instructions from slack on how to do so.\n",
|
|
"\n",
|
|
"We have an example in the LangChain repo."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 1,
|
|
"id": "a79d35bf-5f21-4063-84bf-a60845c1c51f",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"import requests\n",
|
|
"\n",
|
|
"permalink = \"https://raw.githubusercontent.com/langchain-ai/langchain/342087bdfa3ac31d622385d0f2d09cf5e06c8db3/libs/langchain/tests/integration_tests/examples/slack_export.zip\"\n",
|
|
"response = requests.get(permalink)\n",
|
|
"with open(\"slack_dump.zip\", \"wb\") as f:\n",
|
|
" f.write(response.content)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "cf60f703-76f1-4602-a723-02c59535c1af",
|
|
"metadata": {},
|
|
"source": [
|
|
"## 2. Create the Chat Loader\n",
|
|
"\n",
|
|
"Provide the loader with the file path to the zip directory. You can optionally specify the user id that maps to an ai message as well an configure whether to merge message runs."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 2,
|
|
"id": "4b8b432a-d2bc-49e1-b35f-761730a8fd6d",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from langchain.chat_loaders.slack import SlackChatLoader"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 3,
|
|
"id": "8ec6661b-0aca-48ae-9e2b-6412856c287b",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"loader = SlackChatLoader(\n",
|
|
" path=\"slack_dump.zip\",\n",
|
|
")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "8805a7c5-84b4-49f5-8989-0022f2054ace",
|
|
"metadata": {},
|
|
"source": [
|
|
"## 3. Load messages\n",
|
|
"\n",
|
|
"The `load()` (or `lazy_load`) methods return a list of \"ChatSessions\" that currently just contain a list of messages per loaded conversation."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 4,
|
|
"id": "fcd69b3e-020d-4a15-8a0d-61c2d34e1ee1",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from typing import List\n",
|
|
"from langchain.chat_loaders.base import ChatSession\n",
|
|
"from langchain.chat_loaders.utils import (\n",
|
|
" map_ai_messages,\n",
|
|
" merge_chat_runs,\n",
|
|
")\n",
|
|
"\n",
|
|
"raw_messages = loader.lazy_load()\n",
|
|
"# Merge consecutive messages from the same sender into a single message\n",
|
|
"merged_messages = merge_chat_runs(raw_messages)\n",
|
|
"# Convert messages from \"U0500003428\" to AI messages\n",
|
|
"messages: List[ChatSession] = list(map_ai_messages(merged_messages, sender=\"U0500003428\"))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "7d033f87-cd0c-4f44-a753-41b871c1e919",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Next Steps\n",
|
|
"\n",
|
|
"You can then use these messages how you see fit, such as finetuning a model, few-shot example selection, or directly make predictions for the next message. "
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 5,
|
|
"id": "7d8a1629-5d9e-49b3-b978-3add57027d59",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Hi, \n",
|
|
"\n",
|
|
"I hope you're doing well. I wanted to reach out and ask if you'd be available to meet up for coffee sometime next week. I'd love to catch up and hear about what's been going on in your life. Let me know if you're interested and we can find a time that works for both of us. \n",
|
|
"\n",
|
|
"Looking forward to hearing from you!\n",
|
|
"\n",
|
|
"Best, [Your Name]"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"from langchain.chat_models import ChatOpenAI\n",
|
|
"\n",
|
|
"llm = ChatOpenAI()\n",
|
|
"\n",
|
|
"for chunk in llm.stream(messages[1]['messages']):\n",
|
|
" print(chunk.content, end=\"\", flush=True)"
|
|
]
|
|
}
|
|
],
|
|
"metadata": {
|
|
"kernelspec": {
|
|
"display_name": "Python 3 (ipykernel)",
|
|
"language": "python",
|
|
"name": "python3"
|
|
},
|
|
"language_info": {
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"file_extension": ".py",
|
|
"mimetype": "text/x-python",
|
|
"name": "python",
|
|
"nbconvert_exporter": "python",
|
|
"pygments_lexer": "ipython3",
|
|
"version": "3.11.2"
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 5
|
|
}
|