langchain/docs/extras/use_cases/autonomous_agents/hugginggpt.ipynb

{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# HuggingGPT\n",
    "Implementation of [HuggingGPT](https://github.com/microsoft/JARVIS). HuggingGPT is a system to connect LLMs (ChatGPT) with ML community (Hugging Face).\n",
    "\n",
    "+ 🔥 Paper: https://arxiv.org/abs/2303.17580\n",
    "+ 🚀 Project: https://github.com/microsoft/JARVIS\n",
    "+ 🤗 Space: https://huggingface.co/spaces/microsoft/HuggingGPT"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Set up tools\n",
    "\n",
    "We set up the tools available from [Transformers Agent](https://huggingface.co/docs/transformers/transformers_agents#tools). It includes a library of tools supported by Transformers and some customized tools such as image generator, video generator, text downloader and other tools."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from transformers import load_tool"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "hf_tools = [\n",
    "    load_tool(tool_name)\n",
    "    for tool_name in [\n",
    "        \"document-question-answering\",\n",
    "        \"image-captioning\",\n",
    "        \"image-question-answering\",\n",
    "        \"image-segmentation\",\n",
    "        \"speech-to-text\",\n",
    "        \"summarization\",\n",
    "        \"text-classification\",\n",
    "        \"text-question-answering\",\n",
    "        \"translation\",\n",
    "        \"huggingface-tools/text-to-image\",\n",
    "        \"huggingface-tools/text-to-video\",\n",
    "        \"text-to-speech\",\n",
    "        \"huggingface-tools/text-download\",\n",
    "        \"huggingface-tools/image-transformation\",\n",
    "    ]\n",
    "]"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Setup model and HuggingGPT\n",
    "\n",
    "We create an instance of HuggingGPT and use ChatGPT as the controller to rule the above tools."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain.llms import OpenAI\n",
    "from langchain_experimental.autonomous_agents import HuggingGPT\n",
    "# %env OPENAI_API_BASE=http://localhost:8000/v1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "llm = OpenAI(model_name=\"gpt-3.5-turbo\")\n",
    "agent = HuggingGPT(llm, hf_tools)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Run an example\n",
    "\n",
    "Given a text, show a related image and video."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "agent.run(\"please show me a video and an image of 'a boy is running'\")"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "langchain",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.17"
  },
  "orig_nbformat": 4
 },
 "nbformat": 4,
 "nbformat_minor": 2
}