{ "cells": [ { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "# HuggingGPT\n", "Implementation of [HuggingGPT](https://github.com/microsoft/JARVIS). HuggingGPT is a system to connect LLMs (ChatGPT) with ML community (Hugging Face).\n", "\n", "+ 🔥 Paper: https://arxiv.org/abs/2303.17580\n", "+ 🚀 Project: https://github.com/microsoft/JARVIS\n", "+ 🤗 Space: https://huggingface.co/spaces/microsoft/HuggingGPT" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Set up tools\n", "\n", "We set up the tools available from [Transformers Agent](https://huggingface.co/docs/transformers/transformers_agents#tools). It includes a library of tools supported by Transformers and some customized tools such as image generator, video generator, text downloader and other tools." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from transformers import load_tool" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "hf_tools = [\n", " load_tool(tool_name)\n", " for tool_name in [\n", " \"document-question-answering\",\n", " \"image-captioning\",\n", " \"image-question-answering\",\n", " \"image-segmentation\",\n", " \"speech-to-text\",\n", " \"summarization\",\n", " \"text-classification\",\n", " \"text-question-answering\",\n", " \"translation\",\n", " \"huggingface-tools/text-to-image\",\n", " \"huggingface-tools/text-to-video\",\n", " \"text-to-speech\",\n", " \"huggingface-tools/text-download\",\n", " \"huggingface-tools/image-transformation\",\n", " ]\n", "]" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Setup model and HuggingGPT\n", "\n", "We create an instance of HuggingGPT and use ChatGPT as the controller to rule the above tools." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from langchain_experimental.autonomous_agents import HuggingGPT\n", "from langchain_openai import OpenAI\n", "\n", "# %env OPENAI_API_BASE=http://localhost:8000/v1" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "llm = OpenAI(model_name=\"gpt-3.5-turbo\")\n", "agent = HuggingGPT(llm, hf_tools)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Run an example\n", "\n", "Given a text, show a related image and video." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "agent.run(\"please show me a video and an image of 'a boy is running'\")" ] } ], "metadata": { "kernelspec": { "display_name": "langchain", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.17" }, "orig_nbformat": 4 }, "nbformat": 4, "nbformat_minor": 2 }