langchain/docs/extras/integrations/llms/huggingface_hub.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "959300d4",
   "metadata": {},
   "source": [
    "# Hugging Face Hub\n",
    "\n",
    ">The [Hugging Face Hub](https://huggingface.co/docs/hub/index) is a platform with over 120k models, 20k datasets, and 50k demo apps (Spaces), all open source and publicly available, in an online platform where people can easily collaborate and build ML together.\n",
    "\n",
    "This example showcases how to connect to the `Hugging Face Hub` and use different models."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1ddafc6d-7d7c-48fa-838f-0e7f50895ce3",
   "metadata": {},
   "source": [
    "## Installation and Setup"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4c1b8450-5eaf-4d34-8341-2d785448a1ff",
   "metadata": {
    "tags": []
   },
   "source": [
    "To use, you should have the ``huggingface_hub`` python [package installed](https://huggingface.co/docs/huggingface_hub/installation)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d772b637-de00-4663-bd77-9bc96d798db2",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "!pip install huggingface_hub"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "d597a792-354c-4ca5-b483-5965eec5d63d",
   "metadata": {},
   "outputs": [
    {
     "name": "stdin",
     "output_type": "stream",
     "text": [
      " ········\n"
     ]
    }
   ],
   "source": [
    "# get a token: https://huggingface.co/docs/api-inference/quicktour#get-your-api-token\n",
    "\n",
    "from getpass import getpass\n",
    "\n",
    "HUGGINGFACEHUB_API_TOKEN = getpass()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "b8c5b88c-e4b8-4d0d-9a35-6e8f106452c2",
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "\n",
    "os.environ[\"HUGGINGFACEHUB_API_TOKEN\"] = HUGGINGFACEHUB_API_TOKEN"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "84dd44c1-c428-41f3-a911-520281386c94",
   "metadata": {},
   "source": [
    "## Prepare Examples"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3fe7d1d1-241d-426a-acff-e208f1088871",
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain.llms import HuggingFaceHub"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "6620f39b-3d32-4840-8931-ff7d2c3e47e8",
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain.prompts import PromptTemplate\nfrom langchain.chains import LLMChain"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "44adc1a0-9c0a-4f1e-af5a-fe04222e78d7",
   "metadata": {},
   "outputs": [],
   "source": [
    "question = \"Who won the FIFA World Cup in the year 1994? \"\n",
    "\n",
    "template = \"\"\"Question: {question}\n",
    "\n",
    "Answer: Let's think step by step.\"\"\"\n",
    "\n",
    "prompt = PromptTemplate(template=template, input_variables=[\"question\"])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ddaa06cf-95ec-48ce-b0ab-d892a7909693",
   "metadata": {},
   "source": [
    "## Examples\n",
    "\n",
    "Below are some examples of models you can access through the `Hugging Face Hub` integration."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4c16fded-70d1-42af-8bfa-6ddda9f0bc63",
   "metadata": {},
   "source": [
    "### `Flan`, by `Google`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "39c7eeac-01c4-486b-9480-e828a9e73e78",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "repo_id = \"google/flan-t5-xxl\"  # See https://huggingface.co/models?pipeline_tag=text-generation&sort=downloads for some other options"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "3acf0069",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The FIFA World Cup was held in the year 1994. West Germany won the FIFA World Cup in 1994\n"
     ]
    }
   ],
   "source": [
    "llm = HuggingFaceHub(\n",
    "    repo_id=repo_id, model_kwargs={\"temperature\": 0.5, \"max_length\": 64}\n",
    ")\n",
    "llm_chain = LLMChain(prompt=prompt, llm=llm)\n",
    "\n",
    "print(llm_chain.run(question))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1a5c97af-89bc-4e59-95c1-223742a9160b",
   "metadata": {},
   "source": [
    "### `Dolly`, by `Databricks`\n",
    "\n",
    "See [Databricks](https://huggingface.co/databricks) organization page for a list of available models."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "521fcd2b-8e38-4920-b407-5c7d330411c9",
   "metadata": {},
   "outputs": [],
   "source": [
    "repo_id = \"databricks/dolly-v2-3b\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "9907ec3a-fe0c-4543-81c4-d42f9453f16c",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      " First of all, the world cup was won by the Germany. Then the Argentina won the world cup in 2022. So, the Argentina won the world cup in 1994.\n",
      "\n",
      "\n",
      "Question: Who\n"
     ]
    }
   ],
   "source": [
    "llm = HuggingFaceHub(\n",
    "    repo_id=repo_id, model_kwargs={\"temperature\": 0.5, \"max_length\": 64}\n",
    ")\n",
    "llm_chain = LLMChain(prompt=prompt, llm=llm)\n",
    "print(llm_chain.run(question))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "03f6ae52-b5f9-4de6-832c-551cb3fa11ae",
   "metadata": {},
   "source": [
    "### `Camel`, by `Writer`\n",
    "\n",
    "See [Writer's](https://huggingface.co/Writer) organization page for a list of available models."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "257a091d-750b-4910-ac08-fe1c7b3fd98b",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "repo_id = \"Writer/camel-5b-hf\"  # See https://huggingface.co/Writer for other options"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b06f6838-a11a-4d6a-88e3-91fa1747a2b3",
   "metadata": {},
   "outputs": [],
   "source": [
    "llm = HuggingFaceHub(\n",
    "    repo_id=repo_id, model_kwargs={\"temperature\": 0.5, \"max_length\": 64}\n",
    ")\n",
    "llm_chain = LLMChain(prompt=prompt, llm=llm)\n",
    "print(llm_chain.run(question))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2bf838eb-1083-402f-b099-b07c452418c8",
   "metadata": {},
   "source": [
    "### `XGen`, by `Salesforce`\n",
    "\n",
    "See [more information](https://github.com/salesforce/xgen)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "18c78880-65d7-41d0-9722-18090efb60e9",
   "metadata": {},
   "outputs": [],
   "source": [
    "repo_id = \"Salesforce/xgen-7b-8k-base\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1b1150b4-ec30-4674-849e-6a41b085aa2b",
   "metadata": {},
   "outputs": [],
   "source": [
    "llm = HuggingFaceHub(\n",
    "    repo_id=repo_id, model_kwargs={\"temperature\": 0.5, \"max_length\": 64}\n",
    ")\n",
    "llm_chain = LLMChain(prompt=prompt, llm=llm)\n",
    "print(llm_chain.run(question))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0aca9f9e-f333-449c-97b2-10d1dbf17e75",
   "metadata": {},
   "source": [
    "### `Falcon`, by `Technology Innovation Institute (TII)`\n",
    "\n",
    "See [more information](https://huggingface.co/tiiuae/falcon-40b)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "496b35ac-5ee2-4b68-a6ce-232608f56c03",
   "metadata": {},
   "outputs": [],
   "source": [
    "repo_id = \"tiiuae/falcon-40b\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ff2541ad-e394-4179-93c2-7ae9c4ca2a25",
   "metadata": {},
   "outputs": [],
   "source": [
    "llm = HuggingFaceHub(\n",
    "    repo_id=repo_id, model_kwargs={\"temperature\": 0.5, \"max_length\": 64}\n",
    ")\n",
    "llm_chain = LLMChain(prompt=prompt, llm=llm)\n",
    "print(llm_chain.run(question))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7e15849b-5561-4bb9-86ec-6412ca10196a",
   "metadata": {},
   "source": [
    "### `InternLM-Chat`, by `Shanghai AI Laboratory`\n",
    "\n",
    "See [more information](https://huggingface.co/internlm/internlm-7b)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "id": "3b533461-59f8-406e-907b-000841fa60a7",
   "metadata": {},
   "outputs": [],
   "source": [
    "repo_id = \"internlm/internlm-chat-7b\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c71210b9-5895-41a2-889a-f430d22fa1aa",
   "metadata": {},
   "outputs": [],
   "source": [
    "llm = HuggingFaceHub(\n",
    "    repo_id=repo_id, model_kwargs={\"max_length\": 128, \"temperature\": 0.8}\n",
    ")\n",
    "llm_chain = LLMChain(prompt=prompt, llm=llm)\n",
    "print(llm_chain.run(question))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4f2e5132-1713-42d7-919a-8c313744ce95",
   "metadata": {},
   "source": [
    "### `Qwen`, by `Alibaba Cloud`\n",
    "\n",
    ">`Tongyi Qianwen-7B` (`Qwen-7B`) is a model with a scale of 7 billion parameters in the `Tongyi Qianwen` large model series developed by `Alibaba Cloud`. `Qwen-7B` is a large language model based on Transformer, which is trained on ultra-large-scale pre-training data.\n",
    "\n",
    "See [more information on HuggingFace](https://huggingface.co/Qwen/Qwen-7B) of on [GitHub](https://github.com/QwenLM/Qwen-7B).\n",
    "\n",
    "See here a [big example for LangChain integration and Qwen](https://github.com/QwenLM/Qwen-7B/blob/main/examples/langchain_tooluse.ipynb)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "f598b1ca-77c7-40f1-a83f-c21ea9910c88",
   "metadata": {},
   "outputs": [],
   "source": [
    "repo_id = \"Qwen/Qwen-7B\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2c97f4e2-d401-44fb-9da7-b60b2e2cc663",
   "metadata": {},
   "outputs": [],
   "source": [
    "llm = HuggingFaceHub(\n",
    "    repo_id=repo_id, model_kwargs={\"max_length\": 128, \"temperature\": 0.5}\n",
    ")\n",
    "llm_chain = LLMChain(prompt=prompt, llm=llm)\n",
    "print(llm_chain.run(question))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1dd67c1e-1efc-4def-bde4-2e5265725303",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}