Merge branch 'main' into main

2 months ago · 71af7b16b1
parent 8c8b13e158 555bbc0d34
commit 71af7b16b1
18 changed files with 3636 additions and 4 deletions
--- a/articles/what_is_new_with_dalle_3.mdx
+++ b/articles/what_is_new_with_dalle_3.mdx
@ -96,7 +96,7 @@ Have you ever struggled to find the perfect icon for your website or app? It wou

 ![icon_set](/images/dalle_3/icon_set.jpg)

-In this case, I used Potrace to convert the images to SVGs, which you can download [here](http://potrace.sourceforge.net/). This is what I used to convert the images:
+In this case, I used Potrace to convert the images to SVGs, which you can download [here](https://potrace.sourceforge.net/). This is what I used to convert the images:

 ```bash
 potrace -s cat.jpg -o cat.svg
--- a/authors.yaml
+++ b/authors.yaml
@ -83,7 +83,13 @@ jbeutler-openai:
  website: "https://joebeutler.com"
  avatar: "https://avatars.githubusercontent.com/u/156261485?v=4"

+dylanra-openai:
+  name: "Dylan Royan Almeida"
+  website: "https://www.linkedin.com/in/dylan-almeida-604522167/"
+  avatar: "https://avatars.githubusercontent.com/u/149511600?v=4"
+
 royziv11:
  name: "Roy Ziv"
  website: "https://www.linkedin.com/in/roy-ziv-a46001149/"
-  avatar: "https://media.licdn.com/dms/image/D5603AQHkaEOOGZWtbA/profile-displayphoto-shrink_400_400/0/1699500606122?e=1716422400&v=beta&t=wKEIx-vTEqm9wnqoC7-xr1WqJjghvcjjlMt034hXY_4"
+  avatar: "https://media.licdn.com/dms/image/D5603AQHkaEOOGZWtbA/profile-displayphoto-shrink_400_400/0/1699500606122?e=1716422400&v=beta&t=wKEIx-vTEqm9wnqoC7-xr1WqJjghvcjjlMt034hXY_4"
+
--- a/examples/How_to_stream_completions.ipynb
+++ b/examples/How_to_stream_completions.ipynb
@ -553,7 +553,7 @@
    "print(f\"Full response received {chunk_time:.2f} seconds after request\")\n",
    "# clean None in collected_messages\n",
    "collected_messages = [m for m in collected_messages if m is not None]\n",
-    "full_reply_content = ''.join([m for m in collected_messages])\n",
+    "full_reply_content = ''.join(collected_messages)\n",
    "print(f\"Full conversation received: {full_reply_content}\")\n"
   ]
  },
--- a/examples/SDG1.ipynb
+++ b/examples/SDG1.ipynb
--- a/examples/Summarizing_with_controllable_detail.ipynb
+++ b/examples/Summarizing_with_controllable_detail.ipynb
@ -0,0 +1,751 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Summarization with Controllable Detail"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The objective of this notebook is to demonstrate how to summarize large documents with a controllable level of detail.\n",
+    " \n",
+    "If you give a GPT model the task of summarizing a long document (e.g. 10k or more tokens), you'll tend to get back a relatively short summary that isn't proportional to the length of the document. For instance, a summary of a 20k token document will not be twice as long as a summary of a 10k token document. One way we can fix this is to split our document up into pieces, and produce a summary piecewise. After many queries to a GPT model, the full summary can be reconstructed. By controlling the number of text chunks and their sizes, we can ultimately control the level of detail in the output."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 41,
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2024-04-10T05:19:35.305706Z",
+     "start_time": "2024-04-10T05:19:35.303535Z"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "from typing import List, Tuple, Optional\n",
+    "\n",
+    "from openai import OpenAI\n",
+    "import tiktoken\n",
+    "from tqdm import tqdm"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 42,
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2024-04-10T05:19:35.325026Z",
+     "start_time": "2024-04-10T05:19:35.322414Z"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "# open dataset containing part of the text of the Wikipedia page for the United States\n",
+    "with open(\"data/united_states_wikipedia.txt\", \"r\") as file:\n",
+    "    united_states_wikipedia_text = file.read()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 43,
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2024-04-10T05:19:35.364483Z",
+     "start_time": "2024-04-10T05:19:35.348213Z"
+    }
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": "15781"
+     },
+     "execution_count": 43,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# load encoding and check the length of dataset\n",
+    "encoding = tiktoken.encoding_for_model('gpt-3.5-turbo')\n",
+    "len(encoding.encode(united_states_wikipedia_text))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We'll define a simple utility to wrap calls to the OpenAI API."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 44,
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2024-04-10T05:19:35.375619Z",
+     "start_time": "2024-04-10T05:19:35.365818Z"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "client = OpenAI(api_key=os.getenv(\"OPENAI_API_KEY\"))\n",
+    "\n",
+    "def get_chat_completion(messages, model='gpt-3.5-turbo'):\n",
+    "    response = client.chat.completions.create(\n",
+    "        model=model,\n",
+    "        messages=messages,\n",
+    "        temperature=0,\n",
+    "    )\n",
+    "    return response.choices[0].message.content"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Next we'll define some utilities to chunk a large document into smaller pieces."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 45,
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2024-04-10T05:19:35.382790Z",
+     "start_time": "2024-04-10T05:19:35.376721Z"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "def tokenize(text: str) -> List[str]:\n",
+    "    encoding = tiktoken.encoding_for_model('gpt-3.5-turbo')\n",
+    "    return encoding.encode(text)\n",
+    "\n",
+    "\n",
+    "# This function chunks a text into smaller pieces based on a maximum token count and a delimiter.\n",
+    "def chunk_on_delimiter(input_string: str,\n",
+    "                       max_tokens: int, delimiter: str) -> List[str]:\n",
+    "    chunks = input_string.split(delimiter)\n",
+    "    combined_chunks, _, dropped_chunk_count = combine_chunks_with_no_minimum(\n",
+    "        chunks, max_tokens, chunk_delimiter=delimiter, add_ellipsis_for_overflow=True\n",
+    "    )\n",
+    "    if dropped_chunk_count > 0:\n",
+    "        print(f\"warning: {dropped_chunk_count} chunks were dropped due to overflow\")\n",
+    "    combined_chunks = [f\"{chunk}{delimiter}\" for chunk in combined_chunks]\n",
+    "    return combined_chunks\n",
+    "\n",
+    "\n",
+    "# This function combines text chunks into larger blocks without exceeding a specified token count. It returns the combined text blocks, their original indices, and the count of chunks dropped due to overflow.\n",
+    "def combine_chunks_with_no_minimum(\n",
+    "        chunks: List[str],\n",
+    "        max_tokens: int,\n",
+    "        chunk_delimiter=\"\\n\\n\",\n",
+    "        header: Optional[str] = None,\n",
+    "        add_ellipsis_for_overflow=False,\n",
+    ") -> Tuple[List[str], List[int]]:\n",
+    "    dropped_chunk_count = 0\n",
+    "    output = []  # list to hold the final combined chunks\n",
+    "    output_indices = []  # list to hold the indices of the final combined chunks\n",
+    "    candidate = (\n",
+    "        [] if header is None else [header]\n",
+    "    )  # list to hold the current combined chunk candidate\n",
+    "    candidate_indices = []\n",
+    "    for chunk_i, chunk in enumerate(chunks):\n",
+    "        chunk_with_header = [chunk] if header is None else [header, chunk]\n",
+    "        if len(tokenize(chunk_delimiter.join(chunk_with_header))) > max_tokens:\n",
+    "            print(f\"warning: chunk overflow\")\n",
+    "            if (\n",
+    "                    add_ellipsis_for_overflow\n",
+    "                    and len(tokenize(chunk_delimiter.join(candidate + [\"...\"]))) <= max_tokens\n",
+    "            ):\n",
+    "                candidate.append(\"...\")\n",
+    "                dropped_chunk_count += 1\n",
+    "            continue  # this case would break downstream assumptions\n",
+    "        # estimate token count with the current chunk added\n",
+    "        extended_candidate_token_count = len(tokenize(chunk_delimiter.join(candidate + [chunk])))\n",
+    "        # If the token count exceeds max_tokens, add the current candidate to output and start a new candidate\n",
+    "        if extended_candidate_token_count > max_tokens:\n",
+    "            output.append(chunk_delimiter.join(candidate))\n",
+    "            output_indices.append(candidate_indices)\n",
+    "            candidate = chunk_with_header  # re-initialize candidate\n",
+    "            candidate_indices = [chunk_i]\n",
+    "        # otherwise keep extending the candidate\n",
+    "        else:\n",
+    "            candidate.append(chunk)\n",
+    "            candidate_indices.append(chunk_i)\n",
+    "    # add the remaining candidate to output if it's not empty\n",
+    "    if (header is not None and len(candidate) > 1) or (header is None and len(candidate) > 0):\n",
+    "        output.append(chunk_delimiter.join(candidate))\n",
+    "        output_indices.append(candidate_indices)\n",
+    "    return output, output_indices, dropped_chunk_count"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Now we can define a utility to summarize text with a controllable level of detail (note the detail parameter).\n",
+    "\n",
+    "The function first determines the number of chunks by interpolating between a minimum and a maximum chunk count based on a controllable detail parameter. It then splits the text into chunks and summarizes each chunk."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 46,
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2024-04-10T05:19:35.390876Z",
+     "start_time": "2024-04-10T05:19:35.385076Z"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "def summarize(text: str,\n",
+    "              detail: float = 0,\n",
+    "              model: str = 'gpt-3.5-turbo',\n",
+    "              additional_instructions: Optional[str] = None,\n",
+    "              minimum_chunk_size: Optional[int] = 500,\n",
+    "              chunk_delimiter: str = \".\",\n",
+    "              summarize_recursively=False,\n",
+    "              verbose=False):\n",
+    "    \"\"\"\n",
+    "    Summarizes a given text by splitting it into chunks, each of which is summarized individually. \n",
+    "    The level of detail in the summary can be adjusted, and the process can optionally be made recursive.\n",
+    "\n",
+    "    Parameters:\n",
+    "    - text (str): The text to be summarized.\n",
+    "    - detail (float, optional): A value between 0 and 1 indicating the desired level of detail in the summary.\n",
+    "      0 leads to a higher level summary, and 1 results in a more detailed summary. Defaults to 0.\n",
+    "    - model (str, optional): The model to use for generating summaries. Defaults to 'gpt-3.5-turbo'.\n",
+    "    - additional_instructions (Optional[str], optional): Additional instructions to provide to the model for customizing summaries.\n",
+    "    - minimum_chunk_size (Optional[int], optional): The minimum size for text chunks. Defaults to 500.\n",
+    "    - chunk_delimiter (str, optional): The delimiter used to split the text into chunks. Defaults to \".\".\n",
+    "    - summarize_recursively (bool, optional): If True, summaries are generated recursively, using previous summaries for context.\n",
+    "    - verbose (bool, optional): If True, prints detailed information about the chunking process.\n",
+    "\n",
+    "    Returns:\n",
+    "    - str: The final compiled summary of the text.\n",
+    "\n",
+    "    The function first determines the number of chunks by interpolating between a minimum and a maximum chunk count based on the `detail` parameter. \n",
+    "    It then splits the text into chunks and summarizes each chunk. If `summarize_recursively` is True, each summary is based on the previous summaries, \n",
+    "    adding more context to the summarization process. The function returns a compiled summary of all chunks.\n",
+    "    \"\"\"\n",
+    "\n",
+    "    # check detail is set correctly\n",
+    "    assert 0 <= detail <= 1\n",
+    "\n",
+    "    # interpolate the number of chunks based to get specified level of detail\n",
+    "    max_chunks = len(chunk_on_delimiter(text, minimum_chunk_size, chunk_delimiter))\n",
+    "    min_chunks = 1\n",
+    "    num_chunks = int(min_chunks + detail * (max_chunks - min_chunks))\n",
+    "\n",
+    "    # adjust chunk_size based on interpolated number of chunks\n",
+    "    document_length = len(tokenize(text))\n",
+    "    chunk_size = max(minimum_chunk_size, document_length // num_chunks)\n",
+    "    text_chunks = chunk_on_delimiter(text, chunk_size, chunk_delimiter)\n",
+    "    if verbose:\n",
+    "        print(f\"Splitting the text into {len(text_chunks)} chunks to be summarized.\")\n",
+    "        print(f\"Chunk lengths are {[len(tokenize(x)) for x in text_chunks]}\")\n",
+    "\n",
+    "    # set system message\n",
+    "    system_message_content = \"Summarize the following text.\"\n",
+    "    if additional_instructions is not None:\n",
+    "        system_message_content += f\"\\n\\n{additional_instructions}\"\n",
+    "\n",
+    "    accumulated_summaries = []\n",
+    "    for chunk in tqdm(text_chunks):\n",
+    "        if summarize_recursively and accumulated_summaries:\n",
+    "            # Creating a structured prompt for recursive summarization\n",
+    "            accumulated_summaries_string = '\\n\\n'.join(accumulated_summaries)\n",
+    "            user_message_content = f\"Previous summaries:\\n\\n{accumulated_summaries_string}\\n\\nText to summarize next:\\n\\n{chunk}\"\n",
+    "        else:\n",
+    "            # Directly passing the chunk for summarization without recursive context\n",
+    "            user_message_content = chunk\n",
+    "\n",
+    "        # Constructing messages based on whether recursive summarization is applied\n",
+    "        messages = [\n",
+    "            {\"role\": \"system\", \"content\": system_message_content},\n",
+    "            {\"role\": \"user\", \"content\": user_message_content}\n",
+    "        ]\n",
+    "\n",
+    "        # Assuming this function gets the completion and works as expected\n",
+    "        response = get_chat_completion(messages, model=model)\n",
+    "        accumulated_summaries.append(response)\n",
+    "\n",
+    "    # Compile final summary from partial summaries\n",
+    "    final_summary = '\\n\\n'.join(accumulated_summaries)\n",
+    "\n",
+    "    return final_summary"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "source": [
+    "Now we can use this utility to produce summaries with varying levels of detail. By increasing 'detail' from 0 to 1 we get progressively longer summaries of the underlying document. A higher value for the detail parameter results in a more detailed summary because the utility first splits the document into a greater number of chunks. Each chunk is then summarized, and the final summary is a concatenation of all the chunk summaries."
+   ],
+   "metadata": {
+    "collapsed": false
+   }
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 47,
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2024-04-10T05:19:47.541096Z",
+     "start_time": "2024-04-10T05:19:35.391911Z"
+    }
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Splitting the text into 1 chunks to be summarized.\n",
+      "Chunk lengths are [15781]\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "100%|██████████| 1/1 [00:07<00:00,  7.31s/it]\n"
+     ]
+    }
+   ],
+   "source": [
+    "summary_with_detail_0 = summarize(united_states_wikipedia_text, detail=0, verbose=True)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 48,
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2024-04-10T05:19:58.724212Z",
+     "start_time": "2024-04-10T05:19:47.542129Z"
+    }
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Splitting the text into 5 chunks to be summarized.\n",
+      "Chunk lengths are [3945, 3941, 3943, 3915, 37]\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "100%|██████████| 5/5 [00:09<00:00,  1.97s/it]\n"
+     ]
+    }
+   ],
+   "source": [
+    "summary_with_detail_pt1 = summarize(united_states_wikipedia_text, detail=0.1, verbose=True)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 49,
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2024-04-10T05:20:16.216023Z",
+     "start_time": "2024-04-10T05:19:58.725014Z"
+    }
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Splitting the text into 8 chunks to be summarized.\n",
+      "Chunk lengths are [2214, 2253, 2249, 2255, 2254, 2255, 2221, 84]\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "100%|██████████| 8/8 [00:16<00:00,  2.08s/it]\n"
+     ]
+    }
+   ],
+   "source": [
+    "summary_with_detail_pt2 = summarize(united_states_wikipedia_text, detail=0.2, verbose=True)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 50,
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2024-04-10T05:20:46.941240Z",
+     "start_time": "2024-04-10T05:20:16.225524Z"
+    }
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Splitting the text into 14 chunks to be summarized.\n",
+      "Chunk lengths are [1198, 1209, 1210, 1209, 1212, 1192, 1176, 1205, 1212, 1201, 1210, 1210, 1192, 154]\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "100%|██████████| 14/14 [00:30<00:00,  2.15s/it]\n"
+     ]
+    }
+   ],
+   "source": [
+    "summary_with_detail_pt4 = summarize(united_states_wikipedia_text, detail=0.4, verbose=True)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 51,
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2024-04-10T05:21:44.913140Z",
+     "start_time": "2024-04-10T05:20:46.953285Z"
+    }
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Splitting the text into 27 chunks to be summarized.\n",
+      "Chunk lengths are [602, 596, 601, 601, 604, 598, 572, 594, 592, 592, 604, 593, 578, 582, 597, 600, 596, 555, 582, 601, 582, 587, 581, 595, 598, 568, 445]\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "100%|██████████| 27/27 [00:57<00:00,  2.13s/it]\n"
+     ]
+    }
+   ],
+   "source": [
+    "summary_with_detail_pt8 = summarize(united_states_wikipedia_text, detail=0.8, verbose=True)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 52,
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2024-04-10T05:22:57.760218Z",
+     "start_time": "2024-04-10T05:21:44.921275Z"
+    }
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Splitting the text into 33 chunks to be summarized.\n",
+      "Chunk lengths are [490, 443, 475, 490, 501, 470, 472, 487, 479, 477, 447, 442, 490, 468, 488, 477, 493, 493, 472, 491, 490, 501, 493, 468, 500, 500, 474, 460, 489, 462, 490, 482, 445]\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "100%|██████████| 33/33 [01:12<00:00,  2.20s/it]\n"
+     ]
+    }
+   ],
+   "source": [
+    "summary_with_detail_1 = summarize(united_states_wikipedia_text, detail=1.0, verbose=True)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The original document is ~15k tokens long. Notice how large the gap is between the length of 'summary_pt0' and summary_pt10'"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 53,
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2024-04-10T05:22:57.782389Z",
+     "start_time": "2024-04-10T05:22:57.763041Z"
+    }
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": "[307, 494, 839, 1662, 3552, 4128]"
+     },
+     "execution_count": 53,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# lengths of summaries\n",
+    "[len(tokenize(x)) for x in\n",
+    " [summary_with_detail_0, summary_with_detail_pt1, summary_with_detail_pt2, summary_with_detail_pt4,\n",
+    "  summary_with_detail_pt8, summary_with_detail_1]]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Let's inspect the summaries to see how the level of detail changes with the `detail` parameter set to 0, 0.1, 0.2, 0.4, 0.8, and 1 respectively."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 54,
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2024-04-10T05:22:57.785881Z",
+     "start_time": "2024-04-10T05:22:57.783455Z"
+    }
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "The United States of America is a diverse country located in North America, with a population exceeding 334 million. It is a federation of 50 states, a federal capital district, and various territories. The country has a rich history, from the migration of Paleo-Indians over 12,000 years ago to the British colonization and the American Revolution. The U.S. has gone through significant events like the Civil War, World War II, and the Cold War, emerging as a superpower after the collapse of the Soviet Union.\n",
+      "\n",
+      "The U.S. government is a presidential constitutional republic with three separate branches: legislative, executive, and judicial. The country has a strong emphasis on liberty, equality under the law, individualism, and limited government. Economically, the U.S. has the largest nominal GDP in the world and is a leader in economic competitiveness, innovation, and human rights. The U.S. is also a founding member of various international organizations like the UN, World Bank, and NATO.\n",
+      "\n",
+      "The U.S. has a rich cultural landscape, with influences from various ethnic groups and traditions. American literature, music, cinema, and theater have made significant contributions to global culture. The country is known for its diverse cuisine, with dishes influenced by various immigrant groups. The U.S. also has a strong presence in the fashion industry, with New York City being a global fashion capital.\n",
+      "\n",
+      "Overall, the United States is a country with a rich history, diverse population, strong economy, and significant cultural influence on the world stage.\n"
+     ]
+    }
+   ],
+   "source": [
+    "print(summary_with_detail_0)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 55,
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2024-04-10T05:22:57.788969Z",
+     "start_time": "2024-04-10T05:22:57.786691Z"
+    }
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "The United States of America is a country located in North America, consisting of 50 states, a federal capital district, and various territories. It has a rich history, from the arrival of Paleo-Indians over 12,000 years ago to British colonization and the American Revolution. The U.S. expanded across North America, faced sectional divisions over slavery, and emerged as a global superpower after World War II. The country has a presidential constitutional republic with three branches of government and a strong emphasis on liberty, equality, and limited government. Economically, the U.S. is a major player with the largest nominal GDP in the world and significant influence in various international organizations. The country's name, history, and expansion are detailed, including key events like the Declaration of Independence, the Revolutionary War, and the Louisiana Purchase.\n",
+      "\n",
+      "The text discusses key events in American history, including the Missouri Compromise, Indian removal policies, the Civil War, Reconstruction era, post-Civil War developments, rise as a superpower, Cold War era, and contemporary history. It highlights significant events such as the Trail of Tears, California Gold Rush, Reconstruction Amendments, immigration waves, World Wars, Cold War tensions, civil rights movement, economic developments, technological advancements, and major conflicts like the Gulf War and War on Terror. The text also mentions social changes, economic challenges like the Great Depression and Great Recession, and political developments leading to increased polarization in the 2010s.\n",
+      "\n",
+      "The text discusses the geography, climate, biodiversity, conservation efforts, government, politics, political parties, subdivisions, and foreign relations of the United States. It highlights the country's physical features, climate diversity, environmental issues, governmental structure, political parties, state subdivisions, and diplomatic relations. The text also mentions the historical context of the country's political system, including the development of political parties and the structure of the federal government.\n",
+      "\n",
+      "The text discusses the United States' international relations, military capabilities, law enforcement, crime rates, economy, and science and technology advancements. It highlights the country's membership in various international organizations, its military strength, economic dominance, income inequality, and technological innovations. The United States has strong diplomatic ties with several countries, a significant military presence globally, a large economy with high GDP, and is a leader in technological advancements and scientific research.\n",
+      "\n",
+      "The text discusses various aspects of the United States, including its scientific and innovation rankings, energy consumption, transportation infrastructure, demographics, language diversity, immigration, religion, urbanization, and healthcare. It highlights the country's achievements in scientific research, energy usage, transportation systems, population demographics, language diversity, immigration statistics, religious affiliations, urbanization trends, and healthcare facilities.\n",
+      "\n",
+      "The text discusses various aspects of life in the United States, including changes in life expectancy, the healthcare system, education, culture, society, literature, and mass media. It highlights the impact of the COVID-19 pandemic on life expectancy, the disparities in healthcare outcomes, the structure of the education system, the cultural diversity and values in American society, the development of American literature, and the media landscape in the country. The text also touches on issues such as healthcare coverage, education spending, student loan debt, and the protection of free speech in the U.S.\n",
+      "\n",
+      "The text discusses various aspects of American culture, including alternative newspapers in major cities, popular websites, the video game market, theater, visual arts, music, fashion, cinema, and cuisine. It highlights the influence of American culture globally, such as in music, fashion, cinema, and cuisine. The text also mentions significant figures and events in American cultural history, such as the Tony Awards, Broadway theater, the Hudson River School in visual arts, and the Golden Age of Hollywood in cinema. Additionally, it touches on the development of American cuisine, including traditional dishes and the impact of immigrant groups on American food culture.\n",
+      "\n",
+      "The American fast-food industry, known for pioneering the drive-through format in the 1940s, is considered a symbol of U.S. marketing dominance. Major American companies like McDonald's, Burger King, Pizza Hut, Kentucky Fried Chicken, and Domino's Pizza have a significant global presence with numerous outlets worldwide.\n"
+     ]
+    }
+   ],
+   "source": [
+    "print(summary_with_detail_pt2)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Note that this utility also allows passing additional instructions."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 56,
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2024-04-10T05:33:18.789246Z",
+     "start_time": "2024-04-10T05:22:57.789764Z"
+    }
+   },
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "100%|██████████| 5/5 [10:19<00:00, 123.94s/it]"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "- The USA is a federation of 50 states, a federal capital district, and 326 Indian reservations.\n",
+      "- It has sovereignty over five major unincorporated island territories and various uninhabited islands.\n",
+      "- The country has a population exceeding 334 million.\n",
+      "- The USA has the world's third-largest land area and the largest maritime exclusive economic zone.\n",
+      "- The USA has had the largest nominal GDP in the world since 1890.\n",
+      "- In 2023, the USA accounted for over 25% of the global economy based on nominal GDP and 15% based on PPP.\n",
+      "- The USA has the highest median income per capita of any non-microstate.\n",
+      "- The USA ranks high in economic competitiveness, productivity, innovation, human rights, and higher education.\n",
+      "- The USA is a founding member of various international organizations such as the World Bank, IMF, NATO, and the UN Security Council.\n",
+      "\n",
+      "- The Great Society plan of President Lyndon Johnson's administration in the early 1960s resulted in groundbreaking laws and policies to counteract institutional racism.\n",
+      "- By 1985, the majority of women aged 16 and older in the U.S. were employed.\n",
+      "- In the 1990s, the U.S. saw the longest economic expansion in its history, with advances in technology such as the World Wide Web and the first gene therapy trial.\n",
+      "- The U.S. spent $877 billion on its military in 2022, the largest amount globally, making up 39% of global military spending and 3.5% of the country's GDP.\n",
+      "- The U.S. has the third-largest combined armed forces in the world, with about 800 bases and facilities abroad and deployments in 25 foreign countries.\n",
+      "- As of January 2023, the U.S. had the sixth highest per-capita incarceration rate globally, with almost 2 million people incarcerated.\n",
+      "- The U.S. had a nominal GDP of $27 trillion in 2023, the largest in the world, constituting over 25% of the global economy.\n",
+      "\n",
+      "- Real compounded annual GDP growth in the U.S. was 3.3%, compared to 2.3% for the rest of the Group of Seven.\n",
+      "- The U.S. ranks first in the world by disposable income per capita and nominal GDP, second by GDP (PPP) after China, and ninth by GDP (PPP) per capita.\n",
+      "- The U.S. has 136 of the world's 500 largest companies headquartered there.\n",
+      "- The U.S. dollar is the most used currency in international transactions and is the world's foremost reserve currency.\n",
+      "- The U.S. ranked second in the Global Competitiveness Report in 2019, after Singapore.\n",
+      "- The U.S. is the second-largest manufacturing country after China as of 2021.\n",
+      "- Americans have the highest average household and employee income among OECD member states.\n",
+      "- The U.S. has 735 billionaires and nearly 22 million millionaires as of 2023.\n",
+      "- In 2022, there were about 582,500 sheltered and unsheltered homeless persons in the U.S.\n",
+      "- The U.S. receives approximately 81% of its energy from fossil fuels.\n",
+      "- The U.S. has the highest vehicle ownership per capita in the world, with 910 vehicles per 1000 people.\n",
+      "- The U.S. has the third-highest number of patent applications and ranked 3rd in the Global Innovation Index in 2023.\n",
+      "- The U.S. has the third-highest number of published scientific papers in 2022.\n",
+      "- The U.S. has a diverse population with 37 ancestry groups having more than one million members.\n",
+      "- The U.S. has the largest Christian population in the world.\n",
+      "- The average American life expectancy at birth was 77.5 years in 2022.\n",
+      "- The U.S. spends more on education per student than any other country in the world.\n",
+      "\n",
+      "- The United States has the most Nobel Prize winners in history, with 411 awards won.\n",
+      "- American higher education is dominated by state university systems, with private universities enrolling about 20% of students.\n",
+      "- The U.S. spends more per student on higher education than the OECD average and all other nations in combined public and private spending.\n",
+      "- Student loan debt in the U.S. has increased by 102% in the last decade, exceeding 1.7 trillion dollars as of 2022.\n",
+      "- Americans donated 1.44% of total GDP to charity, the highest rate in the world.\n",
+      "- The U.S. has the world's largest music market with a total retail value of $15.9 billion in 2022.\n",
+      "- The United States restaurant industry was projected at $899 billion in sales for 2020, employing over 15 million people.\n",
+      "- The U.S. is home to over 220 Michelin Star rated restaurants, with 70 in New York City alone.\n",
+      "- California alone has 444 publishers, developers, and hardware companies in the video game market.\n",
+      "- The U.S. fast-food industry pioneered the drive-through format in the 1940s.\n",
+      "\n",
+      "- American companies mentioned: McDonald's, Burger King, Pizza Hut, Kentucky Fried Chicken, Domino's Pizza\n",
+      "- These companies have numerous outlets around the world\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "\n"
+     ]
+    }
+   ],
+   "source": [
+    "summary_with_additional_instructions = summarize(united_states_wikipedia_text, detail=0.1,\n",
+    "                                                 additional_instructions=\"Write in point form and focus on numerical data.\")\n",
+    "print(summary_with_additional_instructions)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "source": [
+    "Finally, note that the utility allows for recursive summarization, where each summary is based on the previous summaries, adding more context to the summarization process. This can be enabled by setting the `summarize_recursively` parameter to True. This is more computationally expensive, but can increase consistency and coherence of the combined summary."
+   ],
+   "metadata": {
+    "collapsed": false
+   }
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 57,
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "100%|██████████| 5/5 [00:09<00:00,  1.99s/it]"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "The text provides an overview of the United States, including its geography, history, government structure, economic status, and global influence. It covers the country's origins, colonization, independence, expansion, Civil War, post-war era, rise as a superpower, and involvement in the Cold War. The U.S. is described as a presidential constitutional republic with a strong emphasis on individual rights, liberty, and limited government. The text also highlights the country's economic prowess, cultural influence, and membership in various international organizations.\n",
+      "\n",
+      "The text discusses the United States from the early 1960s to the present day, highlighting significant events such as President Lyndon Johnson's Great Society plan, the counterculture movement, societal changes, the end of the Cold War, the economic expansion of the 1990s, the September 11 attacks, the Great Recession, and political polarization. It also covers the country's geography, climate, biodiversity, conservation efforts, government structure, political parties, foreign relations, military strength, law enforcement, crime rates, and the economy, including its status as the world's largest economy.\n",
+      "\n",
+      "The text discusses the economic status of the United States, highlighting its GDP growth, ranking in various economic indicators, dominance in global trade, and technological advancements. It also covers income distribution, poverty rates, and social issues like homelessness and food insecurity. The text further delves into the country's energy consumption, transportation infrastructure, demographics, immigration trends, religious diversity, urbanization, healthcare system, life expectancy, and education system.\n",
+      "\n",
+      "The text discusses various aspects of American culture and society, including education, literature, mass media, theater, visual arts, music, fashion, cinema, and cuisine. It highlights the country's achievements in education, with a focus on higher education and federal financial aid for students. The text also delves into American cultural values, ethnic diversity, and the country's strong protections of free speech. Additionally, it covers the development of American literature, mass media landscape, theater scene, visual arts movements, music genres, fashion industry, cinema history, and culinary traditions. The influence of American culture globally, as well as the economic impact of industries like music and restaurants, is also discussed.\n",
+      "\n",
+      "American fast-food chains like McDonald's, Burger King, Pizza Hut, Kentucky Fried Chicken, and Domino's Pizza have a widespread global presence with numerous outlets worldwide.\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "\n"
+     ]
+    }
+   ],
+   "source": [
+    "recursive_summary = summarize(united_states_wikipedia_text, detail=0.1, summarize_recursively=True,\n",
+    "                              additional_instructions=\"Don't overuse repetitive phrases to introduce each section\")\n",
+    "print(recursive_summary)"
+   ],
+   "metadata": {
+    "collapsed": false,
+    "ExecuteTime": {
+     "end_time": "2024-04-10T05:33:30.123036Z",
+     "start_time": "2024-04-10T05:33:18.791253Z"
+    }
+   }
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.9"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 1
+}
--- a/examples/custom_image_embedding_search.ipynb
+++ b/examples/custom_image_embedding_search.ipynb
@ -0,0 +1,612 @@
+{
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "w9w5JBaUL-lO"
+      },
+      "source": [
+        "# Multimodal RAG with CLIP Embeddings and GPT-4 Vision\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "3CCjcFSiMbvf"
+      },
+      "source": [
+        "Multimodal RAG integrates additional modalities into traditional text-based RAG, enhancing LLMs' question-answering by providing extra context and grounding textual data for improved understanding.\n",
+        "\n",
+        "Adopting the approach from the [clothing matchmaker cookbook](https://cookbook.openai.com/examples/how_to_combine_gpt4v_with_rag_outfit_assistant), we directly embed images for similarity search, bypassing the lossy process of text captioning, to boost retrieval accuracy.\n",
+        "\n",
+        "Using CLIP-based embeddings further allows fine-tuning with specific data or updating with unseen images.\n",
+        "\n",
+        "This technique is showcased through searching an enterprise knowledge base with user-provided tech images to deliver pertinent information."
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "T-Mpdxit4x49"
+      },
+      "source": [
+        "# Installations"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "nbt3evfHUJTZ"
+      },
+      "source": [
+        "First let's install the relevant packages."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "7hgrcVEl0Ma1"
+      },
+      "outputs": [],
+      "source": [
+        "#installations\n",
+        "%pip install clip\n",
+        "%pip install torch\n",
+        "%pip install pillow\n",
+        "%pip install faiss-cpu\n",
+        "%pip install numpy\n",
+        "%pip install git+https://github.com/openai/CLIP.git\n",
+        "%pip install openai"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "GgrlBLTpT0si"
+      },
+      "source": [
+        "Then let's import all the needed packages.\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 3,
+      "metadata": {
+        "id": "pN1cWF-iyLUg"
+      },
+      "outputs": [],
+      "source": [
+        "# model imports\n",
+        "import faiss\n",
+        "import json\n",
+        "import torch\n",
+        "from openai import OpenAI\n",
+        "import torch.nn as nn\n",
+        "from torch.utils.data import DataLoader\n",
+        "import clip\n",
+        "client = OpenAI()\n",
+        "\n",
+        "# helper imports\n",
+        "from tqdm import tqdm\n",
+        "import json\n",
+        "import os\n",
+        "import numpy as np\n",
+        "import pickle\n",
+        "from typing import List, Union, Tuple\n",
+        "\n",
+        "# visualisation imports\n",
+        "from PIL import Image\n",
+        "import matplotlib.pyplot as plt\n",
+        "import base64"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "9fONcWxRqll8"
+      },
+      "source": [
+        "Now let's load the CLIP model."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 4,
+      "metadata": {
+        "id": "_Ua9y98NRk70"
+      },
+      "outputs": [],
+      "source": [
+        "#load model on device. The device you are running inference/training on is either a CPU or GPU if you have.\n",
+        "device = \"cpu\"\n",
+        "model, preprocess = clip.load(\"ViT-B/32\",device=device)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "Dev-zjfJ774W"
+      },
+      "source": [
+        "\n",
+        "We will now:\n",
+        "1.   Create the image embedding database\n",
+        "2.   Set up a query to the vision model\n",
+        "3.   Perform the semantic search\n",
+        "4.   Pass a user query to the image\n",
+        "\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "5Y1v2jkS42TS"
+      },
+      "source": [
+        "# Create image embedding database"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "wVBAMyhesAyi"
+      },
+      "source": [
+        "Next we will create our image embeddings knowledge base from a directory of images. This will be the knowledge base of technology that we search through to provide information to the user for an image they upload.\n",
+        "\n",
+        "We pass in the directory in which we store our images (as JPEGs) and loop through each to create our embeddings.\n",
+        "\n",
+        "We also have a description.json. This has an entry for every single image in our knowledge base. It has two keys: 'image_path' and 'description'. It maps each image to a useful description of this image to aid in answering the user question."
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "fDCz76gr8yAu"
+      },
+      "source": [
+        "First let's write a function to get all the image paths in a given directory. We will then get all the jpeg's from a directory called 'image_database'"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 5,
+      "metadata": {
+        "id": "vE9i3zLuRk5c"
+      },
+      "outputs": [],
+      "source": [
+        "def get_image_paths(directory: str, number: int = None) -> List[str]:\n",
+        "    image_paths = []\n",
+        "    count = 0\n",
+        "    for filename in os.listdir(directory):\n",
+        "        if filename.endswith('.jpeg'):\n",
+        "            image_paths.append(os.path.join(directory, filename))\n",
+        "            if number is not None and count == number:\n",
+        "                return [image_paths[-1]]\n",
+        "            count += 1\n",
+        "    return image_paths\n",
+        "direc = 'image_database/'\n",
+        "image_paths = get_image_paths(direc)\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "hMldfjn189vC"
+      },
+      "source": [
+        "Next we will write a function to get the image embeddings from the CLIP model given a series of paths.\n",
+        "\n",
+        "We first preprocess the image using the preprocess function we got earlier. This performs a few things to ensure the input to the CLIP model is of the right format and dimensionality including resizing, normalization, colour channel adjustment etc.\n",
+        "\n",
+        "We then stack these preprocessed images together so we can pass them into the model at once rather than in a loop. And finally return the model output which is an array of embeddings."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 6,
+      "metadata": {
+        "id": "fd3I_fPh8qvi"
+      },
+      "outputs": [],
+      "source": [
+        "def get_features_from_image_path(image_paths):\n",
+        "  images = [preprocess(Image.open(image_path).convert(\"RGB\")) for image_path in image_paths]\n",
+        "  image_input = torch.tensor(np.stack(images))\n",
+        "  with torch.no_grad():\n",
+        "    image_features = model.encode_image(image_input).float()\n",
+        "  return image_features\n",
+        "image_features = get_features_from_image_path(image_paths)\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "UH_kyZAE-kHe"
+      },
+      "source": [
+        "We can now create our vector database."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 7,
+      "metadata": {
+        "id": "TIeqpndF8tZk"
+      },
+      "outputs": [],
+      "source": [
+        "index = faiss.IndexFlatIP(image_features.shape[1])\n",
+        "index.add(image_features)\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "swDe1c4v-mbz"
+      },
+      "source": [
+        "And also ingest our json for image-description mapping and create a list of jsons. We also create a helper function to search through this list for a given image we want, so we can obtain the description of that image"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 11,
+      "metadata": {
+        "id": "tdjlXQqC8uNE"
+      },
+      "outputs": [],
+      "source": [
+        "data = []\n",
+        "image_path = 'train1.jpeg'\n",
+        "with open('description.json', 'r') as file:\n",
+        "    for line in file:\n",
+        "        data.append(json.loads(line))\n",
+        "def find_entry(data, key, value):\n",
+        "    for entry in data:\n",
+        "        if entry.get(key) == value:\n",
+        "            return entry\n",
+        "    return None"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "fJXfCtPD5_63"
+      },
+      "source": [
+        "Let us display an example image, this will be the user uploaded image. This is a piece of tech that was unveiled at the 2024 CES. It is the DELTA Pro Ultra Whole House Battery Generator."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "RtkZ7W3g5sED"
+      },
+      "outputs": [],
+      "source": [
+        "im = Image.open(image_path)\n",
+        "plt.imshow(im)\n",
+        "plt.show()"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "5ivECCKSdbBy"
+      },
+      "source": [
+        "![Delta Pro](../images/train1.jpeg)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "Sidjylki7Kye"
+      },
+      "source": [
+        "# Querying the vision model"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "H8O7X6ml7t38"
+      },
+      "source": [
+        "Now let's have a look at what GPT-4 Vision (which wouldn't have seen this technology before) will label it as.\n",
+        "\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "r4uDjS-gQAqm"
+      },
+      "source": [
+        "First we will need to write a function to encode our image in base64 as this is the format we will pass into the vision model. Then we will create a generic image_query function to allow us to query the LLM with an image input."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 12,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 35
+        },
+        "id": "87gf6_xO8Y4i",
+        "outputId": "99be865f-12e8-4ef0-c2f5-5fd6e5c787f3"
+      },
+      "outputs": [
+        {
+          "data": {
+            "application/vnd.google.colaboratory.intrinsic+json": {
+              "type": "string"
+            },
+            "text/plain": [
+              "'Autonomous Delivery Robot'"
+            ]
+          },
+          "execution_count": 12,
+          "metadata": {},
+          "output_type": "execute_result"
+        }
+      ],
+      "source": [
+        "def encode_image(image_path):\n",
+        "    with open(image_path, 'rb') as image_file:\n",
+        "        encoded_image = base64.b64encode(image_file.read())\n",
+        "        return encoded_image.decode('utf-8')\n",
+        "\n",
+        "def image_query(query, image_path):\n",
+        "    response = client.chat.completions.create(\n",
+        "        model='gpt-4-vision-preview',\n",
+        "        messages=[\n",
+        "            {\n",
+        "            \"role\": \"user\",\n",
+        "            \"content\": [\n",
+        "                {\n",
+        "                \"type\": \"text\",\n",
+        "                \"text\": query,\n",
+        "                },\n",
+        "                {\n",
+        "                \"type\": \"image_url\",\n",
+        "                \"image_url\": {\n",
+        "                    \"url\": f\"data:image/jpeg;base64,{encode_image(image_path)}\",\n",
+        "                },\n",
+        "                }\n",
+        "            ],\n",
+        "            }\n",
+        "        ],\n",
+        "        max_tokens=300,\n",
+        "    )\n",
+        "    # Extract relevant features from the response\n",
+        "    return response.choices[0].message.content\n",
+        "image_query('Write a short label of what is show in this image?', image_path)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "yfG_7c-jQAqm"
+      },
+      "source": [
+        "As we can see, it tries its best from the information it's been trained on but it makes a mistake due to it not having seen anything similar in its training data. This is because it is an ambiguous image making it difficult to extrapolate and deduce."
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "szWZqTqf7SrA"
+      },
+      "source": [
+        "# Performing semantic search"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "eV8LaOncGH3j"
+      },
+      "source": [
+        "Now let's perform similarity search to find the two most similar images in our knowledge base. We do this by getting the embeddings of a user inputted image_path, retrieving the indexes and distances of the similar iamges in our database. Distance will be our proxy metric for similarity and a smaller distance means more similar. We then sort based on distance in descending order."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 13,
+      "metadata": {
+        "id": "GzNEhKJ04D-F"
+      },
+      "outputs": [],
+      "source": [
+        "image_search_embedding = get_features_from_image_path([image_path])\n",
+        "distances, indices = index.search(image_search_embedding.reshape(1, -1), 2) #2 signifies the number of topmost similar images to bring back\n",
+        "distances = distances[0]\n",
+        "indices = indices[0]\n",
+        "indices_distances = list(zip(indices, distances))\n",
+        "indices_distances.sort(key=lambda x: x[1], reverse=True)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "0O-GYQ-1QAqm"
+      },
+      "source": [
+        "We require the indices as we will use this to serach through our image_directory and selecting the image at the location of the index to feed into the vision model for RAG."
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "9-6SVzwSJVuT"
+      },
+      "source": [
+        "And let's see what it brought back (we display these in order of similarity):"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "Lt1ZYuKDFeww"
+      },
+      "outputs": [],
+      "source": [
+        "#display similar images\n",
+        "for idx, distance in indices_distances:\n",
+        "    print(idx)\n",
+        "    path = get_image_paths(direc, idx)[0]\n",
+        "    im = Image.open(path)\n",
+        "    plt.imshow(im)\n",
+        "    plt.show()"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "GPTvKIUJ2tgz"
+      },
+      "source": [
+        "![Delta Pro2](../images/train2.jpeg)\n",
+        "\n",
+        "![Delta Pro3](../images/train17.jpeg)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "x4kF2-MJQAqm"
+      },
+      "source": [
+        "We can see here it brought back two images which contain the DELTA Pro Ultra Whole House Battery Generator. In one of the images it also has some background which could be distracting but manages to find the right image."
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "Qc2sOKzY7yv3"
+      },
+      "source": [
+        "# User querying the most similar image"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "8Sio6OR4MDjI"
+      },
+      "source": [
+        "Now for our most similar image, we want to pass it and the description of it to gpt-v with a user query so they can inquire about the technology that they may have bought. This is where the power of the vision model comes in, where you can ask general queries for which the model hasn't been explicitly trained on to the model and it responds with high accuracy."
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "uPzsRk66QAqn"
+      },
+      "source": [
+        "In our example below, we will inquire as to the capacity of the item in question."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 14,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 87
+        },
+        "id": "-_5W_xwitbr3",
+        "outputId": "99a40617-0153-492a-d8b0-6782b8421e40"
+      },
+      "outputs": [
+        {
+          "data": {
+            "application/vnd.google.colaboratory.intrinsic+json": {
+              "type": "string"
+            },
+            "text/plain": [
+              "'The portable home battery DELTA Pro has a base capacity of 3.6kWh. This capacity can be expanded up to 25kWh with additional batteries. The image showcases the DELTA Pro, which has an impressive 3600W power capacity for AC output as well.'"
+            ]
+          },
+          "execution_count": 14,
+          "metadata": {},
+          "output_type": "execute_result"
+        }
+      ],
+      "source": [
+        "similar_path = get_image_paths(direc, indices_distances[0][0])[0]\n",
+        "element = find_entry(data, 'image_path', similar_path)\n",
+        "\n",
+        "user_query = 'What is the capacity of this item?'\n",
+        "prompt = f\"\"\"\n",
+        "Below is a user query, I want you to answer the query using the description and image provided.\n",
+        "\n",
+        "user query:\n",
+        "{user_query}\n",
+        "\n",
+        "description:\n",
+        "{element['description']}\n",
+        "\"\"\"\n",
+        "image_query(prompt, similar_path)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "VIInamGaAG9L"
+      },
+      "source": [
+        "And we see it is able to answer the question. This was only possible by matching images directly and from there gathering the relevant description as context."
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "Ljrf0VKR_2q9"
+      },
+      "source": [
+        "# Conclusion"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "PexvxTF5_7ay"
+      },
+      "source": [
+        "In this notebook, we have gone through how to use the CLIP model, an example of creating an image embedding database using the CLIP model, performing semantic search and finally providing a user query to answer the question."
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "gOgRBeh6eMiq"
+      },
+      "source": [
+        "The applications of this pattern of usage spread across many different application domains and this is easily improved to further enhance the technique. For example you may finetune CLIP, you may improve the retrieval process just like in RAG and you can prompt engineer GPT-V.\n"
+      ]
+    }
+  ],
+  "metadata": {
+    "colab": {
+      "provenance": []
+    },
+    "kernelspec": {
+      "display_name": "Python 3",
+      "name": "python3"
+    },
+    "language_info": {
+      "name": "python",
+      "version": "3.10.14"
+    }
+  },
+  "nbformat": 4,
+  "nbformat_minor": 0
+}
--- a/examples/data/united_states_wikipedia.txt
+++ b/examples/data/united_states_wikipedia.txt
@ -0,0 +1,377 @@
+The United States of America (USA or U.S.A.), commonly known as the United States (US or U.S.) or America, is a country primarily located in North America, between Canada and Mexico. It is a federation of 50 states, a federal capital district (Washington, D.C.), and 326 Indian reservations. Outside the union of states, it asserts sovereignty over five major unincorporated island territories and various uninhabited islands.[j] The country has the world's third-largest land area,[d] largest maritime exclusive economic zone, and the third-largest population, exceeding 334 million.[k]
+
+Paleo-Indians migrated across the Bering land bridge more than 12,000 years ago. British colonization led to the first settlement of the Thirteen Colonies in Virginia in 1607. Clashes with the British Crown over taxation and political representation sparked the American Revolution, with the Second Continental Congress formally declaring independence on July 4, 1776. Following its victory in the Revolutionary War (1775–1783), the country continued to expand across North America. As more states were admitted, sectional division over slavery led to the secession of the Confederate States of America, which fought the remaining states of the Union during the 1861–1865 American Civil War. With the Union's victory and preservation, slavery was abolished nationally. By 1900, the United States had become the world's largest economy and established itself as a great power. After Japan's attack on Pearl Harbor in December 1941, the U.S. entered World War II. The aftermath of the war left the U.S. and the Soviet Union as the world's two superpowers and led to the Cold War, during which both countries engaged in a struggle for ideological dominance and international influence. Following the Soviet Union's collapse and the end of the Cold War in 1991, the U.S. emerged as the world's sole superpower.
+
+The U.S. national government is a presidential constitutional republic and liberal democracy with three separate branches: legislative, executive, and judicial. It has a bicameral national legislature composed of the House of Representatives, a lower house based on population; and the Senate, an upper house based on equal representation for each state. Substantial autonomy is given to states and several territories, with a political culture that emphasizes liberty, equality under the law, individualism, and limited government.
+
+One of the most developed countries, the United States has had the largest nominal GDP in the world since 1890 and accounted for over 25% of the global economy (15% based on PPP) in 2023. It possesses by far the largest amount of wealth of any country and the highest median income per capita of any non-microstate. The U.S. ranks among the world's highest in economic competitiveness, productivity, innovation, human rights, and higher education. Its hard power and cultural influence have a global reach. The U.S. is a founding member of the World Bank, IMF, Organization of American States, NATO, Quad, World Health Organization, and a permanent member of the UN Security Council.
+
+Etymology
+Further information: Names of the United States and Demonyms for the United States
+The first documentary evidence of the phrase "United States of America" dates back to a letter from January 2, 1776, written by Stephen Moylan, a Continental Army aide to General George Washington, to Joseph Reed, Washington's aide-de-camp. Moylan expressed his desire to go "with full and ample powers from the United States of America to Spain" to seek assistance in the Revolutionary War effort.[20][21] The first known publication of the phrase "United States of America" was in an anonymous essay in The Virginia Gazette newspaper in Williamsburg, on April 6, 1776.[22]
+
+By June 1776, the name "United States of America" appeared in drafts of the Articles of Confederation and Perpetual Union, authored by John Dickinson, a Founding Father from the Province of Pennsylvania,[23][24] and in the Declaration of Independence, written primarily by Thomas Jefferson and adopted by the Second Continental Congress in Philadelphia, on July 4, 1776.[23][25]
+
+History
+Main article: History of the United States
+For a topical guide, see Outline of United States history.
+Indigenous peoples
+Further information: Native Americans in the United States and Pre-Columbian era
+
+Cliff Palace, built by Ancestral Puebloans in present-day Montezuma County, Colorado, between c. 1200 and 1275[26]
+The first inhabitants of North America migrated from Siberia across the Bering land bridge at least 12,000 years ago;[27][28] the Clovis culture, which appeared around 11,000 BC, is believed to be the first widespread culture in the Americas.[29][30] Over time, indigenous North American cultures grew increasingly sophisticated, and some, such as the Mississippian culture, developed agriculture, architecture, and complex societies.[31] Indigenous peoples and cultures such as the Algonquian peoples,[32] Ancestral Puebloans,[33] and the Iroquois developed across the present-day United States.[34] Native population estimates of what is now the United States before the arrival of European immigrants range from around 500,000[35][36] to nearly 10 million.[36][37]
+
+European colonization
+Main article: Colonial history of the United States
+See also: European colonization of the Americas
+
+The 1750 colonial possessions of Britain (in pink and purple), France (in blue), and Spain (in orange) in present-day Canada and the United States
+Christopher Columbus began exploring the Caribbean in 1492, leading to Spanish settlements in present-day Puerto Rico, Florida, and New Mexico.[38][39][40] France established its own settlements along the Mississippi River and Gulf of Mexico.[41] British colonization of the East Coast began with the Virginia Colony (1607) and Plymouth Colony (1620).[42][43] The Mayflower Compact and the Fundamental Orders of Connecticut established precedents for representative self-governance and constitutionalism that would develop throughout the American colonies.[44][45] While European settlers in what is now the United States experienced conflicts with Native Americans, they also engaged in trade, exchanging European tools for food and animal pelts.[46][l] Relations ranged from close cooperation to warfare and massacres. The colonial authorities often pursued policies that forced Native Americans to adopt European lifestyles, including conversion to Christianity.[50][51] Along the eastern seaboard, settlers trafficked African slaves through the Atlantic slave trade.[52]
+
+The original Thirteen Colonies[m] that would later found the United States were administered by Great Britain,[53] and had local governments with elections open to most white male property owners.[54][55] The colonial population grew rapidly, eclipsing Native American populations;[56] by the 1770s, the natural increase of the population was such that only a small minority of Americans had been born overseas.[57] The colonies' distance from Britain allowed for the development of self-governance,[58] and the First Great Awakening—a series of Christian revivals—fueled colonial interest in religious liberty.[59]
+
+Revolution and expansion (1776–1861)
+Further information: History of the United States (1776–1789), History of the United States (1789–1815), and History of the United States (1815–1849)
+See caption
+Declaration of Independence, a portrait by John Trumbull depicting the Committee of Five presenting the draft of the Declaration to the Continental Congress on June 28, 1776, in Philadelphia
+After winning the French and Indian War, Britain began to assert greater control over local colonial affairs, creating colonial political resistance; one of the primary colonial grievances was a denial of their rights as Englishmen, particularly the right to representation in the British government that taxed them. In 1774, the First Continental Congress met in Philadelphia, and passed a colonial boycott of British goods that proved effective. The British attempt to then disarm the colonists resulted in the 1775 Battles of Lexington and Concord, igniting the American Revolutionary War. At the Second Continental Congress, the colonies appointed George Washington commander-in-chief of the Continental Army and created a committee led by Thomas Jefferson to write the Declaration of Independence, adopted on July 4, 1776.[60] The political values of the American Revolution included liberty, inalienable individual rights; and the sovereignty of the people;[61] supporting republicanism and rejecting monarchy, aristocracy, and hereditary political power; virtue and faithfulness in the performance of civic duties; and vilification of corruption.[62] The Founding Fathers of the United States, which included George Washington, Benjamin Franklin, Alexander Hamilton, Thomas Jefferson, John Jay, James Madison, Thomas Paine, and John Adams, took inspiration from Ancient Greco-Roman, Renaissance, and Age of Enlightenment philosophies and ideas.[63][64]
+
+After the British surrender at the siege of Yorktown in 1781, American sovereignty was internationally recognized by the Treaty of Paris (1783), through which the U.S. gained territory stretching west to the Mississippi River, north to present-day Canada, and south to Spanish Florida.[65] Ratified in 1781, the Articles of Confederation established a decentralized government that operated until 1789.[60] The Northwest Ordinance (1787) established the precedent by which the country's territory would expand with the admission of new states, rather than the expansion of existing states.[66] The U.S. Constitution was drafted at the 1787 Constitutional Convention to overcome the limitations of the Articles; it went into effect in 1789, creating a federation administered by three branches on the principle of checks and balances.[67] Washington was elected the country's first president under the Constitution, and the Bill of Rights was adopted in 1791 to allay concerns by skeptics of the more centralized government;[68][69] his resignations first as commander-in-chief after the Revolution and later as president set a precedent followed by John Adams, establishing the peaceful transfer of power between rival parties.[70][71]
+
+
+Animation showing the free/slave status of U.S. states and territories expansion, 1789–1861
+In the late 18th century, American settlers began to expand westward, some with a sense of manifest destiny.[72] The Louisiana Purchase (1803) from France nearly doubled the territory of the United States.[73] Lingering issues with Britain remained, leading to the War of 1812, which was fought to a draw.[74] Spain ceded Florida and its Gulf Coast territory in 1819.[75] The Missouri Compromise attempted to balance desires of northern states to prevent expansion of slavery in the country with those of southern states to expand it, admitting Missouri as a slave state and Maine as a free state and declared a policy of prohibiting slavery in the remaining Louisiana Purchase lands north of the 36°30′ parallel.[76] As Americans expanded further into land inhabited by Native Americans, the federal government often applied policies of Indian removal or assimilation.[77][78] The infamous Trail of Tears (1830–1850) was a U.S. government policy that forcibly removed and displaced most Native Americans living east of the Mississippi River to lands far to the west. These and earlier organized displacements prompted a long series of American Indian Wars west of the Mississippi.[79][80] The Republic of Texas was annexed in 1845,[81] and the 1846 Oregon Treaty led to U.S. control of the present-day American Northwest.[82] Victory in the Mexican–American War resulted in the 1848 Mexican Cession of California and much of the present-day American Southwest.[72][83] The California Gold Rush of 1848–1849 spurred a huge migration of white settlers to the Pacific coast, leading to even more confrontations with Native populations. One of the most violent, the California genocide of thousands of Native inhabitants, lasted into the early 1870s,[84][85] just as additional western territories and states were created.[86]
+
+Civil War (1861–1865)
+Main articles: History of the United States (1849–1865) and American Civil War
+
+Division of the states during the American Civil War:
+  Union states
+  Border states
+  Confederate states
+  Territories
+During the colonial period, slavery was legal in the American colonies, though the practice began to be significantly questioned during the American Revolution.[87] States in The North enacted abolition laws,[88] though support for slavery strengthened in Southern states, as inventions such as the cotton gin made the institution increasingly profitable for Southern elites.[89][90][91] This sectional conflict regarding slavery culminated in the American Civil War (1861–1865).[92][93]
+
+Eleven slave states seceded and formed the Confederate States of America, while the other states remained in the Union.[94] War broke out in April 1861 after the Confederacy bombarded Fort Sumter.[95] After the January 1863 Emancipation Proclamation, many freed slaves joined the Union Army.[96] The war began to turn in the Union's favor following the 1863 Siege of Vicksburg and Battle of Gettysburg, and the Confederacy surrendered in 1865 after the Union's victory in the Battle of Appomattox Court House.[97]
+
+The Reconstruction era followed the war. After the assassination of President Abraham Lincoln, Reconstruction Amendments were passed to protect the rights of African Americans. National infrastructure, including transcontinental telegraph and railroads, spurred growth in the American frontier.[98]
+
+Post-Civil War era (1865–1898)
+Main article: History of the United States (1865–1917)
+Duration: 2 minutes and 27 seconds.2:27
+An Edison Studios film showing immigrants arriving at Ellis Island in New York Harbor, a major point of entry for European immigrants in the late 19th and early 20th centuries[99][100]
+From 1865 through 1917 an unprecedented stream of immigrants arrived in the United States, including 24.4 million from Europe.[101] Most came through the port of New York City, and New York City and other large cities on the East Coast became home to large Jewish, Irish, and Italian populations, while many Germans and Central Europeans moved to the Midwest. At the same time, about one million French Canadians migrated from Quebec to New England.[102] During the Great Migration, millions of African Americans left the rural South for urban areas in the North.[103] Alaska was purchased from Russia in 1867.[104]
+
+The Compromise of 1877 effectively ended Reconstruction and white supremacists took local control of Southern politics.[105][106] African Americans endured a period of heightened, overt racism following Reconstruction, a time often called the nadir of American race relations.[107][108] A series of Supreme Court decisions, including Plessy v. Ferguson, emptied the Fourteenth and Fifteenth Amendments of their force, allowing Jim Crow laws in the South to remain unchecked, sundown towns in the Midwest, and segregation in cities across the country, which would be reinforced by the policy of redlining later adopted by the federal Home Owners' Loan Corporation.[109]
+
+An explosion of technological advancement accompanied by the exploitation of cheap immigrant labor[110] led to rapid economic development during the late 19th and early 20th centuries, allowing the United States to outpace England, France, and Germany combined.[111][112] This fostered the amassing of power by a few prominent industrialists, largely by their formation of trusts and monopolies to prevent competition.[113] Tycoons led the nation's expansion in the railroad, petroleum, and steel industries. The United States emerged as a pioneer of the automotive industry.[114] These changes were accompanied by significant increases in economic inequality, slum conditions, and social unrest, creating the environment for labor unions to begin to flourish.[115][116][117] This period eventually ended with the advent of the Progressive Era, which was characterized by significant reforms.[118][119]
+
+Rise as a superpower (1898–1945)
+Main article: History of the United States (1917–1945)
+
+The Trinity nuclear test in 1945, part of the Manhattan Project and the first detonation of a nuclear weapon. The World Wars permanently ended the country's policy of isolationism and left it as a world superpower.
+Pro-American elements in Hawaii overthrew the Hawaiian monarchy; the islands were annexed in 1898. Puerto Rico, Guam, and the Philippines were ceded by Spain following the Spanish–American War.[120] American Samoa was acquired by the United States in 1900 after the Second Samoan Civil War.[121] The U.S. Virgin Islands were purchased from Denmark in 1917.[122] The United States entered World War I alongside the Allies of World War I, helping to turn the tide against the Central Powers.[123] In 1920, a constitutional amendment granted nationwide women's suffrage.[124] During the 1920s and 30s, radio for mass communication and the invention of early television transformed communications nationwide.[125] The Wall Street Crash of 1929 triggered the Great Depression, which President Franklin D. Roosevelt responded to with New Deal social and economic policies.[126][127]
+
+At first neutral during World War II, the U.S. began supplying war materiel to the Allies of World War II in March 1941 and entered the war in December after the Empire of Japan's attack on Pearl Harbor.[128][129] The U.S. developed the first nuclear weapons and used them against the Japanese cities of Hiroshima and Nagasaki in August 1945, ending the war.[130][131] The United States was one of the "Four Policemen" who met to plan the postwar world, alongside the United Kingdom, Soviet Union, and China.[132][133] The U.S. emerged relatively unscathed from the war, with even greater economic and international political influence.[134]
+
+Cold War (1945–1991)
+Main articles: History of the United States (1945–1964), History of the United States (1964–1980), and History of the United States (1980–1991)
+
+Mikhail Gorbachev and Ronald Reagan sign the Intermediate-Range Nuclear Forces Treaty at the White House, 1987.
+After World War II, the United States entered the Cold War, where geopolitical tensions between the U.S. and the Soviet Union led the two countries to dominate world affairs.[135] The U.S. engaged in regime change against governments perceived to be aligned with the Soviet Union, and competed in the Space Race, culminating in the first crewed Moon landing in 1969.[136][137][138][139]
+
+Domestically, the U.S. experienced economic growth, urbanization, and population growth following World War II.[140] The civil rights movement emerged, with Martin Luther King Jr. becoming a prominent leader in the early 1960s.[141] The Great Society plan of President Lyndon Johnson's administration resulted in groundbreaking and broad-reaching laws, policies and a constitutional amendment to counteract some of the worst effects of lingering institutional racism.[142] The counterculture movement in the U.S. brought significant social changes, including the liberalization of attitudes toward recreational drug use and sexuality. It also encouraged open defiance of the military draft (leading to the end of conscription in 1973) and wide opposition to U.S. intervention in Vietnam (with the U.S. totally withdrawing in 1975).[143][144][145] The societal shift in the roles of women partly resulted in large increases in female labor participation in the 1970s, and by 1985 the majority of women aged 16 and older were employed.[146] The late 1980s and early 1990s saw the collapse of the Warsaw Pact and the dissolution of the Soviet Union, which marked the end of the Cold War and solidified the U.S. as the world's sole superpower.[147][148][149][150]
+
+Contemporary (1991–present)
+Main articles: History of the United States (1991–2008) and History of the United States (2008–present)
+
+The Twin Towers in New York City during the September 11 attacks of 2001
+The 1990s saw the longest recorded economic expansion in American history, a dramatic decline in crime, and advances in technology, with the World Wide Web, the evolution of the Pentium microprocessor in accordance with Moore's law, rechargeable lithium-ion batteries, the first gene therapy trial, and cloning all emerging and being improved upon throughout the decade. The Human Genome Project was formally launched in 1990, while Nasdaq became the first stock market in the United States to trade online in 1998.[151] In 1991, an American-led international coalition of states expelled an Iraqi invasion force from Kuwait in the Gulf War.[152]
+
+The September 11, 2001 attacks by the pan-Islamist militant organization Al-Qaeda led to the war on terror and subsequent military interventions in Afghanistan and Iraq.[153][154] The cultural impact of the attacks was profound and long-lasting.
+
+The U.S. housing bubble culminated in 2007 with the Great Recession, the largest economic contraction since the Great Depression.[155] Coming to a head in the 2010s, political polarization increased as sociopolitical debates on cultural issues dominated politics.[156] This polarization was capitalized upon in the January 2021 Capitol attack,[157] when a mob of protesters entered the U.S. Capitol building and attempted to prevent the peaceful transfer of power.[158]
+
+Geography
+Main articles: Geography of the United States and Borders of the United States
+
+A topographic map of the United States
+The United States is the world's third-largest country by land and total area behind Russia and Canada.[d][159][160] The 48 contiguous states and the District of Columbia occupy a combined area of 3,119,885 square miles (8,080,470 km2).[161][162] The coastal plain of the Atlantic seaboard gives way to inland forests and rolling hills in the Piedmont plateau region.[163]
+
+The Appalachian Mountains and the Adirondack massif separate the East Coast from the Great Lakes and the grasslands of the Midwest.[164] The Mississippi River System—the world's fourth longest river system—runs mainly north–south through the heart of the country. The flat, fertile prairie of the Great Plains stretches to the west, interrupted by a highland region in the southeast.[164]
+
+The Rocky Mountains, west of the Great Plains, extend north to south across the country, peaking at over 14,000 feet (4,300 m) in Colorado.[165] Farther west are the rocky Great Basin and Chihuahua, Sonoran, and Mojave deserts.[166] The Sierra Nevada and Cascade mountain ranges run close to the Pacific coast. The lowest and highest points in the contiguous United States are in the state of California,[167] about 84 miles (135 km) apart.[168] At an elevation of 20,310 feet (6,190.5 m), Alaska's Denali is the highest peak in the country and continent.[169] Active volcanoes are common throughout Alaska's Alexander and Aleutian Islands, and Hawaii consists of volcanic islands. The supervolcano underlying Yellowstone National Park in the Rockies is the continent's largest volcanic feature.[170] In 2021, the United States had 8% of global permanent meadows and pastures and 10% of cropland.[171]
+
+Climate
+Main articles: Climate of the United States and Climate change in the United States
+
+The Köppen climate types of the United States
+With its large size and geographic variety, the United States includes most climate types. East of the 100th meridian, the climate ranges from humid continental in the north to humid subtropical in the south.[172] The western Great Plains are semi-arid. Many mountainous areas of the American West have an alpine climate. The climate is arid in the Southwest, Mediterranean in coastal California, and oceanic in coastal Oregon, Washington, and southern Alaska. Most of Alaska is subarctic or polar. Hawaii and the southern tip of Florida are tropical, as well as its territories in the Caribbean and the Pacific.[173]
+
+States bordering the Gulf of Mexico are prone to hurricanes, and most of the world's tornadoes occur in the country, mainly in Tornado Alley.[174] Overall, the United States receives more high-impact extreme weather incidents than any other country.[175] Extreme weather became more frequent in the U.S. in the 21st century, with three times the number of reported heat waves as in the 1960s. In the American Southwest, droughts became more persistent and more severe.[176]
+
+Biodiversity and conservation
+Main articles: Fauna of the United States and Flora of the United States
+
+A bald eagle
+The bald eagle, the national bird of the United States since 1782[177]
+The U.S. is one of 17 megadiverse countries containing large numbers of endemic species: about 17,000 species of vascular plants occur in the contiguous United States and Alaska, and over 1,800 species of flowering plants are found in Hawaii, few of which occur on the mainland.[178] The United States is home to 428 mammal species, 784 birds, 311 reptiles, 295 amphibians,[179] and 91,000 insect species.[180]
+
+There are 63 national parks, and hundreds of other federally managed parks, forests, and wilderness areas, managed by the National Park Service and other agencies.[181] About 28% of the country's land is publicly owned and federally managed,[182] primarily in the western states.[183] Most of this land is protected, though some is leased for commercial use, and less than one percent is used for military purposes.[184][185]
+
+Environmental issues in the United States include debates on non-renewable resources and nuclear energy, air and water pollution, biodiversity, logging and deforestation,[186][187] and climate change.[188][189] The U.S. Environmental Protection Agency (EPA) is the federal agency charged with addressing most environmental-related issues.[190] The idea of wilderness has shaped the management of public lands since 1964, with the Wilderness Act.[191] The Endangered Species Act of 1973 provides a way to protect threatened and endangered species and their habitats. The United States Fish and Wildlife Service implements and enforces the Act.[192] As of 2022, the U.S. ranked 43rd among 180 countries in the Environmental Performance Index.[193] The country joined the Paris Agreement on climate change in 2016 and has many other environmental commitments.[194]
+
+Government and politics
+Main articles: Constitution of the United States and Politics of the United States
+Further information: Elections in the United States, Political ideologies in the United States, Americanism (ideology), and American civil religion
+
+The Capitol and its two legislative chambers, the Senate (left) and the House of Representatives (right)
+
+The White House, the residence and workplace of the U.S. president and the offices of the presidential staff
+
+The Supreme Court Building, which houses the nation's highest court
+The United States is a federal republic of 50 states, with its capital in a federal district, asserting sovereignty over five unincorporated territories and several uninhabited island possessions (some of which are disputed).[195][196] It is the world's oldest surviving federation, and, according to the World Economic Forum, the oldest democracy as well.[197] It is a liberal representative democracy "in which majority rule is tempered by minority rights protected by law."[198] The Constitution of the United States serves as the country's supreme legal document, also establishing the structure and responsibilities of the national federal government and its relationship with the individual states.[199]
+
+National government
+Main article: Federal government of the United States
+Comprised of three branches, all headquartered in Washington, D.C., the federal government is the national government of the United States. It is regulated by a strong system of checks and balances.[200]
+
+The U.S. Congress, a bicameral legislature, made up of the Senate and the House of Representatives, makes federal law, declares war, approves treaties, has the power of the purse,[201] and has the power of impeachment.[202] The Senate has 100 members (2 from each state), elected for a six-year term. The House of Representatives has 435 members from single member congressional districts allocated to each state on the basis of population, elected for a two-year term.[203]
+The U.S. president is the commander-in-chief of the military, can veto legislative bills before they become law (subject to congressional override), and appoints the members of the Cabinet (subject to Senate approval) and other officials, who administer and enforce federal laws and policies through their respective agencies.[204] The president and the vice president run and are elected together in a presidential election. Unlike any others in American politics, it is an indirect election, with the winner being determined by votes cast by electors of the Electoral College. The President and Vice President serve a four-year term and may be elected to the office no more than twice.[205]
+The U.S. federal judiciary, whose judges are all appointed for life by the President with Senate approval, consists primarily of the U.S. Supreme Court, the U.S. courts of appeals, and the U.S. district courts. The U.S. Supreme Court interprets laws and overturn those they find unconstitutional.[206] The Supreme Court is led by the Chief Justice of the United States. It has nine members who serve for life. The members are appointed by the sitting president when a vacancy becomes available.[207]
+The three-branch system is known as the presidential system, in contrast to the parliamentary system, where the executive is part of the legislative body. Many countries around the world copied this aspect of the 1789 Constitution of the United States, especially in the Americas.[208]
+
+Political parties
+Main articles: Political parties in the United States, Political party strength in U.S. states, and List of political parties in the United States
+
+U.S. state governments (governor and legislature) by party control:
+  Democratic control
+  Republican control
+  Split control
+The Constitution is silent on political parties. However, they developed independently in the 18th century with the Federalist and Anti-Federalist parties.[209] Since then, the United States has operated as a de facto two-party system, though the parties in that system have been different at different times.
+
+The two main national parties are presently the Democratic and the Republican. The former is perceived as relatively liberal in its political platform while the latter is perceived as relatively conservative.[210] Each has a primary system to nominate a presidential ticket, and each runs candidates for other offices in every state in the Union. Other smaller and less influential parties exist but do not have the national scope and breadth of the two main parties.
+
+Subdivisions
+Main articles: State governments of the United States, Local government in the United States, and U.S. state
+Further information: List of states and territories of the United States, Indian reservation, Territories of the United States, and Territorial evolution of the United States
+In the American federal system, sovereign powers are shared between two levels of elected government: national and state. People in the states are also represented by local elected governments, which are administrative divisions of the states.[211] States are subdivided into counties or county equivalents, and further divided into municipalities. The District of Columbia is a federal district that contains the capital of the United States, the city of Washington.[212] The territories and the District of Columbia are administrative divisions of the federal government.[213]
+
+
+Foreign relations
+Main articles: Foreign relations of the United States and Foreign policy of the United States
+see caption
+The United Nations headquarters has been situated along the East River in Midtown Manhattan since 1952; in 1945, the United States was a founding member of the UN.
+The United States has an established structure of foreign relations, and it has the world's second-largest diplomatic corps as of 2024. It is a permanent member of the United Nations Security Council,[214] and home to the United Nations headquarters.[215] The United States is a member of the G7,[216] G20,[217] and OECD intergovernmental organizations.[218] Almost all countries have embassies and many have consulates (official representatives) in the country. Likewise, nearly all countries host formal diplomatic missions with the United States, except Iran,[219] North Korea,[220] and Bhutan.[221] Though Taiwan does not have formal diplomatic relations with the U.S., it maintains close unofficial relations.[222] The United States regularly supplies Taiwan with military equipment to deter potential Chinese aggression.[223] Its geopolitical attention also turned to the Indo-Pacific when the United States joined the Quadrilateral Security Dialogue with Australia, India, and Japan.[224]
+
+The United States has a "Special Relationship" with the United Kingdom[225] and strong ties with Canada,[226] Australia,[227] New Zealand,[228] the Philippines,[229] Japan,[230] South Korea,[231] Israel,[232] and several European Union countries (France, Italy, Germany, Spain, and Poland).[233] The U.S. works closely with its NATO allies on military and national security issues, and with countries in the Americas through the Organization of American States and the United States–Mexico–Canada Free Trade Agreement. In South America, Colombia is traditionally considered to be the closest ally of the United States.[234] The U.S. exercises full international defense authority and responsibility for Micronesia, the Marshall Islands, and Palau through the Compact of Free Association.[235] It has increasingly conducted strategic cooperation with India,[236] but its ties with China have steadily deteriorated.[237][238] Since 2014, the U.S. has become a key ally of Ukraine;[239] it has also provided the country with significant military equipment and other support in response to Russia's 2022 invasion.[240]
+
+Military
+Main articles: United States Armed Forces and Military history of the United States
+
+The Pentagon, the headquarters of the U.S. Department of Defense in Arlington County, Virginia, is one of the world's largest office buildings with about 6.5 million square feet (600,000 m2) of floor space.
+The President is the commander-in-chief of the United States Armed Forces and appoints its leaders, the secretary of defense and the Joint Chiefs of Staff. The Department of Defense, which is headquartered at the Pentagon near Washington, D.C., administers five of the six service branches, which are made up of the Army, Marine Corps, Navy, Air Force, and Space Force. The Coast Guard is administered by the Department of Homeland Security in peacetime and can be transferred to the Department of the Navy in wartime.[241]
+
+The United States spent $877 billion on its military in 2022, which is by far the largest amount of any country, making up 39% of global military spending and accounting for 3.5% of the country's GDP.[242][243] The U.S. has 45% of the world's nuclear weapons, the second-largest amount after Russia.[244]
+
+The United States has the third-largest combined armed forces in the world, behind the Chinese People's Liberation Army and Indian Armed Forces.[245] The military operates about 800 bases and facilities abroad,[246] and maintains deployments greater than 100 active duty personnel in 25 foreign countries.[247]
+
+Law enforcement and crime
+Main articles: Law of the United States, Law enforcement in the United States, Crime in the United States, and Censorship in the United States
+
+J. Edgar Hoover Building, the headquarters of the Federal Bureau of Investigation (FBI), in Washington, D.C.
+There are about 18,000 U.S. police agencies from local to national level in the United States.[248] Law in the United States is mainly enforced by local police departments and sheriff departments in their municipal or county jurisdictions. The state police departments have authority in their respective state, and federal agencies such as the Federal Bureau of Investigation (FBI) and the U.S. Marshals Service have national jurisdiction and specialized duties, such as protecting civil rights, national security and enforcing U.S. federal courts' rulings and federal laws.[249] State courts conduct most civil and criminal trials,[250] and federal courts handle designated crimes and appeals of state court decisions.[251]
+
+As of January 2023, the United States has the sixth highest per-capita incarceration rate in the world, at 531 people per 100,000; and the largest prison and jail population in the world with almost 2 million people incarcerated.[252][253][254] An analysis of the World Health Organization Mortality Database from 2010 showed U.S. homicide rates "were 7 times higher than in other high-income countries, driven by a gun homicide rate that was 25 times higher."[255]
+
+Economy
+Main article: Economy of the United States
+Further information: Economic history of the United States and Tourism in the United States
+see caption
+The U.S. dollar, most-used currency in international transactions and the world's foremost reserve currency[256]
+
+Microsoft campus, in Redmond, Washington, is the headquarters of Microsoft, the world's biggest company by market capitalization.[257]
+The U.S. has been the world's largest economy nominally since about 1890.[258] The 2023 nominal U.S. gross domestic product (GDP) of $27 trillion was the largest in the world, constituting over 25% of the global economy or 15% at purchasing power parity (PPP).[259][13] From 1983 to 2008, U.S. real compounded annual GDP growth was 3.3%, compared to a 2.3% weighted average for the rest of the Group of Seven.[260] The country ranks first in the world by disposable income per capita and nominal GDP;[261] second by GDP (PPP), after China;[13] and ninth by GDP (PPP) per capita.[13]
+
+Of the world's 500 largest companies, 136 are headquartered in the U.S.[262] The U.S. dollar is the currency most used in international transactions and is the world's foremost reserve currency, backed by the country's dominant economy, its military, the petrodollar system, and its linked eurodollar and large U.S. treasuries market.[256] Several countries use it as their official currency and in others it is the de facto currency.[263][264] It has free trade agreements with several countries, including the USMCA.[265] The U.S. ranked second in the Global Competitiveness Report in 2019, after Singapore.[266] While its economy has reached a post-industrial level of development, the United States remains an industrial power.[267] As of 2021, the U.S. is the second-largest manufacturing country after China.[268]
+
+
+The New York Stock Exchange on Wall Street, the world's largest stock exchange by market capitalization[269]
+New York City is the world's principal financial center[270][271] and the epicenter of the world's largest metropolitan economy.[272] The New York Stock Exchange and Nasdaq, both located in New York City, are the world's two largest stock exchanges by market capitalization and trade volume.[273][274] The United States is at or near the forefront of technological advancement and innovation[275] in many economic fields, especially in artificial intelligence; computers; pharmaceuticals; and medical, aerospace and military equipment.[276] The country's economy is fueled by abundant natural resources, a well-developed infrastructure, and high productivity.[277] The largest U.S. trading partners are the European Union, Mexico, Canada, China, Japan, South Korea, the United Kingdom, Vietnam, India, and Taiwan.[278] The United States is the world's largest importer and the second-largest exporter after China.[279] It is by far the world's largest exporter of services.[280]
+
+Americans have the highest average household and employee income among OECD member states,[281] and the fourth-highest median household income,[282] up from sixth-highest in 2013.[283] Wealth in the United States is highly concentrated; the richest 10% of the adult population own 72% of the country's household wealth, while the bottom 50% own just 2%.[284] Income inequality in the U.S. remains at record highs,[285] with the top fifth of earners taking home more than half of all income[286] and giving the U.S. one of the widest income distributions among OECD members.[287][288] The U.S. ranks first in the number of dollar billionaires and millionaires, with 735 billionaires and nearly 22 million millionaires (as of 2023).[289] There were about 582,500 sheltered and unsheltered homeless persons in the U.S. in 2022, with 60% staying in an emergency shelter or transitional housing program.[290] In 2018, six million children experienced food insecurity.[291] Feeding America estimates that around one in seven, or approximately 11 million, children experience hunger and do not know where they will get their next meal or when.[292] As of 2021, 38 million people, about 12% of the U.S. population, were living in poverty.[293]
+
+The United States has a smaller welfare state and redistributes less income through government action than most other high-income countries.[294][295] It is the only advanced economy that does not guarantee its workers paid vacation nationally[296] and is one of a few countries in the world without federal paid family leave as a legal right.[297] The United States has a higher percentage of low-income workers than almost any other developed country, largely because of a weak collective bargaining system and lack of government support for at-risk workers.[298]
+
+Science, technology, and energy
+Main articles: Science and technology in the United States, Science policy of the United States, Communications in the United States, and Energy in the United States
+
+U.S. astronaut Buzz Aldrin saluting the American flag on the Moon during the 1969 Apollo 11 mission; the United States is the only country that has landed crews on the lunar surface.
+The United States has been a leader in technological innovation since the late 19th century and scientific research since the mid-20th century. Methods for producing interchangeable parts and the establishment of a machine tool industry enabled the large-scale manufacturing of U.S. consumer products in the late 19th century. By the early 20th century, factory electrification, the introduction of the assembly line, and other labor-saving techniques created the system of mass production.[299] The United States is a leader in the development of artificial intelligence technology and has maintained a space program since the late 1950s, with plans for long-term habitation of the Moon.[300][301]
+
+In 2022, the United States was the country with the second-highest number of published scientific papers.[302] As of 2021, the U.S. ranked second by the number of patent applications, and third by trademark and industrial design applications.[303] In 2023, the United States ranked 3rd in the Global Innovation Index.[304]
+
+As of 2022, the United States receives approximately 81% of its energy from fossil fuel and the largest source of the country's energy came from petroleum (35.8%), followed by natural gas (33.4%), renewable sources (13.3%), coal (9.8%), and nuclear power (8%).[305][306] The United States constitutes less than 5% of the world's population, but consumes 17% of the world's energy.[307][308] The U.S. ranks as the second-highest emitter of greenhouse gases.[309]
+
+Transportation
+Main article: Transportation in the United States
+
+Hartsfield–Jackson Atlanta International Airport, serving the Atlanta metropolitan area, is the world's busiest airport by passenger traffic with over 93 million passengers annually in 2022.[310]
+Personal transportation in the United States is dominated by automobiles,[311][312] which operate on a network of 4 million miles (6.4 million kilometers) of public roads, making it the longest network in the world.[313][314] The Oldsmobile Curved Dash and the Ford Model T, both American cars, are considered the first mass-produced[315] and mass-affordable[316] cars, respectively. As of 2022, the United States is the second-largest manufacturer of motor vehicles[317] and is home to Tesla, the world's most valuable car company.[318] American automotive company General Motors held the title of the world's best-selling automaker from 1931 to 2008.[319] Currently, the American automotive industry is the world's second-largest automobile market by sales,[320] and the U.S. has the highest vehicle ownership per capita in the world,[321] with 910 vehicles per 1000 people.[322] The United States's rail transport network, the longest network in the world,[323] handles mostly freight.[324][325]
+
+The American civil airline industry is entirely privately owned and has been largely deregulated since 1978, while most major airports are publicly owned.[326] The three largest airlines in the world by passengers carried are U.S.-based; American Airlines is number one after its 2013 acquisition by US Airways.[327] Of the world's 50 busiest passenger airports, 16 are in the United States, including the top five and the busiest, Hartsfield–Jackson Atlanta International Airport.[328][329] As of 2022, there are 19,969 airports in the U.S., of which 5,193 are designated as "public use", including for general aviation and other activities.[330]
+
+Of the fifty busiest container ports, four are located in the United States, of which the busiest is the Port of Los Angeles.[331] The country's inland waterways are the world's fifth-longest, and total 41,009 km (25,482 mi).[332]
+
+Demographics
+Main article: Demographics of the United States
+Population
+Main articles: Americans and Race and ethnicity in the United States
+See also: List of U.S. states by population
+
+As of 2020, the majority of the U.S. population lived in suburbs. Above: Nassau County, New York on Long Island, immediately east of New York City.
+The U.S. Census Bureau reported 331,449,281 residents as of April 1, 2020,[n][333] making the United States the third-most-populous country in the world, after China and India.[334] According to the Bureau's U.S. Population Clock, on January 28, 2021, the U.S. population had a net gain of one person every 100 seconds, or about 864 people per day.[335] In 2018, 52% of Americans age 15 and over were married, 6% were widowed, 10% were divorced, and 32% had never been married.[336] In 2021, the total fertility rate for the U.S. stood at 1.7 children per woman,[337] and it had the world's highest rate of children (23%) living in single-parent households in 2019.[338]
+
+The United States has a diverse population; 37 ancestry groups have more than one million members.[339] White Americans with ancestry from Europe, the Middle East or North Africa, form the largest racial and ethnic group at 57.8% of the United States population.[340][341] Hispanic and Latino Americans form the second-largest group and are 18.7% of the United States population. African Americans constitute the country's third-largest ancestry group and are 12.1% of the total U.S. population.[339] Asian Americans are the country's fourth-largest group, composing 5.9% of the United States population, while the country's 3.7 million Native Americans account for about 1%.[339] In 2020, the median age of the United States population was 38.5 years.[334]
+
+Language
+Main article: Languages of the United States
+
+Most spoken languages in the U.S.
+While many languages are spoken in the United States, English is by far the most commonly spoken and written.[342] Although there is no official language at the federal level, some laws, such as U.S. naturalization requirements, standardize English, and most states have declared it the official language.[343] Three states and four U.S. territories have recognized local or indigenous languages in addition to English, including Hawaii (Hawaiian),[344] Alaska (twenty Native languages),[o][345] South Dakota (Sioux),[346] American Samoa (Samoan), Puerto Rico (Spanish), Guam (Chamorro), and the Northern Mariana Islands (Carolinian and Chamorro). In Puerto Rico, Spanish is more widely spoken than English.[347]
+
+According to the American Community Survey in 2010, some 229 million people out of the total U.S. population of 308 million spoke only English at home. About 37 million spoke Spanish at home, making it the second most commonly used language. Other languages spoken at home by one million people or more include Chinese (2.8 million), Tagalog (1.6 million), Vietnamese (1.4 million), French (1.3 million), Korean (1.1 million), and German (1 million).[348]
+
+Immigration
+Main articles: Immigration to the United States and United States Border Patrol
+
+Mexico–United States border wall between San Diego (left) and Tijuana (right)
+America's immigrant population, 51 million, is by far the world's largest in absolute terms.[349][350] In 2022, there were 87.7 million immigrants and U.S.-born children of immigrants in the United States, accounting for nearly 27% of the overall U.S. population.[351] In 2017, out of the U.S. foreign-born population, some 45% (20.7 million) were naturalized citizens, 27% (12.3 million) were lawful permanent residents, 6% (2.2 million) were temporary lawful residents, and 23% (10.5 million) were unauthorized immigrants.[352] In 2019, the top countries of origin for immigrants were Mexico (24% of immigrants), India (6%), China (5%), the Philippines (4.5%), and El Salvador (3%).[353] The United States has led the world in refugee resettlement for decades, admitting more refugees than the rest of the world combined.[354]
+
+Religion
+Main articles: Religion in the United States and Irreligion in the United States
+See also: List of religious movements that began in the United States
+Religious affiliation in the U.S., according to a 2022 Gallup poll:[7]
+
+  Protestantism (34%)
+  Catholicism (23%)
+  Non-specific Christian (11%)
+  Mormonism (2%)
+  Judaism (2%)
+  Other religions (6%)
+  Unaffiliated (21%)
+  Unanswered (1%)
+The First Amendment guarantees the free exercise of religion and forbids Congress from passing laws respecting its establishment.[355][356] Religious practice is widespread, among the most diverse in the world,[357] and profoundly vibrant.[358] The country has the world's largest Christian population.[359] A majority of the global Jewish population lives in the United States, as measured by the Law of Return.[360] Other notable faiths include Buddhism, Hinduism, Islam, many New Age movements, and Native American religions.[361] Religious practice varies significantly by region.[362] "Ceremonial deism" is common in American culture.[363]
+
+The overwhelming majority of Americans believe in a higher power or spiritual force, engage in spiritual practices such as prayer, and consider themselves religious or spiritual.[364][365] In the "Bible Belt", located within the Southern United States, evangelical Protestantism plays a significant role culturally, whereas New England and the Western United States tend to be more secular.[362] Mormonism—a Restorationist movement, whose members migrated westward from Missouri and Illinois under the leadership of Brigham Young in 1847 after the assassination of Joseph Smith[366]—remains the predominant religion in Utah to this day.[367]
+
+Urbanization
+Main articles: Urbanization in the United States and List of United States cities by population
+About 82% of Americans live in urban areas, including suburbs;[159] about half of those reside in cities with populations over 50,000.[368] In 2022, 333 incorporated municipalities had populations over 100,000, nine cities had more than one million residents, and four cities (New York City, Los Angeles, Chicago, and Houston) had populations exceeding two million.[369] Many U.S. metropolitan populations are growing rapidly, particularly in the South and West.[370]
+
+Health
+See also: Healthcare in the United States, Healthcare reform in the United States, and Health insurance in the United States
+The Texas Medical Center, a cluster of contemporary skyscrapers, at night
+Texas Medical Center in Houston is the largest medical complex in the world.[372][373] As of 2018, it employed 120,000 people and treated 10 million patients annually.[374]
+According to the Centers for Disease Control (CDC), average American life expectancy at birth was 77.5 years in 2022 (74.8 years for men and 80.2 years for women). This was a gain of 1.1 years from 76.4 years in 2021, but the CDC noted that the new average "didn't fully offset the loss of 2.4 years between 2019 and 2021". The COVID pandemic and higher overall mortality due to opioid overdoses and suicides were held mostly responsible for the previous drop in life expectancy.[375] The same report stated that the 2022 gains in average U.S. life expectancy were especially significant for men, Hispanics, and American Indian–Alaskan Native people (AIAN). Starting in 1998, the life expectancy in the U.S. fell behind that of other wealthy industrialized countries, and Americans' "health disadvantage" gap has been increasing ever since.[376] The U.S. has one of the highest suicide rates among high-income countries.[377] Approximately one-third of the U.S. adult population is obese and another third is overweight.[378] The U.S. healthcare system far outspends that of any other country, measured both in per capita spending and as a percentage of GDP, but attains worse healthcare outcomes when compared to peer countries for reasons that are debated.[379] The United States is the only developed country without a system of universal healthcare, and a significant proportion of the population that does not carry health insurance.[380] Government-funded healthcare coverage for the poor (Medicaid) and for those age 65 and older (Medicare) is available to Americans who meet the programs' income or age qualifications. In 2010, former President Obama passed the Patient Protection and Affordable Care Act.[p][381]
+
+Education
+Main articles: Education in the United States and Higher education in the United States
+Photograph of the University of Virginia
+The University of Virginia, founded by Thomas Jefferson in 1819, is one of many public colleges and universities in the United States.
+American primary and secondary education (known in the U.S. as K-12, "kindergarten through 12th grade") is decentralized. It is operated by state, territorial, and sometimes municipal governments and regulated by the U.S. Department of Education. In general, children are required to attend school or an approved homeschool from the age of five or six (kindergarten or first grade) until they are 18 years old. This often brings students through the 12th grade, the final year of a U.S. high school, but some states and territories allow them to leave school earlier, at age 16 or 17.[382] The U.S. spends more on education per student than any country in the world,[383] an average of $12,794 per year per public elementary and secondary school student in 2016–2017.[384] Among Americans age 25 and older, 84.6% graduated from high school, 52.6% attended some college, 27.2% earned a bachelor's degree, and 9.6% earned a graduate degree.[385] The U.S. literacy rate is near-universal.[159][386] The country has the most Nobel Prize winners in history, with 411 (having won 413 awards).[387][388]
+
+U.S. tertiary or higher education has earned a global reputation. Many of the world's top universities, as listed by various ranking organizations, are in the United States, including 19 of the top 25.[389][390] American higher education is dominated by state university systems, although the country's many private universities and colleges enroll about 20% of all American students. Large amounts of federal financial aid are provided to students in the form of grants and loans.
+
+Colleges and universities directly funded by the federal government are limited to military personnel and government employees and include the U.S. service academies, the Naval Postgraduate School, and military staff colleges. Local community colleges generally offer coursework and degree programs covering the first two years of college study. They often have more open admission policies, shorter academic programs, and lower tuition.[391]
+
+As for public expenditures on higher education, the U.S. spends more per student than the OECD average, and more than all nations in combined public and private spending.[392] Despite some student loan forgiveness programs in place,[393] student loan debt has increased by 102% in the last decade,[394] and exceeded 1.7 trillion dollars as of 2022.[395]
+
+Culture and society
+Main articles: Culture of the United States and Society of the United States
+The Statue of Liberty, a large teal bronze sculpture on a stone pedestal
+The Statue of Liberty (Liberty Enlightening the World) on Liberty Island in New York Harbor was an 1866 gift from France that has become an iconic symbol of the American Dream.[396]
+Americans have traditionally been characterized by a unifying political belief in an "American creed" emphasizing liberty, equality under the law, democracy, social equality, property rights, and a preference for limited government.[397][398] Culturally, the country has been described as having the values of individualism and personal autonomy,[399][400] having a strong work ethic,[401] competitiveness,[402] and voluntary altruism towards others.[403][404][405] According to a 2016 study by the Charities Aid Foundation, Americans donated 1.44% of total GDP to charity, the highest rate in the world by a large margin.[406] The United States is home to a wide variety of ethnic groups, traditions, and values. It has acquired significant cultural and economic soft power.[407][408]
+
+Nearly all present Americans or their ancestors came from Europe, Africa, and Asia ("the Old World") within the past five centuries.[409] Mainstream American culture is a Western culture largely derived from the traditions of European immigrants with influences from many other sources, such as traditions brought by slaves from Africa.[410] More recent immigration from Asia and especially Latin America has added to a cultural mix that has been described as a homogenizing melting pot, and a heterogeneous salad bowl, with immigrants contributing to, and often assimilating into, mainstream American culture. The American Dream, or the perception that Americans enjoy high social mobility, plays a key role in attracting immigrants.[411] Whether this perception is accurate has been a topic of debate.[412][413][414] While mainstream culture holds that the United States is a classless society,[415] scholars identify significant differences between the country's social classes, affecting socialization, language, and values.[416] Americans tend to greatly value socioeconomic achievement, but being ordinary or average is promoted by some as a noble condition as well.[417]
+
+The United States is considered to have the strongest protections of free speech of any country under the First Amendment,[418] which protects flag desecration, hate speech, blasphemy, and lese-majesty as forms of protected expression.[419][420][421] A 2016 Pew Research Center poll found that Americans were the most supportive of free expression of any polity measured.[422] They are the "most supportive of freedom of the press and the right to use the Internet without government censorship."[423] It is a socially progressive country[424] with permissive attitudes surrounding human sexuality.[425] LGBT rights in the United States are advanced by global standards.[425][426][427]
+
+Literature
+Main articles: American literature and American philosophy
+See also: List of American novelists
+Photograph of Mark Twain
+Mark Twain, who William Faulkner called "the father of American literature"[428]
+Colonial American authors were influenced by John Locke and various other Enlightenment philosophers.[429][430] Before and shortly after the Revolutionary War, the newspaper rose to prominence, filling a demand for anti-British national literature.[431][432] Led by Ralph Waldo Emerson and Margaret Fuller in New England,[433] transcendentalism branched from Unitarianism as the first major American philosophical movement.[434][435] During the nineteenth-century American Renaissance, writers like Walt Whitman and Harriet Beecher Stowe established a distinctive American literary tradition.[436][437] As literacy rates rose, periodicals published more stories centered around industrial workers, women, and the rural poor.[438][439] Naturalism, regionalism, and realism—the latter associated with Mark Twain—were the major literary movements of the period.[440][441]
+
+While modernism generally took on an international character, modernist authors working within the United States more often rooted their work in specific regions, peoples, and cultures.[442] Following the Great Migration to northern cities, African-American and black West Indian authors of the Harlem Renaissance developed an independent tradition of literature that rebuked a history of inequality and celebrated black culture. An important cultural export during the Jazz Age, these writings were a key influence on the négritude philosophy.[443][444] In the 1950s, an ideal of homogeneity led many authors to attempt to write the Great American Novel,[445] while the Beat Generation rejected this conformity, using styles that elevated the impact of the spoken word over mechanics to describe drug use, sexuality, and the failings of society.[446][447] Contemporary literature is more pluralistic than in previous eras, with the closest thing to a unifying feature being a trend toward self-conscious experiments with language.[448]
+
+Mass media
+Further information: Mass media in the United States
+See also: Newspapers in the United States, Television in the United States, Internet in the United States, Radio in the United States, and Video games in the United States
+
+Comcast Center in Philadelphia, headquarters of Comcast, the world's largest telecommunications and media conglomerate
+Media is broadly uncensored, with the First Amendment providing significant protections, as reiterated in New York Times Co. v. United States.[418] The four major broadcasters in the U.S. are the National Broadcasting Company (NBC), Columbia Broadcasting System (CBS), American Broadcasting Company (ABC), and Fox Broadcasting Company (FOX). The four major broadcast television networks are all commercial entities. Cable television offers hundreds of channels catering to a variety of niches.[449] As of 2021, about 83% of Americans over age 12 listen to broadcast radio, while about 40% listen to podcasts.[450] As of 2020, there were 15,460 licensed full-power radio stations in the U.S. according to the Federal Communications Commission (FCC).[451] Much of the public radio broadcasting is supplied by NPR, incorporated in February 1970 under the Public Broadcasting Act of 1967.[452]
+
+U.S. newspapers with a global reach and reputation include The Wall Street Journal, The New York Times, The Washington Post, and USA Today.[453] About 800 publications are produced in Spanish.[454][455] With few exceptions, newspapers are privately owned, either by large chains such as Gannett or McClatchy, which own dozens or even hundreds of newspapers; by small chains that own a handful of papers; or, in a situation that is increasingly rare, by individuals or families. Major cities often have alternative newspapers to complement the mainstream daily papers, such as The Village Voice in New York City and LA Weekly in Los Angeles. The five most popular websites used in the U.S. are Google, YouTube, Amazon, Yahoo, and Facebook, with all of them being American companies.[456]
+
+As of 2022, the video game market of the United States is the world's largest by revenue.[457] There are 444 publishers, developers, and hardware companies in California alone.[458]
+
+Theater
+Main article: Theater in the United States
+
+Broadway theatres in Theater District, Manhattan
+The United States is well known for its cinema and theater. Mainstream theater in the United States derives from the old European theatrical tradition and has been heavily influenced by the British theater.[459] By the middle of the 19th century America had created new distinct dramatic forms in the Tom Shows, the showboat theater and the minstrel show.[460] The central hub of the American theater scene is Manhattan, with its divisions of Broadway, off-Broadway, and off-off-Broadway.[461]
+
+Many movie and television stars have gotten their big break working in New York productions. Outside New York City, many cities have professional regional or resident theater companies that produce their own seasons. The biggest-budget theatrical productions are musicals. U.S. theater has an active community theater culture.[462]
+
+The Tony Awards recognizes excellence in live Broadway theatre and are presented at an annual ceremony in Manhattan. The awards are given for Broadway productions and performances. One is also given for regional theatre. Several discretionary non-competitive awards are given as well, including a Special Tony Award, the Tony Honors for Excellence in Theatre, and the Isabelle Stevenson Award.[463]
+
+Visual arts
+Main articles: Visual art of the United States and Architecture of the United States
+
+American Gothic (1930) by Grant Wood is one of the most famous American paintings and is widely parodied.[464]
+In the visual arts, the Hudson River School was a mid-19th-century movement in the tradition of European naturalism. The 1913 Armory Show in New York City, an exhibition of European modernist art, shocked the public and transformed the U.S. art scene.[465]
+
+Georgia O'Keeffe, Marsden Hartley, and others experimented with new and individualistic styles, which would become known as American modernism. Major artistic movements such as the abstract expressionism of Jackson Pollock and Willem de Kooning and the pop art of Andy Warhol and Roy Lichtenstein developed largely in the United States. Major photographers include Alfred Stieglitz, Edward Steichen, Dorothea Lange, Edward Weston, James Van Der Zee, Ansel Adams, and Gordon Parks.[466]
+
+The tide of modernism and then postmodernism has brought global fame to American architects, including Frank Lloyd Wright, Philip Johnson, and Frank Gehry.[467] The Metropolitan Museum of Art in Manhattan is the largest art museum in the United States.[468]
+
+Music
+Main article: Music of the United States
+American folk music encompasses numerous music genres, variously known as traditional music, traditional folk music, contemporary folk music, or roots music. Many traditional songs have been sung within the same family or folk group for generations, and sometimes trace back to such origins as the British Isles, Mainland Europe, or Africa.[469] The rhythmic and lyrical styles of African-American music in particular have influenced American music.[470] Banjos were brought to America through the slave trade. Minstrel shows incorporating the instrument into their acts led to its increased popularity and widespread production in the 19th century.[471][472] The electric guitar, first invented in the 1930s, and mass-produced by the 1940s, had an enormous influence on popular music, in particular due to the development of rock and roll.[473]
+
+
+The Country Music Hall of Fame and Museum in Nashville, Tennessee
+Elements from folk idioms such as the blues and old-time music were adopted and transformed into popular genres with global audiences. Jazz grew from blues and ragtime in the early 20th century, developing from the innovations and recordings of composers such as W.C. Handy and Jelly Roll Morton. Louis Armstrong and Duke Ellington increased its popularity early in the 20th century.[474] Country music developed in the 1920s,[475] rock and roll in the 1930s,[473] and bluegrass[476] and rhythm and blues in the 1940s.[477] In the 1960s, Bob Dylan emerged from the folk revival to become one of the country's most celebrated songwriters.[478] The musical forms of punk and hip hop both originated in the United States in the 1970s.[479]
+
+The United States has the world's largest music market with a total retail value of $15.9 billion in 2022.[480] Most of the world's major record companies are based in the U.S.; they are represented by the Recording Industry Association of America (RIAA).[481] Mid-20th-century American pop stars, such as Frank Sinatra[482] and Elvis Presley,[483] became global celebrities and best-selling music artists,[474] as have artists of the late 20th century, such as Michael Jackson,[484] Madonna,[485] Whitney Houston,[486] and Prince,[487] and of early 21st century such as Taylor Swift and Beyoncé.[488]
+
+Fashion
+Main article: Fashion in the United States
+
+Haute couture fashion models on the catwalk during New York Fashion Week
+The United States and China collectively account for the majority of global apparel demand. Apart from professional business attire, American fashion is eclectic and predominantly informal. While Americans' diverse cultural roots are reflected in their clothing, sneakers, jeans, T-shirts, and baseball caps are emblematic of American styles.[489] New York is considered to be one of the "big four" global fashion capitals, along with Paris, Milan, and London. A study demonstrated that general proximity to Manhattan's Garment District has been synonymous with American fashion since its inception in the early 20th century.[490]
+
+The headquarters of many designer labels reside in Manhattan. Labels cater to niche markets, such as pre teens. There has been a trend in the United States fashion towards sustainable clothing.[491] New York Fashion Week is one of the most influential fashion weeks in the world, and occurs twice a year.[492]
+
+Cinema
+Main article: Cinema of the United States
+
+The iconic Hollywood Sign, in the Hollywood Hills, often regarded as the symbol of the American film industry
+The U.S. film industry has a worldwide influence and following. Hollywood, a district in northern Los Angeles, the nation's second-most populous city, is also metonymous for the American filmmaking industry, the third-largest in the world, following India and Nigeria.[493][494][495] The major film studios of the United States are the primary source of the most commercially successful and most ticket-selling movies in the world.[496][497] Since the early 20th century, the U.S. film industry has largely been based in and around Hollywood, although in the 21st century an increasing number of films are not made there, and film companies have been subject to the forces of globalization.[498] The Academy Awards, popularly known as the Oscars, have been held annually by the Academy of Motion Picture Arts and Sciences since 1929,[499] and the Golden Globe Awards have been held annually since January 1944.[500]
+
+The industry enjoyed its golden years, in what is commonly referred to as the "Golden Age of Hollywood", from the early sound period until the early 1960s,[501] with screen actors such as John Wayne and Marilyn Monroe becoming iconic figures.[502][503] In the 1970s, "New Hollywood" or the "Hollywood Renaissance"[504] was defined by grittier films influenced by French and Italian realist pictures of the post-war period.[505] The 21st century was marked by the rise of American streaming platforms, which came to rival traditional cinema.[506][507]
+
+Cuisine
+Main article: American cuisine
+Further information: List of American regional and fusion cuisines
+
+A Thanksgiving dinner with roast turkey, mashed potatoes, pickles, corn, candied yams, cranberry jelly, shrimps, stuffing, green peas, deviled eggs, green salad and apple sauce
+Early settlers were introduced by Native Americans to foods such as turkey, sweet potatoes, corn, squash, and maple syrup. Of the most enduring and pervasive examples are variations of the native dish called succotash. Early settlers and later immigrants combined these with foods they were familiar with, such as wheat flour,[508] beef, and milk to create a distinctive American cuisine.[509][510] New World crops, especially pumpkin, corn, potatoes, and turkey as the main course are part of a shared national menu on Thanksgiving, when many Americans prepare or purchase traditional dishes to celebrate the occasion.[511]
+
+Characteristic American dishes such as apple pie, fried chicken, doughnuts, french fries, macaroni and cheese, ice cream, pizza, hamburgers, and hot dogs derive from the recipes of various immigrant groups.[512][513][514][515] Mexican dishes such as burritos and tacos preexisted the United States in areas later annexed from Mexico, and adaptations of Chinese cuisine as well as pasta dishes freely adapted from Italian sources are all widely consumed.[516] American chefs have had a significant impact on society both domestically and internationally. In 1946, the Culinary Institute of America was founded by Katharine Angell and Frances Roth. This would become the United States' most prestigious culinary school, where many of the most talented American chefs would study prior to successful careers.[517][518]
+
+The United States restaurant industry was projected at $899 billion in sales for 2020,[519][520] and employed more than 15 million people, representing 10% of the nation's workforce directly.[519] It is the country's second-largest private employer and the third-largest employer overall.[521][522] The United States is home to over 220 Michelin Star rated restaurants, 70 of which are in New York City alone.[523] Wine has been produced in what is now the United States since the 1500s, with the first widespread production beginning in what is now New Mexico in 1628.[524][525][526] Today, wine production is undertaken in all fifty states, with California producing 84 percent of all US wine. With more than 1,100,000 acres (4,500 km2) under vine, the United States is the fourth-largest wine producing country in the world, after Italy, Spain, and France.[527][528]
+
+The American fast-food industry, the world's first and largest, pioneered the drive-through format in the 1940s[529] and is often viewed as being a symbol of U.S. marketing dominance. American companies such as McDonald's,[530] Burger King, Pizza Hut, Kentucky Fried Chicken, and Domino's Pizza, among many others, have numerous outlets around the world.[531]
--- a/examples/multimodal/Using_GPT4_Vision_With_Function_Calling.ipynb
+++ b/examples/multimodal/Using_GPT4_Vision_With_Function_Calling.ipynb
--- a/examples/multimodal/data/org-chart-sample.pdf
+++ b/examples/multimodal/data/org-chart-sample.pdf
--- a/examples/multimodal/images/damaged_package.jpg
+++ b/examples/multimodal/images/damaged_package.jpg
--- a/examples/multimodal/images/normal_package.jpg
+++ b/examples/multimodal/images/normal_package.jpg
--- a/examples/multimodal/images/wet_package.jpg
+++ b/examples/multimodal/images/wet_package.jpg
--- a/examples/vector_databases/README.md
+++ b/examples/vector_databases/README.md
@ -24,7 +24,7 @@ Each provider has their own named directory, with a standard notebook to introdu
 - [Redis](https://github.com/RedisVentures/simple-vecsim-intro)
 - [SingleStoreDB](https://www.singlestore.com/blog/how-to-get-started-with-singlestore/)
 - [Supabase](https://supabase.com/docs/guides/ai)
- [Tembo](https://tembo.io/docs/tembo-stacks/vector-db)
+- [Tembo](https://tembo.io/docs/product/stacks/ai/vectordb)
 - [Typesense](https://typesense.org/docs/guide/)
 - [Vespa AI](https://vespa.ai/)
 - [Weaviate](https://weaviate.io/developers/weaviate/quickstart)
--- a/images/elbow_chart.png
+++ b/images/elbow_chart.png
--- a/images/train1.jpeg
+++ b/images/train1.jpeg
--- a/images/train17.jpeg
+++ b/images/train17.jpeg
--- a/images/train2.jpeg
+++ b/images/train2.jpeg
--- a/registry.yaml
+++ b/registry.yaml
@ -1242,3 +1242,39 @@
    - teomusatoiu
  tags:
    - moderation
+  
+  
+- title: Summarizing with controllable detail
+  path: examples/Summarizing_with_controllable_detail.ipynb
+  date: 2024-04-01
+  authors:
+    - joe-at-openai
+  tags:
+    - chat
+
+- title: Using GPT4 Vision with Function Calling
+  path: examples/multimodal/Using_GPT4_Vision_With_Function_Calling.ipynb
+  date: 2024-04-09
+  authors:
+    - shyamal-anadkat
+  tags:
+    - chat
+    - vision
+
+- title: Synthetic data generation (Part 1)
+  path: examples/SDG1.ipynb
+  date: 2024-04-10
+  authors:
+    - dylanra-openai
+  tags:
+    - completions
+
+
+- title: CLIP embeddings to improve multimodal RAG with GPT-4 Vision
+  path: examples/custom_image_embedding_search.ipynb
+  date: 2024-04-10
+  authors:
+    - dylanra-openai
+  tags:
+    - vision
+    - embeddings