{ "cells": [ { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "# How to use the DALL·E API\n", "\n", "This notebook shows how to use OpenAI's DALL·E image API endpoints.\n", "\n", "There are three API endpoints:\n", "- **Generations:** generates an image or images based on an input caption\n", "- **Edits:** edits or extends an existing image\n", "- **Variations:** generates variations of an input image" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Setup\n", "\n", "- Import the packages you'll need\n", "- Import your OpenAI API key: You can do this by running \\``export OPENAI_API_KEY=\"your API key\"`\\` in your terminal.\n", "- Set a directory to save images to" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [], "source": [ "# imports\n", "from openai import OpenAI # OpenAI Python library to make API calls\n", "import requests # used to download images\n", "import os # used to access filepaths\n", "from PIL import Image # used to print and edit images\n", "\n", "# initialize OpenAI client\n", "client = OpenAI(api_key=os.environ.get(\"OPENAI_API_KEY\", \"\"))\n" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "image_dir='./images'\n" ] } ], "source": [ "# set a directory to save DALL·E images to\n", "image_dir_name = \"images\"\n", "image_dir = os.path.join(os.curdir, image_dir_name)\n", "\n", "# create the directory if it doesn't yet exist\n", "if not os.path.isdir(image_dir):\n", " os.mkdir(image_dir)\n", "\n", "# print the directory to save to\n", "print(f\"{image_dir=}\")\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Generations\n", "\n", "The generation API endpoint creates an image based on a text prompt. [API Reference](https://platform.openai.com/docs/api-reference/images/create)\n", "\n", "**Required inputs:**\n", "- `prompt` (str): A text description of the desired image(s). The maximum length is 1000 characters for dall-e-2 and 4000 characters for dall-e-3.\n", "\n", "**Optional inputs:**\n", "- `model` (str): The model to use for image generation. Defaults to dall-e-2\n", "- `n` (int): The number of images to generate. Must be between 1 and 10. Defaults to 1.\n", "- `quality` (str): The quality of the image that will be generated. hd creates images with finer details and greater consistency across the image. This param is only supported for dall-e-3.\n", "- `response_format` (str): The format in which the generated images are returned. Must be one of \"url\" or \"b64_json\". Defaults to \"url\".\n", "- `size` (str): The size of the generated images. Must be one of 256x256, 512x512, or 1024x1024 for dall-e-2. Must be one of 1024x1024, 1792x1024, or 1024x1792 for dall-e-3 models. Defaults to \"1024x1024\".\n", "- `style`(str | null): The style of the generated images. Must be one of vivid or natural. Vivid causes the model to lean towards generating hyper-real and dramatic images. Natural causes the model to produce more natural, less hyper-real looking images. This param is only supported for dall-e-3.\n", "- `user` (str): A unique identifier representing your end-user, which will help OpenAI to monitor and detect abuse. [Learn more.](https://beta.openai.com/docs/usage-policies/end-user-ids)" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "ImagesResponse(created=1701994117, data=[Image(b64_json=None, revised_prompt=None, url='https://oaidalleapiprodscus.blob.core.windows.net/private/org-9HXYFy8ux4r6aboFyec2OLRf/user-8OA8IvMYkfdAcUZXgzAXHS7d/img-ced13hkOk3lXkccQgW1fAQjm.png?st=2023-12-07T23%3A08%3A37Z&se=2023-12-08T01%3A08%3A37Z&sp=r&sv=2021-08-06&sr=b&rscd=inline&rsct=image/png&skoid=6aaadede-4fb3-4698-a8f6-684d7786b067&sktid=a48cca56-e6da-484e-a814-9c849652bcb3&skt=2023-12-07T16%3A41%3A48Z&ske=2023-12-08T16%3A41%3A48Z&sks=b&skv=2021-08-06&sig=tcD0iyU0ABOvWAKsY89gp5hLVIYkoSXQnrcmH%2Brkric%3D')])\n" ] } ], "source": [ "# create an image\n", "\n", "# set the prompt\n", "prompt = \"A cyberpunk monkey hacker dreaming of a beautiful bunch of bananas, digital art\"\n", "\n", "# call the OpenAI API\n", "generation_response = client.images.generate(\n", " model = \"dall-e-3\",\n", " prompt=prompt,\n", " n=1,\n", " size=\"1024x1024\",\n", " response_format=\"url\",\n", ")\n", "\n", "# print response\n", "print(generation_response)" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [], "source": [ "# save the image\n", "generated_image_name = \"generated_image.png\" # any name you like; the filetype should be .png\n", "generated_image_filepath = os.path.join(image_dir, generated_image_name)\n", "generated_image_url = generation_response.data[0].url # extract image URL from response\n", "generated_image = requests.get(generated_image_url).content # download the image\n", "\n", "with open(generated_image_filepath, \"wb\") as image_file:\n", " image_file.write(generated_image) # write the image to the file" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# print the image\n", "print(generated_image_filepath)\n", "display(Image.open(generated_image_filepath))\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Variations\n", "\n", "The variations endpoint generates new images (variations) similar to an input image. [API Reference](https://platform.openai.com/docs/api-reference/images/createVariation)\n", "\n", "Here we'll generate variations of the image generated above.\n", "\n", "**Required inputs:**\n", "- `image` (str): The image to use as the basis for the variation(s). Must be a valid PNG file, less than 4MB, and square.\n", "\n", "**Optional inputs:**\n", "- `model` (str): The model to use for image variations. Only dall-e-2 is supported at this time.\n", "- `n` (int): The number of images to generate. Must be between 1 and 10. Defaults to 1.\n", "- `size` (str): The size of the generated images. Must be one of \"256x256\", \"512x512\", or \"1024x1024\". Smaller images are faster. Defaults to \"1024x1024\".\n", "- `response_format` (str): The format in which the generated images are returned. Must be one of \"url\" or \"b64_json\". Defaults to \"url\".\n", "- `user` (str): A unique identifier representing your end-user, which will help OpenAI to monitor and detect abuse. [Learn more.](https://beta.openai.com/docs/usage-policies/end-user-ids)\n" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "ImagesResponse(created=1701994139, data=[Image(b64_json=None, revised_prompt=None, url='https://oaidalleapiprodscus.blob.core.windows.net/private/org-9HXYFy8ux4r6aboFyec2OLRf/user-8OA8IvMYkfdAcUZXgzAXHS7d/img-noNRGgwaaotRGIe6Y2GVeSpr.png?st=2023-12-07T23%3A08%3A59Z&se=2023-12-08T01%3A08%3A59Z&sp=r&sv=2021-08-06&sr=b&rscd=inline&rsct=image/png&skoid=6aaadede-4fb3-4698-a8f6-684d7786b067&sktid=a48cca56-e6da-484e-a814-9c849652bcb3&skt=2023-12-07T16%3A39%3A11Z&ske=2023-12-08T16%3A39%3A11Z&sks=b&skv=2021-08-06&sig=ER5RUglhtIk9LWJXw1DsolorT4bnEmFostfnUjY21ns%3D'), Image(b64_json=None, revised_prompt=None, url='https://oaidalleapiprodscus.blob.core.windows.net/private/org-9HXYFy8ux4r6aboFyec2OLRf/user-8OA8IvMYkfdAcUZXgzAXHS7d/img-oz952tL11FFhf9iXXJVIRUZX.png?st=2023-12-07T23%3A08%3A59Z&se=2023-12-08T01%3A08%3A59Z&sp=r&sv=2021-08-06&sr=b&rscd=inline&rsct=image/png&skoid=6aaadede-4fb3-4698-a8f6-684d7786b067&sktid=a48cca56-e6da-484e-a814-9c849652bcb3&skt=2023-12-07T16%3A39%3A11Z&ske=2023-12-08T16%3A39%3A11Z&sks=b&skv=2021-08-06&sig=99rJOQwDKsfIeerlMXMHholhAhrHfYaQRFJBF8FKv74%3D')])\n" ] } ], "source": [ "# create variations\n", "\n", "# call the OpenAI API, using `create_variation` rather than `create`\n", "variation_response = client.images.create_variation(\n", " image=generated_image, # generated_image is the image generated above\n", " n=2,\n", " size=\"1024x1024\",\n", " response_format=\"url\",\n", ")\n", "\n", "# print response\n", "print(variation_response)" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [], "source": [ "# save the images\n", "variation_urls = [datum.url for datum in variation_response.data] # extract URLs\n", "variation_images = [requests.get(url).content for url in variation_urls] # download images\n", "variation_image_names = [f\"variation_image_{i}.png\" for i in range(len(variation_images))] # create names\n", "variation_image_filepaths = [os.path.join(image_dir, name) for name in variation_image_names] # create filepaths\n", "for image, filepath in zip(variation_images, variation_image_filepaths): # loop through the variations\n", " with open(filepath, \"wb\") as image_file: # open the file\n", " image_file.write(image) # write the image to the file" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# print the original image\n", "print(generated_image_filepath)\n", "display(Image.open(generated_image_filepath))\n", "\n", "# print the new variations\n", "for variation_image_filepaths in variation_image_filepaths:\n", " print(variation_image_filepaths)\n", " display(Image.open(variation_image_filepaths))\n" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Edits\n", "\n", "The edit endpoint uses DALL·E to generate a specified portion of an existing image. Three inputs are needed: the image to edit, a mask specifying the portion to be regenerated, and a prompt describing the desired image. [API Reference](https://platform.openai.com/docs/api-reference/images/createEdit)\n", "\n", "**Required inputs:** \n", "- `image` (str): The image to edit. Must be a valid PNG file, less than 4MB, and square. If mask is not provided, image must have transparency, which will be used as the mask.\n", "- `prompt` (str): A text description of the desired image(s). The maximum length is 1000 characters.\n", "\n", "**Optional inputs:**\n", "- `mask` (file): An additional image whose fully transparent areas (e.g. where alpha is zero) indicate where image should be edited. Must be a valid PNG file, less than 4MB, and have the same dimensions as image.\n", "- `model` (str): The model to use for edit image. Only dall-e-2 is supported at this time.\n", "- `n` (int): The number of images to generate. Must be between 1 and 10. Defaults to 1.\n", "- `size` (str): The size of the generated images. Must be one of \"256x256\", \"512x512\", or \"1024x1024\". Smaller images are faster. Defaults to \"1024x1024\".\n", "- `response_format` (str): The format in which the generated images are returned. Must be one of \"url\" or \"b64_json\". Defaults to \"url\".\n", "- `user` (str): A unique identifier representing your end-user, which will help OpenAI to monitor and detect abuse. [Learn more.](https://beta.openai.com/docs/usage-policies/end-user-ids)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Set Edit Area\n", "\n", "An edit requires a \"mask\" to specify which portion of the image to regenerate. Any pixel with an alpha of 0 (transparent) will be regenerated. The code below creates a 1024x1024 mask where the bottom half is transparent." ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [], "source": [ "# create a mask\n", "width = 1024\n", "height = 1024\n", "mask = Image.new(\"RGBA\", (width, height), (0, 0, 0, 1)) # create an opaque image mask\n", "\n", "# set the bottom half to be transparent\n", "for x in range(width):\n", " for y in range(height // 2, height): # only loop over the bottom half of the mask\n", " # set alpha (A) to zero to turn pixel transparent\n", " alpha = 0\n", " mask.putpixel((x, y), (0, 0, 0, alpha))\n", "\n", "# save the mask\n", "mask_name = \"bottom_half_mask.png\"\n", "mask_filepath = os.path.join(image_dir, mask_name)\n", "mask.save(mask_filepath)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Perform Edit\n", "\n", "Now we supply our image, caption and mask to the API to get 5 examples of edits to our image" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "ImagesResponse(created=1701994167, data=[Image(b64_json=None, revised_prompt=None, url='https://oaidalleapiprodscus.blob.core.windows.net/private/org-9HXYFy8ux4r6aboFyec2OLRf/user-8OA8IvMYkfdAcUZXgzAXHS7d/img-9UOVGC7wB8MS2Q7Rwgj0fFBq.png?st=2023-12-07T23%3A09%3A27Z&se=2023-12-08T01%3A09%3A27Z&sp=r&sv=2021-08-06&sr=b&rscd=inline&rsct=image/png&skoid=6aaadede-4fb3-4698-a8f6-684d7786b067&sktid=a48cca56-e6da-484e-a814-9c849652bcb3&skt=2023-12-07T16%3A40%3A37Z&ske=2023-12-08T16%3A40%3A37Z&sks=b&skv=2021-08-06&sig=MsRMZ1rN434bVdWr%2B9kIoqu9CIrvZypZBfkQPTOhCl4%3D')])\n" ] } ], "source": [ "# edit an image\n", "\n", "# call the OpenAI API\n", "edit_response = client.images.edit(\n", " image=open(generated_image_filepath, \"rb\"), # from the generation section\n", " mask=open(mask_filepath, \"rb\"), # from right above\n", " prompt=prompt, # from the generation section\n", " n=1,\n", " size=\"1024x1024\",\n", " response_format=\"url\",\n", ")\n", "\n", "# print response\n", "print(edit_response)" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [], "source": [ "# save the image\n", "edited_image_name = \"edited_image.png\" # any name you like; the filetype should be .png\n", "edited_image_filepath = os.path.join(image_dir, edited_image_name)\n", "edited_image_url = edit_response.data[0].url # extract image URL from response\n", "edited_image = requests.get(edited_image_url).content # download the image\n", "\n", "with open(edited_image_filepath, \"wb\") as image_file:\n", " image_file.write(edited_image) # write the image to the file" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# print the original image\n", "print(generated_image_filepath)\n", "display(Image.open(generated_image_filepath))\n", "\n", "# print edited image\n", "print(edited_image_filepath)\n", "display(Image.open(edited_image_filepath))" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3.9.9 ('openai')", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.3" }, "vscode": { "interpreter": { "hash": "365536dcbde60510dc9073d6b991cd35db2d9bac356a11f5b64279a5e6708b97" } } }, "nbformat": 4, "nbformat_minor": 4 }