You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
openai-cookbook/apps/embeddings-playground
Ted Sanders 782359f9c4 fixes bad paths in embeddings playground readme 1 year ago
..
README.md fixes bad paths in embeddings playground readme 1 year ago
embeddings_playground.py Ted/embeddings playground (#439) 1 year ago
example_distance_matrix.png Ted/embeddings playground (#439) 1 year ago
requirements.txt Ted/embeddings playground (#439) 1 year ago

README.md

Embeddings Playground

embeddings_playground.py is a single-page streamlit app for experimenting with OpenAI embeddings.

Installation

Before running, install required dependencies with:

pip install -r apps/embeddings-playground/requirements.txt

(You may need to change the path to match your local path.)

Verify installation of streamlit with streamlit hello.

Usage

Run the script with:

streamlit run apps/embeddings-playground/embeddings_playground.py

(Again, you may need to change the path to match your local path.)

In the app, first select your choice of:

  • distance metric (we recommend cosine)
  • embedding model (we recommend text-embedding-ada-002 for most use cases, as of May 2023)

Then, enter a variable number of strings to compare. Click rank to see:

  • the ranked list of strings, sorted by distance from the first string
  • a heatmap showing the distance between each pair of strings

Example

Here's an example distance matrix for 8 example strings related to The sky is blue:

example distance matrix

From these distance pairs, you can see:

  • embeddings measure topical similarity more than logical similarity (e.g., The sky is blue is very close to The sky is not blue)
  • punctuation affects embeddings (e.g., "THE. SKY. IS. BLUE!" is only third closest to The sky is blue)
  • within-language pairs are stronger than across-language pairs (e.g., El cielo as azul is closer to El cielo es rojo than to The sky is blue)

Experiment with your own strings to see what you can learn.