imaginAIry/scripts/prep_vocab_lists.py
Bryce Drennan 31c2160e21
feature: prompt expansion (#51)
You can use `{}` to randomly pull values from lists.  A list of values separated by `|` and enclosed in `{ }` will be randomly drawn from in a non-repeating fashion. Values that are surrounded by `_ _` will pull from a phrase list of the same name.   Folders containing .txt phraselist files may be specified via
`--prompt_library_path`. The option may be specified multiple times.  Built-in categories:

      3d-term, adj-architecture, adj-beauty, adj-detailed, adj-emotion, adj-general, adj-horror, animal, art-movement,
      art-site, artist, artist-botanical, artist-surreal, aspect-ratio, bird, body-of-water, body-pose, camera-brand,
      camera-model, color, cosmic-galaxy, cosmic-nebula, cosmic-star, cosmic-term, dinosaur, eyecolor, f-stop,
      fantasy-creature, fantasy-setting, fish, flower, focal-length, food, fruit, games, gen-modifier, hair, hd,
      iso-stop, landscape-type, national-park, nationality, neg-weight, noun-beauty, noun-fantasy, noun-general,
      noun-horror, occupation, photo-term, pop-culture, pop-location, punk-style, quantity, rpg-item, scenario-desc,
      skin-color, spaceship, style, tree-species, trippy, world-heritage-site

   Examples:

   `imagine "a {red|black} dog" -r 2 --seed 0` will generate both "a red dog" and "a black dog"

   `imagine "a {_color_} dog" -r 4 --seed 0` will generate four, different colored dogs. The colors will eb pulled from an included
   phraselist of colors.

   `imagine "a {_spaceship_|_fruit_|hot air balloon}. low-poly" -r 4 --seed 0` will generate images of spaceships or fruits or a hot air balloon

   Credit to [noodle-soup-prompts](https://github.com/WASasquatch/noodle-soup-prompts/) where most, but not all, of the wordlists originate.
2022-10-08 18:34:35 -07:00

73 lines
1.8 KiB
Python

import gzip
import json
import os.path
import time
from contextlib import contextmanager
CURDIR = os.path.dirname(__file__)
excluded_prefixes = ["identity", "gender", "body", "celeb", "color"]
excluded_words = {
"sex",
"sexy",
"sex appeal",
"sex symbol",
"young",
"youth",
"youthful",
"child",
"baby",
}
category_renames = {
"3d-terms": "3d-term",
"animals": "animal",
"camera": "camera-model",
"camera-manu": "camera-brand",
"cosmic-terms": "cosmic-term",
"details": "adj-detailed",
"foods": "food",
"games": "video-game",
"movement": "art-movement",
"noun-emote": "adj-emotion",
"natl-park": "national-park",
"portrait-type": "body-pose",
"punk": "punk-style",
"site": "art-site",
"tree": "tree-species",
"water": "body-of-water",
"wh-site": "world-heritage-site",
}
@contextmanager
def timed(description):
start = time.perf_counter()
yield
end = time.perf_counter()
duration = end - start
print(f"{description} {duration:2f}")
def make_txts():
src_json = f"{CURDIR}/../downloads/noodle-soup-prompts/nsp_pantry.json"
dst_folder = f"{CURDIR}/../imaginairy/vendored/noodle_soup_prompts"
with open(src_json, "r", encoding="utf-8") as f:
prompts = json.load(f)
categories = []
for c in prompts.keys():
if any(c.startswith(p) for p in excluded_prefixes):
continue
categories.append(c)
categories.sort()
for c in categories:
print((c, len(prompts[c])))
filtered_phrases = [p.lower() for p in prompts[c] if p not in excluded_words]
renamed_c = category_renames.get(c, c)
with gzip.open(f"{dst_folder}/{renamed_c}.txt.gz", "wb") as f:
for p in filtered_phrases:
f.write(f"{p}\n".encode("utf-8"))
if __name__ == "__main__":
make_txts()