piper/notebooks/piper_model_exporter.ipynb

{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "colab": {
      "provenance": [],
      "gpuType": "T4",
      "authorship_tag": "ABX9TyPKhrhJQuxhFJG2C1A+aMsQ",
      "include_colab_link": true
    },
    "kernelspec": {
      "name": "python3",
      "display_name": "Python 3"
    },
    "language_info": {
      "name": "python"
    }
  },
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "view-in-github",
        "colab_type": "text"
      },
      "source": [
        "<a href=\"https://colab.research.google.com/github/rmcpantoja/piper/blob/master/notebooks/piper_model_exporter.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "# [Piper](https://github.com/rhasspy/piper) model exporter\n",
        "## ![Piper logo](https://contribute.rhasspy.org/img/logo.png)\n",
        "\n",
        "Notebook created by [rmcpantoja](http://github.com/rmcpantoja)"
      ],
      "metadata": {
        "id": "EOL-kjplZYEU"
      }
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "cellView": "form",
        "id": "FfMKr8v2RVOm"
      },
      "outputs": [],
      "source": [
        "#@title Install software\n",
        "\n",
        "print(\"Installing...\")\n",
        "!git clone -q https://github.com/rhasspy/piper\n",
        "%cd /content/piper/src/python\n",
        "!pip install -q cython>=0.29.0 espeak-phonemizer>=1.1.0 librosa>=0.9.2 numpy>=1.19.0 pytorch-lightning~=1.7.0 torch~=1.11.0\n",
        "!pip install -q onnx onnxruntime-gpu\n",
        "!bash build_monotonic_align.sh\n",
        "!apt-get install espeak-ng\n",
        "!pip install -q torchtext==0.12.0\n",
        "# fixing recent compativility isswes:\n",
        "!pip install -q torchaudio==0.11.0 torchmetrics==0.11.4\n",
        "print(\"Done!\")"
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "#@title Voice package generation section\n",
        "%cd /content/piper/src/python\n",
        "import os\n",
        "import ipywidgets as widgets\n",
        "from IPython.display import display\n",
        "import json\n",
        "from google.colab import output\n",
        "guideurl = \"https://github.com/rmcpantoja/piper/blob/master/notebooks/wav/en\"\n",
        "#@markdown #### Download:\n",
        "#@markdown **Drive ID or direct download link of the model in another cloud:**\n",
        "model_id = \"\" #@param {type:\"string\"}\n",
        "#@markdown **Drive ID or direct download link of the config.json file:**\n",
        "config_id = \"\" #@param {type:\"string\"}\n",
        "#@markdown ---\n",
        "\n",
        "#@markdown #### Creation process:\n",
        "#@markdown **Choose the language code (iso639-1 format):**\n",
        "\n",
        "#@markdown You can see a list of language codes and names [here](https://www.loc.gov/standards/iso639-2/php/English_list.php)\n",
        "\n",
        "language = \"en-us\" #@param [\"ca\", \"da\", \"de\", \"en\", \"en-us\", \"es\", \"fi\", \"fr\", \"grc\", \"is\", \"it\", \"k\", \"nb\", \"ne\", \"nl\", \"pl\", \"pt-br\", \"ru\", \"sv\", \"uk\", \"vi-vn-x-central\", \"yue\"]\n",
        "voice_name = \"Myvoice\" #@param {type:\"string\"}\n",
        "voice_name = voice_name.lower()\n",
        "quality = \"medium\" #@param [\"high\", \"low\", \"medium\", \"x-low\"]\n",
        "def start_process():\n",
        "    if not os.path.exists(\"/content/project/model.ckpt\"):\n",
        "        raise Exception(\"Could not download model! make sure the file is shareable to everyone\")\n",
        "    output.eval_js(f'new Audio(\"{guideurl}/starting.wav?raw=true\").play()')\n",
        "    !python -m piper_train.export_onnx \"/content/project/model.ckpt\" \"{export_voice_path}/{export_voice_name}.onnx\"\n",
        "    print(\"compressing...\")\n",
        "    !tar -czvf \"{packages_path}/voice-{export_voice_name}.tar.gz\" -C \"{export_voice_path}\" .\n",
        "    output.eval_js(f'new Audio(\"{guideurl}/success.wav?raw=true\").play()')\n",
        "    print(\"Done!\")\n",
        "\n",
        "export_voice_name = f\"{language}-{voice_name}-{quality}\"\n",
        "export_voice_path = \"/content/project/voice-\"+export_voice_name\n",
        "packages_path = \"/content/project/packages\"\n",
        "if not os.path.exists(export_voice_path):\n",
        "    os.makedirs(export_voice_path)\n",
        "if not os.path.exists(packages_path):\n",
        "    os.makedirs(packages_path)\n",
        "print(\"Downloading model and his config...\")\n",
        "if model_id.startswith(\"1\"):\n",
        "    !gdown -q \"{model_id}\" -O /content/project/model.ckpt\n",
        "elif model_id.startswith(\"https://drive.google.com/file/d/\"):\n",
        "    !gdown -q \"{model_id}\" -O \"/content/project/model.ckpt\" --fuzzy\n",
        "else:\n",
        "    !wget \"{model_id}\" -O \"/content/project/model.ckpt\"\n",
        "if config_id.startswith(\"1\"):\n",
        "    !gdown -q \"{config_id}\" -O \"{export_voice_path}/{export_voice_name}.onnx.json\"\n",
        "elif config_id.startswith(\"https://drive.google.com/file/d/\"):\n",
        "    !gdown -q \"{config_id}\" -O \"{export_voice_path}/{export_voice_name}.onnx.json\" --fuzzy\n",
        "else:\n",
        "    !wget \"{config_id}\" -O \"{export_voice_path}/{export_voice_name}.onnx.json\"\n",
        "#@markdown **Do you want to write a model card?**\n",
        "write_model_card = False #@param {type:\"boolean\"}\n",
        "if write_model_card:\n",
        "    with open(f\"{export_voice_path}/{export_voice_name}.onnx.json\", \"r\") as file:\n",
        "        config = json.load(file)\n",
        "    sample_rate = config[\"audio\"][\"sample_rate\"]\n",
        "    num_speakers = config[\"num_speakers\"]\n",
        "    output.eval_js(f'new Audio(\"{guideurl}/waiting.wav?raw=true\").play()')\n",
        "    text_area = widgets.Textarea(\n",
        "        description = \"fill in this following template and press start to generate the voice package\",\n",
        "        value=f'# Model card for {voice_name} ({quality})\\n\\n* Language: {language} (normaliced)\\n* Speakers: {num_speakers}\\n* Quality: {quality}\\n* Samplerate: {sample_rate}Hz\\n\\n## Dataset\\n\\n* URL: \\n* License: \\n\\n## Training\\n\\nTrained from scratch.\\nOr finetuned from: ',\n",
        "        layout=widgets.Layout(width='500px', height='200px')\n",
        "    )\n",
        "    button = widgets.Button(description='Start')\n",
        "\n",
        "    def create_model_card(button):\n",
        "        model_card_text = text_area.value.strip()\n",
        "        with open(f'{export_voice_path}/MODEL_CARD', 'w') as file:\n",
        "            file.write(model_card_text)\n",
        "        text_area.close()\n",
        "        button.close()\n",
        "        output.clear()\n",
        "        start_process()\n",
        "\n",
        "    button.on_click(create_model_card)\n",
        "\n",
        "    display(text_area, button)\n",
        "else:\n",
        "    start_process()"
      ],
      "metadata": {
        "cellView": "form",
        "id": "PqcoBb26V5xA"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "source": [
        "#@title Download/export your generated voice package\n",
        "\n",
        "#@markdown #### How do you want to export your model?\n",
        "export_mode = \"upload it to my Google Drive\" #@param [\"Download the voice package on my device (may take some time)\", \"upload it to my Google Drive\"]\n",
        "print(\"Exporting package...\")\n",
        "if export_mode == \"Download the voice package on my device (may take some time)\":\n",
        "    from google.colab import files\n",
        "    files.download(f\"{packages_path}/voice-{export_voice_name}.tar.gz\")\n",
        "    msg = \"Please wait a moment while the package is being downloaded.\"\n",
        "else:\n",
        "    voicepacks_folder = \"/content/drive/MyDrive/piper voice packages\"\n",
        "    from google.colab import drive\n",
        "    drive.mount('/content/drive')\n",
        "    if not os.path.exists(voicepacks_folder):\n",
        "        os.makedirs(voicepacks_folder)\n",
        "    !cp \"{packages_path}/voice-{export_voice_name}.tar.gz\" \"{voicepacks_folder}\"\n",
        "    msg = f\"You can find the generated voice package at: {voicepacks_folder}.\"\n",
        "print(f\"Done! {msg}\")"
      ],
      "metadata": {
        "cellView": "form",
        "id": "Hu3V9CJeWc4Y"
      },
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "source": [
        "# I want to test this model! I don't need anything else anymore?\n",
        "\n",
        "No, this is almost the end! Now you can share your generated package to your friends, upload to a cloud storage and/or test it on:\n",
        "* [The inference notebook](https://colab.research.google.com/github/rmcpantoja/piper/blob/master/notebooks/piper_inference_(ONNX).ipynb)\n",
        "  * run the cells in order for it to work correctly, as well as all the notebooks. Also, the inference notebook will guide you through the process using the enhanced accessibility feature if you wish. It's easy to use. Test it!\n",
        "* Or through the NVDA screen reader!\n",
        "  * Download and install the latest version of the [add-on](https://github.com/mush42/piper-nvda/releases).\n",
        "  * Once the plugin is installed, go to NVDA menu/preferences/settings... and look for the `Piper Voice Manager` category.\n",
        "  * Tab until you find the `Install from local file` button, press enter and select the generated package in your downloads.\n",
        "  * Once the package is selected and installed, apply the changes and restart NVDA to update the voice list.\n",
        "* Enjoy your creation!"
      ],
      "metadata": {
        "id": "IRiNBHkeoDbC"
      }
    }
  ]
}