Update to upstream changes

- AutoGPTQ manual installation removed (it is included in requirements)
- Softprompt config removed
- Build date print-out added to `docker-entrypoint.sh`
- `README.md` updated
pull/10/head
Atinoda 12 months ago
parent 524dad64c9
commit 7caaaa4a7c

@ -33,10 +33,6 @@ RUN git clone https://github.com/oobabooga/GPTQ-for-LLaMa.git -b cuda /app/repos
# Build and install default GPTQ ('quant_cuda')
ARG TORCH_CUDA_ARCH_LIST="6.1;7.0;7.5;8.0;8.6+PTX"
RUN cd /app/repositories/GPTQ-for-LLaMa/ && python3 setup_cuda.py install
# Install auto-gptq
RUN cd /app/repositories/ && git clone https://github.com/PanQiWei/AutoGPTQ.git && \
cd AutoGPTQ && pip3 install .
FROM nvidia/cuda:11.8.0-devel-ubuntu22.04 AS base
# Runtime pre-reqs

@ -1,5 +1,5 @@
# Introduction
This project dockerises the deployment of [oobabooga/text-generation-webui](https://github.com/oobabooga/text-generation-webui) and its variants. It provides a default configuration (corresponding to a vanilla deployment of the application) as well as pre-configured support for other set-ups (e.g., latest `llama-cpp-python` with GPU offloading, the more recent `triton` and `cuda` branches of GPTQ).
This project dockerises the deployment of [oobabooga/text-generation-webui](https://github.com/oobabooga/text-generation-webui) and its variants. It provides a default configuration (corresponding to a vanilla deployment of the application) as well as pre-configured support for other set-ups (e.g., latest `llama-cpp-python` with GPU offloading, the more recent `triton` and `cuda` branches of GPTQ). The images are available on Docker Hub: [https://hub.docker.com/r/atinoda/text-generation-webui](https://hub.docker.com/r/atinoda/text-generation-webui)
*This goal of this project is to be to [oobabooga/text-generation-webui](https://github.com/oobabooga/text-generation-webui), what [AbdBarho/stable-diffusion-webui-docker](https://github.com/AbdBarho/stable-diffusion-webui-docker) is to [AUTOMATIC1111/stable-diffusion-webui](https://github.com/AUTOMATIC1111/stable-diffusion-webui).*
@ -20,10 +20,10 @@ Choose the desired variant by setting the image `:tag` in `docker-compose.yml` t
| Variant | Description |
|---|---|
| `default` | Implementation of the vanilla deployment from source. Also includes pre-installed `AutoGPTQ` library from `PanQiWei/AutoGPTQ`. |
| `triton` | Updated GPTQ using the latest `triton` branch from `qwopqwop200/GPTQ-for-LLaMa`. Suitable for Linux only. |
| `cuda` | Updated GPTQ using the latest `cuda` branch from `qwopqwop200/GPTQ-for-LLaMa`. |
| `monkey-patch` | Use LoRAs in 4-Bit GPTQ mode. |
| `llama-cublas` | CUDA GPU offloading enabled for llama-cpp. Use by setting option `n-gpu-layers` > 0. |
| `triton` | Updated `GPTQ-for-llama` using the latest `triton` branch from `qwopqwop200/GPTQ-for-LLaMa`. Suitable for Linux only. |
| `cuda` | Updated `GPTQ-for-llama` using the latest `cuda` branch from `qwopqwop200/GPTQ-for-LLaMa`. |
| `monkey-patch` | Use LoRAs in 4-Bit `GPTQ-for-llama` mode. |
| `llama-cublas` | CUDA GPU offloading enabled for `llama-cpp`. Use by setting option `n-gpu-layers` > 0. |
*See: [oobabooga/text-generation-webui/blob/main/docs/GPTQ-models-(4-bit-mode).md](https://github.com/oobabooga/text-generation-webui/blob/main/docs/GPTQ-models-(4-bit-mode).md) and [obabooga/text-generation-webui/blob/main/docs/llama.cpp-models.md](https://github.com/oobabooga/text-generation-webui/blob/main/docs/llama.cpp-models.md) for more information on variants.*
@ -52,7 +52,7 @@ Three commonly used ports are exposed:
*Extensions may use additional ports - check the application documentation for more details.*
### Volumes
The provided example docker compose maps several volumes from the local `config` directory into the container: `loras, models, presets, prompts, softprompts, training`. If these folders are empty, they will be initialised when the container is run.
The provided example docker compose maps several volumes from the local `config` directory into the container: `loras, models, presets, prompts, training`. If these folders are empty, they will be initialised when the container is run.
*If you are getting an error about missing files, try clearing these folders and letting the service re-populate them.*

@ -10,7 +10,7 @@ function ctrl_c {
trap ctrl_c SIGTERM SIGINT SIGQUIT SIGHUP
# Generate default configs if empty
CONFIG_DIRECTORIES=("loras" "models" "presets" "prompts" "softprompts" "training/datasets" "training/formats")
CONFIG_DIRECTORIES=("loras" "models" "presets" "prompts" "training/datasets" "training/formats")
for config_dir in "${CONFIG_DIRECTORIES[@]}"; do
if [ -z "$(ls /app/"$config_dir")" ]; then
echo "*** Initialising config for: '$config_dir' ***"
@ -38,6 +38,10 @@ fi
echo "=== (This version is $COMMITS_BEHIND commits behind origin) ==="
cd $cur_dir
# Print build date
BUILD_DATE=$(cat /build_date.txt)
echo "=== Image build date: $BUILD_DATE ==="
# Assemble CMD and extra launch args
eval "extra_launch_args=($EXTRA_LAUNCH_ARGS)"
LAUNCHER=($@ $extra_launch_args)

Loading…
Cancel
Save