Commit Graph

194 Commits

Author SHA1 Message Date
dbaranchuk
b3a3fb1547 reduce a number of default handlers 2022-08-04 05:52:31 +03:00
dbaranchuk
6165aca545 support various dtype for inference 2022-08-04 05:50:20 +03:00
dbaranchuk
b8a78b8254 add assert on dtype for load_in_8bit 2022-08-03 15:48:45 +03:00
dbaranchuk
8fb7c6fead update installation instructions 2022-08-03 15:17:18 +03:00
dbaranchuk
07218b0f1c Merge remote-tracking branch 'origin/main' into 8bit-model 2022-08-03 15:14:45 +03:00
dbaranchuk
c4b5342623 update deploy scripts 2022-08-03 14:35:50 +03:00
dbaranchuk
4801d01cc9 build cpuonly bitsandbytes 2022-08-03 14:10:02 +03:00
dbaranchuk
0874803dc6 rm demo tmp scripts 2022-08-03 13:45:02 +03:00
dbaranchuk
6e1696b839 update hivemind and bitsandbytes 2022-08-03 13:44:18 +03:00
dbaranchuk
e297ae606f black 2022-08-03 12:47:36 +03:00
dbaranchuk
5200dc7029 add revision support 2022-08-03 12:42:39 +03:00
Alexander Borzunov
7d39d46966
Use "PETALS" as the readme title (#40)
Since we've chosen the system name, let's use it in the repo name and the readme title.
2022-08-02 18:48:54 +04:00
dbaranchuk
564f6edb07 set throughput=1.0 to fix zero throughput problem 2022-07-31 06:17:48 +03:00
dbaranchuk
8b3d66167f set throughput=1 to fix 0 throughput problem 2022-07-31 06:13:59 +03:00
dbaranchuk
123e57a5fc update initial peer 2022-07-31 04:29:39 +03:00
dbaranchuk
2aa236a88e fix typos 2022-07-31 04:28:38 +03:00
dbaranchuk
1ea397a88b black 2022-07-31 04:00:24 +03:00
dbaranchuk
fced9a8f30 fix typo 2022-07-31 03:54:19 +03:00
dbaranchuk
805e71a35b update demo deploy scripts 2022-07-31 03:53:07 +03:00
dbaranchuk
66ce6d9669 make it compatible with low_cpu_mem_usage=True 2022-07-31 03:52:02 +03:00
dbaranchuk
afc1de2627 add script for remote benchmarks 2022-07-31 01:02:43 +03:00
dbaranchuk
ffeceebd4b Fix bug with model duplication in RAM 2022-07-30 02:50:07 +03:00
dbaranchuk
02619d307d black & isort 2022-07-29 01:04:03 +03:00
dbaranchuk
a621549c06 integrate mixed-8bit model 2022-07-29 01:01:05 +03:00
Dmitry Baranchuk
04a2b6f5e3
Support various backend dtypes & async serialization (#38) 2022-07-28 18:33:58 +03:00
Artem Chumachenko
d989b94614
Pack of Inference Changes (#37)
* Return multibatch mode

* Add tests

* fixes
2022-07-27 10:19:45 +04:00
Dmitry Baranchuk
6573076883
Sequential and parallel forward / backward (#36) 2022-07-23 14:32:39 +03:00
justheuristic
f0cffbf67e
Miscellaneous fixes to automatic tests (#35)
1. __Reduce memory usage in in test_full_model__ 
     - previously, loading the full model would consistently fail IF github is enforcing memory limit [example](https://github.com/bigscience-workshop/distributed-bloom/runs/7473920049?check_suite_focus=true)
     - the new version uses accelerate to save 2GB of peak memory, that was previously used when loading both reference model AND its state dict at the same time - only to load that state dict :)
2. __Safer delays when creating servers__
    - run-tests will now wait for a few seconds after creating the first server - and before creating the second one, so as to make 
sure that the first server creates a DHT instance that subsequent servers can connect to.
    - also increased the wait time after creating servers by 30 seconds to make sure we load the model in time even when bumping into slow remotes on HF side
3. __Fix environment variables in CI to avoid build conflicts__
    - the previous code was using a wrong environment variable that was always "main". The current one will correctly resolve branch name, both in main and on pull request.
    - For reference, below you can find sample environments when running CI in both cases: on pull request and on push to main.

<details>
<summary> Environment variables when building this branch (on pull request) </summary>

SELENIUM_JAR_PATH=/usr/share/java/selenium-server.jar GOROOT_1_17_X64=/opt/hostedtoolcache/go/1.17.12/x64 CONDA=/usr/share/miniconda GITHUB_WORKSPACE=/home/runner/work/distributed-bloom/distributed-bloom JAVA_HOME_11_X64=/usr/lib/jvm/temurin-11-jdk-amd64 GITHUB_PATH=/home/runner/work/_temp/_runner_file_commands/add_path_0aba811a-a04b-40a2-ba42-79efb2723e9e GITHUB_ACTION=__run_2 JAVA_HOME=/usr/lib/jvm/temurin-11-jdk-amd64 GITHUB_RUN_NUMBER=98 RUNNER_NAME=GitHub Actions 3 GRADLE_HOME=/usr/share/gradle-7.5 XDG_CONFIG_HOME=/home/runner/.config DOTNET_SKIP_FIRST_TIME_EXPERIENCE=1 ANT_HOME=/usr/share/ant JAVA_HOME_8_X64=/usr/lib/jvm/temurin-8-jdk-amd64 HOMEBREW_PREFIX=/home/linuxbrew/.linuxbrew pythonLocation=/opt/hostedtoolcache/Python/3.9.13/x64 GITHUB_REF_TYPE=branch HOMEBREW_CLEANUP_PERIODIC_FULL_DAYS=3650 BOOTSTRAP_HASKELL_NONINTERACTIVE=1 *** PIPX_BIN_DIR=/opt/pipx_bin DEPLOYMENT_BASEPATH=/opt/runner GITHUB_ACTIONS=true ANDROID_NDK_LATEST_HOME=/usr/local/lib/android/sdk/ndk/24.0.8215888 GITHUB_SHA=3b457e8a14e5ecb0d65d6e4c0e9161f7756a8861 POWERSHELL_DISTRIBUTION_CHANNEL=GitHub-Actions-ubuntu20 DOTNET_MULTILEVEL_LOOKUP=0 GITHUB_REF=refs/pull/35/merge RUNNER_OS=Linux GITHUB_REF_PROTECTED=false HOME=/home/runner GITHUB_API_URL=https://api.github.com/ LANG=C.UTF-8 BLOOM_TESTING_WRITE_TOKEN=*** RUNNER_TRACKING_ID=github_cc9b46e4-56a1-40c5-ba08-5a91e21f0f95 STATS_KEEPALIVE=false RUNNER_ARCH=X64 RUNNER_TEMP=/home/runner/work/_temp EDGEWEBDRIVER=/usr/local/share/edge_driver GITHUB_ENV=/home/runner/work/_temp/_runner_file_commands/set_env_0aba811a-a04b-40a2-ba42-79efb2723e9e GITHUB_EVENT_PATH=/home/runner/work/_temp/_github_workflow/event.json INVOCATION_ID=8f0072e74f2847c0851e7ff9b5e4af7c GITHUB_EVENT_NAME=pull_request GITHUB_RUN_ID=2720198689 JAVA_HOME_17_X64=/usr/lib/jvm/temurin-17-jdk-amd64 ANDROID_NDK_HOME=/usr/local/lib/android/sdk/ndk-bundle GITHUB_STEP_SUMMARY=/home/runner/work/_temp/_runner_file_commands/step_summary_0aba811a-a04b-40a2-ba42-79efb2723e9e HOMEBREW_NO_AUTO_UPDATE=1 GITHUB_ACTOR=justheuristic NVM_DIR=/home/runner/.nvm SGX_AESM_ADDR=1 GITHUB_RUN_ATTEMPT=1 ANDROID_HOME=/usr/local/lib/android/sdk GITHUB_GRAPHQL_URL=https://api.github.com/graphql ACCEPT_EULA=Y RUNNER_USER=runner USER=runner GITHUB_SERVER_URL=https://github.com/ HOMEBREW_CELLAR=/home/linuxbrew/.linuxbrew/Cellar PIPX_HOME=/opt/pipx GECKOWEBDRIVER=/usr/local/share/gecko_driver CHROMEWEBDRIVER=/usr/local/share/chrome_driver SHLVL=0 ANDROID_SDK_ROOT=/usr/local/lib/android/sdk VCPKG_INSTALLATION_ROOT=/usr/local/share/vcpkg HOMEBREW_REPOSITORY=/home/linuxbrew/.linuxbrew/Homebrew RUNNER_TOOL_CACHE=/opt/hostedtoolcache ImageVersion=20220717.1 DOTNET_NOLOGO=1 GITHUB_REF_NAME=35/merge STATS_PFS=true GRAALVM_11_ROOT=/usr/local/graalvm/graalvm-ce-java11-22.1.0 GITHUB_JOB=convert-model LD_LIBRARY_PATH=/opt/hostedtoolcache/Python/3.9.13/x64/lib XDG_RUNTIME_DIR=/run/user/1001 AZURE_EXTENSION_DIR=/opt/az/azcliextensions PERFLOG_LOCATION_SETTING=RUNNER_PERFLOG GITHUB_REPOSITORY=bigscience-workshop/distributed-bloom ANDROID_NDK_ROOT=/usr/local/lib/android/sdk/ndk-bundle CHROME_BIN=/usr/bin/google-chrome GOROOT_1_18_X64=/opt/hostedtoolcache/go/1.18.4/x64 GITHUB_RETENTION_DAYS=90 JOURNAL_STREAM=8:23653 RUNNER_WORKSPACE=/home/runner/work/distributed-bloom LEIN_HOME=/usr/local/lib/lein LEIN_JAR=/usr/local/lib/lein/self-installs/leiningen-2.9.8-standalone.jar GITHUB_ACTION_REPOSITORY= PATH=/opt/hostedtoolcache/Python/3.9.13/x64/bin:/opt/hostedtoolcache/Python/3.9.13/x64:/home/linuxbrew/.linuxbrew/bin:/home/linuxbrew/.linuxbrew/sbin:/home/runner/.local/bin:/opt/pipx_bin:/home/runner/.cargo/bin:/home/runner/.config/composer/vendor/bin:/usr/local/.ghcup/bin:/home/runner/.dotnet/tools:/snap/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin RUNNER_PERFLOG=/home/runner/perflog GITHUB_BASE_REF=main CI=true SWIFT_PATH=/usr/share/swift/usr/bin ImageOS=ubuntu20 GITHUB_REPOSITORY_OWNER=bigscience-workshop GITHUB_HEAD_REF=fix-branch-name GITHUB_ACTION_REF= GITHUB_WORKFLOW=Tests DEBIAN_FRONTEND=noninteractive AGENT_TOOLSDIRECTORY=/opt/hostedtoolcache GOROOT_1_16_X64=/opt/hostedtoolcache/go/1.16.15/x64 _=/usr/bin/env
</details>
<details>
<summary> Environment variables when building in main (on push) </summary>

SELENIUM_JAR_PATH=/usr/share/java/selenium-server.jar GOROOT_1_17_X64=/opt/hostedtoolcache/go/1.17.11/x64 CONDA=/usr/share/miniconda GITHUB_WORKSPACE=/home/runner/work/distributed-bloom/distributed-bloom JAVA_HOME_11_X64=/usr/lib/jvm/temurin-11-jdk-amd64 GITHUB_PATH=/home/runner/work/_temp/_runner_file_commands/add_path_cd6c1ed2-0d0f-496d-b7a6-ffa476dcc144 GITHUB_ACTION=__run_2 JAVA_HOME=/usr/lib/jvm/temurin-11-jdk-amd64 GITHUB_RUN_NUMBER=53 RUNNER_NAME=GitHub Actions 3 GRADLE_HOME=/usr/share/gradle-7.4.2 XDG_CONFIG_HOME=/home/runner/.config DOTNET_SKIP_FIRST_TIME_EXPERIENCE=1 ANT_HOME=/usr/share/ant JAVA_HOME_8_X64=/usr/lib/jvm/temurin-8-jdk-amd64 HOMEBREW_PREFIX=/home/linuxbrew/.linuxbrew pythonLocation=/opt/hostedtoolcache/Python/3.9.13/x64 GITHUB_REF_TYPE=branch HOMEBREW_CLEANUP_PERIODIC_FULL_DAYS=3650 BOOTSTRAP_HASKELL_NONINTERACTIVE=1 *** PIPX_BIN_DIR=/opt/pipx_bin DEPLOYMENT_BASEPATH=/opt/runner GITHUB_ACTIONS=true ANDROID_NDK_LATEST_HOME=/usr/local/lib/android/sdk/ndk/24.0.8215888 GITHUB_SHA=49242d81006454d687ff3293c49f6bf234793627 POWERSHELL_DISTRIBUTION_CHANNEL=GitHub-Actions-ubuntu20 DOTNET_MULTILEVEL_LOOKUP=0 GITHUB_REF=refs/heads/main RUNNER_OS=Linux GITHUB_REF_PROTECTED=true HOME=/home/runner GITHUB_API_URL=https://api.github.com/ LANG=C.UTF-8 BLOOM_TESTING_WRITE_TOKEN=*** RUNNER_TRACKING_ID=github_7668f06a-99e1-4ed1-81e9-46d75fab3f33 STATS_KEEPALIVE=false RUNNER_ARCH=X64 RUNNER_TEMP=/home/runner/work/_temp EDGEWEBDRIVER=/usr/local/share/edge_driver GITHUB_ENV=/home/runner/work/_temp/_runner_file_commands/set_env_cd6c1ed2-0d0f-496d-b7a6-ffa476dcc144 GITHUB_EVENT_PATH=/home/runner/work/_temp/_github_workflow/event.json INVOCATION_ID=3dadac48981b4a679a33224db89be1ed GITHUB_EVENT_NAME=push GITHUB_RUN_ID=2680158280 JAVA_HOME_17_X64=/usr/lib/jvm/temurin-17-jdk-amd64 ANDROID_NDK_HOME=/usr/local/lib/android/sdk/ndk-bundle GITHUB_STEP_SUMMARY=/home/runner/work/_temp/_runner_file_commands/step_summary_cd6c1ed2-0d0f-496d-b7a6-ffa476dcc144 HOMEBREW_NO_AUTO_UPDATE=1 GITHUB_ACTOR=justheuristic NVM_DIR=/home/runner/.nvm SGX_AESM_ADDR=1 GITHUB_RUN_ATTEMPT=1 ANDROID_HOME=/usr/local/lib/android/sdk GITHUB_GRAPHQL_URL=https://api.github.com/graphql ACCEPT_EULA=Y RUNNER_USER=runner USER=runner GITHUB_SERVER_URL=https://github.com/ HOMEBREW_CELLAR=/home/linuxbrew/.linuxbrew/Cellar PIPX_HOME=/opt/pipx GECKOWEBDRIVER=/usr/local/share/gecko_driver CHROMEWEBDRIVER=/usr/local/share/chrome_driver SHLVL=0 ANDROID_SDK_ROOT=/usr/local/lib/android/sdk VCPKG_INSTALLATION_ROOT=/usr/local/share/vcpkg HOMEBREW_REPOSITORY=/home/linuxbrew/.linuxbrew/Homebrew RUNNER_TOOL_CACHE=/opt/hostedtoolcache ImageVersion=20220710.1 DOTNET_NOLOGO=1 GITHUB_REF_NAME=main STATS_PFS=true GRAALVM_11_ROOT=/usr/local/graalvm/graalvm-ce-java11-22.1.0 GITHUB_JOB=convert-model LD_LIBRARY_PATH=/opt/hostedtoolcache/Python/3.9.13/x64/lib XDG_RUNTIME_DIR=/run/user/1001 AZURE_EXTENSION_DIR=/opt/az/azcliextensions PERFLOG_LOCATION_SETTING=RUNNER_PERFLOG GITHUB_REPOSITORY=bigscience-workshop/distributed-bloom CHROME_BIN=/usr/bin/google-chrome ANDROID_NDK_ROOT=/usr/local/lib/android/sdk/ndk-bundle GOROOT_1_18_X64=/opt/hostedtoolcache/go/1.18.3/x64 GITHUB_RETENTION_DAYS=90 JOURNAL_STREAM=8:22000 RUNNER_WORKSPACE=/home/runner/work/distributed-bloom LEIN_HOME=/usr/local/lib/lein LEIN_JAR=/usr/local/lib/lein/self-installs/leiningen-2.9.8-standalone.jar GITHUB_ACTION_REPOSITORY= PATH=/opt/hostedtoolcache/Python/3.9.13/x64/bin:/opt/hostedtoolcache/Python/3.9.13/x64:/home/linuxbrew/.linuxbrew/bin:/home/linuxbrew/.linuxbrew/sbin:/home/runner/.local/bin:/opt/pipx_bin:/home/runner/.cargo/bin:/home/runner/.config/composer/vendor/bin:/usr/local/.ghcup/bin:/home/runner/.dotnet/tools:/snap/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin RUNNER_PERFLOG=/home/runner/perflog GITHUB_BASE_REF= CI=true SWIFT_PATH=/usr/share/swift/usr/bin ImageOS=ubuntu20 GITHUB_REPOSITORY_OWNER=bigscience-workshop GITHUB_HEAD_REF= GITHUB_ACTION_REF= GITHUB_WORKFLOW=Tests DEBIAN_FRONTEND=noninteractive AGENT_TOOLSDIRECTORY=/opt/hostedtoolcache GOROOT_1_16_X64=/opt/hostedtoolcache/go/1.16.15/x64 _=/usr/bin/env
</details>



Co-authored-by: Dmitry Baranchuk <dmitrybaranchuk@gmail.com>
2022-07-22 22:38:40 +03:00
Dmitry Baranchuk
7de3acf909
Fix is_subsequence (#32) 2022-07-19 22:37:19 +03:00
justheuristic
f0c7383181
Implement RemoteSequential slicing and extra repr, add tests (#30)
- finish renaming RemoteSequenceInfo -> RemoteSequenceManager (why: if it was an *Info, user would expect it to be similar - to a dataclass; whereas in actuality, the class is doing heavy network interactions on its own)
- implement RemoteSequenceManager.make_sequence (from https://pastebin.com/uXgy2U8B )
- make RemoteSequentialInferenceSession use RemoteSequenceManager.make_sequence
- make tests pass again
- make it possible to create inference session without RemoteTransformerBlock
- make a standalone test for RemoteSequential
- rollback convert-model

Co-authored-by: Tim Dettmers <tim.dettmers@gmail.com>
2022-07-19 04:28:04 +03:00
Artem Chumachenko
6ee942e915
Add GenerationMixin class (#29)
Add generation abstraction, that's using inference_session.
Added modes:
- Greedy, top-k/top-p sampling
- Multibatch generation
- Constraint abstraction
In the future, will add prefix-tuned generation, beam-search and more hf-like stuff.
2022-07-19 00:44:16 +03:00
justheuristic
177d81bea6
CI: use GIT_REF_NAME instead of GIT_HEAD_REF (#28)
use GIT_REF_NAME instead of GIT_HEAD_REF
2022-07-16 11:43:24 +03:00
justheuristic
49242d8100
[WIP] Fix CI runs in master (#27)
* set default head ref

* check for head ref

* check branch
2022-07-16 04:06:37 +03:00
justheuristic
01ed4db750
Fix default branch in CI (#26)
Set default head ref if the environment variable is not set
2022-07-16 03:11:27 +03:00
justheuristic
ccdcefe405
Add instructions to test the full model (#25)
add instructions to test the full model
2022-07-16 02:45:10 +03:00
justheuristic
eb0a6be716
Clean up readme (#24)
Remove some deprecated sections of README and turns on CI for main branch
2022-07-16 02:11:17 +03:00
justheuristic
e2711a033b
Add automated tests (#23)
This PR will run basic tests automatically on each subsequent PR

- convert a small model on every PR
- run existing tests on every PR
- enforce black / isort
- require checks on merge
- make sure tests are not flappy

Co-authored-by: Alexander Borzunov <hxrussia@gmail.com>
Co-authored-by: Dmitry Baranchuk <dmitrybaranchuk@gmail.com>
2022-07-16 01:59:23 +03:00
Dmitry Baranchuk
f5463812ad
Shallow prompt tuning (#22) 2022-07-15 17:37:26 +03:00
Alexander Borzunov
7e9f337a63
Remove excess line from readme 2022-07-14 22:27:17 +04:00
Dmitry Baranchuk
db966a76dd
rm heuristic for num_handlers 2022-07-14 02:35:23 +03:00
Dmitry Baranchuk
f114a6d417
set default num_handlers=16 2022-07-14 02:32:58 +03:00
Aleksandr Borzunov
f3cf5f4d8d Fix choose_best_blocks() 2022-07-13 22:59:01 +00:00
Alexander Borzunov
75856e4769
Measure and cache network & compute throughput (#21) 2022-07-13 05:46:26 +04:00
Dmitry Baranchuk
ac7df18dfa
Merge pull request #19 from learning-at-home/lm_head
add a modified LM head
2022-07-12 14:25:24 +03:00
Dmitry Baranchuk
fd0bf064f3
minor refactoring 2022-07-12 14:22:11 +03:00
Alexander Borzunov
aba43f1308
Implement block selection on servers (#20) 2022-07-12 14:42:30 +04:00
dbaranchuk
21e1f42f04 mv set_requires_grad to remote_model 2022-07-10 23:41:05 +03:00
dbaranchuk
5168a3405a fix comments 2022-07-10 20:35:13 +03:00
dbaranchuk
79280c4371 refactoring 2022-07-10 20:27:38 +03:00
dbaranchuk
6bffeff0a1 fix 2022-07-10 20:17:11 +03:00