mirror of
https://github.com/bigscience-workshop/petals
synced 2024-10-31 09:20:41 +00:00
8a73b41a42
- Before this PR, `ServerState.JOINING` was announced only once. This announcement quickly expires in case of the full-size BLOOM, since loading blocks takes several minutes. This PR fixes it, so `ServerState.JOINING` is announced periodically in a thread until blocks are loaded. - This PR also makes the `Server` class a non-thread, so it runs in the main thread and can catch `KeyboardInterrupt`. This is important, since if we are downloading blocks right now, we need to stop it and send the `ServerState.OFFLINE` message. Note that `ModuleContainer` is still a thread. - (minor) For the sake of readability, I moved the `ModuleContainer.create()` definition, so it is now defined before `Server.__init__()` (this is because `.create()` is invoked first). |
||
---|---|---|
.. | ||
__init__.py | ||
config.json | ||
convert_model.py | ||
deploy_server.sh | ||
inference_one_block.py | ||
local_server_config_example.cfg | ||
remote_server_config_example.cfg | ||
run_local_servers.sh | ||
run_remote_servers.sh | ||
run_server.py | ||
speed_test.py |