You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
petals/src/petals/server
Alexander Borzunov 21c3526ec1
Start SequenceManager's thread only after first .make_sequence() (#301)
**Why?**

- We'd like to avoid excess threads for the original sequence manager in case if we only use its slices (e.g. when we add adapters or need only a subset of model blocks):

- If we create a sequence manager just before a fork (e.g. in a web app backend or a multi-thread benchmark), we'd like to avoid excess threads in the original process and only use this thread in child processes where we actually call `.make_sequence()`.
1 year ago
..
__init__.py Make Petals a pip-installable package (attempt 2) (#102) 2 years ago
backend.py Use inference mode in _MergedInferenceStep (#275) 1 year ago
block_selection.py Use get_logger(__name__) instead of get_logger(__file__) (#265) 1 year ago
block_utils.py Speed up loading blocks using init with meta weights (#285) 1 year ago
handler.py Use get_logger(__name__) instead of get_logger(__file__) (#265) 1 year ago
memory_cache.py Use get_logger(__name__) instead of get_logger(__file__) (#265) 1 year ago
reachability.py Improve reachability logs (#253) 1 year ago
server.py Start SequenceManager's thread only after first .make_sequence() (#301) 1 year ago
task_pool.py Use get_logger(__name__) instead of get_logger(__file__) (#265) 1 year ago
task_prioritizer.py Merge inference pools into one to increase inference speed (#225) 1 year ago
throughput.py Use get_logger(__name__) instead of get_logger(__file__) (#265) 1 year ago