petals

You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

History

Alexander Borzunov 21c3526ec1 Start SequenceManager's thread only after first .make_sequence() (#301 ) Why? - We'd like to avoid excess threads for the original sequence manager in case if we only use its slices (e.g. when we add adapters or need only a subset of model blocks): - If we create a sequence manager just before a fork (e.g. in a web app backend or a multi-thread benchmark), we'd like to avoid excess threads in the original process and only use this thread in child processes where we actually call `.make_sequence()`.		1 year ago
..
__init__.py	Make Petals a pip-installable package (attempt 2) (#102 )	2 years ago
backend.py	Use inference mode in _MergedInferenceStep (#275 )	1 year ago
block_selection.py	Use get_logger(__name__) instead of get_logger(__file__) (#265 )	1 year ago
block_utils.py	Speed up loading blocks using init with meta weights (#285 )	1 year ago
handler.py	Use get_logger(__name__) instead of get_logger(__file__) (#265 )	1 year ago
memory_cache.py	Use get_logger(__name__) instead of get_logger(__file__) (#265 )	1 year ago
reachability.py	Improve reachability logs (#253 )	1 year ago
server.py	Start SequenceManager's thread only after first .make_sequence() (#301 )	1 year ago
task_pool.py	Use get_logger(__name__) instead of get_logger(__file__) (#265 )	1 year ago
task_prioritizer.py	Merge inference pools into one to increase inference speed (#225 )	1 year ago
throughput.py	Use get_logger(__name__) instead of get_logger(__file__) (#265 )	1 year ago