mirror of
https://github.com/bigscience-workshop/petals
synced 2024-10-31 09:20:41 +00:00
012f840f7e
This pull-request implements a simple (1) greedy (2) latency-agnostic routing optimization that should speed up both our use cases. Why this exists: our effort to merge full routing (ping-aware, throughut-aware, dijkstra) is in a sorry state between several branches; merging it into main would take many days. Co-authored-by: Aleksandr Borzunov <borzunov.alexander@gmail.com> |
||
---|---|---|
.. | ||
scripts | ||
conftest.py | ||
test_aux_functions.py | ||
test_block_exact_match.py | ||
test_chained_calls.py | ||
test_full_model.py | ||
test_linear8bitlt.py | ||
test_priority_pool.py | ||
test_remote_sequential.py | ||
test_sequence_manager.py | ||
test_tensor_parallel.py | ||
test_utils.py | ||
test.id |