petals/tests
justheuristic 012f840f7e
Use length-weighted sampling in routing for inference (#204)
This pull-request implements a simple (1) greedy (2) latency-agnostic routing optimization that should speed up both our use cases.

Why this exists: our effort to merge full routing (ping-aware, throughut-aware, dijkstra) is in a sorry state between several branches; merging it into main would take many days.

Co-authored-by: Aleksandr Borzunov <borzunov.alexander@gmail.com>
2023-01-11 23:26:09 +03:00
..
scripts Fix arguments in remove_old_models.py (#153) 2022-12-13 19:01:12 +03:00
conftest.py Fix logging: do not duplicate lines, enable colors in Colab (#156) 2022-12-15 09:12:18 +04:00
test_aux_functions.py Add local tensor-parallel fwd/bwd (#143) 2023-01-03 18:35:51 +03:00
test_block_exact_match.py Add local tensor-parallel fwd/bwd (#143) 2023-01-03 18:35:51 +03:00
test_chained_calls.py Make Petals a pip-installable package (attempt 2) (#102) 2022-11-30 10:41:13 +04:00
test_full_model.py Fix logging: do not duplicate lines, enable colors in Colab (#156) 2022-12-15 09:12:18 +04:00
test_linear8bitlt.py Support --load_in_8bit on pre-Turing GPUs (#113) 2022-12-02 15:10:24 +03:00
test_priority_pool.py Fix issues related to petals as a module (#159) 2022-12-16 09:09:06 +04:00
test_remote_sequential.py Add local tensor-parallel fwd/bwd (#143) 2023-01-03 18:35:51 +03:00
test_sequence_manager.py Use length-weighted sampling in routing for inference (#204) 2023-01-11 23:26:09 +03:00
test_tensor_parallel.py Increase tolerances in test_tp_block (#196) 2023-01-11 17:54:24 +03:00
test_utils.py Implement RemoteSequential slicing and extra repr, add tests (#30) 2022-07-19 04:28:04 +03:00
test.id Add automated tests (#23) 2022-07-16 01:59:23 +03:00