You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
petals/tests
justheuristic 5af04524dd
Split long sequences into chunks (#403)
This PR is designed to avoid OOMs when processing long sequences that happen due to the huge attention logits matrices.

Co-authored-by: Alexander Borzunov <borzunov.alexander@gmail.com>
10 months ago
..
conftest.py Fix logging: do not duplicate lines, enable colors in Colab (#156) 1 year ago
test.id Add automated tests (#23) 2 years ago
test_aux_functions.py Report inference, forward, and network RPS separately (#358) 10 months ago
test_block_exact_match.py Add LLaMA support (#323) 11 months ago
test_chained_calls.py Add LLaMA support (#323) 11 months ago
test_dtype.py Add LLaMA support (#323) 11 months ago
test_full_model.py Split long sequences into chunks (#403) 10 months ago
test_peft.py Support peft LoRA adapters (#335) 10 months ago
test_priority_pool.py Fix issues related to `petals` as a module (#159) 1 year ago
test_remote_sequential.py Support loading blocks in 4-bit (QLoRA NF4 format, disabled by default) (#333) 11 months ago
test_sequence_manager.py Test that bitsandbytes is not imported when it's not used (#351) 10 months ago
test_server_stats.py Add LLaMA support (#323) 11 months ago
test_tensor_parallel.py Add LLaMA support (#323) 11 months ago
test_utils.py Support peft LoRA adapters (#335) 10 months ago