petals

You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

History

Artem Chumachenko d6f4f80f3f Fix Mixtral-related issues (#570 ) This PR fixes problems related to #569: - block initialization - throughput calculation and cache usage - mixtral in tests Beam search is removed for Mixtral and Llama for now. Those models use DynamicCache, which requires special function to change: (see https://github.com/huggingface/transformers/blob/main/src/transformers/cache_utils.py#L161) --------- Co-authored-by: Max Ryabinin <mryabinin0@gmail.com>		2 weeks ago
..
bootstrap.id	Test Llama, rebalancing, throughput eval, and all CLI scripts (#452 )	9 months ago
conftest.py	Fix logging: do not duplicate lines, enable colors in Colab (#156 )	1 year ago
server2.id	Test Llama, rebalancing, throughput eval, and all CLI scripts (#452 )	9 months ago
test_aux_functions.py	Add customizable input tensors (#445 )	9 months ago
test_block_exact_match.py	Prioritize short inference, unmerge pools for long inference (#458 )	9 months ago
test_cache.py	Support macOS (#477 )	8 months ago
test_chained_calls.py	Fix Mixtral-related issues (#570 )	2 weeks ago
test_dtype.py	Add LLaMA support (#323 )	10 months ago
test_full_model.py	Fix Mixtral-related issues (#570 )	2 weeks ago
test_optimized_layers.py	Fix Mixtral-related issues (#570 )	2 weeks ago
test_peft.py	Support peft LoRA adapters (#335 )	10 months ago
test_priority_pool.py	Support macOS (#477 )	8 months ago
test_remote_sequential.py	Fix `.generate(input_ids=...)` (#485 )	8 months ago
test_sequence_manager.py	Test Llama, rebalancing, throughput eval, and all CLI scripts (#452 )	9 months ago
test_server_stats.py	Test Llama, rebalancing, throughput eval, and all CLI scripts (#452 )	9 months ago
test_tensor_parallel.py	Test Llama, rebalancing, throughput eval, and all CLI scripts (#452 )	9 months ago
test_utils.py	Support peft LoRA adapters (#335 )	10 months ago