petals

You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

History

Artem Chumachenko d6f4f80f3f Fix Mixtral-related issues (#570 ) This PR fixes problems related to #569: - block initialization - throughput calculation and cache usage - mixtral in tests Beam search is removed for Mixtral and Llama for now. Those models use DynamicCache, which requires special function to change: (see https://github.com/huggingface/transformers/blob/main/src/transformers/cache_utils.py#L161) --------- Co-authored-by: Max Ryabinin <mryabinin0@gmail.com>		3 weeks ago
..
routing	Store (start_block, end_block) in each DHT record for reliability (#510 )	8 months ago
__init__.py	Move SequenceManagerConfig -> ClientConfig, petals.dht_utils -> petals.utils.dht (#463 )	9 months ago
config.py	Improve default arguments for clients and servers (#530 )	6 months ago
from_pretrained.py	Improve default arguments for clients and servers (#530 )	6 months ago
inference_session.py	Bump transformers and accelerate versions (#554 )	2 months ago
lm_head.py	Improve default arguments for clients and servers (#530 )	6 months ago
ptune.py	Make client compatible with transformers' GenerationMixin (#464 )	8 months ago
remote_forward_backward.py	Move SequenceManagerConfig -> ClientConfig, petals.dht_utils -> petals.utils.dht (#463 )	9 months ago
remote_generation.py	Fix Mixtral-related issues (#570 )	3 weeks ago
remote_sequential.py	Make client compatible with transformers' GenerationMixin (#464 )	8 months ago
sequential_autograd.py	Make client compatible with transformers' GenerationMixin (#464 )	8 months ago