petals

You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

History

Alexander Borzunov 351e96bc46 Penalize servers that use relays during rebalancing (#428 ) Servers accessible only via relays may introduce issues if they are the only type of servers holding certain blocks. Specifically, a connection to such servers may be unstable or opened after a certain delay. This PR changes their self-reported throughput, so that the rebalancing algorithm prefers to put directly available servers for hosting each block.		10 months ago
..
routing	Penalize servers that use relays during rebalancing (#428 )	10 months ago
__init__.py	Add LLaMA support (#323 )	11 months ago
from_pretrained.py	Add LLaMA support (#323 )	11 months ago
inference_session.py	Add connect_timeout (#423 )	10 months ago
lm_head.py	Fix llama's lm_head.weight.requires_grad (#330 )	11 months ago
ptune.py	Fix llama's lm_head.weight.requires_grad (#330 )	11 months ago
remote_forward_backward.py	Add connect_timeout (#423 )	10 months ago
remote_generation.py	Raise error for unexpected .generate() kwargs (#315 )	1 year ago
remote_sequential.py	Support peft LoRA adapters (#335 )	11 months ago
sequential_autograd.py	Add connect_timeout (#423 )	10 months ago