petals

You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

History

Alexander Borzunov 8666653cf5 Fix routing through relay, default network RPS, --token, logging, readme (#399 ) * Hide GeneratorExit in _iterate_inference_steps() * Update README.md about `--public_name` * Use .from_pretrained(..., use_auth_token=token) instead of token=token until it's fully supported across HF libs * Use default network speed 25 Mbit/s * Apply relay penalty in max-throughput routing * Replace RPS with "tokens/sec per block" in logs * Increase default expiration		11 months ago
..
__init__.py	Add AutoDistributed{Model, ModelForCausalLM, ModelForSequenceClassification} (#329 )	12 months ago
asyncio.py	Shield alloc & free from cancellation (#163 )	1 year ago
auto_config.py	Fix routing through relay, default network RPS, --token, logging, readme (#399 )	11 months ago
convert_block.py	Support Llama 2 (#379 )	11 months ago
disk_cache.py	Allow free_disk_space_for() remove arbitrary files from Petals cache (#339 )	11 months ago
generation_algorithms.py	Fix typo in generation_algorithms.py (#364 )	11 months ago
generation_constraints.py	Fix typos with codespell (#126 )	2 years ago
hf_auth.py	Support Llama 2 (#379 )	11 months ago
logging.py	Remove unused imports and attributes (#324 )	12 months ago
misc.py	Support Llama 2 (#379 )	11 months ago
peft.py	Support Llama 2 (#379 )	11 months ago
ping.py	Implement shortest-path routing for inference (#362 )	11 months ago
random.py	Implement shortest-path routing for inference (#362 )	11 months ago
version.py	Add LLaMA support (#323 )	12 months ago