petals

You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

History

Alexander Borzunov 43ac6016ac Fix dtypes in backend schemas (#99 ) Currently, the schemas use `torch.float32`, so all inputs and outputs converted to float32 before sending and after receiving on both servers and clients. This creates a huge slowdown for the system. * This PR makes the schemas use the server's `--torch_dtype` argument (default is `torch.bloat16` for BLOOM-176B) * an option for client to request a specific output compression. Use case 1: client sends quantized inputs and expects quantized inputs in return. Use case 2: client uses quantization for gradients w.r.t. activations, but keeps grads w.r.t. __prompts__ as is for greater precision. * a comment explaining the purpose of NoSpendingPolicy - since we likely won't have it for the workshop * a test with custom compression (janky implementation for testing purposes) Co-authored-by: justheuristic <justheuristic@gmail.com>		2 years ago
..
scripts	Reduce vocabulary size in test model, fix bug in routing when overlapped (#45 )	2 years ago
conftest.py	Implement RemoteSequential slicing and extra repr, add tests (#30 )	2 years ago
test.id	Add automated tests (#23 )	2 years ago
test_block_exact_match.py	Make Petals a pip-installable package (attempt 2) (#102 )	2 years ago
test_chained_calls.py	Make Petals a pip-installable package (attempt 2) (#102 )	2 years ago
test_full_model.py	Make Petals a pip-installable package (attempt 2) (#102 )	2 years ago
test_priority_pool.py	Make Petals a pip-installable package (attempt 2) (#102 )	2 years ago
test_remote_sequential.py	Fix dtypes in backend schemas (#99 )	2 years ago
test_utils.py	Implement RemoteSequential slicing and extra repr, add tests (#30 )	2 years ago