petals

You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

History

Alexander Borzunov ab41223b17 Fix dtype- and device-related client issues (#98 ) This PR: 1. Makes inference/forward/backward calls on client remember the dtype and device of source tensors, then move/cast the outputs to the same dtype/device. This way: - Users don't need to make changes in the code launching `RemoteSequential` to make it run on a different device. - `model.generate()` also starts to support both CPU and GPU. 2. Sets default `low_cpu_mem_usage=True`, client's request timeout to 20 sec. 3. Removes excess casts to float32 left in Dmitry's code. 4. (minor) Improves error messages.		1 year ago
..
bloom	Fix dtype- and device-related client issues (#98 )	1 year ago
client	Fix dtype- and device-related client issues (#98 )	1 year ago
server	Add various server timeouts, lower --max_batch_size and --inference_max_length defaults (#97 )	1 year ago
utils	Add various server timeouts, lower --max_batch_size and --inference_max_length defaults (#97 )	1 year ago
__init__.py	Measure and cache network & compute throughput (#21 )	2 years ago
constants.py	Use public swarm by default (#92 )	1 year ago
data_structures.py	Implement RemoteSequential slicing and extra repr, add tests (#30 )	2 years ago
dht_utils.py	remove transformer block, implement as sequential of size 1 (#54 )	2 years ago