You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
petals/src
Alexander Borzunov ab41223b17
Fix dtype- and device-related client issues (#98)
This PR:

1. Makes inference/forward/backward calls on client remember the dtype and device of source tensors, then move/cast the outputs to the same dtype/device. This way:
    - Users don't need to make changes in the code launching `RemoteSequential` to make it run on a different device.
    - `model.generate()` also starts to support both CPU and GPU.

2. Sets default `low_cpu_mem_usage=True`, client's request timeout to 20 sec.

3. Removes excess casts to float32 left in Dmitry's code.

4. (minor) Improves error messages.
1 year ago
..
bloom Fix dtype- and device-related client issues (#98) 1 year ago
client Fix dtype- and device-related client issues (#98) 1 year ago
server Add various server timeouts, lower --max_batch_size and --inference_max_length defaults (#97) 1 year ago
utils Add various server timeouts, lower --max_batch_size and --inference_max_length defaults (#97) 1 year ago
__init__.py Measure and cache network & compute throughput (#21) 2 years ago
constants.py Use public swarm by default (#92) 1 year ago
data_structures.py Implement RemoteSequential slicing and extra repr, add tests (#30) 2 years ago
dht_utils.py remove transformer block, implement as sequential of size 1 (#54) 2 years ago