You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
petals/src/bloom
Alexander Borzunov ab41223b17
Fix dtype- and device-related client issues (#98)
This PR:

1. Makes inference/forward/backward calls on client remember the dtype and device of source tensors, then move/cast the outputs to the same dtype/device. This way:
    - Users don't need to make changes in the code launching `RemoteSequential` to make it run on a different device.
    - `model.generate()` also starts to support both CPU and GPU.

2. Sets default `low_cpu_mem_usage=True`, client's request timeout to 20 sec.

3. Removes excess casts to float32 left in Dmitry's code.

4. (minor) Improves error messages.
1 year ago
..
__init__.py black-isort 2 years ago
block.py Add automated tests (#23) 2 years ago
from_pretrained.py integrate mixed-8bit model (#39) 2 years ago
model.py Fix dtype- and device-related client issues (#98) 1 year ago
ops.py Miscellaneous fixes to automatic tests (#35) 2 years ago