petals

You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

History

Alexander Borzunov ab41223b17 Fix dtype- and device-related client issues (#98 ) This PR: 1. Makes inference/forward/backward calls on client remember the dtype and device of source tensors, then move/cast the outputs to the same dtype/device. This way: - Users don't need to make changes in the code launching `RemoteSequential` to make it run on a different device. - `model.generate()` also starts to support both CPU and GPU. 2. Sets default `low_cpu_mem_usage=True`, client's request timeout to 20 sec. 3. Removes excess casts to float32 left in Dmitry's code. 4. (minor) Improves error messages.		1 year ago
..
__init__.py	black-isort	2 years ago
block.py	Add automated tests (#23 )	2 years ago
from_pretrained.py	integrate mixed-8bit model (#39 )	2 years ago
model.py	Fix dtype- and device-related client issues (#98 )	1 year ago
ops.py	Miscellaneous fixes to automatic tests (#35 )	2 years ago