You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
ab41223b17
This PR: 1. Makes inference/forward/backward calls on client remember the dtype and device of source tensors, then move/cast the outputs to the same dtype/device. This way: - Users don't need to make changes in the code launching `RemoteSequential` to make it run on a different device. - `model.generate()` also starts to support both CPU and GPU. 2. Sets default `low_cpu_mem_usage=True`, client's request timeout to 20 sec. 3. Removes excess casts to float32 left in Dmitry's code. 4. (minor) Improves error messages. |
1 year ago | |
---|---|---|
.. | ||
__init__.py | 1 year ago | |
inference_session.py | 1 year ago | |
remote_forward_backward.py | 2 years ago | |
remote_generation.py | 1 year ago | |
remote_model.py | 1 year ago | |
remote_sequential.py | 1 year ago | |
sequence_manager.py | 1 year ago | |
sequential_autograd.py | 1 year ago | |
spending_policy.py | 2 years ago |