You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
- rpc_inference: server will now accept allocation timeout from user, defaults to no timeout - bugfix: inference timeout is now measured from the moment the request is received - previously, you would have to wait for your timeout plus the time it takes to sort through the queue (other users' timeout) - now, you get AllocationFailed if you had to wait for over (timeout) seconds - regardless of other users - a request for inference with no timeout will now fail instantly if there is not enough memory available - dtype number of bytes is now correctly determined for int, bool & other types --------- Co-authored-by: Your Name <you@example.com> Co-authored-by: Alexander Borzunov <borzunov.alexander@gmail.com> Co-authored-by: Aleksandr Borzunov <hxrussia@gmail.com> |
10 months ago | |
---|---|---|
.. | ||
petals | 10 months ago |