You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
petals/src/petals/server
Alexander Borzunov 6ba63c6cc8
Fix output shape when resuming generation (#211)
Before this PR, `model.generate()` returned one excess token when resuming generation with an existing (the last token of the previous session, `session.last_token_id`). This is an unexpected behavior not convenient for the downstream apps, so this PR changes it until it's too late.
1 year ago
..
__init__.py Make Petals a pip-installable package (attempt 2) (#102) 1 year ago
backend.py Return available cache size in rpc_info() (#191) 1 year ago
block_selection.py Don't switch blocks if it makes swarm disjoint (#210) 1 year ago
block_utils.py Bump transformers to 4.25.1 (#151) 1 year ago
handler.py Report server version and dht.client_mode in rpc_info(), check for updates on startup (#209) 1 year ago
memory_cache.py Add local tensor-parallel fwd/bwd (#143) 1 year ago
reachability.py Add service checking direct reachability from peers (#195) 1 year ago
server.py Report server version and dht.client_mode in rpc_info(), check for updates on startup (#209) 1 year ago
task_pool.py Add local tensor-parallel fwd/bwd (#143) 1 year ago
task_prioritizer.py Fix typos with codespell (#126) 1 year ago
throughput.py Fix output shape when resuming generation (#211) 1 year ago