You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
petals/src/petals/server
justheuristic c4938bc23e
Merge inference pools into one to increase inference speed (#225)
It turns out using a separate pool for each block has led to significant slowdown, see #224 for details.
1 year ago
..
__init__.py Make Petals a pip-installable package (attempt 2) (#102) 1 year ago
backend.py Merge inference pools into one to increase inference speed (#225) 1 year ago
block_selection.py Don't switch blocks if it makes swarm disjoint (#210) 1 year ago
block_utils.py Bump transformers to 4.25.1 (#151) 1 year ago
handler.py Merge inference pools into one to increase inference speed (#225) 1 year ago
memory_cache.py Add local tensor-parallel fwd/bwd (#143) 1 year ago
reachability.py Add service checking direct reachability from peers (#195) 1 year ago
server.py Merge inference pools into one to increase inference speed (#225) 1 year ago
task_pool.py Add local tensor-parallel fwd/bwd (#143) 1 year ago
task_prioritizer.py Merge inference pools into one to increase inference speed (#225) 1 year ago
throughput.py Fix output shape when resuming generation (#211) 1 year ago