You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
petals/src/petals/server
Alexander Borzunov 8c546d988a
Test Llama, rebalancing, throughput eval, and all CLI scripts (#452)
This PR extends CI to:

1. Test Llama code using [TinyLlama-v0](https://huggingface.co/Maykeye/TinyLLama-v0).
2. Test rebalancing (sets up a situation where the 1st server needs to change its original position).
3. Check if benchmark scripts run (in case someone breaks its code). Note that the benchmark results are meaningless here (since they're measured on a tiny swarm of CPU servers, with low `--n_steps`).
4. Test `petals.cli.run_dht`.
5. Increase swap space and watch free RAM (a common issue is that actions are cancelled without explanation if there's not enough RAM - so it's a useful reminder + debug tool).
6. Fix flapping tests for bloom-560m by increasing tolerance.

Other minor changes: fix `--help` messages to show defaults, fix docs, tune rebalancing constants.
10 months ago
..
__init__.py Make Petals a pip-installable package (attempt 2) (#102) 2 years ago
backend.py Split long sequences into chunks (#403) 11 months ago
block_functions.py [Refactor] extract block forward, backward and inference into a separate file (#435) 10 months ago
block_selection.py Use get_logger(__name__) instead of get_logger(__file__) (#265) 1 year ago
block_utils.py Override float32 in config to bfloat16 (#431) 10 months ago
from_pretrained.py Fix routing through relay, default network RPS, --token, logging, readme (#399) 11 months ago
handler.py [Refactor] extract block forward, backward and inference into a separate file (#435) 10 months ago
memory_cache.py Fix deadlocks in MemoryCache (#396) 11 months ago
reachability.py Update to petals.dev (#390) 11 months ago
server.py Test Llama, rebalancing, throughput eval, and all CLI scripts (#452) 10 months ago
task_pool.py Use get_logger(__name__) instead of get_logger(__file__) (#265) 1 year ago
task_prioritizer.py Merge inference pools into one to increase inference speed (#225) 1 year ago
throughput.py Penalize servers that use relays during rebalancing (#428) 10 months ago