You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
petals/src/petals/server
Alexander Borzunov 057a2fb5de
Support Llama 2 (#379)
11 months ago
..
__init__.py Make Petals a pip-installable package (attempt 2) (#102) 2 years ago
backend.py Support Llama 2 (#379) 11 months ago
block_selection.py Use get_logger(__name__) instead of get_logger(__file__) (#265) 1 year ago
block_utils.py Support loading blocks in 4-bit (QLoRA NF4 format, disabled by default) (#333) 11 months ago
from_pretrained.py Support Llama 2 (#379) 11 months ago
handler.py Fix handler memory leak, get rid of mp.Manager (#373) 11 months ago
memory_cache.py Share more info about a server in DHT (#355) 11 months ago
reachability.py Support Llama 2 (#379) 11 months ago
server.py Support Llama 2 (#379) 11 months ago
task_pool.py Use get_logger(__name__) instead of get_logger(__file__) (#265) 1 year ago
task_prioritizer.py Merge inference pools into one to increase inference speed (#225) 1 year ago
throughput.py Report inference, forward, and network RPS separately (#358) 11 months ago