petals

You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

History

Alexander Borzunov de2475f31c Make client compatible with transformers' GenerationMixin (#464 ) This PR drops custom generation codes and introduces compatibility with `transformers.GenerationMixin` instead. This includes support for more sampling options (`top_p`, `top_k`, `repetition_penalty` requested in #460) and beam search - all that is now identical to running model with transformers locally. Most features (excluding beam search and other rarely used stuff) are also compatible with resuming existing sessions. ### Breaking changes If `.generate()` or forward passes are being run inside an `.inference_session()` context, they now use the opened session by default. So, these snippets are now equivalent: ```python # Using default session with model.inference_session(max_length=100): output_ids = model.generate(input_ids, max_new_tokens=3) # Explicitly specifying a session with model.inference_session(max_length=100) as sess: output_ids = model.generate(input_ids, max_new_tokens=3, session=sess) ``` Earlier, the 1st snippet was creating a new session, which is not what most people expected (= such code was most likely to introduce a bug, which is now fixed).		10 months ago
..
__init__.py	Move SequenceManagerConfig -> ClientConfig, petals.dht_utils -> petals.utils.dht (#463 )	10 months ago
asyncio.py	Shield alloc & free from cancellation (#163 )	1 year ago
auto_config.py	Fix routing through relay, default network RPS, --token, logging, readme (#399 )	11 months ago
convert_block.py	Support Llama 2 (#379 )	11 months ago
dht.py	Move SequenceManagerConfig -> ClientConfig, petals.dht_utils -> petals.utils.dht (#463 )	10 months ago
disk_cache.py	Allow free_disk_space_for() remove arbitrary files from Petals cache (#339 )	11 months ago
hf_auth.py	Support Llama 2 (#379 )	11 months ago
logging.py	Remove unused imports and attributes (#324 )	12 months ago
misc.py	Make client compatible with transformers' GenerationMixin (#464 )	10 months ago
packaging.py	Add customizable input tensors (#445 )	10 months ago
peft.py	Support Llama 2 (#379 )	11 months ago
ping.py	Fix petals.utils.ping for servers with client-mode DHT (#430 )	10 months ago
random.py	Implement shortest-path routing for inference (#362 )	11 months ago
version.py	Add LLaMA support (#323 )	12 months ago