You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
de2475f31c
This PR drops custom generation codes and introduces compatibility with `transformers.GenerationMixin` instead. This includes support for more sampling options (`top_p`, `top_k`, `repetition_penalty` requested in #460) and beam search - all that is now identical to running model with transformers locally. Most features (excluding beam search and other rarely used stuff) are also compatible with resuming existing sessions. ### Breaking changes If `.generate()` or forward passes are being run inside an `.inference_session()` context, they now use the opened session by default. So, these snippets are now equivalent: ```python # Using default session with model.inference_session(max_length=100): output_ids = model.generate(input_ids, max_new_tokens=3) # Explicitly specifying a session with model.inference_session(max_length=100) as sess: output_ids = model.generate(input_ids, max_new_tokens=3, session=sess) ``` Earlier, the 1st snippet was creating a new session, which is not what most people expected (= such code was most likely to introduce a bug, which is now fixed). |
10 months ago | |
---|---|---|
.. | ||
__init__.py | 10 months ago | |
asyncio.py | 1 year ago | |
auto_config.py | 11 months ago | |
convert_block.py | 11 months ago | |
dht.py | 10 months ago | |
disk_cache.py | 11 months ago | |
hf_auth.py | 11 months ago | |
logging.py | 12 months ago | |
misc.py | 10 months ago | |
packaging.py | 10 months ago | |
peft.py | 11 months ago | |
ping.py | 10 months ago | |
random.py | 11 months ago | |
version.py | 12 months ago |