Commit Graph

6 Commits (main)

Author SHA1 Message Date
Alexander Borzunov de2475f31c
Make client compatible with transformers' GenerationMixin (#464)
This PR drops custom generation codes and introduces compatibility with `transformers.GenerationMixin` instead. This includes support for more sampling options (`top_p`, `top_k`, `repetition_penalty` requested in #460) and beam search - all that is now identical to running model with transformers locally.

Most features (excluding beam search and other rarely used stuff) are also compatible with resuming existing sessions.

### Breaking changes

If `.generate()` or forward passes are being run inside an `.inference_session()` context, they now use the opened session by default. So, these snippets are now equivalent:

```python
# Using default session
with model.inference_session(max_length=100):
    output_ids = model.generate(input_ids, max_new_tokens=3)

# Explicitly specifying a session
with model.inference_session(max_length=100) as sess:
    output_ids = model.generate(input_ids, max_new_tokens=3, session=sess)
```

Earlier, the 1st snippet was creating a new session, which is not what most people expected (= such code was most likely to introduce a bug, which is now fixed).
9 months ago
Alexander Borzunov 056f22515a
Prioritize short inference, unmerge pools for long inference (#458)
Right now, long inference requests may occupy Runtime for a few seconds without giving it away to process short (most latency-sensitive requests). This PR fixes it by disallowing the merged pool for long requests and prioritizing the short ones.
10 months ago
justheuristic c4938bc23e
Merge inference pools into one to increase inference speed (#225)
It turns out using a separate pool for each block has led to significant slowdown, see #224 for details.
1 year ago
Max Ryabinin 3ca8b4f082
Fix typos with codespell (#126) 1 year ago
Max Ryabinin 9faf08b898
Remove unused imports, add missing arguments to docstrings (#108)
* Remove unused imports, add missing arguments to docstrings
1 year ago
Alexander Borzunov 7bd5916744
Make Petals a pip-installable package (attempt 2) (#102)
1. Petals can be now installed using `pip install git+https://github.com/bigscience-workshop/petals`
    - In case if you already cloned the repo, you can do `pip install .` or `pip install .[dev]`
2. Moved `src` => `src/petals`
    - Replaced `from src.smth import smth` with `from petals.smth import smth`
3. Moved `cli` => `src/petals/cli`
    - Replaced `python -m cli.run_smth` with `python -m petals.cli.run_smth` (all utilities are now available right after pip installation)
4. Moved the `requirements*.txt` contents to `setup.cfg` (`requirements.txt` for packages is not supported well by modern packaging utils)
5. Increased the package version from `0.2` to `1.0alpha1`
2 years ago