- latest accelerate, transformers, huggingface_hub
- rearrange attention caches to support https://github.com/huggingface/transformers/pull/18344
- remove unused code
- fix edge case where session crashes when receiving seq length 0
- assert transformer version when importing WrappedBloomBlock
Co-authored-by: Alexander Borzunov <borzunov.alexander@gmail.com>
Co-authored-by: Max Ryabinin <mryabinin0@gmail.com>
Summary:
```python
parser.add_argument('--attn_cache_size', type=str, default=None,
help='The size of GPU memory allocated for storing past attention keys/values between inference steps. '
'Examples: 500MB, 1.2GB, 1073741824 (bytes). Note that 1KB != 1KiB here. '
'Default: 0.5GiB * num_blocks * hidden_size / 14336. '
'The latter is the hidden size of the bigscience/bloom-petals model.')
parser.add_argument('--request_timeout', type=float, required=False, default=3 * 60,
help='Timeout (in seconds) for the whole rpc_forward/rpc_backward/rpc_forward_stream/rpc_backward_stream request')
parser.add_argument('--session_timeout', type=float, required=False, default=30 * 60,
help='Timeout (in seconds) for the whole inference session')
parser.add_argument('--step_timeout', type=float, required=False, default=60,
help="Timeout (in seconds) for waiting the next step's inputs inside an inference session")
parser.add_argument('--load_in_8bit', type=bool, default=None,
help="Convert the loaded model into mixed-8bit quantized model. Default: True if GPU is available")
```
Co-authored-by: justheuristic <justheuristic@gmail.com>
1. Petals can be now installed using `pip install git+https://github.com/bigscience-workshop/petals`
- In case if you already cloned the repo, you can do `pip install .` or `pip install .[dev]`
2. Moved `src` => `src/petals`
- Replaced `from src.smth import smth` with `from petals.smth import smth`
3. Moved `cli` => `src/petals/cli`
- Replaced `python -m cli.run_smth` with `python -m petals.cli.run_smth` (all utilities are now available right after pip installation)
4. Moved the `requirements*.txt` contents to `setup.cfg` (`requirements.txt` for packages is not supported well by modern packaging utils)
5. Increased the package version from `0.2` to `1.0alpha1`