You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
petals/src/petals
Alexander Borzunov 643a054170
Make server use smart defaults (#115)
Summary:

```python
parser.add_argument('--attn_cache_size', type=str, default=None,
                    help='The size of GPU memory allocated for storing past attention keys/values between inference steps. '
                         'Examples: 500MB, 1.2GB, 1073741824 (bytes). Note that 1KB != 1KiB here. '
                         'Default: 0.5GiB * num_blocks * hidden_size / 14336. '
                         'The latter is the hidden size of the bigscience/bloom-petals model.')

parser.add_argument('--request_timeout', type=float, required=False, default=3 * 60,
                    help='Timeout (in seconds) for the whole rpc_forward/rpc_backward/rpc_forward_stream/rpc_backward_stream request')
parser.add_argument('--session_timeout', type=float, required=False, default=30 * 60,
                    help='Timeout (in seconds) for the whole inference session')
parser.add_argument('--step_timeout', type=float, required=False, default=60,
                    help="Timeout (in seconds) for waiting the next step's inputs inside an inference session")

parser.add_argument('--load_in_8bit', type=bool, default=None,
                    help="Convert the loaded model into mixed-8bit quantized model. Default: True if GPU is available")
```

Co-authored-by: justheuristic <justheuristic@gmail.com>
2 years ago
..
bloom Remove unused imports, add missing arguments to docstrings (#108) 2 years ago
cli Make server use smart defaults (#115) 2 years ago
client Hotfix span selection (#110) 2 years ago
server Make server use smart defaults (#115) 2 years ago
utils Fix tile size on ampere (#116) 2 years ago
__init__.py Make Petals a pip-installable package (attempt 2) (#102) 2 years ago
constants.py Make Petals a pip-installable package (attempt 2) (#102) 2 years ago
data_structures.py Make Petals a pip-installable package (attempt 2) (#102) 2 years ago
dht_utils.py Optimize RemoteSequenceManager (#106) 2 years ago