petals

mirror of https://github.com/bigscience-workshop/petals synced 2024-11-13 19:11:21 +00:00

Author	SHA1	Message	Date
justheuristic	d271b75dd4	Let users specify sequence length instead of assuming 2048 (#52 ) - Maximum length is now provided in `.inference_session(max_length=100)` - previously, we would always assume max length = 2048 - added a generic way to forward *kwargs to inference session - for compatibility with #47 - Note to @borzunov : it does not* pass them arbitrarily, but instead checks for kwarg names at the bottom level - run_server can be started with a custom max_length for inference - renamed --cache_size_bytes to --attention_cache_bytes (to avoid collision with --cache_dir) - --attn_cache_bytes can now support humane file sizes (e.g. 300MB instead of 314572800) - made some server-side errors more human-readable to user (e.g. when max length is exceeded) Co-authored-by: Aleksandr Borzunov <borzunov.alexander@gmail.com> Co-authored-by: Alexander Borzunov <hxrussia@gmail.com>	2022-08-29 21:04:37 +03:00
Dmitry Baranchuk	11a424837f	integrate mixed-8bit model (#39 ) * integrate mixed-8bit model * Fix bug with model duplication in RAM * set throughput=1.0 to fix zero throughput problem * add revision support * update hivemind and bitsandbytes * update deploy scripts * update installation instructions	2022-08-04 09:57:37 +03:00
Dmitry Baranchuk	f114a6d417	set default num_handlers=16	2022-07-14 02:32:58 +03:00
Alexander Borzunov	75856e4769	Measure and cache network & compute throughput (#21 )	2022-07-13 05:46:26 +04:00
Alexander Borzunov	aba43f1308	Implement block selection on servers (#20 )	2022-07-12 14:42:30 +04:00
justheuristic	9c492bbe8c	Infer prefix by defaukt	2022-06-29 11:28:13 +03:00
justheuristic	6e3db6bed6	black-isort	2022-06-28 13:16:03 +03:00
justheuristic	d5c410bb1f	fix auth token	2022-06-27 15:51:58 +03:00
justheuristic	1ab5fb1630	fetch a specific bloom block without downloading the entire model	2022-06-20 15:33:17 +03:00
justheuristic	3b9351de1c	isort	2022-06-19 17:22:57 +03:00
justheuristic	20497f81d1	switch to hivemind-master	2022-06-17 11:34:50 +03:00
Pavel Samygin	57f4e0a899	add identity path	2022-06-17 10:15:15 +03:00
justheuristic	a798ea04a6	add minimalistic benchmarks	2022-06-14 15:18:11 +03:00
justheuristic	14e316b52a	black-isort	2022-06-14 09:46:14 +03:00
justheuristic	7ce7cd7a97	basic backend	2022-06-14 08:49:42 +03:00
justheuristic	1c49bcb741	basic backend	2022-06-14 08:26:05 +03:00

16 Commits