petals

mirror of https://github.com/bigscience-workshop/petals synced 2024-11-13 19:11:21 +00:00

Author	SHA1	Message	Date
Alexander Borzunov	ee4e69c254	Enable rebalancing by default (#84 )	2022-11-02 00:50:01 +04:00
Alexander Borzunov	f64eb3a665	Update hivemind to 1.1.2, mark `model` argument as required (#81 )	2022-10-26 03:23:18 +04:00
Alexander Borzunov	149f433763	Rebalance swarm when necessary (#34 )	2022-10-12 14:28:27 +04:00
justheuristic	8caf1145a8	Quality of life changes: update readme, simplify run_server interface (#75 ) - run_server now accepts model name as both positional and keyword argument - changed names in README to account for interface updates - moved model conversion from README to a separate wiki page - updated requirements.txt	2022-09-20 03:51:57 +03:00
justheuristic	e92487e5d2	Update dependency versions (#71 ) * update dependency versions * install bitsandbytes cpuonly from pip * remove deprecated API from task pool * clearer startup logs Co-authored-by: Tim Dettmers <dettmers@cs.washington.edu>	2022-09-13 03:51:15 +03:00
Pavel Samygin	50535a8435	Priority tasks (#47 ) * priority in handlers and backend pools * simple points system on server side * priortize task in handler before submit task * fix tests * s/expert/block/g Co-authored-by: justheuristic <justheuristic@gmail.com>	2022-09-10 22:24:42 +03:00
justheuristic	d271b75dd4	Let users specify sequence length instead of assuming 2048 (#52 ) - Maximum length is now provided in `.inference_session(max_length=100)` - previously, we would always assume max length = 2048 - added a generic way to forward *kwargs to inference session - for compatibility with #47 - Note to @borzunov : it does not* pass them arbitrarily, but instead checks for kwarg names at the bottom level - run_server can be started with a custom max_length for inference - renamed --cache_size_bytes to --attention_cache_bytes (to avoid collision with --cache_dir) - --attn_cache_bytes can now support humane file sizes (e.g. 300MB instead of 314572800) - made some server-side errors more human-readable to user (e.g. when max length is exceeded) Co-authored-by: Aleksandr Borzunov <borzunov.alexander@gmail.com> Co-authored-by: Alexander Borzunov <hxrussia@gmail.com>	2022-08-29 21:04:37 +03:00
justheuristic	a2634001e9	Reduce vocabulary size in test model, fix bug in routing when overlapped (#45 ) This PR reduces this vocabulary size to save memory during conversion, keeping only the first 50k tokens As a result, * tests that load client-side embeddings need significantly less RAM * we can now run CI tests with 4 servers instead of 2 - needed to test routing - see bugs uncovered * some of the servers now use load balancing * CI convert_model now takes 4-5 minutes (was 6-7)	2022-08-17 18:50:52 +03:00
Dmitry Baranchuk	11a424837f	integrate mixed-8bit model (#39 ) * integrate mixed-8bit model * Fix bug with model duplication in RAM * set throughput=1.0 to fix zero throughput problem * add revision support * update hivemind and bitsandbytes * update deploy scripts * update installation instructions	2022-08-04 09:57:37 +03:00
Dmitry Baranchuk	6573076883	Sequential and parallel forward / backward (#36 )	2022-07-23 14:32:39 +03:00
justheuristic	e2711a033b	Add automated tests (#23 ) This PR will run basic tests automatically on each subsequent PR - convert a small model on every PR - run existing tests on every PR - enforce black / isort - require checks on merge - make sure tests are not flappy Co-authored-by: Alexander Borzunov <hxrussia@gmail.com> Co-authored-by: Dmitry Baranchuk <dmitrybaranchuk@gmail.com>	2022-07-16 01:59:23 +03:00
Dmitry Baranchuk	f114a6d417	set default num_handlers=16	2022-07-14 02:32:58 +03:00
Alexander Borzunov	75856e4769	Measure and cache network & compute throughput (#21 )	2022-07-13 05:46:26 +04:00
Alexander Borzunov	aba43f1308	Implement block selection on servers (#20 )	2022-07-12 14:42:30 +04:00
Dmitry Baranchuk	f055135b08	rm prefix	2022-07-08 21:59:25 +03:00
justheuristic	5695897620	fix imports	2022-07-07 04:13:20 +03:00
justheuristic	de556c99be	straighten import order	2022-07-07 03:49:04 +03:00
justheuristic	90d65e58aa	set default DHT prefix	2022-07-07 03:34:58 +03:00
justheuristic	41e5a95e8e	set client branch to main by default; remove the concept of base branch (redundant)	2022-07-07 03:18:10 +03:00
justheuristic	899cefe588	set client branch to main by default; remove the concept of base branch (redundant)	2022-07-07 03:16:47 +03:00
justheuristic	4695071ad2	WIP: make DistributedBloom compliant with HF interface	2022-07-07 03:11:28 +03:00
justheuristic	4ad845bce3	black-isort	2022-07-07 01:04:47 +03:00
Dmitry Baranchuk	be83e6d0cb	refactoring	2022-07-04 22:43:51 +03:00
Dmitry Baranchuk	d969172208	set requires_grad=False, lm_layer -> h @ word_embeddings, rm lm_layer from comverted_model	2022-07-04 21:18:29 +03:00
justheuristic	d42e8abd38	Merge branch 'client' into main	2022-07-01 03:53:54 +03:00
Dmitry Baranchuk	d7baa9997d	rm deprecated files	2022-06-29 13:56:44 +03:00
Dmitry Baranchuk	f60a7dd183	deploy swarm on local & remote machines	2022-06-29 13:52:43 +03:00
justheuristic	9c492bbe8c	Infer prefix by defaukt	2022-06-29 11:28:13 +03:00
justheuristic	6e3db6bed6	black-isort	2022-06-28 13:16:03 +03:00
justheuristic	894cd5d586	Merge branch 'fix-auth-token' into main	2022-06-28 13:15:34 +03:00
Dmitry Baranchuk	4d27080ccc	add scripts to deploy a swarm	2022-06-27 17:26:08 +03:00
justheuristic	d5c410bb1f	fix auth token	2022-06-27 15:51:58 +03:00
justheuristic	ce9556dcb0	deprecation warning	2022-06-23 16:26:54 +03:00
justheuristic	83cd4412a1	black-isort	2022-06-20 16:50:22 +03:00
justheuristic	1ab5fb1630	fetch a specific bloom block without downloading the entire model	2022-06-20 15:33:17 +03:00
justheuristic	6047a2ffe0	push config and tokenizer separately	2022-06-20 14:28:31 +03:00
justheuristic	b6f3bbfd97	black	2022-06-19 19:18:46 +03:00
justheuristic	84de19fb1a	better status logs	2022-06-19 19:17:44 +03:00
justheuristic	1555d98f66	push converted model to hub	2022-06-19 19:13:48 +03:00
justheuristic	736f1d1085	push converted model to hub	2022-06-19 19:06:35 +03:00
justheuristic	e8241d2915	black everything	2022-06-19 17:23:08 +03:00
justheuristic	3b9351de1c	isort	2022-06-19 17:22:57 +03:00
justheuristic	20497f81d1	switch to hivemind-master	2022-06-17 11:34:50 +03:00
Pavel Samygin	57f4e0a899	add identity path	2022-06-17 10:15:15 +03:00
justheuristic	a798ea04a6	add minimalistic benchmarks	2022-06-14 15:18:11 +03:00
justheuristic	14e316b52a	black-isort	2022-06-14 09:46:14 +03:00
justheuristic	7ce7cd7a97	basic backend	2022-06-14 08:49:42 +03:00
justheuristic	1c49bcb741	basic backend	2022-06-14 08:26:05 +03:00
justheuristic	3215945882	use logger	2022-06-14 08:25:06 +03:00
justheuristic	ce5dedd2c7	rename	2022-06-14 05:09:13 +03:00

1 2

52 Commits