petals

mirror of https://github.com/bigscience-workshop/petals synced 2024-10-31 09:20:41 +00:00

History

justheuristic a2634001e9 Reduce vocabulary size in test model, fix bug in routing when overlapped (#45 ) This PR reduces this vocabulary size to save memory during conversion, keeping only the first 50k tokens As a result, * tests that load client-side embeddings need significantly less RAM * we can now run CI tests with 4 servers instead of 2 - needed to test routing - see bugs uncovered * some of the servers now use load balancing * CI convert_model now takes 4-5 minutes (was 6-7)	2022-08-17 18:50:52 +03:00
..
check-style.yaml	Clean up readme (#24 )	2022-07-16 02:11:17 +03:00
run-tests.yaml	Reduce vocabulary size in test model, fix bug in routing when overlapped (#45 )	2022-08-17 18:50:52 +03:00

Reduce vocabulary size in test model, fix bug in routing when overlapped (#45 )

This PR reduces this vocabulary size to save memory during conversion, keeping only the first 50k tokens
As a result, 

* tests that load client-side embeddings need significantly less RAM
* we can now run CI tests with 4 servers instead of 2 - needed to test routing - see bugs uncovered
* some of the servers now use load balancing
* CI convert_model now takes 4-5 minutes (was 6-7)

2022-08-17 18:50:52 +03:00

check-style.yaml

Clean up readme (#24 )

2022-07-16 02:11:17 +03:00

run-tests.yaml

Reduce vocabulary size in test model, fix bug in routing when overlapped (#45 )

2022-08-17 18:50:52 +03:00