You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
a2634001e9
This PR reduces this vocabulary size to save memory during conversion, keeping only the first 50k tokens As a result, * tests that load client-side embeddings need significantly less RAM * we can now run CI tests with 4 servers instead of 2 - needed to test routing - see bugs uncovered * some of the servers now use load balancing * CI convert_model now takes 4-5 minutes (was 6-7) |
2 years ago | |
---|---|---|
.. | ||
check-style.yaml | 2 years ago | |
run-tests.yaml | 2 years ago |