Aleksandr Borzunov
8c50f65cf2
black
2 years ago
Aleksandr Borzunov
0ef1d15c45
Require hivemind with MPFuture in inference mode fixed
2 years ago
Aleksandr Borzunov
90654da4e1
Improve InferenceSession typing
2 years ago
Aleksandr Borzunov
292e359731
Fix InferenceSession edge cases
2 years ago
Aleksandr Borzunov
b278a8d5f1
Make the first retry delay be zero
2 years ago
Aleksandr Borzunov
226fe91f6f
InferenceSession: Fix the case when failure happens while recovering
...
from another failure
2 years ago
Aleksandr Borzunov
01cffeba5d
Fix max_length
2 years ago
Aleksandr Borzunov
2fafbaa119
Fix timeout on next token
2 years ago
Aleksandr Borzunov
3bc06f0002
InferenceSession: Replace only a segment of spans instead of everything
...
until the end
2 years ago
Aleksandr Borzunov
fb47655482
Fix bug with make_sequence() returning longer sequences
2 years ago
Aleksandr Borzunov
a59facc0bf
Fix sequential_backward()
2 years ago
Aleksandr Borzunov
756e27707f
Fix sequential_forward()
2 years ago
Aleksandr Borzunov
a58a8b95d0
Make backward more fault-tolerant
2 years ago
Aleksandr Borzunov
87fd00ead9
Make forward more fault-tolerant
2 years ago
Aleksandr Borzunov
b1b1947e8f
Log disconnect errors with DEBUG level
2 years ago
Aleksandr Borzunov
3a7b8a4389
black
2 years ago
Aleksandr Borzunov
8d47e38251
Rename RemoteSequentialInferenceSession => InferenceSession
2 years ago
Aleksandr Borzunov
a232f13869
Rename RemoteTransformerBlockInferenceSession => _ServerInferenceSession
2 years ago
Aleksandr Borzunov
b6316a5603
Make inference session fields private
2 years ago
Aleksandr Borzunov
55bea823c0
Regenerate attn caches when necessary
2 years ago
Aleksandr Borzunov
f6622bcff7
Implement fault-tolerant inference
2 years ago
Aleksandr Borzunov
bd10d15e6e
Rename Remote{TransformerBlock => Server}InferenceSession
2 years ago
Artem Chumachenko
695df826c2
Force reinstall for hivemind in example notebooks ( #88 )
2 years ago
Alexander Borzunov
dc6ecccac5
Implement timeouts in forward/backward ( #90 )
2 years ago
Aleksandr Borzunov
4518d65fdd
Add MIT license
2 years ago
Alexander Borzunov
898f614515
Fix floating point issues in block_selection.py ( #89 )
2 years ago
Alexander Borzunov
c07a7e0812
Add "Terms of Use"
2 years ago
Artem Chumachenko
0d9c7de0bd
Add sst-2 ipynb example ( #86 )
...
- Add sst-2 example of a prompt-based training
- Have some enhancement in the persona-chat example
2 years ago
Alexander Borzunov
57e8d2e721
Implement exponential backoff for forward & backward ( #85 )
2 years ago
Alexander Borzunov
ee4e69c254
Enable rebalancing by default ( #84 )
2 years ago
Artem Chumachenko
2cb82dd648
Add colab-related changes ( #80 )
...
Add some stuff to work on COLAB more comfortable.
Co-authored-by: Alexander Borzunov <hxrussia@gmail.com>
2 years ago
Alexander Borzunov
87fd6a4f08
Fix "Too many open files" during rebalancing ( #83 )
...
Now, the number of open files stays the same after every rebalancing.
2 years ago
Alexander Borzunov
f64eb3a665
Update hivemind to 1.1.2, mark `model` argument as required ( #81 )
2 years ago
Alexander Borzunov
149f433763
Rebalance swarm when necessary ( #34 )
2 years ago
Alexander Borzunov
640bbc38a9
Make even smaller readability changes
2 years ago
Alexander Borzunov
d1b012b479
Make small readability & style changes to the instructions ( #77 )
2 years ago
justheuristic
fef48d7d99
Use bitsandbytes==0.34.0, update readme ( #76 )
...
* unlock bnb backward
* Fix bnb version in README
* Update requirements.txt
2 years ago
justheuristic
8caf1145a8
Quality of life changes: update readme, simplify run_server interface ( #75 )
...
- run_server now accepts model name as both positional and keyword argument
- changed names in README to account for interface updates
- moved model conversion from README to a separate wiki page
- updated requirements.txt
2 years ago
Artem Chumachenko
1046911dea
Add prompt tuning example on Personachat dataset ( #69 )
2 years ago
justheuristic
3fdcc55a56
fix protobuf version ( #74 )
...
* fix protobuf version
2 years ago
justheuristic
e92487e5d2
Update dependency versions ( #71 )
...
* update dependency versions
* install bitsandbytes cpuonly from pip
* remove deprecated API from task pool
* clearer startup logs
Co-authored-by: Tim Dettmers <dettmers@cs.washington.edu>
2 years ago
Pavel Samygin
50535a8435
Priority tasks ( #47 )
...
* priority in handlers and backend pools
* simple points system on server side
* priortize task in handler before submit task
* fix tests
* s/expert/block/g
Co-authored-by: justheuristic <justheuristic@gmail.com>
2 years ago
justheuristic
892d18fea7
Build cpuonly from bitsandbytes main ( #70 )
...
Build cpuonly from main
2 years ago
justheuristic
f3984b192a
Make attention cache wait until memory is freed ( #53 )
...
Previously, attempting to allocate with MemoryCache that does not have enough space would throw AllocationFailed.
PR changes this behavior to the following:
- by default, wait until memory is freed by other tenants (FIFO)
- if could not allocate within timeout, throw AllocationFailed
- if allocated size is too big to fit even in empty cache, throw AllocationFailed
- [x] passes existing tests
- [x] passes manual load tests
p.s. if anyone wondered: using mp.Condition will not make the code simpler, their lock behavior is slightly different to what we need here
Co-authored-by: Alexander Borzunov <hxrussia@gmail.com>
Co-authored-by: Aleksandr Borzunov <borzunov.alexander@gmail.com>
2 years ago
justheuristic
8a0c056929
Fix calling rpc_info multiple times ( #60 )
...
call info once
2 years ago
Artem Chumachenko
ada98a1b37
Add deep prompt inference ( #66 )
...
Add deep prompt in inference_step. Small refactoring in deep prompt code.
2 years ago
Alexander Borzunov
54ad745bed
Warn that current instructions involve 6B model but we will replace them soon ( #63 )
2 years ago
Alexander Borzunov
5f0c5329d4
Update readme with arxiv link and more discussions ( #62 )
...
Co-authored-by: justheuristic <justheuristic@gmail.com>
2 years ago
Alexander Borzunov
9bea7b9ea8
Update bullet points with feedback from Tim and other people ( #61 )
...
Co-authored-by: Tim Dettmers <tim.dettmers@gmail.com>
2 years ago
Alexander Borzunov
7653562aa1
Use latest version of Petals scheme, shrink Petals logo ( #59 )
2 years ago