speculative_inference
main
hf_quantization_integration
borzunov-patch-3
test_set_position
test-with-jf160m
step_metadata
speculative_test
test_branch
forward_backward
fix-docker
forward_kwargs
bump
test_main
fix-inference-retry
lora_from_hub
payload-size
partial_rollback
qkv_merge
no_qkv_merge
wip_triton
hivemind-dht-fork-process
repetition-penalty
amd-gpus
bnb-0-41-1
lru
beat-docker-into-submission
measurements
debug-leak
fix-nf4-and-dtypes
declare_adapters
empty-weights
download_8bit_weights
no-cpufeature
versions
test_opt_serving
borzunov-patch-2
borzunov-patch-1
processing_attention
yozh-dev-branch
server-increase-startup-timeout
vectorized_beam_search
friendly-timeout-errors
hivemind-1.1.4
fix3
hotfix_bnb
fix-ptune
server-dtypes
pip-installable-v2
pip-installable
diff-compression
client-convenience
server-timeouts
server-logging
beamsearch
fix-protobuf
fix-requirements
fix-joining-announce
bootstrap-peers
fault-tolerant-inference
examples_fix_hivemind
forward-backward-timeouts
fix-rebalancing-issues
add-sst2-example
enable-rebalancing
update_example_1
fix-too-many-open-files
update-hivemind
extract-module-container
instruction-readability-style
readme-clarifications
justheuristic-patch-5
fix-readme
ptune-example-personachat
rtfd
fix-pb2
investigate-segfault
upd-deps
priority-tasks
justheuristic-patch-4
cache
justheuristic-patch-3
generation-inference
deep_prompt_inference
warn-about-6b-instructions
update-readme-disclaimers-faq
justheuristic-patch-2
update-bullet-points
update-readme-pics
readme-release
remove-remote-block
prompt-inference
fix-cache
optimize_seq
fix-seq-backward-recovery
fix-distr-seq-cls
justheuristic-patch-1
fix-convert-8bit
memory_savings
distributed-deep-ptune
ptune-wip
pytest-verbose
rename-test-model
8bit_backward
8bit-model
8bit_model_inference
petals-readme-title
support-backend-dtypes
deep-prompt-tuning
mockup
efficient-forward-backward
fix-branch-name
dbaranchuk-patch-1
get_sequence
generation
fix-ci
fix-master-ci
test-push
facelift
CI
prompt-tuning
client-attempt2
measure-throughput
lm_head
load-balancing
sequence
demo-1
standardize
diff
rpc
update-model
client
fix-auth-token
multiple-experts
8bit_blocks
inference_chain
main_fix
v1.0.0
v1.1.0
v1.1.1
v1.1.2
v1.1.3
v1.1.4
v1.1.5
v2.0.0.post1
v2.0.0.post2
v2.0.0.post3
v2.0.1
v2.0.1.post1
v2.0.1.post2
v2.1.0
v2.2.0
${ noResults }
2 Commits (b873d92ffa09b4177b6e00a7d7a407927b276cd7)
Author | SHA1 | Message | Date |
---|---|---|---|
justheuristic |
8dc0f513ba
|
Hotfix span selection (#110)
Fix an issue in span selection that was introduced in #106 |
2 years ago |
justheuristic |
a2066a4096
|
Optimize RemoteSequenceManager (#106)
- [x] made RemoteSequenceManager into a background thread that pre-fetches information instead of running just in time - [x] moved routing-related stuff to petals.client.routing - [x] extract remote peer routing information to RemoteSequenceInfo - [x] made sure that the code survives continued use (e.g. one hour) - [x] updated every spot where update_ is called manually - [x] modified get_sequence to check that the thread is alive, warn if not - [x] removed max_retries, switched rpc_info to exponential backoff - [x] fixed a bg that causes RemoteSeq* to lose user-defined hyperparameters (e.g. timeout) upon subsequencing (sequential[3:5]) - [x] moved client-side points strategy to client.routing - [x] ensured that RemoteSequenceManager thread created in get_remote_module properly shuts down when the module is destroyed - [x] resolved minor affected todos - [x] modified tests to no longer use PYTHONPATH - [x] worked around protocol error in rpc_info Co-authored-by: Aleksandr Borzunov <borzunov.alexander@gmail.com> Co-authored-by: Artem Chumachenko <artek.chumak@gmail.com> |
2 years ago |