Commit Graph

  • c7255ef547
    Update README.md Alexander Borzunov 2023-09-22 03:20:18 +0400
  • a2484b3053
    Fix file locks in NFS-mounted directories (#517) FYY 2023-09-19 20:01:23 -0400
  • d57917b1de #515 update file open mode to avoid acquire file lock fail on NFS shared mount tonywang16 2023-09-19 18:51:01 -0400
  • a14c0f3bc9 Fix #515 unable to acquire shared lock on NFS mounted directory tonywang16 2023-09-19 16:01:22 -0400
  • 5c8bb55d2a Merge branch 'main' of https://github.com/tonywang16/petals j879@huggingface.co 2023-09-19 15:39:24 -0400
  • 172f593084 #515 fix file lock failed to acquire on NFS mounted directory j879@huggingface.co 2023-09-19 15:38:59 -0400
  • 5ce4f1a159
    Store (start_block, end_block) in each DHT record for reliability (#510) Alexander Borzunov 2023-09-15 23:53:57 +0400
  • 533f9c8a34 Remove excess line Aleksandr Borzunov 2023-09-15 18:32:11 +0000
  • 4a302d0bd0 Refactor compute_spans() in petals.client.routing.sequence_info and petals.server.block_selection Aleksandr Borzunov 2023-09-15 17:10:23 +0000
  • dcd4641b82 List[Optional[RemoteModuleInfo]] -> List[RemoteModuleInfo] everywhere Aleksandr Borzunov 2023-09-15 16:28:24 +0000
  • 2608b5cf8d Fix (2) Aleksandr Borzunov 2023-09-15 15:34:55 +0000
  • 22da72aeeb Remove unused imports, methods Aleksandr Borzunov 2023-09-15 15:12:18 +0000
  • f20d75ca69 Fix block offset in SequenceManager slices Aleksandr Borzunov 2023-09-15 14:58:49 +0000
  • 58634df833 Use stricter DHT validation Aleksandr Borzunov 2023-09-15 14:32:02 +0000
  • 7e584448f1 Fix mixed JOINING/ONLINE records Aleksandr Borzunov 2023-09-15 14:22:02 +0000
  • 0f527a0788 Use start_block, end_block if present Aleksandr Borzunov 2023-09-15 13:58:49 +0000
  • 145377c4cc Announce start_block and end_block Aleksandr Borzunov 2023-09-15 13:49:50 +0000
  • 51af621c51
    Merge 1b21dd3217 into 158621677b Artem Chumachenko 2023-09-06 20:20:47 +0300
  • 158621677b
    Bump version to 2.2.0 (#502) Alexander Borzunov 2023-09-06 19:43:30 +0400
  • 53a9222f0c Add Falcon to readme v2.2.0 Aleksandr Borzunov 2023-09-06 15:26:29 +0000
  • 4883923a66 Show license link for Falcon-180B Aleksandr Borzunov 2023-09-06 15:23:48 +0000
  • 1b21dd3217 Add memory cache usage lora_from_hub Artem Chumachenko 2023-09-06 12:10:32 +0400
  • 01c3cf8d15 Add first version Artem Chumachenko 2023-09-06 10:46:10 +0400
  • c665c42cf2 reduce diff Your Name 2023-09-06 06:24:59 +0300
  • 3f06b53b1d temporary rollback: allow kwargs only at first inference step Your Name 2023-09-06 05:53:07 +0300
  • 3048c3b3ad rollback Your Name 2023-09-06 04:29:28 +0300
  • 721f7d2db3 unbreak everything Your Name 2023-09-06 03:52:40 +0300
  • 3bffcde0fe black+isort Your Name 2023-09-06 03:37:59 +0300
  • 8eb1722f1e standardize: s/backend_kwargs/block_kwargs/g everywhere Your Name 2023-09-06 03:37:45 +0300
  • 68b8cea246 note Your Name 2023-09-06 03:35:39 +0300
  • a23bd73f3b probably break everyting Your Name 2023-09-06 03:30:26 +0300
  • 056cd77f11 standardize checking block_kwargs Your Name 2023-09-06 03:16:15 +0300
  • aacd8b2f9d pass args/kwargs via forward Your Name 2023-09-06 03:16:03 +0300
  • 62e780c054 check num block kwargs Your Name 2023-09-06 03:13:08 +0300
  • 17d278e88a black-isort-clarify Your Name 2023-09-06 02:38:14 +0300
  • 9e29140bb0 mention reference issue Your Name 2023-09-06 01:37:48 +0300
  • b7bd4770d7 black-isort Your Name 2023-09-06 01:28:12 +0300
  • f2049658b6 make it work for fwd, bwd Your Name 2023-09-06 01:28:01 +0300
  • 465fd93147 more WIP Your Name 2023-09-05 23:29:16 +0300
  • 4393d99e78 1isort Your Name 2023-09-05 22:14:38 +0300
  • 49474e5477 wip some more Your Name 2023-09-05 22:07:48 +0300
  • e5c2d8eca4 WIP BEFORE MEETING NEED BACKWARD UPDATE Your Name 2023-09-05 15:32:47 +0300
  • b945e388e5
    Remove smaller limit for legacy bfloat16 serialization payload-size Alexander Borzunov 2023-09-05 16:23:26 +0400
  • 4ac91cf5dd delete duplicate tests Your Name 2023-09-05 14:33:16 +0300
  • 2e760319ab add docstr Your Name 2023-09-05 14:28:48 +0300
  • cc4fe17a99 minimize diff partial_rollback Your Name 2023-09-05 14:05:41 +0300
  • 6c7f762379 rollback: only generic kwarg Your Name 2023-09-05 05:21:12 +0300
  • 6256995bb1 Merge remote-tracking branch 'origin/main' into forward_kwargs Your Name 2023-09-05 05:12:19 +0300
  • 39d0b31dcf
    Bump version to 2.2.0 Alexander Borzunov 2023-09-04 19:01:51 +0400
  • 1ebd88ae7b
    Optimize the Falcon block for inference (#500) Max Ryabinin 2023-09-04 14:38:32 +0200
  • 52baffb056 Update test_optimized_layers Max Ryabinin 2023-09-04 14:43:37 +0300
  • 2c27c19df4 Do not fuse split_heads with qkv This is most likely due to bitsandbytes performing work not captured in the graph Max Ryabinin 2023-09-04 14:42:13 +0300
  • ea6c037c8b Enable CUDA graphs only on CUDA Max Ryabinin 2023-09-04 12:01:10 +0300
  • 91f6248535 Run tests on CUDA and CPU, Max Ryabinin 2023-09-04 12:00:59 +0300
  • 177669e97f Rollback CUDA graphs Max Ryabinin 2023-09-04 11:25:49 +0300
  • b941df5d2f Fix formatting Max Ryabinin 2023-09-04 04:38:00 +0300
  • 1f2ef79da3 WIP disable graphs Max Ryabinin 2023-09-04 04:30:31 +0300
  • cfaf6c1975 Fix rotary embeddings Max Ryabinin 2023-09-04 02:51:49 +0300
  • d56f57acd2 Fix rotary embeddings Max Ryabinin 2023-09-04 02:47:58 +0300
  • 841a0d5262 Improve test and compatibility Max Ryabinin 2023-09-04 01:56:29 +0300
  • ae30427276 Make the block compatible with other architectures Max Ryabinin 2023-09-04 01:20:33 +0300
  • 2c1452de5c Fix buffer registration Max Ryabinin 2023-09-04 01:05:27 +0300
  • ce401f1163 Make cos_cached/sin_cached buffers Max Ryabinin 2023-09-04 00:47:27 +0300
  • 111bf7e125 Fix the test Max Ryabinin 2023-09-04 00:47:11 +0300
  • 67764fea9e Fix formatting, reduce diff Max Ryabinin 2023-09-04 00:16:10 +0300
  • 1fc22bd69f Post-rebase changes Max Ryabinin 2023-09-04 00:09:22 +0300
  • 1f006c59a1 Fix class names Max Ryabinin 2023-09-03 23:58:13 +0300
  • ca4d091a3f Optimize Falcon block for inference Max Ryabinin 2023-09-03 23:50:55 +0300
  • d40eb6c701
    Fix prompt tuning after #464 (#501) Alexander Borzunov 2023-09-04 12:25:29 +0400
  • db46bf3ac1 Account for pre_seq_len in new session's max_length Aleksandr Borzunov 2023-09-04 08:05:13 +0000
  • 8cb6b37e2f Fix prompt tuning after #464 Aleksandr Borzunov 2023-09-04 07:37:35 +0000
  • dd4a3230bc
    Add Falcon support (#499) Alexander Borzunov 2023-09-04 01:45:37 +0400
  • 1d865d1704 Fix dim order Aleksandr Borzunov 2023-09-03 20:39:13 +0000
  • f6553ad4cb Fix cache reordering with seq_len = 0 Aleksandr Borzunov 2023-09-03 19:57:24 +0000
  • 4537c77004 Expand/collapse KV caches when config.new_decoder_architecture is True Aleksandr Borzunov 2023-09-03 18:50:29 +0000
  • cda4fe8cef Fix num_key_value_groups Aleksandr Borzunov 2023-09-03 16:42:06 +0000
  • 4159e557bf Create dummy data when materializing qkv_proj qkv_merge Max Ryabinin 2023-09-03 19:20:07 +0300
  • cac654a16f Set default revision for tiiuae/* repos to models in the in-library format Aleksandr Borzunov 2023-09-03 14:29:09 +0000
  • 97dd3764d9 Set default pad_token_id Aleksandr Borzunov 2023-09-03 13:37:12 +0000
  • 8d78cf2357 Use safetensors float32 model Aleksandr Borzunov 2023-09-03 13:03:23 +0000
  • 012ae0e56a Use NF4 for all models on CUDA Aleksandr Borzunov 2023-09-03 13:02:32 +0000
  • ff83e7c8ad Fix DistrtibutedFalconModel.word_embeddings_layernorm Aleksandr Borzunov 2023-09-03 06:16:40 +0000
  • cc234c57ad Fix comment Aleksandr Borzunov 2023-09-02 23:33:00 +0000
  • bd6aa2399c Support --throughput dry_run Aleksandr Borzunov 2023-09-02 23:28:59 +0000
  • 15fed1e88b Draft Falcon support Aleksandr Borzunov 2023-09-02 23:24:42 +0000
  • 9cb4c721e7 Fix checking for nonexistent keys Max Ryabinin 2023-09-03 01:55:50 +0300
  • 16fb547960 Fix removal of nonexistent keys Max Ryabinin 2023-09-03 01:35:00 +0300
  • f100915641 Reformat code with black Max Ryabinin 2023-09-03 01:09:56 +0300
  • 4644131086 Ignore missing qkv_proj.weight when loading a checkpoint Max Ryabinin 2023-09-03 01:08:55 +0300
  • a7f87b636b Disable the optimization no_qkv_merge Max Ryabinin 2023-09-03 00:49:23 +0300
  • b2ab84cc33 Add dry_run option to --throughput Max Ryabinin 2023-09-03 00:41:03 +0300
  • c666a975d0 Remove unused import in throughput.py Max Ryabinin 2023-09-03 00:38:35 +0300
  • 57119bb201 Merge query/key/value projection layers Max Ryabinin 2023-09-03 00:11:03 +0300
  • fa464dfc99 WIP Triton+QKV merge wip_triton Max Ryabinin 2023-09-03 00:11:03 +0300
  • b4d822afb2
    Force use_cache=True in config only (#497) Alexander Borzunov 2023-09-03 01:16:00 +0400
  • 9cd25d95e6 Force use_cache=True in config only Aleksandr Borzunov 2023-09-02 21:03:56 +0000
  • abd547735f
    Force use_cache=True (#496) Alexander Borzunov 2023-09-02 22:57:18 +0400
  • 4d5a21ca31
    Update setup.cfg Alexander Borzunov 2023-09-02 21:36:34 +0400
  • 13f66c4e5e Force use_cache=True Aleksandr Borzunov 2023-09-02 17:32:39 +0000
  • ce89b649b5
    Merge branch 'main' into forward_kwargs justheuristic 2023-09-01 16:42:30 +0300