diff --git a/README.md b/README.md index 0e5547b..3f70ccf 100644 --- a/README.md +++ b/README.md @@ -1,13 +1,59 @@ -# PETALS: Collaborative Inference of Large Models +
+
+ Decentralized platform for running 100B+ language models
+
+
+
+
+
+
+
+ [Read paper] | [View website] +
-### Installation +## How it works? + ++ +
+ +### 🚧 This project is in active development + +Be careful: some features may not work, interfaces may change, and we have no detailed docs yet (see [roadmap](https://github.com/bigscience-workshop/petals/issues/12)). + +A stable version of the code and a public swarm open to everyone will be released in November 2022. You can [subscribe](https://petals.ml/) to be emailed when it happens or fill in [this form](https://forms.gle/TV3wtRPeHewjZ1vH9) to help the public launch by donating GPU time. In the meantime, you can launch and use your own private swarm. + +## Code examples + +Solving a sequence classification task via soft prompt tuning of BLOOM-176B: + +```python +# Initialize distributed BLOOM with soft prompts +model = AutoModelForPromptTuning.from_pretrained( + "bigscience/distributed-bloom") +# Define optimizer for prompts and linear head +optimizer = torch.optim.AdamW(model.parameters()) + +for input_ids, labels in data_loader: + # Forward pass with local and remote layers + outputs = model.forward(input_ids) + loss = cross_entropy(outputs.logits, labels) + + # Distributed backward w.r.t. local params + loss.backward() # Compute model.prompts.grad + optimizer.step() # Update local params only + optimizer.zero_grad() +``` + +## Installation ```bash conda install -y -c conda-forge cudatoolkit-dev==11.3.1 cudatoolkit==11.3.1 cudnn==8.2.1.32 @@ -16,7 +62,6 @@ pip install -r requirements.txt pip install -i https://test.pypi.org/simple/ bitsandbytes-cuda113 ``` - ### Basic functionality All tests is run on localhost @@ -89,3 +134,12 @@ pytest tests/test_block_exact_match.py # test the full model pytest tests/test_full_model.py ``` + +-------------------------------------------------------------------------------- + ++ This project is a part of the BigScience research workshop. +
++ +