From 7653562aa12e76595467d39a7caff64c0408d64a Mon Sep 17 00:00:00 2001
From: Alexander Borzunov <hxrussia@gmail.com>
Date: Fri, 2 Sep 2022 15:38:04 +0400
Subject: [PATCH 1/2] Use latest version of Petals scheme, shrink Petals logo
 (#59)

---
 README.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/README.md b/README.md
index 3f70ccf..913d223 100644
--- a/README.md
+++ b/README.md
@@ -1,5 +1,5 @@
 <p align="center">
-    <img src="https://i.imgur.com/7eR7Pan.png" width="500"><br>
+    <img src="https://i.imgur.com/7eR7Pan.png" width="400"><br>
     Decentralized platform for running 100B+ language models<br><br>
     <a href="https://github.com/bigscience-workshop/petals/actions">
         <img src="https://github.com/bigscience-workshop/petals/actions/workflows/run-tests.yaml/badge.svg?branch=main">
@@ -22,7 +22,7 @@
 ## How it works?
 
 <p align="center">
-    <img src="https://i.imgur.com/75LFA0Y.png" width="800">
+    <img src="https://i.imgur.com/RTYF3yW.png" width="800">
 </p>
 
 ### 🚧 This project is in active development

From 9bea7b9ea86614657adc871fd97074ddae74191f Mon Sep 17 00:00:00 2001
From: Alexander Borzunov <hxrussia@gmail.com>
Date: Sat, 3 Sep 2022 06:38:18 +0400
Subject: [PATCH 2/2] Update bullet points with feedback from Tim and other
 people (#61)

Co-authored-by: Tim Dettmers <tim.dettmers@gmail.com>
---
 README.md | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/README.md b/README.md
index 913d223..6f0644f 100644
--- a/README.md
+++ b/README.md
@@ -11,9 +11,10 @@
 
 ## Key features
 
-- Run inference or fine-tune [BLOOM-176B](https://huggingface.co/bigscience/bloom) by joining compute resources with people all over the Internet. No need to have high-end GPUs.
-- One inference step takes ≈ 1 sec — much faster than possible with offloading. Enough for chatbots and other interactive apps.
-- Employ any fine-tuning and sampling methods by accessing model's hidden states and changing its control flow — something you can't do in proprietary APIs.
+- Run inference or fine-tune large language models like [BLOOM-176B](https://huggingface.co/bigscience/bloom) by joining compute resources with people all over the Internet. No need to have high-end GPUs.
+- It's difficult to fit the whole BLOOM-176B into GPU memory [unless](https://twitter.com/Tim_Dettmers/status/1559892918395031552) you have multiple high-end GPUs. Instead, **Petals** allows to load and serve a small part of the model, then team up with people serving all the other parts to run inference or fine-tuning.
+- This way, one inference step takes ≈ 1 sec — much faster than possible with offloading. Enough for chatbots and other interactive apps.
+- Beyond traditional language model APIs — you can employ any fine-tuning and sampling methods by executing custom paths through the model or accessing its hidden states. This allows for the comforts of an API with the flexibility of PyTorch.
 
 <p align="center">
     <b><a href="https://petals.ml/petals.pdf">[Read paper]</a></b> | <b><a href="https://petals.ml/">[View website]</a></b>