From da320988b029970f713eb141663e044db2038b40 Mon Sep 17 00:00:00 2001
From: Shawn Presser <shawnpresser@gmail.com>
Date: Sun, 5 Mar 2023 09:55:57 -0600
Subject: [PATCH] Update README.md

---
 README.md | 12 ++++++++++++
 1 file changed, 12 insertions(+)
diff --git a/README.md b/README.md
index d08fb8d..29a5258 100644
--- a/README.md
+++ b/README.md
@@ -6,6 +6,18 @@
 
 **UPDATE (3:58 AM CST)**: I've mirrored everything to R2, and updated the script to point to it. Note that the download command has changed (it uses a new version of the bash script) so you'll need to re-copy from this README. The safety guarantees are the same for you in the end, though, and the bandwidth is still around 36MB/s, which isn't too bad. I'm honestly too tired to update the rest of the README to reflect this slowdown; I'll just leave it the way it was for tonight. Please tweet on the [announcement thread](https://twitter.com/theshawwn/status/1632238214529400832) if anything breaks again, and I'll fix it again. </passes out>
 
+**UPDATE (9:51 AM CST)**: HN user MacsHeadroom left a [valuable comment](https://news.ycombinator.com/item?id=35029766):
+
+> I'm running LLaMA-65B on a single A100 80GB with 8bit quantization. $1.5/hr on vast.ai
+>
+> The output is at least as good as davinci.
+>
+> I think some early results are using bad repetition penalty and/or temperature settings. I had to set both fairly high to get the best results. (Some people are also incorrectly comparing it to chatGPT/ChatGPT API which is not a good comparison. But that's a different problem.)
+>
+> I've had it translate, write poems, tell jokes, banter, write executable code. It does it all-- and all on a single card.
+
+---
+
 This repository contains a high-speed download of LLaMA, Facebook's 65B parameter model that was recently made available via torrent. (Discussion: [Facebook LLAMA is being openly distributed via torrents](https://news.ycombinator.com/item?id=35007978))
 
 It downloads all model weights (7B, 13B, 30B, 65B) at around 200 MB/s: