From da320988b029970f713eb141663e044db2038b40 Mon Sep 17 00:00:00 2001 From: Shawn Presser Date: Sun, 5 Mar 2023 09:55:57 -0600 Subject: [PATCH] Update README.md --- README.md | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/README.md b/README.md index d08fb8d..29a5258 100644 --- a/README.md +++ b/README.md @@ -6,6 +6,18 @@ **UPDATE (3:58 AM CST)**: I've mirrored everything to R2, and updated the script to point to it. Note that the download command has changed (it uses a new version of the bash script) so you'll need to re-copy from this README. The safety guarantees are the same for you in the end, though, and the bandwidth is still around 36MB/s, which isn't too bad. I'm honestly too tired to update the rest of the README to reflect this slowdown; I'll just leave it the way it was for tonight. Please tweet on the [announcement thread](https://twitter.com/theshawwn/status/1632238214529400832) if anything breaks again, and I'll fix it again. +**UPDATE (9:51 AM CST)**: HN user MacsHeadroom left a [valuable comment](https://news.ycombinator.com/item?id=35029766): + +> I'm running LLaMA-65B on a single A100 80GB with 8bit quantization. $1.5/hr on vast.ai +> +> The output is at least as good as davinci. +> +> I think some early results are using bad repetition penalty and/or temperature settings. I had to set both fairly high to get the best results. (Some people are also incorrectly comparing it to chatGPT/ChatGPT API which is not a good comparison. But that's a different problem.) +> +> I've had it translate, write poems, tell jokes, banter, write executable code. It does it all-- and all on a single card. + +--- + This repository contains a high-speed download of LLaMA, Facebook's 65B parameter model that was recently made available via torrent. (Discussion: [Facebook LLAMA is being openly distributed via torrents](https://news.ycombinator.com/item?id=35007978)) It downloads all model weights (7B, 13B, 30B, 65B) at around 200 MB/s: