From 5964e59b9d598a996fbe24079a3300a2111e2315 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Jannis=20Sch=C3=B6nleber?= Date: Fri, 5 Jan 2024 09:44:15 +0100 Subject: [PATCH] add `tensorli` https://github.com/joennlae/tensorli --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 2c9810e..5996117 100644 --- a/README.md +++ b/README.md @@ -137,6 +137,7 @@ While an in-depth knowledge about the Transformer architecture is not required, * [nanoGPT](https://www.youtube.com/watch?v=kCc8FmEb1nY) by Andrej Karpathy: A 2h-long YouTube video to reimplement GPT from scratch (for programmers). * [Attention? Attention!](https://lilianweng.github.io/posts/2018-06-24-attention/) by Lilian Weng: Introduce the need for attention in a more formal way. * [Decoding Strategies in LLMs](https://mlabonne.github.io/blog/posts/2023-06-07-Decoding_strategies.html): Provide code and a visual introduction to the different decoding strategies to generate text. +* [Tensorli](https://github.com/joennlae/tensorli): A minimalistic implementation of a trainable GPT-like transformer using only numpy (<650 lines) --- ### 2. Building an instruction dataset