Jannis Schönleber 6 months ago
parent f409eea77a
commit 5964e59b9d

@ -137,6 +137,7 @@ While an in-depth knowledge about the Transformer architecture is not required,
* [nanoGPT](https://www.youtube.com/watch?v=kCc8FmEb1nY) by Andrej Karpathy: A 2h-long YouTube video to reimplement GPT from scratch (for programmers).
* [Attention? Attention!](https://lilianweng.github.io/posts/2018-06-24-attention/) by Lilian Weng: Introduce the need for attention in a more formal way.
* [Decoding Strategies in LLMs](https://mlabonne.github.io/blog/posts/2023-06-07-Decoding_strategies.html): Provide code and a visual introduction to the different decoding strategies to generate text.
* [Tensorli](https://github.com/joennlae/tensorli): A minimalistic implementation of a trainable GPT-like transformer using only numpy (<650 lines)
---
### 2. Building an instruction dataset

Loading…
Cancel
Save