Ask HN: Build Your Own LLM?
retube | 8 points
Since you're posting here, you're looking for the shortcut.
The shortcut is Karpathy's "Let's Build GPT: from scratch, in code, spelled out" video:
https://www.youtube.com/watch?v=kCc8FmEb1nY
Then there is a good video that dives into LLMs and how they work that is quite approachable:
https://www.youtube.com/watch?v=7xTGNNLPyMI
From there, flesh out knowledge with his other videos, where he goes both extremely light and extremely deep:
https://www.youtube.com/@AndrejKarpathy/videos
Anyway, I really like's Karpathy's video because he's very good at explaining LLMs at every level.
runjake | a day ago
Andrej Karpathy: Let's build GPT: from scratch, in code, spelled out. https://www.youtube.com/watch?v=kCc8FmEb1nY
sfmz | 2 days ago
How about this?
2ro | 2 days ago
Sorry to self-promote but I did exactly that a few months back: https://khamidou.com/gpt2/
Generally, I think the Karpathy tutorials are a good starting point but they're very mathy (despite people telling you you only need high school math to understand llms, a lot of the abstractions and concepts he uses are a bit foreign to programmers).
I found out rebuilding inference of a known model taught me a lot more than passively sitting through the videos and maybe retyping his code. You should try it with something simple, like a model from a few years back!