Show HN: Can your GPU run this LLM?

ALTRC0320 | 15 points

Hey HN, I made this simple GitHub tool to check how much vRAM you need to train or inference any LLM. It supports inference frameworks (huggingface/vLLM/llama.cpp) & quantization (GGML/bnb) I Made this after getting frustrated when I couldn't get 4bit 7b llama to work on my RTX 4090 24GB gpu even though the model is only 7GB.

ALTRC0320 | 2 years ago

It's so over, I cant even run the lowest LLama model; It never began for iGPUcels https://ibb.co/jy6B0sf

I was actually looking for something that could test what LLM I could run, thanks a lot haha

IndigoIncognito | 2 years ago