Measuring Acceleration Structures
ibobev | 79 points
Smaller data is where it’s at when optimizing nowadays. Less bandwidth required and higher cache hit rate.
You can compute a ton per bit transferred from DRAM. On both CPUs and GPUs.
vardump | 3 months ago
I wrote an non-RTX on-GPU raytracer a while back (naive compared to this) and it's super-interesting to read about the advances in compressing BVH structures.
But the changes also highlights a change in focus from just implementing this naively(RDNA3 technically not too much removed from the naive raytracer I wrote) to moving it to something carefully engineered and optimized for memory bandwidth (with savings circuits even built into silicon?).