Zen, CUDA, and Tensor Cores, Part I: The Silicon

throwaway71271 | 175 points

The answer to the leading question "What’s the difference between a Zen core, a CUDA core, and a Tensor core?" is not covered in Part 1, so you may want to wait if this interests you more than chip layouts.

fulafel | 4 months ago

you can calculate the area of the tensor and raytracing units by measuring+comparing die sizes between the nearest 20-series and 16-series chips. Contrary to the assumptions a lot of people made from the cartoon diagrams, it's actually relatively small, together they make up approximately 18% of the cluster area and it's below 10% of the chip as a whole. The area is roughly 2/3rds tensor unit area and 1/3 raytracing unit area, so RT is around 3% of total chip area and tensor is around 6%.

https://old.reddit.com/r/hardware/comments/baajes/rtx_adds_1...

This could have changed somewhat in newer releases, but probably not too drastically, since NVIDIA has never really increased raw ray performance since the 20-series launch. And while there have been a few raytracing features around the edges, raster and cache have been bumped significantly too (notably, ampere got dual-issue fp32 pipelines... which didn't really work out for NVIDIA that well either!) so honestly there's a reasonable chance it's slightly less in subsequent architectures.

paulmd | 4 months ago

> Each of the tiles on the CPU side is actually a Zen 4 core, complete with its dedicated L2 cache.

Perhaps, it could be more interesting to compare without L2 cache.

kvemkon | 4 months ago

[flagged]

Darulquran-123 | 4 months ago

It was a good read. I wonder what hot takes he'll have in the second part if any.

diabllicseagull | 4 months ago

I refused to buy the so determined defective chips even if they represented better value because if the intent was truly to try and max yield then there should be for Ryzen for example good 7 core versions with only 1 core that was found to be defective. Since no 7 core zens exist, then at least some of the CPUs with 6 core CCDs have intentionally had 1 of the cores destroyed for reasons unknown, which could be to meet volume targets. If this is because for Ryzen the cores can only be disabled in pairs, then it boggles my mind that it would not be economic given the $ diff of tens to hundreds of dollars between the 6 and 8 core versions that is does not make sense to add the circuits to allow each core to be individually fused off and allow further product differentiation, especially considering how much effort and # of SKUs have been put forth with the frequency binning in AM4 (5700x, 5800, 5800x, 5800xt, etc.), rather than bigger market segmentation jumps.

downvotetruth | 4 months ago