U-Net CNN in APL: Exploring Zero-Framework, Zero-Library Machine Learning

tosh | 94 points

It's neat to see ongoing Co-dfns work from Aaron and others! There are a number of YouTube videos online if anyone is interested in very cool and esoteric yet serious programming: https://www.youtube.com/playlist?list=PLDU0iEj6f8duXzmgnlGX4....

sctb | 10 months ago

Impressively concise implementation, really interesting paper! Benchmark looks quite questionable though -- e.g. they use fp64 (while any sane person would use at least f32, if not f16), batch 1 (while normally one would try to get the max batch size which fits into memory, and it would reach 1 only for much bigger models or inputs), and also measure the time including transfer to/from GPU (while it would normally be interleaved with GPU operations). Not sure what results would look like in a more realistic setup, but still getting within 2x of PyTorch even in such a setting looks impressive!

lopuhin | 10 months ago

Interesting Futhark mention as well in the Related work section:

> Another approach to GPU-based array programming with an APL focus is the TAIL/Futhark system [8], which is a compiler chain taking APL to the TAIL (Typed Array Inter- mediate Language) and then compiling TAIL code using the Futhark GPU compiler backend.

fulafel | 10 months ago

I wonder what it would look like on kdb+?

natas | 10 months ago

Nobody should write backward pass by hand.

mlajtos | 10 months ago