ZML - High performance AI inference stack
msoad | 35 points
Hi ya! Want to say this looks awesome :) really interested in the sharded inference demo!!! You said it was experimental, is it in the examples folder at all?? (On phone atm, so apologies for not investigating further)
hsjdhdvsk | 2 days ago
First of all, great job! I think the inference will become more and more important.
That being said, I have a question regarding the ease of use. How difficult it is for someone with python/c++ background to get used to zig and (re)write a model to use with zml?
onurcel | 3 days ago
Given that the focus is performance, do you have any benchmarks to compare against the likes of TensoRT-LLM.
Palmik | 2 days ago
my dreams have come true. hardware-agnostic ml primitives in a typed, compiled language.
my only question is: is zig stable enough to base such a project on?
montyanderson | 2 days ago
What would be the benefit of using ZML instead of relying on StableHLO/PJRT? Because the cost of porting models is for sure high.