ZML - High performance AI inference stack

msoad | 35 points

What would be the benefit of using ZML instead of relying on StableHLO/PJRT? Because the cost of porting models is for sure high.

ismailmaj | 2 days ago

Hi ya! Want to say this looks awesome :) really interested in the sharded inference demo!!! You said it was experimental, is it in the examples folder at all?? (On phone atm, so apologies for not investigating further)

hsjdhdvsk | 2 days ago

First of all, great job! I think the inference will become more and more important.

That being said, I have a question regarding the ease of use. How difficult it is for someone with python/c++ background to get used to zig and (re)write a model to use with zml?

onurcel | 3 days ago

Given that the focus is performance, do you have any benchmarks to compare against the likes of TensoRT-LLM.

Palmik | 2 days ago

my dreams have come true. hardware-agnostic ml primitives in a typed, compiled language.

my only question is: is zig stable enough to base such a project on?

montyanderson | 2 days ago