So this means if I want to use a ML model I made in python, but don't want to code the rest of the application in python I can do that?
This looks interesting - I use OONX to call my PyTorch models from .NET but so far it’s meant I’ve not been able to test out JAX based libraries since they don’t have ONNX export and it has also meant I had to write C# boilerplate code to preprocess my input data into the form required by the model.
Potentially this could be a lot better but I’d be curious what speed overhead the IPC layer adds. At least with ONNX you get a nice speed bump :)
Any plans to support Windows? That would make Carton the ultimate library to embed LLMs into desktop applications
is this ancillary to what [these guys](https://github.com/unifyai/ivy) are trying to do?
This HN post looks really weird on mobile (no, not the website, HN itself)
Is this the same as Nvidia's Triton?
When will you release a Java client?
I'd love to see this for golang (even without GPU support).
Maybe I'm missing something here, isn't this largely achieved by ONNX already?
[0] https://onnx.ai
> From any [*] programming language.
[*] If "any programming language" is Python or Javascript.
Make it for Go, and I am sold. Running ML models in Go services is still an unsolved problem.
Slightly related dumb question, I saw on GitHub that TensorFlow has Java support. Does anyone actually use TensorFlow with Java?
> Carton wraps your model with some metadata and puts it in a zip file
Why a zip file?
"...run any machine learning model from any programming language*."
*As long as that language is python or rust.
What I think is that this is nothing more than a resume-bolstering effort that doesn't really need to exist and probably won't once OP lands a role at whatever FAANG company they're trying to impress.
Just some random brain dump: Why limit to ML models?
Perhaps we can (should?) have some universal package hub, where you can package and push a "thing" from any language, and then pull and use it from any other language. With some metadata describing the input/output schema. The underlying engine can use WASM or containers or something like that.