I built mlx-onnx (as part of mlx-ruby), a standalone exporter that converts MLX graphs to ONNX.
Web Demo: https://skryl.github.io/mlx-ruby/demo
Repo: https://github.com/skryl/mlx-onnx
It provides:
- MLX callable -> ONNX export
- Python API + native C++ API
The goal is to make it easier to move MLX models into ONNX tooling (onnxruntime, validation, downstream deployment), while keeping
export behavior testable and explicit.
Quick example:
import mlx.core as mx
import mlx_onnx as mxonnx
def f(x):
return mx.exp(x + 1.0)
x = mx.array([[1.0, 2.0]], dtype=mx.float32)
mxonnx.export_onnx("model.onnx", f, x, model_name="demo", opset=18)
Performance is competitive with Python MLX. On small models, Ruby is within 0.55-1.54x of Python depending on model type and device. The heavy lifting happens in the same C++ / Metal runtime either way.
Ruby deserves better ML tooling. The language is expressive enough that model definitions can actually be more readable than their Python equivalents. gem install mlx to try it out.
One caveat is that this paper only covers training, which can be done on a single CS-3 using external memory (swapping weights in and out of SRAM). There is no way that a single CS-3 will hit this record inference performance with external memory so this was likely done with 10-20 CS-3 chips and the full model in SRAM. Definitely can’t compare token/$ with that kind of setup vs a DGX.
Thanks for the correction. They are currently using FP16 for inference according to OpenRouter. I had thought that implied that they could not use FP8 given the pressure that they have to use as little memory as possible from being solely reliant on SRAM. I wonder why they opted to use FP16 instead of FP8.
Trusted (http://usetrusted.com) | San Francisco | Onsite, Fulltime | $100-$150k, 0.5-1.0% equity
Contact: alex@usetrusted.com
Trusted alleviates the pain parents face in discovering, scheduling and paying for high quality, vetted child care.
We are a small team working on transforming the child care industry and helping countless parents in the process. We care deeply about the quality of the service we provide but we also pride ourselves on the wellbeing and happiness of our team. Our day to day usually involves a standup around 10am, a few 10 minute exercise breaks throughout the day, and we normally tie things up between 6pm and 7pm.
We're looking for an experienced front-end engineer to lead client-side Javascript development and grow both our internal and customer facing web clients. Because of the small size of our team, we love engineers who feel comfortable across the whole stack but specialize in something they love!
Skills We Are Looking For:
* 5+ Years of client-side Javascript development
* Deep knowledge of React, Angular, Backbone, or another client-side framework
* Experience with UI/UX testing
Bonus:
* Design chops
* A portfolio which showcases your previous work
* A Github account with cool projects in it
* Experience with server-side technologies (Ruby, Python, PHP, etc)
* Mobile development experience
Thanks for writing this. If anyone else is ever in a similar situation, please do your best to get out of the room. Even if you think your attacker might be hurt and is no longer restraining you, just get out. Get out and THEN call someone. Knock on doors, whatever... if you don't have your phone. Staying put and waiting for the attacker to leave is a BAD idea, even if you get a chance to use a phone.
It's like calling private methods on a class ;) Not easily accessible but once you figure out how to do it you can accomplish certain feats that may have seemed impossible beforehand. Alas, with great power comes great responsibility. Ever seen Limitless?
reply