"Scalar replacement" explodes the object into its class member variables and does not construct a class object at all. That does result in the exact same `sub %esp` (that Go would do for any struct), but it is restricted to only working if every single usage of that class type is fully inlined and the class is never passed anywhere that needs it in its object form.
It's worse than what Go has. Go can stack-allocate any struct and still pass pointers to it to non-inlined functions.
I don't know the exact reason here, but from my experience JVMs don't seem to perform optimisations as deep as static compilers. You can see the compiler not only missed scalar replacement here, but also didn't use a conditional move to avoid branching neither it performed any loop unrolling. Maybe it is because JITs are constrained by time and resources a lot more.
It'd be worth doing a deep dive into this - you can look at the compiler's intermediate representation to understand why it didn't make an optimisation. There may be something trivial getting in the way.
It's worse than what Go has. Go can stack-allocate any struct and still pass pointers to it to non-inlined functions.