this is a problem with nvidia too.. i just made all my infra easy to reprovision...

jacquesm · on Nov 26, 2021

Interesting, can you describe this in a bit more detail? It runs completely counter to my experience, so far NVidia for me has just been a long string of 'boring' in that it just works. Even applications written for older cards and older versions of CUDA have continued to work just fine.

supernovae · on Nov 27, 2021

It was so bad, we just moved to immutable GPU infrastructure regardless of physical or virtual. When a new release of all the nvidia stuff comes out, we re-image the machine and install it.

Cuda on linux with ml/gpu workloads is still kind of a hotmess and i'd say we're far from finding a winner like some suggest here.

It's gotten better... but still far easier to treat it like a mess and start fresh with any install