Disclaimer, it's my project, but I run an open source project called deeplearnin...

dharma1 · on Aug 13, 2015

interesting. What's the performance like on the same/equivalent GPU when comparing CUDA to OpenCL?

agibsonccc · on Aug 13, 2015

Need to run empirical benchmarks. CUDA is usually faster. I'd like to run my own benchmarks with nd4j though. We have our own benchmark setup that works for every backend. It allows us to do some interesting things. Cuda itself is usually faster with data transfer latency though[1].

Looking forward to running these ourselves after our opencl support kicks in (only the kernels are written =/)

I plan on basing the work for open cl on our cuda work which is fairly well established at this point (mainly doing optimizations not much change in architecture)

[1]: http://arxiv.org/pdf/1005.2581.pdf

ris · on Aug 13, 2015

"CUDA is usually faster."

I wouldn't be surprised if this turned out not to be accidental. I mean, it wouldn't work against NVidia for OpenCL to continue to be seen as the "slower option". So I'm sure their efforts to improve their OpenCL implementation aren't considered as important from a business point of view.

dharma1 · on Aug 13, 2015

True, though if you compare roughly equivalent Nvidia and AMD GPUs, my impression is that the CUDA implementation on Nvidia still outperforms OpenCL on AMD for deep learning. Is this right?

AMD cards are a lot cheaper, would be good to be able to use them for deep learning too - https://www.reddit.com/r/linux/comments/2zgpj8/15000_nvidia_...

And new fast, low power FPGA's from Altera support OpenCL. https://www.altera.com/products/design-software/embedded-sof...

ris · on Aug 13, 2015

"Is this right?"

Yeah, I think this is the previously mentioned float performance thing.