If the JVM and/or the Scala runtime does user-space buffering and Go forwards st...

jongraehl · on Aug 6, 2013

The better performance of the Scala client+server if anything suggests less buffering, not more, since the next ping can't be written until the previous pong is received.

p.s. to parent: "the JVM TCP stack?" really?

vidarh · on Aug 6, 2013

I admit I haven't checked the example thoroughly - if it goes in lockstep then buffering won't be the culprit.

But you're wrong that better performance implies less buffering. A typical way to write such applications to do less buffering is to do select() or poll() or equivalent followed by a large non-blocking read, and then picking your data out of that buffer.

As pointed out above, if this "benchmark" does ping/pong's in lockstep across a single connection, then buffering vs no-buffering will make exactly no difference, as there's no additional data available to read. But in scenarios where the amount of data is larger, the time saved from fewer context switches quickly adds up and gives you far more time to actually process the data. Usually your throughput will increase, but your latency will also tend to drop despite the buffering as long as the buffers are of a reasonable size.

Buffering is a problem when the buffers grows to large multiples of the typical transaction size, but for typical application level protocols that takes really huge buffers.

jongraehl · on Aug 6, 2013

My comment was specific to this benchmark, which has 4 byte messages that cannot be buffered due to the ping (wait for pong) ping ... repetition in each client. Of course buffering matters for full-pipeline throughput.