Just a random fact, for those that use Snort[1], this article brings up the same reason why stream processing in snort is single threaded. See their attempt to make it multi-threaded:
Edit: to clarify, snort can run across multiple threads, but a single stream is handled by a single thread. When they tried to process the same data in multiple threads at once, cache synching killed performance.
Poor cache utilization on traditional shared memory arch when passing data (packet) from core to core. I wonder if, while going down this road, they made sure key structures were cache-aligned.
http://securitysauce.blogspot.com/2009/04/snort-30-beta-3-re...
Edit: to clarify, snort can run across multiple threads, but a single stream is handled by a single thread. When they tried to process the same data in multiple threads at once, cache synching killed performance.
[1] http://snort.org/