To the author of this article: Could you run your node test with "node --trace-gc <script>" turned on? That will output when, and for how long, node's GC is doing it's thing.
Anyway, this could be a legit complaint at this point.
It explains why JSON parsing doesn't perform well, but doesn't explain the original benchmark results. Would be great to get a comment from @mraleph or the other V8 guys on that.
1) one with Buffers is also causing mark-sweep/compact pauses (7-15ms each) because Buffer constructor calls AdjustAmountOfExternalAllocatedMemory which triggers full gc cycle if it thinks that too much memory is held outside V8.
2) GCs in string based benchmark are mainly scavenges taking 0 ms plus < 10 full collections taking <6ms each on my desktop.
That is all I can say. V8 GC is performing well here from my point of view.
Yes, --trace-gc shows about 10 Mark-sweeps per second, each taking around 13 ms (no compacts though as far as I could see). But are those ~15% spent in GC are enough to explain the performance?
You were saying that V8 GC is failing here so I just explained why JSON.parse is especially bad for V8 GC.
Strictly speaking I am not even convinced that GC is bottleneck here. Only profiling can reveal the real bottleneck.
[I tried small experiment: used thirdparty pure-JS JSON parser instead of V8 JSON.parse --- that changed GC profile, but did not affect response time.]
V8 has a very impressive garbage collector (stop-the-world, generational, accurate) and the GC is probably a part the Google team have spent a lot of time tuning and working on as it's one of the harder and most important parts of building a VM...
My guess is that node's GC configuration isn't finely tuned for 25KB structures or maybe the GC is called prematurely.
Some suggestions: try to turn off the GC and re-do the benchmark, try with smaller JSON datastructures, try with different versions of node. Each of these would give more evidence where the problem is.
Btw. in that benchmark: which versions of RingoJS and node.js are used? How much memory does each server use in the end?
Edit:
What type of garbage collector does RingoJS/Rhino use - how is the GC configured for RingoJS?
The JSON I'm parsing is just objects with short string properties (around 10 characters). There's just one longer 25kb JSON string but that is never collected. As to Node configuration, can you provide some specific options to use? I've been asking about this on #node.js (and ryan) and I'm open to any suggestions.
Ringo is running with the server hotspot JVM without any further options.
Per default, Java 6 uses a generational collector with multi-threaded stop-the-world copying for the young generation and single-threaded stop-the-world mark-sweep-compact for the tenured generation.
I think the editorializing of the headline is unnecessarily negative (read biased). There is obviously an issue Ryan is working on addressing but Node is clearly already providing performance which is suitable for many server workloads.
I'm the author of both the original article and this HN posting - and yes, I am biased, since I'm the main developer of RingoJS (the other platform in that benchmark). I've made that quite clear and provided additional background in the original benchmark to which this is just a short update: http://hns.github.com/2010/09/21/benchmark.html
I think my benchmark and the conclusions I draw from it (after a lot of thinking) are fair. My intention is just to make people see there's no magic bullet with performance or scalability, and that there are alternatives for server-side JavaScript.
I think your conclusions in the article are fair. I think the title on HN is misleading because it's a quantitive issue.
V8 GC is a well known concern in the Node community, but it's still performing well enough that Node is considerably faster than traditional servers (like Apache). The fact that Ringo is also faster doesn't make V8 "not ready" it just means it could be improved.
If I wanted to be contentious I could suggest that "Hacker News comments confirms that RingoJS may not be ready for developers" because the author likes taking pot shots at other frameworks. But that would be petty, wouldn't it?
You are right about the title. That "not ready for the server" is a foolish phrase. I'd change it to "not tuned for the server" if I could, but it looks like it's impossible to change that now.
First, the response time variation is an important observation, thanks for that. To make sure, that it is caused by V8's CG, we need a GC log from the benchmark.
Second to make this a fair comparison, you need to use a similar sized heap. It could be, that the JVM heap was large enough to run the whole test without a major CG. We need a GC log also for this part of the test.
Sun was aware of the different CG strategies needed for server and client use and let us choose between them. V8 however seems the be optimized for the client side.
I'm not yet convinced this is due to GC. Sent a pull request[1] to Hannes to use the faster Buffer technique, to at least rule out the interference of v8's crazy slow string juggling under load.
Has anyone done any experiements with node.js and Jägermonkey? They’re getting pretty close to V8 in speed ( http://arewefastyet.com/ ) and might prove better for server use (utter conjecture on my part).
But none of these seem to be low level enough to make a reasonable comparison.
Flusspferd http://flusspferd.org/ seems most like node, but using spidermonkey (circa Firefox 3.5, it seems, but actively maintained, so there's hope for future enhancements). Unfortunately, it doesn't yet seem to have all the handy web serving stuff that node has, so probably still not useful for a competitive benchmark.
I talked with Brendan Eich in New York after JSConf.eu and he was suggesting they might be looking at porting the V8 API to Jägermonkey after they release it so that it can run Node.
Just an educated guess. If you're allocating tons of objects and strings and your app gets slow, it's very likely to be the GC. But I don't know V8 well enough to say for sure.
You're interpreting the graph incorrectly. There's no "bottom of the curve", the graph is showing a distribution, not some property over time. A correct reading will e.g. tell you that out of 50'000 requests a total of ~30'000 requests completed in 100ms or less. Or similarly for what you call the bottom of the curve: 5'000 out of 50'000 requests complete in ~25ms or less. This does _not_ state which 5'000 requests that were, and in particular it does _not_ imply that the _first_ 5'000 requests are faster. For this you'd have to plot response time over time.
Not only that, but one of Node's big selling points has been the supposed flat response time distribution it achieves with a simple 'helllo world' example.
Clearly thats not the case once each Node/V8 need to do a real-world amount of work per request.
Anyway, this could be a legit complaint at this point.