Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Could you share the benchmark source code of the first example?


Here's the one that showed a lot more speedup than the article:

https://pastebin.com/v9tczpus

Looks like the LLM invented somewhat different test for it than the article had. I tried again and have this with the same data structure as in the article:

https://pastebin.com/SDdcchZG

That gave similar results to the article.

All the other tests still give little-to-no speedup on my machine.


Many thanks for providing the source. It also works on my machine.

TIL.


I tried the others on my x86 machine and they all do something for me - not nearly as much as the article, but something.


The "_ [0]byte" trick has no base in my knowledge. For the author's specified example, [1024]float64 will be always allocated on one whole page, aka, always 64-byte aligned.

For "Array of Structs vs Struct of Arrays", using slices as fields is a good idea. If the purpose is to make fields allocated on their respective memory block, just use pointers instead.


> The "_ [0]byte" trick has no base in my knowledge. For the author's specified example, [1024]float64 will be always allocated on one whole page, aka, always 64-byte aligned.

You're right - I read the results I had wrong on that one. That one is slower, not faster, on both my M2 and on x86 machine.


My last comment has imprecision and misunderstanding.

> ... [1024]float64 will be always allocated on one whole page, aka, always 64-byte aligned.

if it is allocated on heap and at the start of allocated memory block.

> For "Array of Structs vs Struct of Arrays", using slices as fields is a good idea. If the purpose is to make fields allocated on their respective memory block, just use pointers instead.

I misunderstood it.

It is like row-based database vs. column-based database. Both ways have their respective advantages and disadvantages.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: