Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
fuber2018
on July 7, 2023
|
parent
|
context
|
favorite
| on:
{n} times faster than C
If I unroll the main while loop to handle 4x as much each time through the loop in the SWAR-version, the runtime drops to 0.0562s (average 10 runs).
That's an overall 57.5x speedup.
fuber2018
on July 7, 2023
[–]
If I convert the unrolled-64-bit SWAR function to use 32-bit chunks instead, average runtime almost doubles, approx. 0.1s now.
Need sleep now.
fuber2018
on July 7, 2023
|
parent
[–]
If I unroll the 64-bit SWAR version by 8x instead of 4x, the runtime is reduced by another 10% over the 4x-unrolled SWAR version. Diminishing returns...
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search:
That's an overall 57.5x speedup.