HNPWA with Next.js

Kernel optimization with BOLT (binary optimization and layout tool)

Here is another interesting BOLT article, this one on PostgreSQL optimization:

https://vondra.me/posts/playing-with-bolt-and-postgres/

"results are unexpectedly good, in some cases up to 40%"

Instruction Cache and TLB trashing is an often overlooked consequence of code bloat and sometimes of overly aggressive micro-benchmark driven optimization.

Reorganizing the binary is an interesting approach to minimize the cost, but I think that any performance oriented developer should keep in mind that most projects are rarely dependent on a single hot loop but on many systems working together and competing for space in the cache(s).

I generally use -Os instead of -O2 and -O3 in my projects, while trying to reduce code bloat to a minimum for that reason.

stephc_int13 | a year ago

One can try it out with CachyOS/Arch:

https://cachyos.org/blog/2411-kernel-autofdo/

BSDobelix | a year ago

Back in the day on the Mac, the order of source files in your project would determine locality in the binary.

If memory serves, this was with MPW C or maybe CodeWarrior.

You could see the jump (jmp) instructions use short jumps rather than long ones.

OnlyMortal | a year ago

Does it work with Intel fortran-compiled code?

kardos | a year ago

So am I blind or does it not mention the results? Was the result a faster kernel? How big was the difference?

yxhuvud | a year ago

Anyone know of a windows equivalent to BOLT ?

vsskanth | a year ago