JasonLion wrote:Back when I was working on my adaptation of zhouyundong's code, I found significant speed improvements (20%) when compiled for 32 bit rather than 64 bit. This was very much not what I was expecting, as nearly all applications speed up noticeably by simply compiling with the 64 bit compiler flag, with no other changes. This might not still be true for champagne's version, but I suspect it probably is.
good topic to keep my brain working along the beach for the next 10 weeks.
starting from my last version of the "zhou" process, I see what to do to optimize the code for that, but this will be C++ and will keep use of some native instructions as popcount and bitscanforward