Faster morton codes with compiler intrinsics

Today I learned that newer Intel processors have an instruction which is tailor-made for generating morton codes: the PDEP instruction. There's an instruction for the inverse as well, PEXT.

These exist in 32- and 64-bit versions and you can use them directly from C or C++ code via compiler intrinsics: _pdep_u32/u64 and _pext_u32/u64. Miraculously, both the Visual C++ and GCC versions of the intrinsics have the same names. You'll need an Intel Haswell processor or newer to be able to take advantage of them though.

Docs for the instructions:

Intel's docs
GCC docs
Visual C++ docs

This page has a great write up of older techniques for generating morton codes:

Jeroen Baert's blog

...but the real gold is hidden at the bottom of that page in a comment from Julien Bilalte, which is what clued me in to the existence of these instructions.

Update: there's some useful info on Wikipedia about these intructions too.

Coder Vil

Search This Blog

Faster morton codes with compiler intrinsics

Comments

Post a Comment

Popular posts from this blog

LD_DEBUG

Assert no lock required

How to outperform std::vector in 1 easy step