Skip to main content

Common bit prefix length for two integers

Here's a neat trick I discovered a couple of months back: given two signed or unsigned integers of the same bit width, you can calculate the length of their common prefix very efficiently:


  int common_prefix_length(int a, int b)
  {
    return __builtin_clz(a ^ b);
  }

What's this doing? Let's break it down:

a ^ b is the bitwise-xor of a and b. The boolean xor operation is true when one of it's inputs is true and the other is false; or false if both have the same value. Another way to put this is that xor returns true when its inputs are different and false if they're the same. The bitwise-xor operation then, returns a value which has zeros for every bit that is the same in both a and b; and ones for every bit that's different.

__builtin_clz is a GCC intrisinc function which counts the number of leading zero bits of its argument. It compiles down to a single machine code instruction on hardware that supports it (which includes every Intel chip made in this decade). The Visual C++ equivalent is the _BitScanReverse intrinsic, which has a slightly more complicated API; implementing the above with it is left as an exercise for the reader. :-)

Passing the result of a ^ b to __builtin_clz means we're counting the leading zero bits in a number where a zero bit indicates that the corresponding bits in a and b had the same value; which is exactly how we get the length of the common prefix.

You can get the common suffix in the same way. The only difference is that you use the __builtin_ctz intrinsic (Visual C++: _BitScanForward) instead, to count the trailing zero bits:


  int common_suffix_length(int a, int b)
  {
    return __builtin_ctz(a ^ b);
  }

Neat huh?

Comments

Popular posts from this blog

Assert no lock required

This is a technique I learnt about from Jason Gregory's excellent book, Game Engine Architecture (3rd Edition) . If you have a shared resource accessed by multiple threads, where you're fairly certain that it's only ever accessed by one thread at a time, you can use an assert() to check for this at debug time without having to pay the runtime cost of locking a mutex. The implementation is fairly straightforward: class UnnecessaryMutex { public: void lock() { assert(!_locked); _locked = true; } void unlock() { assert(_locked); _locked = false; } private: volatile bool _locked = false; }; #ifdef ENABLE_LOCK_ASSERTS #define BEGIN_ASSERT_LOCK_NOT_REQUIRED(mutex) (mutex).lock() #define END_ASSERT_LOCK_NOT_REQUIRED(mutex) (mutex).unlock() #else #define BEGIN_ASSERT_LOCK_NOT_REQUIRED(mutex) #define END_ASSERT_LOCK_NOT_REQUIRED(mutex) #endif Usage is equally straightforward: UnnecessaryMutex gMutex; void PossiblyOverlappingFunction

Triangle bounding boxes in a single byte

Just thought of a way to store the bounding box for a single triangle in only one byte. It's not really practical or something you'd ever really want to use, but what the hell. Assume we have some kind of indexed mesh structure with a list of vertex positions and a list of triangle indices:   struct Mesh {     std::vector<vec3> verts;     std::vector<uvec3> triangles;   }; We can find the bounding box of a triangle by taking the min and max of all three vertices:   vec3 Mesh::lowerBound(uint32_t tri) const {     vec3 v0 = verts[triangles[tri].x];     vec3 v1 = verts[triangles[tri].y];     vec3 v2 = verts[triangles[tri].z];     return min(min(v0, v1), v2);   }   vec3 Mesh::upperBound(uint32_t tri) const {     vec3 v0 = verts[triangles[tri].x];     vec3 v1 = verts[triangles[tri].y];     vec3 v2 = verts[triangles[tri].z];     return max(max(v0, v1), v2);   } This is nice and simple and probably way better than what I'm about to suggest. W

LD_DEBUG

Posting this mainly as a reminder to myself... If you ever find yourself needing to figure out a dynamic library loading problem on Linux, LD_DEBUG can be a massive help. This is an environment variable you can set to make the dynamic linker print out a ton of useful diagnostic info. There are a number of different values which control the amount and type of diagnostics printed. One of the values is help; if you set LD_DEBUG to this and run executable it will print out a list of all the available options along with brief descriptions. For example, on my Linux workstation at the office: > LD_DEBUG=help cat Valid options for the LD_DEBUG environment variable are: libs display library search paths reloc display relocation processing files display progress for input file symbols display symbol table processing bindings display information about symbol binding versions display version dependencies all all previous options combi