Skip to main content

Common bit prefix length for two integers

Here's a neat trick I discovered a couple of months back: given two signed or unsigned integers of the same bit width, you can calculate the length of their common prefix very efficiently:

  int common_prefix_length(int a, int b)
    return __builtin_clz(a ^ b);

What's this doing? Let's break it down:

a ^ b is the bitwise-xor of a and b. The boolean xor operation is true when one of it's inputs is true and the other is false; or false if both have the same value. Another way to put this is that xor returns true when its inputs are different and false if they're the same. The bitwise-xor operation then, returns a value which has zeros for every bit that is the same in both a and b; and ones for every bit that's different.

__builtin_clz is a GCC intrisinc function which counts the number of leading zero bits of its argument. It compiles down to a single machine code instruction on hardware that supports it (which includes every Intel chip made in this decade). The Visual C++ equivalent is the _BitScanReverse intrinsic, which has a slightly more complicated API; implementing the above with it is left as an exercise for the reader. :-)

Passing the result of a ^ b to __builtin_clz means we're counting the leading zero bits in a number where a zero bit indicates that the corresponding bits in a and b had the same value; which is exactly how we get the length of the common prefix.

You can get the common suffix in the same way. The only difference is that you use the __builtin_ctz intrinsic (Visual C++: _BitScanForward) instead, to count the trailing zero bits:

  int common_suffix_length(int a, int b)
    return __builtin_ctz(a ^ b);

Neat huh?


Popular posts from this blog

How to outperform std::vector in 1 easy step

Everyone who's familiar with C++ knows that you should avoid resizing a std::vector inside a loop wherever possible. The reasoning's pretty obvious: the memory allocated for the vector doubles in size each time it fills up and that doubling is a costly operation. Have you ever wondered why it's so costly though?

It's tempting to assume that because implementations of the STL have been around for so long that they must be pretty efficient. It turns out that's a bad assumption because the problem, in this case, is the standard itself: specifically, the allocator interface.

The allocator interface provides two methods that obtain and release memory:

allocate allocates uninitialized storage
(public member function)deallocate deallocates storage
(public member function)

(taken from this page).

What's missing is away of growing an existing memory allocation in place. In C this is provided by the realloc function, but there's no equivalent in the std::allocator interfa…

Octree node identifiers

Let's say we have an octree and we want to come up with a unique integer that can identify any node in the tree - including interior nodes, not just leaf nodes. Let's also say that the octree has a maximum depth no greater than 9 levels, i.e. the level containing the leaf nodes divides space into 512 parts along each axis.

The encoding The morton encoding of a node's i,j,k coordinates within the tree lets us identify a node uniquely if we already know it's depth. Without knowing the depth, there's no way to differentiate between cells at different depths in the tree. For example, the node at depth 1 with coords 0,0,0 has exactly the same morton encoding as the node at depth 2 with coords 0,0,0.

We can fix this by appending the depth of the node to the morton encoding. If we have an octree of depth 9 then we need up to 27 bits for the morton encoding and 4 bits for the depth, which still fits nicely into a 32-bit integer. We'll shift the morton code up so that i…

Triangle bounding boxes in a single byte

Just thought of a way to store the bounding box for a single triangle in only one byte. It's not really practical or something you'd ever really want to use, but what the hell.

Assume we have some kind of indexed mesh structure with a list of vertex positions and a list of triangle indices:

  struct Mesh {
    std::vector<vec3> verts;
    std::vector<uvec3> triangles;

We can find the bounding box of a triangle by taking the min and max of all three vertices:

  vec3 Mesh::lowerBound(uint32_t tri) const {
    vec3 v0 = verts[triangles[tri].x];
    vec3 v1 = verts[triangles[tri].y];
    vec3 v2 = verts[triangles[tri].z];
    return min(min(v0, v1), v2);

  vec3 Mesh::upperBound(uint32_t tri) const {
    vec3 v0 = verts[triangles[tri].x];
    vec3 v1 = verts[triangles[tri].y];
    vec3 v2 = verts[triangles[tri].z];
    return max(max(v0, v1), v2);

This is nice and simple and probably way better than what I'm about to suggest.

We can store a byte that tells us which of …