Skip to main content

Assert no lock required

This is a technique I learnt about from Jason Gregory's excellent book, Game Engine Architecture (3rd Edition).

If you have a shared resource accessed by multiple threads, where you're fairly certain that it's only ever accessed by one thread at a time, you can use an assert() to check for this at debug time without having to pay the runtime cost of locking a mutex.

The implementation is fairly straightforward:

class UnnecessaryMutex {
public:
  void lock() {
    assert(!_locked);
    _locked = true;
  }

  void unlock() {
    assert(_locked);
    _locked = false;
  }

private:
  volatile bool _locked = false;
};

#ifdef ENABLE_LOCK_ASSERTS
  #define BEGIN_ASSERT_LOCK_NOT_REQUIRED(mutex) (mutex).lock()
  #define END_ASSERT_LOCK_NOT_REQUIRED(mutex)   (mutex).unlock()
#else
  #define BEGIN_ASSERT_LOCK_NOT_REQUIRED(mutex)
  #define END_ASSERT_LOCK_NOT_REQUIRED(mutex)
#endif

Usage is equally straightforward:

UnnecessaryMutex gMutex;

void PossiblyOverlappingFunction()
{
  BEGIN_ASSERT_LOCK_NOT_REQUIRED(gMutex);
  // ... do critical section operations ...
  END_ASSERT_LOCK_NOT_REQUIRED(gMutex);
}

There are a few caveats with this though.

First is that it's not 100% reliable, because it favors minimal runtime cost over perfect accuracy. It should catch most cases where two critical sections overlap, but it's vulnerable to race conditions. Declaring the _locked variable volatile doesn't prevent these, it just means access to the variable can't be optimised away. The book makes the point that is probably sufficient if combined with good enough testing.

If you need better accuracy, you could use a std::atomic<bool> instead, with appropriate memory orderings. This will increase the runtime overhead a bit, but if the mutex is locked and unlocked very frequently that may still be ok for your use case. It may be useful to have a #define controlling which implementation is used, so that if the fast version detects a problem you can switch to the slower but more accurate version to help track down the problem.

If you want 100% accuracy you could use a real mutex and assert on whether try_lock() succeeds.

The second caveat is that its not a recursive mutex. If you try to obtain the lock a second time from the thread that's already holding it, that will still trigger the assert. The general idea does still apply for recursive mutexes, but the implementation of the UnnecessaryMutex class gets a little more complicated: it would need to keep track of which thread it's locked by and a count of how many times it's been locked, instead of just a boolean.

An RAII-style wrapper for UnnecessaryMutex, which locks the mutex on construction and unlocks it on destruction, can be a useful addition to this.

What I've said here is mostly just rephrasing of what can be found in the book, which has lots of other useful stuff besides this.

Comments

Popular posts from this blog

OpenGL ES and occlusion queries

This is a follow-up to my earlier post "WebGL doesn't have query objects" . Since I wrote that post, the situation has changed a bit. It's still true to say that WebGL doesn't have query objects, but the underlying reason - that OpenGL ES doesn't - is no longer true. For OpenGL ES 2.0 , there's an extension which provides basic query functionality: EXT_occlusion_query_boolean  (which seems to have been based on ARB_occlusion_query2 from regular OpenGL). For OpenGL ES 3.0 , the functionality from that extension appears to have been adopted into the standard. The extension provides two query types, both of which set a boolean value to indicate whether any pixels passed the depth and stencil tests. While this is progress, unfortunately it's still not sufficient to implement the pixel accurate collision detection method I described in an earlier post. For that purpose it's not enough to know whether any  pixels passed the tests; you want to kno...

Triangle bounding boxes in a single byte

Just thought of a way to store the bounding box for a single triangle in only one byte. It's not really practical or something you'd ever really want to use, but what the hell. Assume we have some kind of indexed mesh structure with a list of vertex positions and a list of triangle indices:   struct Mesh {     std::vector<vec3> verts;     std::vector<uvec3> triangles;   }; We can find the bounding box of a triangle by taking the min and max of all three vertices:   vec3 Mesh::lowerBound(uint32_t tri) const {     vec3 v0 = verts[triangles[tri].x];     vec3 v1 = verts[triangles[tri].y];     vec3 v2 = verts[triangles[tri].z];     return min(min(v0, v1), v2);   }   vec3 Mesh::upperBound(uint32_t tri) const {     vec3 v0 = verts[triangles[tri].x];     vec3 v1 = verts[triangles[tri].y];     vec3 v2 = verts[triangles[tri].z];     return ...