
Debugging a Dynamic Library that Wouldn't Unload
June 6, 2021
I recently ran into an unexpected bug that sent me on a bit of adventure and had a very surprising conclusion. This is my tale.
Background
Once upon a time I wrote a blog post titled How to Reload Native Plugins in Unity. This is important because my workflow is mildly uncommon. It looks roughly like this.
- Launch Unity Editor
- Enter 'Play Mode'
- Load
.dll
plugins viaLoadLibrary
andGetProcAddress
- Exit 'Play Mode'
- Unload
.dll
plugins viaFreeLibrary
- Modify C++ code, recompile code, replace
.dll
plugins, goto Step 2
The important thing here is that dynamic libraries are being loaded, unloaded, modified, and loaded again. Everything happens within the Unity.exe
process which is never closed. Restarting the Unity editor is slow and cumbersome. This workflow allows me to rapidly iterate on new C++ plugins without having to restart the Unity editor between runs.
I've been using this workflow for almost two years. My Unity native plugin reloader has proven to be effective and reliable.
A Wild Bug Appears
One of my users started to observe weird program behavior. Impossible behavior even. They seemed to have some stale state sticking around between runs. That shouldn't be possible!
My workflow builds the world "On Enter Play" and tears the world down "On Exit Play". This is a trade-off. One really really nice benefit of this architecture is no stale state. By cleaning everything and fully unloading the .dll
I can guarantee there is zero stale state between runs. Everytime I click "Play" in the editor I'm guaranteed to start clean.
Or so I thought.
Down the Rabbit Hole
After a little debugging it became very clear that there was indeed stale state. But how? My code avoids globals like the plague. (All globals are evil.) So even if some global or static snuck into a naughty .dll
it shouldn't matter since the whole thing gets unloaded.
It turns out the .dll
was not being unloaded! I attached the Visual Studio debugger and the modules window made plain as day that Foo.dll
was staying in memory between runs.

What's weird is that my project loads multiple custom native plugins. And only Foo.dll
was staying loaded between runs. My other plugins Bar.dll
, Baz.dll
, etc were all successfully unloaded. Out of several plugins one, and only one, was failing to unload. Something was uniquely wrong with Foo.dll
.
LoadLibrary
is a reference counted operation. Calling LoadLibrary
will load the library if necessary, otherwise it increments the existing refcount. Similarly, FreeLibrary
decrements the refcount and unloads on zero. I added a few logs and confirmed that every call to LoadLibrary
had a matching FreeLibrary
.
Next I altered my API in Foo.dll
to do nothing and return immediately. Lo and behold foo.dll
unloads! This implies that something inside Foo.dll
is bumping its refcount causing it to not unload. But what?
WinDbg to the Rescue
Thanks to a friend I learned a new trick. Using WinDbg you can run the command bm *GetModuleHandle*
to inject a breakpoint into every function matching the pattern.
Hitting this breakpoint reveals the mystery.

Let's break this down.
std::thread
constructor causesGetModuleHandleExW
to be invoked.rcx
register contains0x04
. This corresponds toGET_MODULE_HANDLE_EX_FLAG_FROM_ADDRESS
.rdx
contains0x00007ffc9e1d1145
. This corresponds to some function insideFoo.dll
.
The first line of documentation for GetModuleHandleExW
reads:
Retrieves a module handle for the specified module and increments the module's reference count
🎉 Tada! 🎉
The std::thread
constructor calls _beginthreadex
which calls create_thread_parameter
which calls GetModuleHandleExW
. The call to GetModuleHandleExW
passes the flag GET_MODULE_HANDLE_EX_FLAG_FROM_ADDRESS
which increments the module refcount. There is a corresponding call to FreeLibraryAndExitThread
via std::thread
destructor and _endthread
.
The root cause of my "module won't unload bug" is a failure to properly cleanup all background threads. Fixing my sloppy shutdown code allowed Foo.dll
to actually unload. Success!
To Crash or Not to Crash
At this point my mystery is solved. However you should be asking youself some questions.
Imagine for a second that you load some .dll
and call a module function that spins up a background thread to perform some expensive operation. Then you call FreeLibrary
. What happens?
- Program explodes catastrophically.
- Program continues to function.
I expected #1, program explodes. The thread is executing instructions, those instructions are unloaded from memory, kaboom. 💥
I observed #2. std::thread
increments the module refcount which prevents the module from unloading. This surprised me and everybody I talked to. Maybe this behavior is obvious to you. It certainly wasn't to me!
But wait, there's more
Let's consider another scenario. Imagine if std::thread
initially calls a "safe" function which later calls a module function that gets unloaded? What happens?
Here's a minimum example to find out.
// Foo.cpp compiled into Foo.dll extern "C" { __declspec(dllexport) void ExpensiveFunc() { std::cout << "begin expensive operation" << std::endl; std::this_thread::sleep_for(std::chrono::milliseconds(500)); std::cout << "end expensive operation" << std::endl; } // Not actually C. Simplified for blog. __declspec(dllexport) std::thread ExpensiveFuncAsync() { // bumps refcount of Foo.dll return std::thread([]() { ExpensiveFunc(); }); } }
// main.cpp compiled into main.exe void main() { using VoidFn = void(*)(); using ThreadFn = std::thread(*)(); // Works { auto module = LoadLibraryA("Foo.dll"); // Foo.dll refcount = 1 ThreadFn expensiveFuncAsyncFn = (ThreadFn)GetProcAddress(module, "ExpensiveFuncAsync"); std::thread works = expensiveFuncAsyncFn(); // Foo.dll refcount = 2 FreeLibrary(module); // Foo.dll refcount = 1 works.join(); // Foo.dll refcount = 0; unloads } // Crashes { auto module = LoadLibraryA("Foo.dll"); // Foo.dll refcount = 1 VoidFn expensiveFuncFn = (VoidFn)GetProcAddress(module, "ExpensiveFunc"); // std::thread calls lambda, which does NOT bump refcount of Foo.dll std::thread crashes = std::thread([expensiveFuncFn]() { expensiveFuncFn(); }); std::this_thread::sleep_for(std::chrono::milliseconds(100)); FreeLibrary(module); // Foo.dll refCount = 0; unloads crashes.join(); // kaboom! Access violation executing location } }
There are two functions in Foo.dll
– void ExpensiveFunc()
and std::thread ExpensiveFuncAsync()
. The first performs some expensive operation. The second creates a new thread which performs some expensive operation.
There are two blocks of code inside main.cpp
. The first loads Foo.dll
, calls std::thread ExpensiveFuncAsync()
, frees Foo.dll
, and joins the thread. This block of code does NOT crash because Foo.dll
's refcount gets bumped when ExpensiveFuncAsync
constructs a new std::thread
.
The second block constructs a std::thread
inside foo.exe
which then calls void ExpensiveFunc
in Foo.dll
. This version explodes catastrophically.
What this means is that if you are naughty and "leak" a std::thread
then your program MIGHT crash, or it might not. And it doesn't depend on what code is executing. It depends on what code created the std::thread
.
I personally think MSVC's STL behavior here is highly questionable. Bumping the module refcount from std::thread
is super extremely non-obvious. No one on my team expected this behavior. It didn't even protect me from a bug. It merely swept my sloppy bug under the rug and made it difficult to discover. I would rather my program explode the moment I call FreeLibrary
. That would have been both obvious and trivial to fix.
Windows vs Linux
This entire blog post was written in the context of Windows compiling with Visual Studio 2019. I do not know if other operating systems with other STL implementations have the same behavior. If any reader would like to test and let me know then I'll happily update this post.
Conclusion
Debugging this particular issue was a bit of an adventure. My blog post title gave away the fact that a module wasn't unloading. It actually took a bit of time to make that discovery. I was so surprised by the fact that std::thread
bumps the module refcount that I felt it worthy of a blog post.
Thanks for reading.