Using Zig to Commit Toolchains to VCS

January 6th, 2024

I recently wrote a blog post titled Dependencies Belong in Version Control. In this post I argue that VCS should contain code, binary assets, third-party libraries, and even compiler toolchains.

In that post I built a proof-of-life project that compiles and runs on Windows without any system dependencies. Getting this to work on Linux was a bit of an adventure. This post is about that adventure.

Recap: The Windows Setup

I built a small C++ project that compiles on Windows without needing an external packages, libraries, programs, or compilers to be installed. It should just work.

Here's what the directory setup looked like for Windows:

\root
    \sample_cpp_app
        - main.cpp
    \thirdparty
        \fmt (3 MB)
    \toolchains
        \win
            \cmake (106 MB)
            \LLVM (2.5 GB)
            \mingw64 (577 MB)
            \ninja (570 KB)
            \Python311 (20.5 MB)
    - CMakeLists.txt
    - build.bat
    - build.py

When run it produces:

Hello world from C++ 👋
Goodbye cruel world from C++ ☠️

This is a very simple project. It's "hello world" with the small twist that it uses the popular fmt library.

Building this project is as simple as running build.bat. Under the hood process is:

  1. run build.bat
  2. which runs ./toolchains/win/Python311/python.exe build.py
  3. which invokes ./toolchains/win/cmake/cmake.exe
  4. which builds via ./toolchains/win/ninja/ninja.exe
  5. and produces ./bin/main.exe

The build.bat script uses SETLOCAL to temporarily nuke all environment variables and reset PATH to C:\Windows\System32;. This ensures that only repo toolchains and content is used. Clone and run script. No external dependencies required. Success!

The Linux Dilemma

Mirroring this approach almost works on Linux. Almost!

The key problem is that LLVM does not come bundled with libc headers or libs. LLVM expects those to be provided else where. On Windows this means mingw64 which is easy to embed. However Linux expects libc to be magically available in a system folder.

On Ubuntu I went down the path of trying to download and vendor the necessary packages. The exact minimum set of necessary packages was quite confusing to a Linux noob such as myself. It was something like libc6-dev and libgcc-9-dev but I'm really not sure.

After a dozen hours of hair pulling I got something that linked and ran. Unfortunately it segfaulted anytime malloc was called. As best I could tell glibc was linked but not correctly initialized. (More on this later!)

Obviously I was in dragon territory and not doing things "the Linux way". This is a bad place to be.

Compiling C++ Programs

Let's take a step back. What does it actually take to compile a C++ program? Why is this even a hard problem?

C++ is infamous for NOT having a standard build or package system. I swear compiling a C++ program isn't actually that hard.

  1. Compile each .cpp file into an object file
    • Input: foo.cpp, compiler flags, include directory paths
    • Output: foo.o
  2. Static Library
    • Input: foo.o bar.o
    • Output: baz.a
  3. Dynamic/Shared library
    • Input: ham.o eggs.o
    • Output: libbacon.so
  4. Executable
    • Input: foo.a baz.a libbacon.so
    • Output: breakfast

Now obviously there's a lot of bookkeeping. If you want to link baz.a into a program then you need all its header directories. If you don't want to build the whole thing from scratch every time you need to track dependencies and changes.

Build systems are genuinely very hard. I deal with them a lot in my day job. I'll make the bold claim that knowing what command to run is relatively easy. The complex, and frustrating, part is inducing the rube golberg machine to run the commands I know I want.

CMake and Clang

Build systems generally allow projects to be defined declaratively. Then they execute a bunch of logic that injects various steps, flags, and paths to produce an output. It kinda sorta works mostly. Tools like CMake add another layer of indirection because it runs a bunch of complex, opaque logic to generate build files that are processed by another tool.

Build tools are highly configurable. There's a flag for everything! Here's some of the CMake flags I played with:
-nostdlib
-no-standard-includes
-nodefaultlibs
-stdlib=
--gcc-install-dir=
-sysroot=

It's exceedingly difficult to figure out both what build tools do and why. Debug and verbose flags aren't always enough. I find myself increasingly annoyed by build systems that can't be trivially used in a step debugger, but I digress.

Bottom line is that I never got vendored Ubuntu packages to work. I spent a bunch of time trying to get things to work and failed. I'm not sure what bMakeItWork option I missed. But even if it did work this feels like the wrong approach. Surely there's a better way?

Zig to the Rescue

I have nearly zero experience with Zig. What little experience I do have wasn't particularly positive. It's not what I'm looking for in a programming language.

However Zig has a really, really cool property. The Zig compiler can be used as a drop-in replacement for GCC/Clang. In theory you can just replace clang++ with zig c++ and voila.

It took me less than an hour to update replace LLVM with Zig. It kinda sorta just worked! With Zig the new project structure looks like this:

\root
    \sample_cpp_app
        - main.cpp
    \thirdparty
        \fmt (3 MB)
    \toolchains
        \linux
            \cmake (143 MB)
            \ninja (260 KB)
            \scripts (1 KB)
            \zig (332 MB)
        \mac (356 MB)
        \win (470 MB)
    - CMakeLists.txt
    - build_and_run_linux.sh
    - build_and_run_mac.sh
    - build_and_run_win.bat

LLVM is deleted and replaced with Zig. There are three copies of Zig (Linux, Mac, Windows) which contains ~150 MB of redundant data. My previous post explains how a next-gen VCS system can make this duplication literally free.

Linking libc

Linking libc isn't actually hard. All you need are the headers and a link target.

Interestingly, this is a place where Windows and Linux are different. On Windows if you want to use bacon.dll you need headers and bacon.dll.imp.lib. This import library is very small and contains only a list of exported symbols.

Meanwhile Linux compilers don't offer a thin import library. Instead you need the headers and the full libbacon.so shared library. This will be used to compile the breakfast executable which will then expect to dynamically load libbacon.so from the system.

Personally, I think it's kinda weird that to dynamically load libbacon.so you need a full copy when compiling. That's excessive and demonstrably unnecessary imho. More on this in a moment!

How Zig Does It

How Zig solve the glibc conundrum? Andrew Kelly wrote a lengthy blog post that goes into nitty gritty details. I'll attempt to summarize it here.

First, Zig is built on LLVM so clang++ is part of the Zig compiler binary already.

Second, we need to link glibc. On Linux this, unfortunately, means we need a full copy of libc.6.so. Zig does this by... simply compiling glibc! Well, kind of. What Zig actually needs to do is create a dummy .so that very carefully contains all the symbols for a particular version of glibc. This dummy .so allows the program to link. When the program runs it dynamically loads the system glibc which contains the real compiled code. Zig is effectively creating an equivalent to the Windows .dll.imp.lib file!

Zig's ability to create a thin .so requires an expensive pre-process step. The Zig team compiles glibc for 46+ targets and uses the result to deploy all the necessary headers.

Third, Zig compiles and links Scrt1.o, crti.o, and crtn.o. It turns out these are the C runtime start files. Oh hey, this is exactly what I missing when my vendored approach wasn't initiallizing the C runtime! Neat.

There's a few more bits to handle, but that covers the issues I was running into. I suggest reading Andrew Kelley's detailed blog post if you're interested in knowing more.

First Class Cross-Compile

You may be wondering why Zig jumps through all these hoops. The reason is to provide first class support for cross compiling.

Zig can compile for any platform from any platform. For example you can compile an x86_64-linux-gnu program from Windows. Or you can compile for macOS from Linux. It's beautiful.

A consequence of supporting cross-compile is that you can not depend on system libraries and toolchains! Thus the only choice is for the Zig compiler to include everything necessary to compile for all Linux, macOS, and Windows variants.

A Small Rant

If you don't like spicy rants I suggest skipping this section. This rant is, ahem, my truthiness. Take it with a grain of salt.

I think that compiler toolchains that rely on system installed libraries are wrong and broken. The Linux Way is fundamentally incorrect and harmful. glibc was the hardest target for Zig to support because glibc's design is based on bad practices from the 70s.

Cross-compile support is de facto good. Programming languages should support cross-compile by default and it should be trivial. If cross-compile is trivially supported then vendored toolchains are implicitly supported as well. It's win/win!

The tragedy is that it doesn't have to be hard! Build systems and compilers must not assume the execution platform and target platform are the same. Libraries should only have a single set of headers. Use #ifdef within files to differentiate between platforms. configure scripts that generate headers based on the local environment are pure evil. Don't do that. Zig's support via mingw-w64 is trivial and musl is simple. It's only glibc that is a bad citizen.

It's very frustrating when bad choices make simple things hard. Zig jumps through a bunch of hoops to make things easy. Kudos to Andrew Kelley and the Zig community! It's also clear evidence that making things not suck can be done by one person with grit. It's not insurmountable and doesn't require an army.

Vendored Toolchains - Proof of Life

Let's get back on topic. In my previous post I argued that compiler toolchains belong in version control. In this post I've created a C++ project that can be compiled on Linux, macOS, and Windows without installing a single external dependency. It uses Zig as a Clang++ replacement because Zig "just works" and clang++ doesn't.

My sample project is 185 MB compressed and 1.23 GB uncompressed. This could be shrunk with some de-duplication work.

Example Project: Dropbox

This project doesn't support every environment. I've only provided the Zig toolchain for x86_64 Linux, x86_64 Windows, and aarch64 macOS. Adding support for new platforms is trivial. If you think this sounds like too much bloat then you need to read my previous post.

Conclusion

I strongly believe that all dependencies - including compiler toolchains - belong in version control. It's radically more usable, reliable, reproducible, and sustainable. A new and improved VCS tool can make this space and bandwidth efficient.

My sample project demonstrates that vendoring toolchains absolutely works. It can be done. Even on Linux.

This exercise has also convinced me that first class support for cross-compilation is both important and doesn't have to be hard. glibc makes things far harder and more complex than it actually needs to be. Zig's ergonomic improvements could and should be provided by glibc out of the box.

Thanks for reading.

Epilogue

XetHub

To work in this little project I used XetHub instead of GitHub. It's not the Next Gen Version Control system of my dreams. But maybe it could be?

Currently I would describe XetHub as "Git LFS done right". Once you install their client you run git xet clone https://xethub.com/forrestthewoods/vcs_toolchains. After that you only run normal git commands. Large files are automagically handled. It's somewhat slick.

Unfortunately they don't have the virtual filesystem of my dreams. Normal clones still download all files. Sparse clones aren't automagic. Their read-only version is mounted as a network drive so you can't run build scripts.

Maybe someday. More importantly, people are working on new and improved VCS tools. Git isn't the end of the road.

Cosmopolitan C

cosmopolitan libc is a project to make C a "build once, run anywhere" language. It produces magic polyglot executables that magically run on Linux + Mac + Windows + more. It's pretty mind blowing.

I asked if I could mix cosmopolitan's libc with clang++. The answer was "yes, but the compiler flags must match and cosmoc++ must link". I opted not to explore further.

I'm not super interested in replacing my normal compilers with cosmo. But I think it would be interesting to replace the compiler executable with a cosmo-built variant. What if one Zig binary could be used by any platform to compile for any platform? That'd be neat!

That said, with a proper VCS tool the cost of having N binaries in version control is negligible. This is a neat idea, but it's not high on my list.

Zig glibc

My project replaces clang++ with Zig. This isn't necessarily desirable.

It should be possible to leverage Zig's work to extract libc headers and source and create makefiles that can be used normally. The Zig compiler does a little bit of orchestration work to tie it all together. But Zig's hard work to gather headers and .c files could be leveraged without Zig itself.