Skip to content

Instantly share code, notes, and snippets.

@viega
Created March 20, 2025 01:58
Show Gist options
  • Save viega/f063010765a47ca9aa000a75ec55f8e2 to your computer and use it in GitHub Desktop.
Save viega/f063010765a47ca9aa000a75ec55f8e2 to your computer and use it in GitHub Desktop.
A C implementation of defer using `goto`
// defer.h
// [email protected]
// © 2025 Crash Override, Inc.
// Licensed under the BSD 3-Clause license
#pragma once
#include <stdint.h>
typedef struct n00b_defer_ll_t n00b_defer_ll_t;
struct n00b_defer_ll_t {
void *next_target;
int64_t guard;
};
#define N00B_DEFER_INIT ((int64_t)0xdefe11defe11defeLL)
#define n00b_enable_defer() \
n00b_defer_ll_t __n00b_defer_list = { \
NULL, \
N00B_DEFER_INIT, \
}; \
void *__n00b_defer_return_label = NULL
#define n00b_token_paste(x, y) x##y
#define n00b_defer_label(x) n00b_token_paste(__defer_block_, x)
#define n00b_defer_node(x) n00b_token_paste(__n00b_defer_node_, x)
// The unnecessary extra block after the label is to prevent
// clang-format from wrapping oddly.
#define n00b_defer(defer_block) \
n00b_defer_ll_t n00b_defer_node(__LINE__); \
if (n00b_defer_node(__LINE__).guard != N00B_DEFER_INIT) { \
n00b_defer_node(__LINE__).guard = N00B_DEFER_INIT; \
n00b_defer_node(__LINE__).next_target = __n00b_defer_list.next_target; \
__n00b_defer_list.next_target = &&n00b_defer_label(__LINE__); \
} \
if (false) { \
n00b_defer_label(__LINE__) : \
{ \
n00b_defer_node(__LINE__).guard = 0ULL; \
defer_block; \
if (!n00b_defer_node(__LINE__).next_target) { \
if (!__n00b_defer_return_label) { \
goto n00b_defer_bottom_exit; \
} \
goto *(__n00b_defer_return_label); \
} \
} \
goto *(n00b_defer_node(__LINE__).next_target); \
}
#define n00b_defer_func_exit() \
if (__n00b_defer_list.next_target) { \
goto *(__n00b_defer_list.next_target); \
}
#define n00b_defer_return \
__n00b_defer_return_label = &&n00b_defer_label(__LINE__); \
n00b_defer_func_exit(); \
n00b_defer_label(__LINE__) : return
#define n00b_defer_longjmp(jump_env, jump_passed_state) \
__n00b_defer_return_label = &&n00b_defer_label(__LINE__); \
n00b_defer_func_exit(); \
n00b_defer_label(__LINE__) : longjmp(jump_env, jump_passed_state)
#define n00b_defer_func_end() \
n00b_defer_bottom_exit : assert("You forgot to return on some branch." \
== NULL)
#if defined(N00B_USE_INTERNAL_API)
#define Return n00b_defer_return
#define Longjmp(x, y) n00b_defer_longjmp(x, y)
#define defer(x) n00b_defer(x)
#define defer_on() n00b_enable_defer()
#define defer_func_end() n00b_defer_func_end()
#endif
@viega
Copy link
Author

viega commented Mar 20, 2025

Overview

I just saw that C is getting a 'defer' keyword (not part of the ANSI standard, but a tech note everyone is expected to implement; see this blog post. That was exciting to me, and serendipitously, I needed to go fix an issue where I missed unlocking a lock.

I took a quick look around, and found a couple of implementations, but using longjmp(), which has more overhead than necessary.

So, given it will take a while before a proper defer is commonly available, I decided to build myself a passable implementation to use
until the real thing is available.

API Overview

This API is namespaced to n00b (a bigger project), but if you #define N00B_INTERNAL_API before including, you'll get the preferred API:

  • defer_on() which must be placed at the beginning of any function that wants to use this.

  • defer(deferred_code), which can appear wherever you need it to in the body.

  • Return, which wraps the return keyword, invoking the defer cleanup process before executing the return statement.

  • defer_func_end(), which ensures you cannot escape the function without a proper return statement (but does not keep you from using the wrong kind of return).

  • Longjmp(env, val) if you do need to exit via a longjmp().

Important notes

  1. The parameter to defer can be arbitrary in-scope code (it is fine to open up a block as long as the preprocessor takes it.)

  2. Defer blocks only run if they are dynamically reached.

  3. Any such block is only ever run a single time.

  4. Defer blocks are run in the reverse order in which they were reached.

  5. You MUST NOT to jump out of whatever code you put inside your defer statement. There's no guard against that.

  6. The return statement is run AFTER the defer blocks run. If you do computation in your return statement, and need it to run BEFORE the defer blocks get called, then compute the result and stick it in a ariable, and return the variable instead.

See below for a more detailed explaination.

Obligatory Contrived Example

#define N00B_USE_INTERNAL_API
#include "defer.h"

int
n00b_defer_test(int x)
{
    defer_on();
    
    defer(printf("First defer block runs last.\n"));

    for (int i = 0; i < 100; i++) {
        if (x + i > 200) {
            defer(printf("Deferred in for loop!\n"));
        }
    }

    if (x & 1) {
        Return x & ~1;
    }

    defer(printf("If I make it this far, this goes first.\n"));

    Return x;

    n00b_defer_func_end();
}

void
n00b_run_defer_tests(void)
{
    printf("Test 1: \n");
    n00b_defer_test(151);
    printf("Test 2: \n");
    n00b_defer_test(150);
    printf("Test 3: \n");
    n00b_defer_test(0);
    printf("Test 4: \n");
    n00b_defer_test(1);
}

int main() {
  n00b_run_defer_tests();
}

The above prints:

viega@Mac n00b % n00b
Test 1:
Deferred in for loop!
First defer block runs last.
Test 2:
If I make it this far, this goes first.
Deferred in for loop!
First defer block runs last.
Test 3:
If I make it this far, this goes first.
First defer block runs last.
Test 4:
First defer block runs last.

Explanation

The most interesting bit of the implementation is the actual 'defer' block (it's not really a block, but easier to talk about that way). Each block statically declares its own stack-allocated linked list record.

At the highest level:

  • When a 'defer' block is entered, it first checks to see if it's been run yet (done via a sentinal value set in the stack entry). If it hasn't been entered, then it pushes the location of the code to run during cleanup onto the stack.

  • The rest of the defer block is skipped.

  • The Return (and Longjmp) statements ensure the cleanup happens before actually executing.

While the implementation is conceptually simple and doesn't take much code, it does require some less common C wizardry to understand:

  1. First, C's token pasting operator (##) allows us to create variables and jump labels on the fly.

  2. We can make labels and variables unique per-statement by pasting together a prefix with the value of the LINE macro (done with an indirect macro here).

  3. Wrapping the defer code in an if (false) { } block allows us to skip it on its first run. But if we define a label inside the block, then push that label onto the stack, it can be popped off the block and jumped to.

  4. After the user's defer code (but still inside the if (false) {} block), we need to add code to look to see if there's another deferred block we reached, that we need to run. We always look inside our own private linked-list record to see the successor, if any.

  5. When there are no more defer blocks to run, we need to jump to the actual return statement (or longjmp). Of course, a function can have many. To address that, there's a single variable (added via defer_on()) that gets the jump target to complete the exit. So when the stack empties, we jump wherever that is.

    We use the same token-pasting approach to create the label, and a computed goto to jump to it.

  6. The linked list guard to determine whether the 'defer' block has been added to the linked list or not helps prevent blocks from being run multiple times on exit. Since the compiler does generate code to zero out stack-allocated memory, we have to get a bit creative.

    Here, we assume the stack memory is random garbage, and use an arbitrary guard value to indicate that a block has been run. When
    doing this, we need to be sure to remove the guard (by zeroing it out) so that we don't miss a defered block in subsequent stack frames.

    We could take an alternate approach, and statically initialize a boolean. We still would need to clear it after running our deferred block. That keeps it off the stack, but leaves us hosed if we somehow do jump out of a defer block.

The Computed goto

Many C developers don't know about the existence of computed goto in the language. And, to be fair, it's not technically part of the standard, even though it's worked for a long time in both gcc and clang.

Generally, when you use goto in C, the target is a fixed, static label:

...

if (error) {
    goto cleanup;
}

cleanup:
   close(fd);

...

That will translate to a jump instruction to a fixed offset known at compile time, which will be incredibly cheap.

Here though, while we create jump labels associated with each defer statement statically, we only want to jump to things if they're triggered, so which labels we want to jump to (and in what order) is dynamic.

The computed goto allows us to handle that by giving us two primitives:

  1. The ability to get the address of any in-scope label that we want to use as a jump target. This is done with the && operator. The type of the label is simply just void * (though you could typedef it for clarity).
  2. The ability to give goto a memory location from which to dynamically read the jump target. This is done by using the C dereference operator (*), effectively telegraphing the argument is not the jump target, but where to look for the jump target.

Obviously, this involves a couple of extra memory references.

The computed goto may seem foreign to many, but it actually dates all the way back to Algol 68. The basic idea is to put these labels in an array, compute some value that provides an index into that array, and jump as appropriate. This has mostly been subsumed by switch or case statements, which often use a computed goto under the hood, but provide much more structure.

However, it doesn't make sense to use a C switch statement here, exactly due to that structure. If our macros were to generate a big switch() statement, code would need to ensure defer statements all had the same parent block. In practice, that would lead to a lot of obtuse errors and unnecessary gymnastics.

The computed goto approach is more flexible, and no less efficient.

Comparison

vs. a longjmp() approach

The longjmp call allows for 'goto'-like functionality that can cross function boundaries, and essentially works by saving register state (at the point of a setjmp), and then restoring it all with the longjmp, which is significant work compared to a few gotos and a small single digit number of stack accesses per frame.

This approach is far more lightweight.

Of course, we do wrap longjmp, since it's commonly used as the basis of more heavy-weight exception handling mechanisms. The biggest challenge with such mechanisms is often doing the cleanup while unwinding the stack. Wrapping longjmp doesn't directly help a lot there, as it only would run defer blocks when raising an exception.

However, it's easy to trigger 'defer' calls at the point where setjmp gets back execution after a jump completes. The only remaining challenge is the expense and kludge of having multiple layers of setjmp if needed.

Forthcoming compiler changes

The forthcoming language feature will definitely be better than this. For example:

  1. You won't have to add macros at the top and bottom of functions.
  2. It will be a statement, where you can use braces.
  3. The proper implementation will allow defer clean-up to happen on block-level (cleaning up after exiting any block).
  4. There'll be no need for the user to wrap things like return and longjmp to ensure defer blocks are evaluated.

But, this is still better than other common exception handling mechanisms, including ones based on longjmp.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment