Created
March 20, 2025 01:58
-
-
Save viega/f063010765a47ca9aa000a75ec55f8e2 to your computer and use it in GitHub Desktop.
A C implementation of defer using `goto`
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// defer.h | |
// [email protected] | |
// © 2025 Crash Override, Inc. | |
// Licensed under the BSD 3-Clause license | |
#pragma once | |
#include <stdint.h> | |
typedef struct n00b_defer_ll_t n00b_defer_ll_t; | |
struct n00b_defer_ll_t { | |
void *next_target; | |
int64_t guard; | |
}; | |
#define N00B_DEFER_INIT ((int64_t)0xdefe11defe11defeLL) | |
#define n00b_enable_defer() \ | |
n00b_defer_ll_t __n00b_defer_list = { \ | |
NULL, \ | |
N00B_DEFER_INIT, \ | |
}; \ | |
void *__n00b_defer_return_label = NULL | |
#define n00b_token_paste(x, y) x##y | |
#define n00b_defer_label(x) n00b_token_paste(__defer_block_, x) | |
#define n00b_defer_node(x) n00b_token_paste(__n00b_defer_node_, x) | |
// The unnecessary extra block after the label is to prevent | |
// clang-format from wrapping oddly. | |
#define n00b_defer(defer_block) \ | |
n00b_defer_ll_t n00b_defer_node(__LINE__); \ | |
if (n00b_defer_node(__LINE__).guard != N00B_DEFER_INIT) { \ | |
n00b_defer_node(__LINE__).guard = N00B_DEFER_INIT; \ | |
n00b_defer_node(__LINE__).next_target = __n00b_defer_list.next_target; \ | |
__n00b_defer_list.next_target = &&n00b_defer_label(__LINE__); \ | |
} \ | |
if (false) { \ | |
n00b_defer_label(__LINE__) : \ | |
{ \ | |
n00b_defer_node(__LINE__).guard = 0ULL; \ | |
defer_block; \ | |
if (!n00b_defer_node(__LINE__).next_target) { \ | |
if (!__n00b_defer_return_label) { \ | |
goto n00b_defer_bottom_exit; \ | |
} \ | |
goto *(__n00b_defer_return_label); \ | |
} \ | |
} \ | |
goto *(n00b_defer_node(__LINE__).next_target); \ | |
} | |
#define n00b_defer_func_exit() \ | |
if (__n00b_defer_list.next_target) { \ | |
goto *(__n00b_defer_list.next_target); \ | |
} | |
#define n00b_defer_return \ | |
__n00b_defer_return_label = &&n00b_defer_label(__LINE__); \ | |
n00b_defer_func_exit(); \ | |
n00b_defer_label(__LINE__) : return | |
#define n00b_defer_longjmp(jump_env, jump_passed_state) \ | |
__n00b_defer_return_label = &&n00b_defer_label(__LINE__); \ | |
n00b_defer_func_exit(); \ | |
n00b_defer_label(__LINE__) : longjmp(jump_env, jump_passed_state) | |
#define n00b_defer_func_end() \ | |
n00b_defer_bottom_exit : assert("You forgot to return on some branch." \ | |
== NULL) | |
#if defined(N00B_USE_INTERNAL_API) | |
#define Return n00b_defer_return | |
#define Longjmp(x, y) n00b_defer_longjmp(x, y) | |
#define defer(x) n00b_defer(x) | |
#define defer_on() n00b_enable_defer() | |
#define defer_func_end() n00b_defer_func_end() | |
#endif |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Overview
I just saw that C is getting a 'defer' keyword (not part of the ANSI standard, but a tech note everyone is expected to implement; see this blog post. That was exciting to me, and serendipitously, I needed to go fix an issue where I missed unlocking a lock.
I took a quick look around, and found a couple of implementations, but using
longjmp()
, which has more overhead than necessary.So, given it will take a while before a proper defer is commonly available, I decided to build myself a passable implementation to use
until the real thing is available.
API Overview
This API is namespaced to n00b (a bigger project), but if you
#define N00B_INTERNAL_API
before including, you'll get the preferred API:defer_on()
which must be placed at the beginning of any function that wants to use this.defer(deferred_code)
, which can appear wherever you need it to in the body.Return
, which wraps thereturn
keyword, invoking the defer cleanup process before executing the return statement.defer_func_end()
, which ensures you cannot escape the function without a proper return statement (but does not keep you from using the wrong kind ofreturn
).Longjmp(env, val)
if you do need to exit via alongjmp()
.Important notes
The parameter to
defer
can be arbitrary in-scope code (it is fine to open up a block as long as the preprocessor takes it.)Defer blocks only run if they are dynamically reached.
Any such block is only ever run a single time.
Defer blocks are run in the reverse order in which they were reached.
You MUST NOT to jump out of whatever code you put inside your defer statement. There's no guard against that.
The return statement is run AFTER the defer blocks run. If you do computation in your return statement, and need it to run BEFORE the defer blocks get called, then compute the result and stick it in a ariable, and return the variable instead.
See below for a more detailed explaination.
Obligatory Contrived Example
The above prints:
Explanation
The most interesting bit of the implementation is the actual 'defer' block (it's not really a block, but easier to talk about that way). Each block statically declares its own stack-allocated linked list record.
At the highest level:
When a 'defer' block is entered, it first checks to see if it's been run yet (done via a sentinal value set in the stack entry). If it hasn't been entered, then it pushes the location of the code to run during cleanup onto the stack.
The rest of the defer block is skipped.
The Return (and Longjmp) statements ensure the cleanup happens before actually executing.
While the implementation is conceptually simple and doesn't take much code, it does require some less common C wizardry to understand:
First, C's token pasting operator (##) allows us to create variables and jump labels on the fly.
We can make labels and variables unique per-statement by pasting together a prefix with the value of the LINE macro (done with an indirect macro here).
Wrapping the defer code in an
if (false) { }
block allows us to skip it on its first run. But if we define a label inside the block, then push that label onto the stack, it can be popped off the block and jumped to.After the user's defer code (but still inside the
if (false) {}
block), we need to add code to look to see if there's another deferred block we reached, that we need to run. We always look inside our own private linked-list record to see the successor, if any.When there are no more defer blocks to run, we need to jump to the actual return statement (or longjmp). Of course, a function can have many. To address that, there's a single variable (added via
defer_on()
) that gets the jump target to complete the exit. So when the stack empties, we jump wherever that is.We use the same token-pasting approach to create the label, and a computed goto to jump to it.
The linked list guard to determine whether the 'defer' block has been added to the linked list or not helps prevent blocks from being run multiple times on exit. Since the compiler does generate code to zero out stack-allocated memory, we have to get a bit creative.
Here, we assume the stack memory is random garbage, and use an arbitrary guard value to indicate that a block has been run. When
doing this, we need to be sure to remove the guard (by zeroing it out) so that we don't miss a defered block in subsequent stack frames.
We could take an alternate approach, and statically initialize a boolean. We still would need to clear it after running our deferred block. That keeps it off the stack, but leaves us hosed if we somehow do jump out of a defer block.
The Computed
goto
Many C developers don't know about the existence of computed
goto
in the language. And, to be fair, it's not technically part of the standard, even though it's worked for a long time in bothgcc
andclang
.Generally, when you use
goto
in C, the target is a fixed, static label:That will translate to a jump instruction to a fixed offset known at compile time, which will be incredibly cheap.
Here though, while we create jump labels associated with each defer statement statically, we only want to jump to things if they're triggered, so which labels we want to jump to (and in what order) is dynamic.
The computed
goto
allows us to handle that by giving us two primitives:&&
operator. The type of the label is simply justvoid *
(though you couldtypedef
it for clarity).goto
a memory location from which to dynamically read the jump target. This is done by using the C dereference operator (*
), effectively telegraphing the argument is not the jump target, but where to look for the jump target.Obviously, this involves a couple of extra memory references.
The
computed goto
may seem foreign to many, but it actually dates all the way back to Algol 68. The basic idea is to put these labels in an array, compute some value that provides an index into that array, and jump as appropriate. This has mostly been subsumed byswitch
orcase
statements, which often use a computedgoto
under the hood, but provide much more structure.However, it doesn't make sense to use a C
switch
statement here, exactly due to that structure. If our macros were to generate a bigswitch()
statement, code would need to ensuredefer
statements all had the same parent block. In practice, that would lead to a lot of obtuse errors and unnecessary gymnastics.The computed
goto
approach is more flexible, and no less efficient.Comparison
vs. a
longjmp()
approachThe
longjmp
call allows for 'goto'-like functionality that can cross function boundaries, and essentially works by saving register state (at the point of asetjmp
), and then restoring it all with thelongjmp
, which is significant work compared to a fewgotos
and a small single digit number of stack accesses per frame.This approach is far more lightweight.
Of course, we do wrap
longjmp
, since it's commonly used as the basis of more heavy-weight exception handling mechanisms. The biggest challenge with such mechanisms is often doing the cleanup while unwinding the stack. Wrappinglongjmp
doesn't directly help a lot there, as it only would rundefer
blocks when raising an exception.However, it's easy to trigger 'defer' calls at the point where
setjmp
gets back execution after a jump completes. The only remaining challenge is the expense and kludge of having multiple layers ofsetjmp
if needed.Forthcoming compiler changes
The forthcoming language feature will definitely be better than this. For example:
return
andlongjmp
to ensure defer blocks are evaluated.But, this is still better than other common exception handling mechanisms, including ones based on
longjmp
.