Skip to content

Instantly share code, notes, and snippets.

@GoldenStack
Last active August 24, 2025 23:52
Show Gist options
  • Save GoldenStack/09cb66ec29ff80e0aebda528d2cdb2e4 to your computer and use it in GitHub Desktop.
Save GoldenStack/09cb66ec29ff80e0aebda528d2cdb2e4 to your computer and use it in GitHub Desktop.
Language explanation and rationale for some key features of my language (https://github.com/GoldenStack/overscore)
  • No place expressions/lvalues. This means the = operator takes two arguments, of type *T and T, and sets the left side to the right side. There are several other implications of this, but I'm not including them here.
  • No booleans. The type system is powerful enough to allow easily swapping them out for more comprehensible alternatives. For example, consider fn evaluate(expr: Expr, deep: bool). When calling, it looks like evaluate(expr, true), which doesn't really communicate what true means. Instead, with fn evaluate(expr: Expr, depth: .shallow or .deep), the call looks like evaluate(expr, .deep), which communicates the meaning of it much better: deep evaluation. This doesn't implement any math operators, but if what you need is a 2-value integer, you can just create an an integer type with two elements (syntax undecided as of now) and use normal mathematical operations on them. If you think booleans are really needed, you can simply const Bool = .true or .false.
  • Declarations (x: int) and definitions (x = 5) are expressions. Since definitions are essentially equal to struct values with one type, they can be passed around as expressions during runtime (declarations can't because they're types).
  • No operator precedence. Instead of complicated precedence rules for infix operators, the compiler forces you to add parentheses to remove ambiguity. References: Zig, Pony, a blog post.
  • Merged namespaces and struct literals. This means that having a namespace of values (in Zig, struct { const x = 5; const y = 6; }) and struct literals (again, in Zig, .{ .x = 5, .y = 6 }) are the same thing. This makes the language simpler.
  • Names are references. If I have int x = 2;, the type of the expression x (e.g. @TypeOf(x)) is a pointer to an integer - this is because using the name as a reference to a variable ought to actually be a reference, instead of introducing complicated language semantics. BLISS (1970) turns out to have this same feature.
  • No tuples (structs with unnamed fields). Only structs (named tuples). Tuples honestly aren't that useful, especially when the language has the right features in their place, and they complicate parsing (especially with infix type construction, mentioned later), and only having named product/sum types means you have to give everything a name.
  • No hidden copies. Since names are references and pointers are "transparent," copies are never made automatically (e.g. int x = 5; int y = x; implicitly copies in most languages). If you don't write x.* anywhere, it's never copied, and if you do, it's always copied. The only exception are that pointers can be implicitly copied, but pointers don't have identity and are always extremely cheap (one mov), so you can't have implicit memcpys killing you (e.g. for TigerBeetle).
  • No address-of (&) operator. This makes code clearer, since it's often unclear if using the address-of operator is making a short-lived reference to a local variable (&a) or simply finding the reference of an operation applied to another variable (e.g. &a.b). Instead, you'll need to add another declaration to get a reference, making it more obvious what's happening: e.g. int x = 5; *int y = x;
  • Infix type construction. This means, instead of struct X { int a; int b; }, you write const X = (.a: int and .b: int). This has several benefits, including completely disallowing the creation of single-field unions or structs (since you can't call an infix operator with one argument), which, if you think about it, doesn't make sense because single-field unions and structs don't have any meaningful difference, but have different semantics anyway.
  • Main file is an expression. Instead of files being some magic construct, or guaranteed to be a container, files are just expressions. If you have a file that contains literally the text 5, you can @import it and it will evaluate to 5.
  • Circular imports are a non-issue. Since files are expressions, the concept of "circular imports" is equivalent to mutually dependent tuples/containers/structs/product types, something like .main = (.a = 5; .b = other); .other = main.a. This is not an issue because we have lazy type checking for product types due to them being negative types (which behave well with respect to lazy evaluation strategies), which means that dependency loop errors are non-eager and only occur if there is actually a dependency loop. This also means that mutually dependent files that aren't product types are always dependency loops, which is the correct behaviour, and yet doesn't even need a special case in the compiler.
  • Transparent pointers. It's possible to do most operations through pointers without having to dereference them, allowing behaviour that makes more sense. For example, instead of having an implicit dereference when doing (&x).y, a similar expression x->y results in a pointer to y, instead of also dereferencing. This makes more sense, as it allows the same behaviour, but with fewer operators and less hidden behaviour. This also removeses cases of not knowing when a dereference copies or not, which makes it harder to reason about performance.
  • Syntax that reflects theory: functions and switch statements are essentially functions over product types (structs) and sum types (unions) respectively, so why not have syntax that reflects this? Loops also have a simpler syntax that reflect what's going on, with loop if x == 0 then {} else break; representing a for loop, but loop if isn't a special syntax at all, either. Another small and somewhat unrelated syntax change: if statements are if x then y else z instead of if (x) y else z.
  • Arrays of a given length (e.g. [3]u8) are just syntax sugar for (.0: u8 and .1: u8 and .2: u8), and slices are a language feature instead: Slice(T: type) = (.len = usize and .ptr = [*]T) (unknown length array syntax pending; stolen from Zig right now).
  • Const pointers by default. Zig had a rare regression in this aspect; this is righted here with *u8 (or []u8) being const by default, with the user having to specify *var u8/[]var u8 to make it mutable.
  • The language also has most of the great features Zig introduces, such as comptime, @builtin syntax, types as values, etc. Also, simplicity/minimalism.
@GoldenStack
Copy link
Author

Considering one-based indexing btw. :)

@GameBuilder202
Copy link

GameBuilder202 commented Jun 19, 2025

Considering one-based indexing btw. :)

:(((

@GoldenStack
Copy link
Author

Unsure if I'll have operator overloading. It seems mandatory given I don't have some types like booleans (so I can't hardcode the implementations in the compiler), but I can also rely on the (probably) built-in integer types for integer operations, which are useful if you need booleans for the operations you can perform on them. The additional compiler complexity and potential readability issues do hurt as well.

Basic operators for large integers can be relatively expensive, which are things we accept as having operators. If we move forward with an io parameter passed into main, it'll be impossible for operators to perform side effects, unless something with an operator contains it as a field. Regardless, I think these two work together well, since I think what Zig is trying to get at with operators is that operators can't have side effects.

@GameBuilder202
Copy link

Basic operators for large integers can be relatively expensive, which are things we accept as having operators.

While this is true, I don't think it is really a problem. If you're using bignum types, you are kinda signing up for more expensive operations in the long run, implicit or not.

an io parameter passed into main

I assume you mean an IO-monady return type and not parameter? Unless you completely move to a pure FP solution of "side effects iff monads", I don't think you can realistically limit operators from doing side effects. Infact I sometimes do it (print debugging).

@GoldenStack
Copy link
Author

GoldenStack commented Jun 19, 2025

While this is true, I don't think it is really a problem. If you're using bignum types, you are kinda signing up for more expensive operations in the long run, implicit or not.

Yes, I'm not saying it's a problem-I'm actually justifying the opposite (i.e., why it's fine; we expect lots of builtins to have a cost anyway).

I assume you mean an IO-monady return type and not parameter? Unless you completely move to a pure FP solution of "side effects iff monads", I don't think you can realistically limit operators from doing side effects. Infact I sometimes do it (print debugging).

Yeah, I mean a parameter. In fact, the way I'm envisioning it, it'll probably be only a parameter (of basically a vtable), and won't store any results, so you're never supposed to return it. This means you can pass around it as much as you want, but if it doesn't have it passed as a parameter then something should not be performing side effects. It's not absolute (it'll probably work the same way Zig allocators work, where you could theoretically make your own, or do print debugging) but it'll just be a very strong guideline.

@GameBuilder202
Copy link

Yeah, I mean a parameter. In fact, the way I'm envisioning it, it'll probably be only a parameter (of basically a vtable), and won't store any results, so you're never supposed to return it.

So what I understand: instead of IO functions in a module, you have IO functions on an object passed to the main function?

I think that could work. It might make print-debugging a bit annoying, but I think its a good compromise. One problem I see is that I don't see how this would support organization, like grouping together stdout/stdin io, file io, whatever other io.

@GoldenStack
Copy link
Author

So what I understand: instead of IO functions in a module, you have IO functions on an object passed to the main function?

Yeah, it's IO functions on an object passed into main. If IO is a unit type then it can be optimized away, so I'm not concerned about efficiency.

It might make print-debugging a bit annoying, but I think its a good compromise.

It might make print debugging more annoying, but keep in mind that, for example, Haskell has an "escape hatch" via trace, so we can just do a similar thing. It doesn't need to be an absolute rule.

One problem I see is that I don't see how this would support organization, like grouping together stdout/stdin io, file io, whatever other io.

I hadn't thought about this before, but it'll probably work in a similar way to how IO works already in Zig. In fact, since in Zig you're just importing a struct and accessing members/calling functions on it, in this language the exact same thing can be true, but the variable name comes from a parameter and not from @import, so the usage has the potential to be the exact same after converting from const io = @import("std").io; to the parameter (io: std.Io).

An improvement, though, is that if something only needs to make changes to files, you can just pass io.file instead of the entire io. (I made this up while I'm writing it but I hope the it serves the point). You can already access fields on it in Zig, but if it's being passed around like this you have finer-grained permissions.

@GameBuilder202
Copy link

If IO is a unit type then it can be optimized away, so I'm not concerned about efficiency.

I don't see where/how this would be the case, considering the object contains stuff you need to do impure stuff.

On the theory side of things, I think what this creates is essentially just do notation without the do keyword. The parallel I'm seeing here is:

main :: IO ()
main = do
  str <- getLine
  putStrLn str
  return ()

-->

fn main(io: std.IO) {
    const str = io.getLine();
    io.putStrLn(str);
}

the parallel becomes even clearer if we explicitly annotate the usage of the functions under the IO namespace (example in pseudo-Lean4):

def main : IO () := do
  let str := IO.getLine
  IO.putStrLn str
  return ()

I think there's some opportunity to incorporate some monad theory due to this connection.

@GameBuilder202
Copy link

If my point wasn't clear, what I meant was:
You can only use the functions of a monad (here IO) when it is your return type, and do notation is just imperative-ified monad code.

In the case of this language, I see it roughly translating to: You can only use the functions of a (insert word here) (here IO) when it is passed as an object to the function.

@GoldenStack
Copy link
Author

Yeah, that's basically right. It's just discount monads. I don't think I'll make stuff too focused on monads, but I guess I'll just see how the design process goes.

I don't see where/how this would be the case, considering the object contains stuff you need to do impure stuff.

It would basically be a vtable with fields that happen to be comptime-known, or it could just be a unit type that just calls static functions. Either way I think it's manageable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment