Prism Documentation

A dialect of C with defer, orelse, and automatic zero-initialization. Drop into any codebase with CC=prism make.

Install

Linux / macOS

cc prism.c -flto -s -O3 -o prism && ./prism install

Downloads a single prism.c file, compiles it with your system compiler, and installs to /usr/local/bin/prism.

Windows (MSVC)

Open a Developer Command Prompt (or run vcvars64.bat) and build:

cl /Fe:prism.exe prism.c /O2 /D_CRT_SECURE_NO_WARNINGS /nologo

Requires Visual Studio Build Tools with the Desktop development with C++ workload.

Quick start

# Compile a file
prism foo.c -o foo

# Compile and run immediately
prism run foo.c

# See the transpiled output
prism transpile foo.c

Drop-in overlay

Prism uses a GCC-compatible interface. Any flag you'd pass to gcc or clang passes through automatically.

# Instead of:
CC=gcc make

# Use:
CC=prism make

All standard flags (-O2, -Wall, -I, -L, -l) pass through to the backend compiler unchanged. Prism consumes only its own flags.

defer

The problem: C requires manual cleanup at every exit point. Each new resource adds cleanup to every error path. Miss one and you leak.

Standard C
int compile(const char *path) {
    FILE *f = fopen(path, "r");
    if (!f) return -1;

    char *src = read_file(f);
    if (!src) {
        fclose(f);
        return -1;
    }

    Token *tok = tokenize(src);
    if (!tok) {
        free(src);
        fclose(f);
        return -1;
    }

    int result = emit(ast);
    free(src);   // remember all
    fclose(f);   // in right order
    return result;
}
With Prism
int compile(const char *path) {
    FILE *f = fopen(path, "r");
    if (!f) return -1;
    defer fclose(f);

    char *src = read_file(f);
    if (!src) return -1;
    defer free(src);

    Token *tok = tokenize(src);
    if (!tok) return -1;
    defer token_free(tok);

    Node *ast = parse(tok);
    if (!ast) return -1;
    defer node_free(ast);

    return emit(ast);
}

Write cleanup once. It runs on every exit — return, break, continue, goto, or reaching }. Defers execute in LIFO order (last defer runs first).

Edge cases

Forbidden contexts

Defer is rejected in functions that use:

Inside a defer body, these are always errors regardless of context:

Opt-out: prism -fno-defer src.c — disables defer entirely for a file.

Zero-init

The problem: Uninitialized reads are the #1 source of C vulnerabilities. -Wall only catches obvious cases.

Standard C — UB
int sum_positive(int *arr, int n) {
    int total; // could be anything
    for (int i = 0; i < n; i++)
        if (arr[i] > 0)
            total += arr[i];
    return total; // UB if no positives
}
With Prism — safe
// All locals start at zero:
int x;              // 0
char *ptr;          // NULL
int arr[10];        // {0, 0, ...}
struct {
    int a; float b;
} s;                // {0, 0.0}

Before code generation, Pass 1 walks the entire preprocessed token stream at all depths to build a complete symbol table of every typedef, enum constant, parameter shadow, and VLA tag. This distinguishes size_t x; (declaration → initialize) from size_t * x; (expression → don't touch).

What gets initialized

DeclarationTransformation
int x;int x = 0;
char *p;char *p = 0;
struct S s;struct S s = {0};
int arr[10];int arr[10] = {0};
int arr[n]; (VLA)int arr[n]; memset(arr, 0, sizeof(arr));
T x; where T is a VLA typedefT x; memset(x, 0, sizeof(x));

Zero-initialization only applies inside function bodies. File-scope declarations and struct/union/enum definitions are never touched.

Opt-out: prism -fno-zeroinit src.c — or per-variable with the raw keyword.

raw

Opt out of zero-initialization for a specific variable without disabling it globally.

raw int x;              // uninitialized
raw char buf[65536];    // no memset overhead
raw struct Large data;  // skip zeroing

When to use

Multi-declarator edge case: raw applies only to the first declarator. In raw int x, y;, only x opts out — y is still zero-initialized.

raw variables can be jumped over by goto safely — no initialization to bypass. Exception: raw on a VLA does not exempt it from the goto check, because jumping past a VLA bypasses implicit stack allocation regardless of initialization.

orelse

Handle failure inline — check a value and bail in one expression.

defer solved the cleanup problem. orelse solves the check-and-bail boilerplate:

Without orelse
FILE *f = fopen(path, "r");
if (!f) return -1;
defer fclose(f);

char *src = read_file(f);
if (!src) return -1;
defer free(src);
With orelse
FILE *f =
    fopen(path, "r") orelse return -1;
defer fclose(f);

char *src =
    read_file(f) orelse return -1;
defer free(src);

orelse checks if the value is falsy (null pointer, zero). If so, the action fires and all active defers run — identical to a normal return.

Forms

FormExampleEmitted as
Control flowx = f() orelse return -1;x = f(); if (!x) return -1;
Blockx = f() orelse { log(); return -1; }x = f(); if (!x) { log(); return -1; }
Fallback valuex = f() orelse "default"x = f(); if (!x) x = "default";
Bare expressiondo_init() orelse return -1;if (!(do_init())) return -1;
Array dimensionint buf[n orelse 1]Temp variable hoisted before declaration
Decl initializerint x = f() orelse 0;Expanded with temp and null check

Control flow examples

int *p = get_ptr() orelse return -1;
int *q = next()    orelse break;
int *r = try_it()  orelse continue;
int *s = find()    orelse goto cleanup;

Block — arbitrary code on failure

FILE *f = fopen(path, "r") orelse {
    log_error("failed to open %s", path);
    return -1;
}

Fallback value

char *name = get_name() orelse "unknown";

Works with any falsy scalar

int fd = open(path, O_RDONLY) orelse return -1;  // integer: 0 is falsy
size_t n = read_data(fd, buf) orelse break;      // 0 bytes = done

Limitations

orelse does not support struct or union values — it is a compile error:

struct Vec2 v = make_vec2() orelse return -1;  // Error

Struct pointers work fine:

struct Vec2 *p = get_vec2() orelse return -1;  // OK

typedef caveat: Prism has no type checker — only a symbol table. If a struct is hidden behind a typedef, Prism cannot detect it. The code will pass Prism and fail at the backend compiler with a less helpful error. Use struct Foo * explicitly when possible.

Side-effect protection: orelse inside VLA dimension brackets (int buf[n orelse 1]) rejects expressions with side effects — ++, --, =, volatile reads, and function calls — to prevent double evaluation.

Invalid contexts (compile errors): inside struct bodies, inside typeof, in ternary expressions, in for-init control parens, at file scope.

Opt-out: prism -fno-orelse src.c

Safety enforcement

Prism's CFG verifier runs as Phase 2A — after all semantic analysis, before a single byte of output is emitted. All violations are hard errors by default.

CFG violation table

ViolationDefaultWith -fno-safety
Forward goto skips over deferErrorWarning
Forward goto skips over uninitialized declarationErrorWarning
Forward goto skips over VLA declarationAlways errorAlways error
Backward goto enters scope containing deferErrorWarning
Backward goto enters scope with uninitialized declarationErrorWarning
Backward goto enters scope containing VLA declarationAlways errorAlways error
switch/case skips defer via fallthroughErrorWarning
switch/case bypasses zero-initialized declarationErrorWarning

VLA skips are always hard errors because jumping past a VLA bypasses implicit stack allocation — this is undefined behavior regardless of whether the variable is initialized. raw does not exempt a VLA from this check.

// Error: goto 'skip' would skip over variable declaration 'x'
void unsafe() {
    goto skip;
    int x;
skip:
    printf("%d", x);
}

Defer in forbidden contexts

void bad() {
    jmp_buf buf;
    defer cleanup();  // Error: defer forbidden with setjmp
    if (setjmp(buf)) return;
}

_Generic compatibility

case and default inside _Generic association lists are not treated as switch cases — Prism tracks _Generic scope separately so they don't trigger spurious CFG violations.

Downgrade to warnings: prism -fno-safety src.c — turns CFG errors into warnings for gradual adoption on existing codebases. VLA violations remain hard errors.

CLI reference

Commands

CommandDescription
prism src.c -o outCompile (GCC-compatible)
prism run src.cTranspile, compile, and run immediately
prism transpile src.cOutput transpiled C to stdout
prism installInstall to /usr/local/bin/prism
prism --helpPrint usage
prism --versionPrint Prism version + backend CC version
prism -v (no source files)Pass all args through to backend CC

Prism flags

These are consumed by Prism and not passed to the backend compiler.

FlagEffect
-fno-deferDisable defer
-fno-zeroinitDisable zero-initialization
-fno-orelseDisable orelse
-fno-safetyDowngrade CFG errors to warnings (VLA violations remain errors)
-fno-line-directivesDisable #line directives in output
-fflatten-headersFlatten headers into single output file
-fno-flatten-headersDisable header flattening
--prism-cc=<compiler>Use a specific backend compiler
--prism-verboseShow commands being run

All other flags are passed through to the backend compiler unchanged. User-supplied -x <lang> is respected and used as the pipe language instead of the default c.

Compiler detection

When the backend compiler basename doesn't make it obvious (e.g. cc on Termux, FreeBSD, or some Linux distros), Prism probes <CC> --version to detect clang vs GCC. This is used to avoid passing unsupported flags — for example, -fpreprocessed is clang-only.

Multi-file & passthrough

# Multiple source files
prism main.c utils.c -o app

# Mix with assembly
prism main.c boot.s -o kernel

# C++ files pass through untouched
prism main.c helper.cpp -o mixed

Passthrough extensions (not transpiled): .s, .S (assembly), .cc, .cpp, .cxx, .mm (C++), .m (Objective-C).

Error reporting

Prism emits #line directives so compiler errors point to your original source, not the transpiled output:

# You see:
main.c:42:5: error: use of undeclared identifier 'foo'

# Not:
/tmp/prism_xyz.c:1847:5: error: use of undeclared identifier 'foo'

Disable with prism -fno-line-directives src.c — useful when debugging the transpiler output directly.

Library mode

Prism can be compiled as a library for embedding in other tools. This excludes the CLI main().

cc -DPRISM_LIB_MODE -c prism.c -o prism.o

API

PrismFeatures prism_defaults(void);
PrismResult   prism_transpile_file(const char *path, PrismFeatures features);
void          prism_free(PrismResult *r);

PrismFeatures holds compiler path, include paths, defines, compiler flags, force-includes, and boolean feature toggles (defer, zeroinit, orelse, line_directives, warn_safety, flatten_headers).

PrismResult returns a status code and the transpiled source string:

StatusMeaning
PRISM_OKSuccess
PRISM_ERR_SYNTAXTokenizer or parse error
PRISM_ERR_SEMANTICCFG or safety violation
PRISM_ERR_IOFile not found or read error

Error recovery: In library mode, error_tok triggers longjmp instead of exit(1). All arena-allocated Pass 1 structures are reclaimed on the jump — no leaks, no dangling pointers. The context is safe to reuse for the next call.

Architecture

Prism processes C in two passes. Pass 1 performs full semantic analysis and catches all errors. Pass 2 is a near-pure code generator that reads Pass 1's immutable artifacts — no type table mutations, no speculative token walking, no mid-emit errors.

Invariants

#Invariant
1Immutable symbol table. After Phase 1B completes, the typedef table is frozen. Pass 2 performs zero mutations to the typedef table, scope tree, annotation array, or function metadata.
2All errors before emission. Every error_tok call from semantic analysis fires during Pass 1 or Phase 2A. Pass 2 contains no safety-check errors.
3O(N) CFG verification. p1_verify_cfg is guaranteed linear in the number of per-function entry items. No O(N²) pairwise scans.
4Delimiter matching completeness. Every (, [, { has a matching index pointing to its closing pair, computed during tokenization, used pervasively in both passes.
5Self-host fixed point. Stage 1 and Stage 2 transpiled C output is identical (verified by CI).
6Arena safety. All arena-allocated Pass 1 structures are reclaimed on longjmp error recovery. No dangling pointers after reset.

Pass overview

PhaseWhat it does
Pass 0Tokenizer — flat pool of 20-byte tokens. Delimiter matching, keyword tagging (32 bitmask flags), per-function setjmp/vfork/asm taint graph. ~70–80% of tokens take a fast path that never touches the annotation array.
Pass 1AScope Tree — walk all tokens, assign scope IDs, build parent chain, classify each { (loop / switch / conditional / function body / struct / statement expression)
Pass 1BType Registration — full-depth typedef, enum, VLA tag registration at all scopes. Symbol table is frozen after this point — no mutations in Pass 2.
Pass 1CShadow Table — record every variable that shadows a typedef, with scope ID and token index for temporally-correct range-based lookup
Pass 1DCFG Collection — per-function arrays of labels, gotos, defers, declarations, switch/case entries
Pass 1EReturn Type Capture — record each function's return type token range and void/setjmp/vfork/asm flags
Pass 1FDefer Validation — reject forbidden patterns inside defer bodies: return, goto, break, continue, nested statement expressions
Pass 1GBraceless Tagging — mark control keywords whose braceless body contains Prism keywords needing brace injection
Pass 1HOrelse Classification — classify orelse in array dimension brackets and declaration initializers; reject at file scope
Pass 2ACFG Verification — O(N) snapshot-and-sweep: verify every goto→label and switch→case pair against defers and declarations. VLA skips are always hard errors regardless of -fno-safety.
Pass 2Code Generation — emit transformed C. Reads immutable scope tree, typedef table, shadow table. No type mutations, no safety checks, no token speculation.

Token structure

Tokens are 20 bytes on the hot path. Source location data (line_no, file_idx, loc_offset) is stored in a separate cold array to keep the hot pool cache-friendly.

FieldTypeDescription
taguint32_tBitmask of 32 TT_* keyword/semantic flags
next_idxuint32_tPool index of next token
match_idxuint32_tPool index of matching delimiter (0 = none)
lenuint32_tByte length of token text
kinduint8_tTK_IDENT, TK_KEYWORD, TK_PUNCT, TK_STR, TK_NUM, TK_PREP_DIR, TK_EOF
flagsuint8_tTF_AT_BOL, TF_HAS_SPACE, TF_IS_FLOAT, TF_OPEN, TF_CLOSE, TF_C23_ATTR, TF_RAW, TF_SIZEOF

Self-hosting

Prism compiles itself. The transpiled C output of stage 1 and stage 2 is identical — verified by the CI pipeline on every commit.

# Stage 0 — host compiler builds Prism from source
cc -O2 -o prism_stage0 prism.c

# Stage 1 — Prism compiles itself
./prism_stage0 prism.c -o prism_stage1

# Stage 2 — self-built Prism compiles itself again
./prism_stage1 prism.c -o prism_stage2

# Outputs are identical
diff <(./prism_stage1 transpile prism.c) <(./prism_stage2 transpile prism.c)

CI runs on: Linux x86_64, macOS x86_64/arm64, Windows (build-only), Linux arm64, Linux riscv64.

Binary differences between stage1 and stage2 on macOS are due to Mach-O LC_UUID metadata, not code differences. The transpiled C output is the canonical comparison.

Full specification: .github/SPEC.md