Prism Documentation
A dialect of C with defer, orelse, and automatic zero-initialization. Drop into any codebase with CC=prism make.
Install
Linux / macOS
cc prism.c -flto -s -O3 -o prism && ./prism install
Downloads a single prism.c file, compiles it with your system compiler, and installs to /usr/local/bin/prism.
Windows (MSVC)
Open a Developer Command Prompt (or run vcvars64.bat) and build:
cl /Fe:prism.exe prism.c /O2 /D_CRT_SECURE_NO_WARNINGS /nologo
Requires Visual Studio Build Tools with the Desktop development with C++ workload.
Quick start
# Compile a file
prism foo.c -o foo
# Compile and run immediately
prism run foo.c
# See the transpiled output
prism transpile foo.c
Drop-in overlay
Prism uses a GCC-compatible interface. Any flag you'd pass to gcc or clang passes through automatically.
# Instead of:
CC=gcc make
# Use:
CC=prism make
All standard flags (-O2, -Wall, -I, -L, -l) pass through to the backend compiler unchanged. Prism consumes only its own flags.
defer
The problem: C requires manual cleanup at every exit point. Each new resource adds cleanup to every error path. Miss one and you leak.
int compile(const char *path) {
FILE *f = fopen(path, "r");
if (!f) return -1;
char *src = read_file(f);
if (!src) {
fclose(f);
return -1;
}
Token *tok = tokenize(src);
if (!tok) {
free(src);
fclose(f);
return -1;
}
int result = emit(ast);
free(src); // remember all
fclose(f); // in right order
return result;
}
int compile(const char *path) {
FILE *f = fopen(path, "r");
if (!f) return -1;
defer fclose(f);
char *src = read_file(f);
if (!src) return -1;
defer free(src);
Token *tok = tokenize(src);
if (!tok) return -1;
defer token_free(tok);
Node *ast = parse(tok);
if (!ast) return -1;
defer node_free(ast);
return emit(ast);
}
Write cleanup once. It runs on every exit — return, break, continue, goto, or reaching }. Defers execute in LIFO order (last defer runs first).
Edge cases
- Statement expressions
({ ... })— defers fire at the inner scope, not the outer function scope switchfallthrough — defers don't double-fire between cases (verified by the CFG pass)- Nested loops —
break/continueunwind the correct scope's defers only - Computed goto —
goto *ptrwith active defers is a hard error; defer cleanup cannot be determined for an indirect jump target
Forbidden contexts
Defer is rejected in functions that use:
setjmp/longjmp— non-local jumps bypass cleanupvfork— including the(vfork)()paren-wrapped pattern- Inline
asm— assembly may jump out of the function unpredictably
Inside a defer body, these are always errors regardless of context:
return,goto,break,continue- GNU statement expressions
({…})containing any of the above
Opt-out: prism -fno-defer src.c — disables defer entirely for a file.
Zero-init
The problem: Uninitialized reads are the #1 source of C vulnerabilities. -Wall only catches obvious cases.
int sum_positive(int *arr, int n) {
int total; // could be anything
for (int i = 0; i < n; i++)
if (arr[i] > 0)
total += arr[i];
return total; // UB if no positives
}
// All locals start at zero:
int x; // 0
char *ptr; // NULL
int arr[10]; // {0, 0, ...}
struct {
int a; float b;
} s; // {0, 0.0}
Before code generation, Pass 1 walks the entire preprocessed token stream at all depths to build a complete symbol table of every typedef, enum constant, parameter shadow, and VLA tag. This distinguishes size_t x; (declaration → initialize) from size_t * x; (expression → don't touch).
What gets initialized
| Declaration | Transformation |
|---|---|
int x; | int x = 0; |
char *p; | char *p = 0; |
struct S s; | struct S s = {0}; |
int arr[10]; | int arr[10] = {0}; |
int arr[n]; (VLA) | int arr[n]; memset(arr, 0, sizeof(arr)); |
T x; where T is a VLA typedef | T x; memset(x, 0, sizeof(x)); |
Zero-initialization only applies inside function bodies. File-scope declarations and struct/union/enum definitions are never touched.
Opt-out: prism -fno-zeroinit src.c — or per-variable with the raw keyword.
raw
Opt out of zero-initialization for a specific variable without disabling it globally.
raw int x; // uninitialized
raw char buf[65536]; // no memset overhead
raw struct Large data; // skip zeroing
When to use
- Large buffers immediately overwritten (
read(),recv()) - Performance-critical inner loops where zeroing is measurable
- Interfacing with APIs that fully initialize the data themselves
Multi-declarator edge case: raw applies only to the first declarator. In raw int x, y;, only x opts out — y is still zero-initialized.
raw variables can be jumped over by goto safely — no initialization to bypass. Exception: raw on a VLA does not exempt it from the goto check, because jumping past a VLA bypasses implicit stack allocation regardless of initialization.
orelse
Handle failure inline — check a value and bail in one expression.
defer solved the cleanup problem. orelse solves the check-and-bail boilerplate:
FILE *f = fopen(path, "r");
if (!f) return -1;
defer fclose(f);
char *src = read_file(f);
if (!src) return -1;
defer free(src);
FILE *f =
fopen(path, "r") orelse return -1;
defer fclose(f);
char *src =
read_file(f) orelse return -1;
defer free(src);
orelse checks if the value is falsy (null pointer, zero). If so, the action fires and all active defers run — identical to a normal return.
Forms
| Form | Example | Emitted as |
|---|---|---|
| Control flow | x = f() orelse return -1; | x = f(); if (!x) return -1; |
| Block | x = f() orelse { log(); return -1; } | x = f(); if (!x) { log(); return -1; } |
| Fallback value | x = f() orelse "default" | x = f(); if (!x) x = "default"; |
| Bare expression | do_init() orelse return -1; | if (!(do_init())) return -1; |
| Array dimension | int buf[n orelse 1] | Temp variable hoisted before declaration |
| Decl initializer | int x = f() orelse 0; | Expanded with temp and null check |
Control flow examples
int *p = get_ptr() orelse return -1;
int *q = next() orelse break;
int *r = try_it() orelse continue;
int *s = find() orelse goto cleanup;
Block — arbitrary code on failure
FILE *f = fopen(path, "r") orelse {
log_error("failed to open %s", path);
return -1;
}
Fallback value
char *name = get_name() orelse "unknown";
Works with any falsy scalar
int fd = open(path, O_RDONLY) orelse return -1; // integer: 0 is falsy
size_t n = read_data(fd, buf) orelse break; // 0 bytes = done
Limitations
orelse does not support struct or union values — it is a compile error:
struct Vec2 v = make_vec2() orelse return -1; // Error
Struct pointers work fine:
struct Vec2 *p = get_vec2() orelse return -1; // OK
typedef caveat: Prism has no type checker — only a symbol table. If a struct is hidden behind a typedef, Prism cannot detect it. The code will pass Prism and fail at the backend compiler with a less helpful error. Use struct Foo * explicitly when possible.
Side-effect protection: orelse inside VLA dimension brackets (int buf[n orelse 1]) rejects expressions with side effects — ++, --, =, volatile reads, and function calls — to prevent double evaluation.
Invalid contexts (compile errors): inside struct bodies, inside typeof, in ternary expressions, in for-init control parens, at file scope.
Opt-out: prism -fno-orelse src.c
Safety enforcement
Prism's CFG verifier runs as Phase 2A — after all semantic analysis, before a single byte of output is emitted. All violations are hard errors by default.
CFG violation table
| Violation | Default | With -fno-safety |
|---|---|---|
Forward goto skips over defer | Error | Warning |
Forward goto skips over uninitialized declaration | Error | Warning |
Forward goto skips over VLA declaration | Always error | Always error |
Backward goto enters scope containing defer | Error | Warning |
Backward goto enters scope with uninitialized declaration | Error | Warning |
Backward goto enters scope containing VLA declaration | Always error | Always error |
switch/case skips defer via fallthrough | Error | Warning |
switch/case bypasses zero-initialized declaration | Error | Warning |
VLA skips are always hard errors because jumping past a VLA bypasses implicit stack allocation — this is undefined behavior regardless of whether the variable is initialized. raw does not exempt a VLA from this check.
// Error: goto 'skip' would skip over variable declaration 'x'
void unsafe() {
goto skip;
int x;
skip:
printf("%d", x);
}
Defer in forbidden contexts
void bad() {
jmp_buf buf;
defer cleanup(); // Error: defer forbidden with setjmp
if (setjmp(buf)) return;
}
_Generic compatibility
case and default inside _Generic association lists are not treated as switch cases — Prism tracks _Generic scope separately so they don't trigger spurious CFG violations.
Downgrade to warnings: prism -fno-safety src.c — turns CFG errors into warnings for gradual adoption on existing codebases. VLA violations remain hard errors.
CLI reference
Commands
| Command | Description |
|---|---|
| prism src.c -o out | Compile (GCC-compatible) |
| prism run src.c | Transpile, compile, and run immediately |
| prism transpile src.c | Output transpiled C to stdout |
| prism install | Install to /usr/local/bin/prism |
| prism --help | Print usage |
| prism --version | Print Prism version + backend CC version |
| prism -v (no source files) | Pass all args through to backend CC |
Prism flags
These are consumed by Prism and not passed to the backend compiler.
| Flag | Effect |
|---|---|
| -fno-defer | Disable defer |
| -fno-zeroinit | Disable zero-initialization |
| -fno-orelse | Disable orelse |
| -fno-safety | Downgrade CFG errors to warnings (VLA violations remain errors) |
| -fno-line-directives | Disable #line directives in output |
| -fflatten-headers | Flatten headers into single output file |
| -fno-flatten-headers | Disable header flattening |
| --prism-cc=<compiler> | Use a specific backend compiler |
| --prism-verbose | Show commands being run |
All other flags are passed through to the backend compiler unchanged. User-supplied -x <lang> is respected and used as the pipe language instead of the default c.
Compiler detection
When the backend compiler basename doesn't make it obvious (e.g. cc on Termux, FreeBSD, or some Linux distros), Prism probes <CC> --version to detect clang vs GCC. This is used to avoid passing unsupported flags — for example, -fpreprocessed is clang-only.
Multi-file & passthrough
# Multiple source files
prism main.c utils.c -o app
# Mix with assembly
prism main.c boot.s -o kernel
# C++ files pass through untouched
prism main.c helper.cpp -o mixed
Passthrough extensions (not transpiled): .s, .S (assembly), .cc, .cpp, .cxx, .mm (C++), .m (Objective-C).
Error reporting
Prism emits #line directives so compiler errors point to your original source, not the transpiled output:
# You see:
main.c:42:5: error: use of undeclared identifier 'foo'
# Not:
/tmp/prism_xyz.c:1847:5: error: use of undeclared identifier 'foo'
Disable with prism -fno-line-directives src.c — useful when debugging the transpiler output directly.
Library mode
Prism can be compiled as a library for embedding in other tools. This excludes the CLI main().
cc -DPRISM_LIB_MODE -c prism.c -o prism.o
API
PrismFeatures prism_defaults(void);
PrismResult prism_transpile_file(const char *path, PrismFeatures features);
void prism_free(PrismResult *r);
PrismFeatures holds compiler path, include paths, defines, compiler flags, force-includes, and boolean feature toggles (defer, zeroinit, orelse, line_directives, warn_safety, flatten_headers).
PrismResult returns a status code and the transpiled source string:
| Status | Meaning |
|---|---|
PRISM_OK | Success |
PRISM_ERR_SYNTAX | Tokenizer or parse error |
PRISM_ERR_SEMANTIC | CFG or safety violation |
PRISM_ERR_IO | File not found or read error |
Error recovery: In library mode, error_tok triggers longjmp instead of exit(1). All arena-allocated Pass 1 structures are reclaimed on the jump — no leaks, no dangling pointers. The context is safe to reuse for the next call.
Architecture
Prism processes C in two passes. Pass 1 performs full semantic analysis and catches all errors. Pass 2 is a near-pure code generator that reads Pass 1's immutable artifacts — no type table mutations, no speculative token walking, no mid-emit errors.
Invariants
| # | Invariant |
|---|---|
| 1 | Immutable symbol table. After Phase 1B completes, the typedef table is frozen. Pass 2 performs zero mutations to the typedef table, scope tree, annotation array, or function metadata. |
| 2 | All errors before emission. Every error_tok call from semantic analysis fires during Pass 1 or Phase 2A. Pass 2 contains no safety-check errors. |
| 3 | O(N) CFG verification. p1_verify_cfg is guaranteed linear in the number of per-function entry items. No O(N²) pairwise scans. |
| 4 | Delimiter matching completeness. Every (, [, { has a matching index pointing to its closing pair, computed during tokenization, used pervasively in both passes. |
| 5 | Self-host fixed point. Stage 1 and Stage 2 transpiled C output is identical (verified by CI). |
| 6 | Arena safety. All arena-allocated Pass 1 structures are reclaimed on longjmp error recovery. No dangling pointers after reset. |
Pass overview
| Phase | What it does |
|---|---|
| Pass 0 | Tokenizer — flat pool of 20-byte tokens. Delimiter matching, keyword tagging (32 bitmask flags), per-function setjmp/vfork/asm taint graph. ~70–80% of tokens take a fast path that never touches the annotation array. |
| Pass 1A | Scope Tree — walk all tokens, assign scope IDs, build parent chain, classify each { (loop / switch / conditional / function body / struct / statement expression) |
| Pass 1B | Type Registration — full-depth typedef, enum, VLA tag registration at all scopes. Symbol table is frozen after this point — no mutations in Pass 2. |
| Pass 1C | Shadow Table — record every variable that shadows a typedef, with scope ID and token index for temporally-correct range-based lookup |
| Pass 1D | CFG Collection — per-function arrays of labels, gotos, defers, declarations, switch/case entries |
| Pass 1E | Return Type Capture — record each function's return type token range and void/setjmp/vfork/asm flags |
| Pass 1F | Defer Validation — reject forbidden patterns inside defer bodies: return, goto, break, continue, nested statement expressions |
| Pass 1G | Braceless Tagging — mark control keywords whose braceless body contains Prism keywords needing brace injection |
| Pass 1H | Orelse Classification — classify orelse in array dimension brackets and declaration initializers; reject at file scope |
| Pass 2A | CFG Verification — O(N) snapshot-and-sweep: verify every goto→label and switch→case pair against defers and declarations. VLA skips are always hard errors regardless of -fno-safety. |
| Pass 2 | Code Generation — emit transformed C. Reads immutable scope tree, typedef table, shadow table. No type mutations, no safety checks, no token speculation. |
Token structure
Tokens are 20 bytes on the hot path. Source location data (line_no, file_idx, loc_offset) is stored in a separate cold array to keep the hot pool cache-friendly.
| Field | Type | Description |
|---|---|---|
| tag | uint32_t | Bitmask of 32 TT_* keyword/semantic flags |
| next_idx | uint32_t | Pool index of next token |
| match_idx | uint32_t | Pool index of matching delimiter (0 = none) |
| len | uint32_t | Byte length of token text |
| kind | uint8_t | TK_IDENT, TK_KEYWORD, TK_PUNCT, TK_STR, TK_NUM, TK_PREP_DIR, TK_EOF |
| flags | uint8_t | TF_AT_BOL, TF_HAS_SPACE, TF_IS_FLOAT, TF_OPEN, TF_CLOSE, TF_C23_ATTR, TF_RAW, TF_SIZEOF |
Self-hosting
Prism compiles itself. The transpiled C output of stage 1 and stage 2 is identical — verified by the CI pipeline on every commit.
# Stage 0 — host compiler builds Prism from source
cc -O2 -o prism_stage0 prism.c
# Stage 1 — Prism compiles itself
./prism_stage0 prism.c -o prism_stage1
# Stage 2 — self-built Prism compiles itself again
./prism_stage1 prism.c -o prism_stage2
# Outputs are identical
diff <(./prism_stage1 transpile prism.c) <(./prism_stage2 transpile prism.c)
CI runs on: Linux x86_64, macOS x86_64/arm64, Windows (build-only), Linux arm64, Linux riscv64.
Binary differences between stage1 and stage2 on macOS are due to Mach-O LC_UUID metadata, not code differences. The transpiled C output is the canonical comparison.
Full specification: .github/SPEC.md