KernRift is a self-hosted bare-metal systems programming language and compiler designed for kernel-first development. It compiles source code into native x86_64 and ARM64 binaries for Linux, Windows, macOS, and Android, producing ELF, PE, Mach-O, and fat binaries as output.
Key Features:
Self-hosting: The compiler is written entirely in KernRift and compiles itself to a fixed point without relying on Rust, C, or LLVM.
SSA IR Backend: Target-independent intermediate representation with liveness analysis, graph-coloring register allocation, and optimization passes for efficient code generation.
Cross-platform Support: Builds executables for x86_64 and ARM64 across Linux, Windows, macOS, and Android from a single source tree.
Fat Binaries: Default output includes BCJ+LZ-Rift-compressed fat binaries with 8 platform slices, enabling seamless execution on any supported target.
Zero Runtime Dependencies: Static executables that require no external libraries or dynamic linkers.
Audience & Benefit:
Ideal for systems programmers and kernel developers seeking a self-contained toolchain to build efficient, cross-platform applications. KernRift provides a modern alternative for low-level development with built-in support for fat binaries and direct hardware access.
README
KernRift
KernRift is a bare-metal systems programming language and compiler created by Pantelis Christou.
A self-hosted systems language compiler for kernel-first development. KernRift compiles itself — no Rust, no C, no LLVM, no external toolchain. It produces native executables for x86_64 and AArch64 on Linux, Windows, macOS, and Android, with BCJ+LZ-Rift-compressed fat binaries as the default output (8 platform slices per .krbo). The kr runner executes .krbo fat binaries on any supported platform. The compiler self-hosts on all 8 targets and is verified via CI on every push. The compiler ships with an SSA-based IR backend with liveness analysis, graph-coloring register allocation, an AST-level function inliner, Briggs/George copy coalescing, LICM, constant folding, DCE, and CSE — producing native machine code for all targets directly from the IR, no assembler in the loop.
v2.8.26 highlights (full details in CHANGELOG.md):
Language ergonomics. Ternary cond ? then : else, let type inference (let n = a + b), match as an expression with bare-statement arms, continue inside for, loop { }, inclusive ranges 0..=n, and defer { }.
Diagnostics. Parser error recovery (many syntax errors per run, not just the first), file:line:col headers with a source line and ^~~~ caret, and "did you mean?" suggestions on undeclared names. The type checker is now default-on and fatal.
Codegen. Power-of-two //% strength-reduce to shr/and; every AArch64 branch displacement is range-checked instead of silently masked.
A large correctness batch across the IR and legacy backends, plus stdlib fixes (map growth, read_file, exp of negatives) — see the changelog.
Briggs/George copy coalescing, on by default. The graph-colouring register allocator collapses vN = copy vM pairs whose live ranges don't interfere, so the redundant mov rN, rN is dropped at emit time. Briggs is the conservative gate (refuses if ≥ K neighbours of the merged class would have degree ≥ K); George is a less-conservative fallback gated to K ≥ 8. krc.kr self-compile vs : x86_64 −72 B, arm64 −1592 B. disables.
AST-level function inliner. Pure single-expression callees (fn add(a, b) -> u64 { return a + b }) are folded into their call sites; DCE then drops the unused originals. --emit=obj / --emit=asm / --emit=ir keep every top-level fn live so symbols still appear in the linker table / asm listing / IR dump.
--help rewritten to cover every flag the parser handles, grouped by output / code-gen / living-compiler / info. Previously --legacy, --coalesce, --O0, and the entire lc proposal surface were undocumented.
IR ARM64 compile_fat fixed (R1). The v2.8.7-era miscompile that forced a --legacy --arch=arm64 shipping recipe is gone; ARM64 slices in fat binaries now go through IR by default. --legacy remains as an explicit opt-out, not a silent fallback.
Features
Self-hosting — the compiler compiles itself to a fixed point. No Rust, no C, no LLVM in the build.
SSA IR backend — target-independent intermediate representation with liveness analysis, graph-coloring register allocation with Briggs/George copy coalescing, an AST-level function inliner, LICM, constant folding, DCE, and CSE. Emits x86_64 and AArch64 machine code directly — no assembler, no linker in the loop. --legacy falls back to the original direct codegen.
Cross-platform — Linux, Windows, macOS, Android on x86_64 and ARM64 from a single source tree.
Floating-point — f32 and f64 types with full arithmetic, comparisons, conversions, and a math library (sin, cos, exp, log, pow, sqrt, fmt_f64). f16 for storage. Hardware sqrt, software trig/exp/log.
Multi-return — return (a, b) and (u64 x, u64 y) = call() for 2-tuple destructuring.
Fat binaries — default output is a .krbo with 8 platform slices (BCJ+LZ-Rift compressed). The kr runner extracts and executes the right slice at startup.
Zero dependencies at runtime — static executables, no libc, no dynamic linker.
Kernel-first primitives — device blocks for typed MMIO, load/store/vload/vstore builtins for clean pointer access, inline assembly with a large instruction table, signed comparisons, bitfield ops, atomic operations, --freestanding mode.
Clean pointer syntax — store32(addr, val) and load64(addr) instead of the verbose unsafe { *(addr as uint32) = val } form.
Slice parameters — fn foo([u8] data) with data.len for buffer-processing functions.
Fixed arrays — u8[256] buf locally, static u8[4096] page at module level, and Point[10] pts with pts[i].field syntax for struct arrays.
Volatile blocks — mfence on x86_64, DSB SY on ARM64 — completion barrier, not just ordering.
ARM64 system registers — MSR/MRS access in inline asm (20+ registers including SCTLR_EL1, VBAR_EL1, MPIDR_EL1).
All 8 targets self-compile. CI verifies bootstrap fixed point (krc3 == krc4) and runs 587 tests on every push. Numbers below are on an AMD Ryzen 9 7900X — see benchmarks/BENCHMARKS.md for the complete run including gcc / rustc comparisons.
Target
Legacy codegen
IR codegen (default)
IR vs legacy
linux x86_64 ELF
~290 ms / 1.20 MB
~1 135 ms / 1.15 MB
−4 % size
linux arm64 ELF
~290 ms / 1.04 MB
~1 130 ms / 0.83 MB
−20 % size
Fat binary (all 8)
—
~9.2 s / 3.84 MB
(IR all 8 slices)
The IR path now produces smaller binaries than legacy on both architectures. Two things landed since v2.8.8 to flip the size story: a partial used-callee-save prologue + cross-register spill-reload peephole (v2.8.21 RA work), and v2.8.24's Briggs/George copy coalescer. The function inliner (v2.8.24) also folds pure single-expression callees so DCE can drop the originals.
--legacy is now an explicit opt-out, not a fallback. --ir forces IR (the default). --no-coalesce turns off the copy coalescer.
Install
Linux / macOS / Android (Termux) — install script:
curl -sSf https://raw.githubusercontent.com/Pantelis23/KernRift/main/install.sh | sh
cargo install --git https://github.com/Pantelis23/KernRift-bootstrap kernriftc
make build && make install
This installs krc and kr to ~/.local/bin/ and the standard library to ~/.local/share/kernrift/. On Windows, the installer puts krc.exe and kr.exe into %LOCALAPPDATA%\KernRift\.
Language
import "std/string.kr"
import "std/io.kr"
struct Point {
u64 x
u64 y
}
fn Point.sum(Point self) -> u64 {
return self.x + self.y
}
fn fib(u64 n) -> u64 {
if n <= 1 { return n }
return fib(n - 1) + fib(n - 2)
}
fn main() {
Point p
p.x = fib(10)
p.y = 42
// int_to_str returns a pointer — use print_str, not println
u64 s = int_to_str(p.sum())
print_str("sum = ")
println_str(s)
exit(0)
}
import "std/math_float.kr"
fn main() {
f64 x = int_to_f64(2)
println_str(fmt_f64(sqrt(x), 6)) // "1.414213"
(u64 q, u64 r) = divmod(17, 5)
println(q) // 3
exit(0)
}
fn divmod(u64 a, u64 b) -> u64 {
return (a / b, a % b)
}
Types: u8/u16/u32/u64, i8/i16/i32/i64, f16/f32/f64 (long forms uint8..int64 also work), structs, enums, fixed-size arrays, device blocks. Control: if/else, while, for..in, break/continue, match, recursion. Functions with method syntax (fn Struct.method), slice parameters (fn foo([u8] data) { u64 n = data.len; ... }), imports with recursive resolution.
Import with import "std/string.kr" etc. The compiler searches ~/.local/share/kernrift/ automatically.
Editor Support
A VS Code extension (v0.2.3) is available on the VS Code Marketplace:
Syntax highlighting (TextMate grammar)
LSP server with diagnostics (krc check), completions, hover docs, and go-to-definition
Examples
See the examples/ directory for runnable programs covering every feature — pointers, slices, struct arrays, device blocks, recursion, stdin input, and more.
Architecture
~45 700 lines of KernRift across 19 source files + 18 stdlib modules (227 K tokens, 142 K AST nodes on self-compile). Self-compiles to a 1.15 MB x86_64 native binary in ~1.1 s (IR, default), a 0.83 MB ARM64 binary, or an 8-slice fat binary (BCJ + LZ-Rift compression) in ~9.2 s on an AMD Ryzen 9 7900X. 448 tests pass, bootstrap fixed point verified on all 8 targets — Linux, macOS, Windows, and Android on both x86_64 and ARM64. See benchmarks/BENCHMARKS.md for micro-benchmarks vs gcc / rustc and peak-memory numbers.
File
Purpose
lexer.kr
Tokenizer (90+ kinds)
parser.kr
Recursive descent + Pratt precedence
ir.kr
SSA IR + x86_64 emitter (Linux / macOS / Windows / Android), liveness, graph-colour RA, Briggs/George coalescer, LICM, CF/DCE/CSE
ir_aarch64.kr
AArch64 emitter fed from the same IR
inliner.kr
AST-level pass that folds pure single-expression callees into call sites
A released krc binary compiles the current source into the next krc. No Rust, no C, no LLVM involved. CI verifies the fixed point on every push across all 8 platform targets.