Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Linux ABI Support

This document describes how Ziren supports the Linux ABI inside its MIPS zkVM, covering execution, proving, and cross-shard verification.

Overview

Ziren runs MIPS guest programs compiled against a Linux userspace ABI. The guest issues SYSCALL instructions just like a real MIPS/Linux process. The zkVM intercepts these and either:

  1. Executes the syscall in the host executor (producing a concrete result), then
  2. Proves that the result is correct via AIR constraints in the SysLinuxChip.

The guest never touches real kernel code. The zkVM emulates a minimal Linux kernel that supports memory management, basic I/O, and process lifecycle — enough to run programs compiled with standard C/Go/Rust toolchains targeting MIPS.

Architecture

                        Guest Program (MIPS binary)
                                  |
                          SYSCALL (V0 = syscall number)
                                  |
                                  v
                  +-------------------------------+
                  |           Executor            |
                  |                               |
                  |  execute_operation()          |
                  |    -> SyscallCode::from_u32() |
                  |    -> get_syscall()           |
                  |    -> handler.execute()       |
                  |    -> emit_syscall_event()    |
                  +-------------------------------+
                          |               |
                    LinuxEvent       LinuxEvent
                          |               |
                          v               v
           +--------------------+     +------------------------+
           |    Core Shard      |     |   Precompile Shard     |
           |                    |     |                        |
           | SyscallInstrsChip  |     | SyscallChip(Precompile)|
           |   send_syscall()   |     |   receive_syscall()    |
           |        |           |     |         |              |
           | SyscallChip(Core)  |     |    SysLinuxChip        |
           |  receive_syscall() |     |    (81 columns)        |
           +--------|-----------+     +---------|------------- +
                    |                           |
                    |   global lookup message:  |
                    |   [shard, clk,            |
                    |    syscall_id,            |
                    |    arg1, arg2,            |
                    |    result_lo, result_hi]  |
                    |                           |
                    v                           v
                +--------------------------------------+
                |            GlobalChip                |
                |  Verify send/receive multiplicities  |
                |  Ensure result consistency           |
                +--------------------------------------+

Register Convention (MIPS ABI)

All registers are 32-bit (u32). In the AIR, each is represented as Word<T> = 4 x u8 (little-endian, each byte range-checked to [0, 255]).

RegisterWidthRole
V032-bitSyscall number (input) / return value (output)
A032-bitFirst argument
A132-bitSecond argument
A232-bitThird argument (read via memory when needed)
A332-bitError code output (0 = success, 9 = EBADF)

Supported Linux Syscalls

SYS_MMAP (4210) / SYS_MMAP2 (4090) — Memory Mapping

Used by the guest allocator to request memory pages.

ArgWidthSemantics
a032-bitRequested address. 0 = allocate from heap.
a132-bitSize in bytes. Rounded up to page boundary.
return v032-bitAllocated address (heap pointer when a0 == 0, or a0 itself).
output A332-bitAlways 0x00000000.

Execution logic:

  • a0 == 0: returns current heap pointer, increments heap by page-aligned size.
  • a0 != 0: returns a0 (reuse existing mapping).

Page alignment in the AIR:

The 32-bit size a1 is decomposed at the byte level to separate the 12-bit page offset from the 20-bit page-aligned upper address:

a1 = [byte0 : 8-bit] [byte1 : 8-bit] [byte2 : 8-bit] [byte3 : 8-bit]
                       ├── lo nibble: 4-bit (boolean bit decomposition)
                       └── hi nibble: 4-bit (boolean bit decomposition)

page_offset   = byte0 + lo_nibble * 256          (12-bit, range [0, 4095])
upper_address = hi_nibble * 4096 + byte2 * 65536 + byte3 * 16777216

The mmap_size is constrained byte-by-byte (not via field reduce()) to avoid KoalaBear prime collisions:

mmap_size[0] = 0
mmap_size[1] = hi_nibble * 16 + 16 * not_aligned - carry[0] * 256
mmap_size[2] = a1[2] + carry[0] - carry[1] * 256
mmap_size[3] = a1[3] + carry[1]

Where not_aligned = 1 when page_offset != 0 (round up to next page). The carry bits handle byte overflow from the +0x1000 addition. Page alignment (low 12 bits = 0) is structural: byte0 = 0 and every term in byte1 is a multiple of 16.

Heap update uses bytewise AddOperation (4 x u8 + 3 carry bits): new_heap = old_heap + mmap_size.

SYS_BRK (4045) — Program Break

ArgWidthSemantics
a032-bitNew break address.
a132-bitUnused.
return v032-bitmax(a0, current_brk).
output A332-bitAlways 0x00000000.

AIR uses GtColsBytes (bytewise greater-than with complementary LTU lookups) to compare a0 against the BRK register.

SYS_CLONE (4120) — Process Clone (Simulated)

ArgWidthSemantics
a032-bitClone flags (ignored).
a132-bitChild stack (ignored).
return v032-bitAlways 0x00000001.
output A332-bitAlways 0x00000000.

Threading is not implemented. The syscall always returns 1 (simulated parent PID).

SYS_EXIT_GROUP (4246) — Terminate Execution

ArgWidthSemantics
a032-bitExit code.
a132-bitUnused.
return v032-bitAlways 0x00000000.
output A332-bitAlways 0x00000000.

Sets next_pc = 0 and records the exit code. Equivalent to HALT.

SYS_READ (4003) — Read from File Descriptor

ArgWidthSemantics
a032-bitFile descriptor. Only 0 (stdin) is valid.
a132-bitBuffer address.
return v032-bitBytes read, or 0xFFFFFFFF on error.
output A332-bit0x00000000 on success, 0x00000009 (EBADF) on invalid fd.

Only stdin (fd 0) is supported. All other fds return -1 with EBADF.

SYS_WRITE (4004) — Write to File Descriptor

ArgWidthSemantics
a032-bitFile descriptor.
a132-bitBuffer address.
A2 (implicit)32-bitByte count (read from A2 register via memory).
return v032-bitBytes written (= A2 value).
output A332-bitAlways 0x00000000.

AIR constrains inorout.value == inorout.prev_value (read-only guard on A2 memory access).

SYS_FCNTL (4055) — File Control

ArgWidthSemantics
a032-bitFile descriptor (0/1/2 valid).
a132-bitCommand: 1 = F_GETFD, 3 = F_GETFL.
return v032-bitFlags/fd value, or 0xFFFFFFFF on error.
output A332-bit0x00000000 on success, 0x00000009 on error.

Full case matrix:

cmd (a1)fd (a0)result (v0)error (A3)
1 (F_GETFD)00x000000000x00000000
1 (F_GETFD)10x000000010x00000000
1 (F_GETFD)20x000000020x00000000
1 (F_GETFD)other0xFFFFFFFF0x00000009
3 (F_GETFL)00x00000000 (O_RDONLY)0x00000000
3 (F_GETFL)10x00000001 (O_WRONLY)0x00000000
3 (F_GETFL)20x00000001 (O_WRONLY)0x00000000
3 (F_GETFL)other0xFFFFFFFF0x00000009
otherany0xFFFFFFFF0x00000009

AIR uses bidirectional IsZeroOperation decoders on a0 (3 decoders) and a1 (2 decoders) with exhaustive branch constraints.

NOP Syscalls — No Operation

ArgWidthSemantics
a032-bitIgnored.
a132-bitIgnored.
return v032-bitAlways 0x00000000.
output A332-bitAlways 0x00000000.
SyscallNumber
SYS_OPEN4005
SYS_CLOSE4006
SYS_MUNMAP4091
SYS_NANOSLEEP4166
SYS_RT_SIGACTION4194
SYS_RT_SIGPROCMASK4195
SYS_SIGALTSTACK4206
SYS_FSTAT644215
SYS_MADVISE4218
SYS_GETTID4222
SYS_SCHED_GETAFFINITY4240
SYS_CLOCK_GETTIME4263
SYS_OPENAT4288
SYS_PRLIMIT644338

Any unrecognized Linux syscall ID also falls into the NOP path.

Cross-Shard Verification

Linux syscalls execute across two shards:

  • Core shard: CPU decodes SYSCALL, writes (shard, clk, syscall_id, arg1, arg2) to SyscallChip(Core).
  • Precompile shard: SysLinuxChip computes the result, receives the same tuple from SyscallChip(Precompile).

Both SyscallChip instances send two global lookup messages:

Global #1 (Syscall kind):
  [shard, clk, syscall_id, arg1_lo, arg1_hi, arg2_lo, arg2_hi]

Global #2 (SyscallResult kind):
  [shard, clk, syscall_id, result_lo, result_hi, 0, 0]

The GlobalChip verifies that send/receive multiplicities match for both messages. This ensures:

  • Argument integrity: The Core and Precompile shards process the same 32-bit arguments (half-word packed to prevent reduce() collisions modulo the KoalaBear prime).
  • Result consistency: Both shards agree on the syscall return value.

Arguments and results are packed as half-words (lo = byte0 + byte1 * 256, hi = byte2 + byte3 * 256), each U16Range-checked to [0, 65535]. This decomposition is injective for 32-bit values, unlike reduce() which can collide.

Linux vs Non-Linux Syscalls

Ziren supports two categories of syscalls, distinguished by the encoding of the syscall code in register V0:

PropertyLinux SyscallsNon-Linux (Precompile) Syscalls
Syscall IDMIPS ABI numbers (4003, 4045, 4210, ...)Ziren-defined codes (0x00000005, 0x01010006, ...)
ID encodingByte 1 of V0 is non-zeroByte 1 of V0 is zero
Proving chipSysLinuxChipDedicated chip per precompile (SHA256, Poseidon2, etc.)
Argument handlingFull byte-level: a0/a1 as Word<T> (4 x u8)Reduced field element: arg1 = reduce(op_b)
Result handlingresult as Word<T>, A3 error codeReturn value in V0 (precompile-specific)
Global linkageHalf-word packed args + result via two global lookupsHalf-word packed args via one global lookup
Local linkagesend_syscall_result_packed with is_linux multiplicitysend_syscall / receive_syscall (reduced args only)

How detection works

The SyscallInstrsChip examines byte 1 of the syscall code (prev_a_value[1]):

  • byte[1] != 0Linux syscall — routes to SysLinuxChip via lookup
  • byte[1] == 0Precompile syscall — routes to the precompile's dedicated chip

This is enforced bidirectionally via an IsZeroOperation, preventing a malicious prover from misrouting a precompile call into the Linux path or vice versa.

Argument representation across the pipeline

SyscallInstrsChip (Core shard):
  send_syscall(reduce(op_b), reduce(op_c))          -- reduced, for both types
  send_syscall_result(op_a_word, op_b_word, op_c_word) -- byte-level, linux only

SyscallChip (bridge):
  arg1 = arg1_lo + arg1_hi * 65536                   -- derived inline (not stored)
  arg1_lo, arg1_hi: U16Range-checked                 -- for ALL syscalls
  Global #1: [shard, clk, id, a1_lo, a1_hi, a2_lo, a2_hi]  -- collision-resistant
  Global #2: [shard, clk, id, result_lo, result_hi, 0, 0]   -- result linkage

SysLinuxChip (Precompile shard, linux only):
  receive_syscall_result(result_word, a0_word, a1_word)  -- byte-level constraints
  Constrains result based on a0/a1 byte values

Other precompile chips (Precompile shard, non-linux):
  receive_syscall(reduce(arg1), reduce(arg2))            -- reduced args only
  Constrains output based on memory at arg addresses

The key distinction: linux syscalls need byte-level argument constraints (e.g., MMAP checks page alignment of a1 bytes, FCNTL checks a0 == 0/1/2 as integers). Non-linux precompiles typically use arguments as memory pointers and operate on the data at those addresses, so reduced field elements suffice.

AIR Soundness Properties

The SysLinuxChip enforces these key properties (all proven bidirectionally):

  1. Syscall routing: Each Linux syscall ID maps to exactly one handler branch. Bidirectional IsZeroOperation decoders prevent misrouting (e.g., CLONE cannot be routed to NOP).

  2. Argument decoding: a0 == 0/1/2 and a1 == 1/3 flags are bidirectional — when the argument matches a known value, the flag MUST be set.

  3. Result correctness: Every branch constrains both result (V0) and output (A3) to specific values matching the executor semantics.

  4. Memory consistency: Read-only memory accesses (BRK read, A2 read) enforce value == prev_value. Write accesses (HEAP update) use bytewise AddOperation.

  5. Page alignment: MMAP size is constrained byte-by-byte. Low 12 bits of mmap_size are structurally zero (byte0 = 0, byte1 is always a multiple of 16). No field reduce() is used.

  6. Cross-shard linkage: Two global lookups per syscall — one for collision-resistant argument matching (half-word packed), one for result consistency. U16Range checks on all half-words ensure canonical decomposition for both linux and non-linux syscalls.