Interaction Trees in Zig and simple benchmarks

This commit is contained in:
2026-05-15 21:41:19 -05:00
parent e3dcf5edd7
commit 8d5e76db1c
17 changed files with 2179 additions and 81 deletions

193
docs/zig-io.md Normal file
View File

@@ -0,0 +1,193 @@
# Zig Interaction-Tree IO Runtime Plan
## Goal
Port the Haskell `IODriver` interaction-tree system into the Zig host so that:
1. The Zig CLI (`tricu-zig`) can execute tricu programs with effects (`putStr`, `readFile`, `fork`, etc.).
2. The C FFI (`libarboricx`) exposes a single `arb_run_io` call, giving every language host (C, Python, PHP, Node) turnkey IO without reimplementing the protocol.
3. The fast native reduction path (currently ~0.005s for `id "hello"`) is used for pure computation; IO syscalls happen only at effect boundaries.
## Current State
| Host | Reduction Speed | IO Support |
|------|----------------|------------|
| Haskell interpreter | ~1.7s for `runArboricxTyped` demo | Full `IODriver.hs` with scheduler, async, permissions |
| Zig native | ~0.005s for `append` | None — pure reduction only |
| Zig kernel | ~0.235s for `id.arboricx` | None — runs self-hosted parser, no effects |
| C / Python / PHP FFI | Native Zig speed | None — can construct and reduce, cannot interpret interaction trees |
The Haskell `IODriver` is ~500 lines of stateful code (scheduler, frame stack, permission checks, async lifecycle). Replicating it in every host language is a maintenance hazard. We will implement it **once** in Zig and share it through the C ABI.
## Architecture
### Layer 1 — Tree Inspection Primitives (C FFI)
Minimal functions that let C (or other FFIs) inspect raw tree shape. Used internally by the driver, and exposed for non-POSIX hosts that need custom effect handlers.
```c
int arb_is_leaf(arb_ctx_t* ctx, uint32_t root);
int arb_is_stem(arb_ctx_t* ctx, uint32_t root);
int arb_is_fork(arb_ctx_t* ctx, uint32_t root);
int arb_get_stem_child(arb_ctx_t* ctx, uint32_t root, uint32_t* out);
int arb_get_fork_children(arb_ctx_t* ctx, uint32_t root,
uint32_t* out_left, uint32_t* out_right);
```
### Layer 2 — POSIX IO Driver (C FFI)
A single high-level call that runs the full interaction-tree loop:
```c
typedef struct {
int allow_read_all;
int allow_write_all;
const char** allowed_read_paths;
size_t allowed_read_count;
const char** allowed_write_paths;
size_t allowed_write_count;
} arb_io_perms_t;
// Reduce → decode action → perform syscall → feed result → repeat until pure.
// Returns the final pure tree value.
uint32_t arb_run_io(arb_ctx_t* ctx, uint32_t program,
const arb_io_perms_t* perms);
```
This is the only call 99% of hosts need. It contains the exact same logic as `IODriver.hs`:
- **Frame stack** — `BindFrame` (sequencing) and `LocalFrame` (environment scoping)
- **Runtime** — permissions, environment tree, mutable state tree
- **Action dispatch** — decode the tag (pure, bind, putStr, getLine, readFile, writeFile, ask, local, get, put, fork, await, yield, sleep)
- **Scheduler** — runnable queue, blocked tasks, sleeping tasks, wake-on-completion, deadlock detection
- **Error protocol** — ok/err pairs with numeric codes
### Zig CLI Integration
Add `--io` and `--unsafe-io` flags to `tricu-zig`:
```bash
# Safe mode — no filesystem access (default when --io is used)
tricu-zig --io greet.arboricx
# Unsafe mode — full POSIX access (development / local scripts)
tricu-zig --io --unsafe-io writeThenRead.arboricx
# Specific paths
# (future: --allow-read ./foo --allow-write ./bar)
```
Under `--io`, the CLI loads the bundle, reduces it once to WHNF, then passes the root to `arb_run_io` instead of eagerly decoding the final value.
## Implementation Stages
### Stage 1 — Tree Inspection Primitives
Add the five inspection functions to `ext/zig/src/c_abi.zig` and `ext/zig/include/arboricx.h`. No logic changes to reduction; these just read arena node tags.
**Acceptance:** A C test program can walk an arbitrary tree built with `arb_fork`/`arb_stem`/`arb_leaf` without knowing the arena internals.
### Stage 2 — IO Protocol Decoder
Write `ext/zig/src/io_driver.zig` containing:
- `decodeAction` — inspect a reduced tree and identify the action tag (pure=0, bind=1, putStr=10, …)
- `isIOSentinel` — verify `"tricuIO"` sentinel and version
- `makePure`, `makeOkResult`, `makeErrResult` — construct standard response trees
These are pure Zig functions with no syscalls. They mirror `IODriver.hs` logic but operate on arena indices.
**Acceptance:** Unit tests decode each action type correctly from trees built via codecs.
### Stage 3 — Synchronous IO Loop
Implement the core driver loop with a frame stack:
```zig
while (true) {
current = reduce.reduce(current, scratch_arena, fuel);
if (isIOSentinel(current)) |action| {
switch (decodeAction(action)) {
.pure => { /* pop frame or return */ },
.bind => { /* push BindFrame, recurse into left */ },
.putStr => { /* write stdout, continue with Leaf */ },
.getLine => { /* read stdin, continue with string */ },
// ... etc
}
} else {
return current; // pure result
}
}
```
Support synchronous actions only: `pure`, `bind`, `putStr`, `getLine`, `readFile`, `writeFile`, `ask`, `local`, `get`, `put`.
**Acceptance:** `greet.tri` and `writeThenRead.tri` run correctly through `tricu-zig --io`.
### Stage 4 — Scheduler and Async Actions
Add the task scheduler for `fork`, `await`, `yield`, `sleep`:
- `Runnable` queue (FIFO)
- `BlockedOn` map (task → blocked task ID)
- `Sleeping` map (task → wake time)
- Round-robin scheduling with `yield` and `sleep` support
- Deadlock detection when no runnable tasks remain and no sleepers
This mirrors `IODriver.hs` exactly, including task handle encoding (`Fork("task", n)`).
**Acceptance:** `demos/interactionTrees/forkAwait.tri` and `yield.tri` pass.
### Stage 5 — Permission System
Port path canonicalization and permission checks from Haskell:
- Syntactic normalization (resolve `.`, reject `..`)
- `--unsafe-io` bypass (allow all)
- `--allow-read PATH` / `--allow-write PATH` allowlists
- Error code 20 (`errPolicyDeny`) on violation
**Acceptance:** File operations outside allowed paths return err pairs, not crashes.
### Stage 6 — FFI Integration and Host Rollout
- Expose `arb_run_io` in the C header
- Update Python FFI test to verify IO round-trip
- Update PHP wrapper to support `--io`
- Document the two-layer model for future hosts (use `arb_run_io` for POSIX, Layer 1 primitives for custom runtimes)
**Acceptance:** Every existing FFI test still passes; new IO test passes in Python.
## Design Decisions
### Why baked-in POSIX effects?
- Most hosts (C, Python, PHP, native CLI) want real stdout/stdin/files.
- One canonical implementation avoids divergence.
- The Haskell `IODriver.hs` remains the reference spec; the Zig driver is the production runtime.
### Why not callback-based by default?
Callbacks add complexity for the common case. If a non-POSIX host (e.g., browser JS) needs custom effects, it can use the Layer 1 inspection primitives to build a ~50-line shim without reimplementing the scheduler. We can add `arb_run_io_with_callbacks` later if demand exists.
### Why not implement in every host language?
The Haskell `IODriver` is subtle: frame stack unwinding, async lifecycle, deadlock detection, path canonicalization, error code protocol. Bugs in any reimplementation would fracture the language ecosystem. A shared native driver is the only maintainable answer.
## Risks and Open Questions
1. **Fuel exhaustion during IO loops**`arb_run_io` internally calls `reduce.reduce` with a fuel parameter. Should it accept a total fuel budget, or reset fuel per reduction step? The Haskell side has no fuel limit; we may want `arb_run_io_unlimited` and `arb_run_io_fueled` variants.
2. **State threading** — The Haskell driver threads an environment and mutable state tree through the runtime. These are opaque `T` values manipulated by tricu code. The Zig driver must preserve them exactly across scheduler switches.
3. **Binary vs text I/O**`readFile` currently returns bytes (via `ofBytes` / `toString` in Haskell). The Zig driver must match the encoding exactly so that tricu code sees the same values in both hosts.
4. **Error parity** — Every error code (199) and its corresponding tree shape must match Haskell exactly. Divergence here breaks cross-host compatibility.
## Success Criteria
- `tricu-zig --io demos/interactionTrees/greet.tri` prints `Hello, tricu` in <10ms.
- `tricu-zig --io --unsafe-io demos/interactionTrees/writeThenRead.tri` writes and reads back a temp file correctly.
- `tricu-zig --io --unsafe-io demos/interactionTrees/forkAwait.tri` completes with correct async results.
- Python FFI can call `arb_run_io` and observe stdout from a tricu program.
- No regression in pure-reduction benchmarks (native path still ~0.005s for `id`).