Files
tricu/docs/zig-io.md

194 lines
8.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Zig Interaction-Tree IO Runtime Plan
## Goal
Port the Haskell `IODriver` interaction-tree system into the Zig host so that:
1. The Zig CLI (`tricu-zig`) can execute tricu programs with effects (`putStr`, `readFile`, `fork`, etc.).
2. The C FFI (`libarboricx`) exposes a single `arb_run_io` call, giving every language host (C, Python, PHP, Node) turnkey IO without reimplementing the protocol.
3. The fast native reduction path (currently ~0.005s for `id "hello"`) is used for pure computation; IO syscalls happen only at effect boundaries.
## Current State
| Host | Reduction Speed | IO Support |
|------|----------------|------------|
| Haskell interpreter | ~1.7s for `runArboricxTyped` demo | Full `IODriver.hs` with scheduler, async, permissions |
| Zig native | ~0.005s for `append` | None — pure reduction only |
| Zig kernel | ~0.235s for `id.arboricx` | None — runs self-hosted parser, no effects |
| C / Python / PHP FFI | Native Zig speed | None — can construct and reduce, cannot interpret interaction trees |
The Haskell `IODriver` is ~500 lines of stateful code (scheduler, frame stack, permission checks, async lifecycle). Replicating it in every host language is a maintenance hazard. We will implement it **once** in Zig and share it through the C ABI.
## Architecture
### Layer 1 — Tree Inspection Primitives (C FFI)
Minimal functions that let C (or other FFIs) inspect raw tree shape. Used internally by the driver, and exposed for non-POSIX hosts that need custom effect handlers.
```c
int arb_is_leaf(arb_ctx_t* ctx, uint32_t root);
int arb_is_stem(arb_ctx_t* ctx, uint32_t root);
int arb_is_fork(arb_ctx_t* ctx, uint32_t root);
int arb_get_stem_child(arb_ctx_t* ctx, uint32_t root, uint32_t* out);
int arb_get_fork_children(arb_ctx_t* ctx, uint32_t root,
uint32_t* out_left, uint32_t* out_right);
```
### Layer 2 — POSIX IO Driver (C FFI)
A single high-level call that runs the full interaction-tree loop:
```c
typedef struct {
int allow_read_all;
int allow_write_all;
const char** allowed_read_paths;
size_t allowed_read_count;
const char** allowed_write_paths;
size_t allowed_write_count;
} arb_io_perms_t;
// Reduce → decode action → perform syscall → feed result → repeat until pure.
// Returns the final pure tree value.
uint32_t arb_run_io(arb_ctx_t* ctx, uint32_t program,
const arb_io_perms_t* perms);
```
This is the only call 99% of hosts need. It contains the exact same logic as `IODriver.hs`:
- **Frame stack** — `BindFrame` (sequencing) and `LocalFrame` (environment scoping)
- **Runtime** — permissions, environment tree, mutable state tree
- **Action dispatch** — decode the tag (pure, bind, putStr, getLine, readFile, writeFile, ask, local, get, put, fork, await, yield, sleep)
- **Scheduler** — runnable queue, blocked tasks, sleeping tasks, wake-on-completion, deadlock detection
- **Error protocol** — ok/err pairs with numeric codes
### Zig CLI Integration
Add `--io` and `--unsafe-io` flags to `tricu-zig`:
```bash
# Safe mode — no filesystem access (default when --io is used)
tricu-zig --io greet.arboricx
# Unsafe mode — full POSIX access (development / local scripts)
tricu-zig --io --unsafe-io writeThenRead.arboricx
# Specific paths
# (future: --allow-read ./foo --allow-write ./bar)
```
Under `--io`, the CLI loads the bundle, reduces it once to WHNF, then passes the root to `arb_run_io` instead of eagerly decoding the final value.
## Implementation Stages
### Stage 1 — Tree Inspection Primitives
Add the five inspection functions to `ext/zig/src/c_abi.zig` and `ext/zig/include/arboricx.h`. No logic changes to reduction; these just read arena node tags.
**Acceptance:** A C test program can walk an arbitrary tree built with `arb_fork`/`arb_stem`/`arb_leaf` without knowing the arena internals.
### Stage 2 — IO Protocol Decoder
Write `ext/zig/src/io_driver.zig` containing:
- `decodeAction` — inspect a reduced tree and identify the action tag (pure=0, bind=1, putStr=10, …)
- `isIOSentinel` — verify `"tricuIO"` sentinel and version
- `makePure`, `makeOkResult`, `makeErrResult` — construct standard response trees
These are pure Zig functions with no syscalls. They mirror `IODriver.hs` logic but operate on arena indices.
**Acceptance:** Unit tests decode each action type correctly from trees built via codecs.
### Stage 3 — Synchronous IO Loop
Implement the core driver loop with a frame stack:
```zig
while (true) {
current = reduce.reduce(current, scratch_arena, fuel);
if (isIOSentinel(current)) |action| {
switch (decodeAction(action)) {
.pure => { /* pop frame or return */ },
.bind => { /* push BindFrame, recurse into left */ },
.putStr => { /* write stdout, continue with Leaf */ },
.getLine => { /* read stdin, continue with string */ },
// ... etc
}
} else {
return current; // pure result
}
}
```
Support synchronous actions only: `pure`, `bind`, `putStr`, `getLine`, `readFile`, `writeFile`, `ask`, `local`, `get`, `put`.
**Acceptance:** `greet.tri` and `writeThenRead.tri` run correctly through `tricu-zig --io`.
### Stage 4 — Scheduler and Async Actions
Add the task scheduler for `fork`, `await`, `yield`, `sleep`:
- `Runnable` queue (FIFO)
- `BlockedOn` map (task → blocked task ID)
- `Sleeping` map (task → wake time)
- Round-robin scheduling with `yield` and `sleep` support
- Deadlock detection when no runnable tasks remain and no sleepers
This mirrors `IODriver.hs` exactly, including task handle encoding (`Fork("task", n)`).
**Acceptance:** `demos/interactionTrees/forkAwait.tri` and `yield.tri` pass.
### Stage 5 — Permission System
Port path canonicalization and permission checks from Haskell:
- Syntactic normalization (resolve `.`, reject `..`)
- `--unsafe-io` bypass (allow all)
- `--allow-read PATH` / `--allow-write PATH` allowlists
- Error code 20 (`errPolicyDeny`) on violation
**Acceptance:** File operations outside allowed paths return err pairs, not crashes.
### Stage 6 — FFI Integration and Host Rollout
- Expose `arb_run_io` in the C header
- Update Python FFI test to verify IO round-trip
- Update PHP wrapper to support `--io`
- Document the two-layer model for future hosts (use `arb_run_io` for POSIX, Layer 1 primitives for custom runtimes)
**Acceptance:** Every existing FFI test still passes; new IO test passes in Python.
## Design Decisions
### Why baked-in POSIX effects?
- Most hosts (C, Python, PHP, native CLI) want real stdout/stdin/files.
- One canonical implementation avoids divergence.
- The Haskell `IODriver.hs` remains the reference spec; the Zig driver is the production runtime.
### Why not callback-based by default?
Callbacks add complexity for the common case. If a non-POSIX host (e.g., browser JS) needs custom effects, it can use the Layer 1 inspection primitives to build a ~50-line shim without reimplementing the scheduler. We can add `arb_run_io_with_callbacks` later if demand exists.
### Why not implement in every host language?
The Haskell `IODriver` is subtle: frame stack unwinding, async lifecycle, deadlock detection, path canonicalization, error code protocol. Bugs in any reimplementation would fracture the language ecosystem. A shared native driver is the only maintainable answer.
## Risks and Open Questions
1. **Fuel exhaustion during IO loops**`arb_run_io` internally calls `reduce.reduce` with a fuel parameter. Should it accept a total fuel budget, or reset fuel per reduction step? The Haskell side has no fuel limit; we may want `arb_run_io_unlimited` and `arb_run_io_fueled` variants.
2. **State threading** — The Haskell driver threads an environment and mutable state tree through the runtime. These are opaque `T` values manipulated by tricu code. The Zig driver must preserve them exactly across scheduler switches.
3. **Binary vs text I/O**`readFile` currently returns bytes (via `ofBytes` / `toString` in Haskell). The Zig driver must match the encoding exactly so that tricu code sees the same values in both hosts.
4. **Error parity** — Every error code (199) and its corresponding tree shape must match Haskell exactly. Divergence here breaks cross-host compatibility.
## Success Criteria
- `tricu-zig --io demos/interactionTrees/greet.tri` prints `Hello, tricu` in <10ms.
- `tricu-zig --io --unsafe-io demos/interactionTrees/writeThenRead.tri` writes and reads back a temp file correctly.
- `tricu-zig --io --unsafe-io demos/interactionTrees/forkAwait.tri` completes with correct async results.
- Python FFI can call `arb_run_io` and observe stdout from a tricu program.
- No regression in pure-reduction benchmarks (native path still ~0.005s for `id`).