feat(zig): native Arboricx bundle parser and C ABI

This commit is contained in:
2026-05-10 21:21:58 -05:00
parent 8a673e282d
commit d7a7a8134c
27 changed files with 5365 additions and 18 deletions

172
AGENTS.md
View File

@@ -4,13 +4,20 @@
## 0. Test Driven Development
Write and discuss tests with the user before implementing any implementation code.
Write and discuss tests with the user before working on implementation code. Do not modify existing tests without explicit permission.
## 1. Build & Test
```bash
# Tests
# Haskell tests (default check)
nix flake check
# Zig build
nix build .#tricu-zig
# Zig tests (separate target — not part of nix flake check)
nix build .#tricu-zig-tests
# Full build
nix build .#
```
@@ -32,10 +39,10 @@ nix build .#
| `LToken` | Lexer tokens |
| `Node` / `MerkleHash` | Content-addressed Merkle DAG nodes |
### Source modules
### Source modules (Haskell)
| Module | Purpose |
|--------|---------|
|--------|---------|
| `Main.hs` | CLI entry point (`cmdargs`), three modes: `repl`, `eval`, `decode` |
| `Eval.hs` | Interpreter: `evalTricu`, `result`, `evalSingle` |
| `Parser.hs` | Megaparsec parser → `TricuAST` |
@@ -46,13 +53,31 @@ nix build .#
| `ContentStore.hs` | SQLite-backed term persistence |
| `Wire.hs` | Arboricx portable wire format — encode/decode/import/export of Merkle-DAG bundle blobs |
### Multi-language Arboricx ecosystem
Arboricx is the portable executable-object format used by tricu. The project now includes native parsing and execution in multiple languages:
| Language | Location | Capabilities |
|----------|----------|--------------|
| **Haskell** | `src/Wire.hs`, `src/Research.hs` | Reference implementation — bundle encode/decode, content store, full Tree Calculus reduction |
| **tricu (self-hosted)** | `kernel_run_arboricx_typed.dag` | A self-hosting Arboricx parser/executor written in tricu itself. Used as a kernel inside the Zig host for maximum portability ("cool but useless" — ~3s for `append`) |
| **Zig** | `ext/zig/` | **Production host** — native bundle parser, WHNF reducer, C ABI (`libarboricx.so` / `.a`), CLI (`tricu-zig`), Python FFI support |
| **JavaScript (Node)** | `ext/js/` | Native bundle parser, manifest decoder, Merkle DAG verifier, Tree Calculus reducer, CLI runner |
| **PHP** | `ext/php/` | Tree Calculus reducer, codecs, kernel loader, CLI runner |
All hosts share the same bundle format and Merkle hashing scheme.
### File extensions
- `.hs` - Haskell source
- `.tri` - tricu language source (used in `lib/`, `test/`, `demos/`)
- `.arboricx` - Portable executable bundle
- `.dag` - Serialized kernel DAG (used by `gen_kernel.zig` at build time)
## 3. Test Suite
### Haskell tests
Tests live in `test/Spec.hs` and use **Tasty** + **HUnit**.
```bash
@@ -75,11 +100,24 @@ nix flake check
| `elimLambdaSingle` | Lambda elimination: eta reduction, SDef binding, semantics preservation |
| `stressElimLambda` | Lambda elimination stress test: 200 vars, 800-body curried lambda |
### Suggesting tests
### Zig tests
You do not write or modify tests. The user writes tests to constrain your outputs. You must adhere your code to tests or suggest modifications to tests.
Run separately via:
If the user gives you explicit permission to implement a test you may proceed.
```bash
nix build .#tricu-zig-tests
```
These are **not** included in `nix flake check`. The test derivation compiles and runs:
| Test | What it covers |
|------|----------------|
| `c_abi_test.c` | Smoke tests — leaf, stem, fork, app, reduce, number/string roundtrip, kernel root |
| `c_abi_append_test.c` | Kernel path — `append.arboricx` with string arguments via Tricu kernel |
| `native_bundle_append_test.c` | Native fast path — `append.arboricx` loaded natively, applied, reduced |
| `native_bundle_id_test.c` | Native fast path — `id.arboricx` |
| `native_bundle_bools_test.c` | Native fast path — `true.arboricx` / `false.arboricx` |
| `python_ffi_test.py` | Python ctypes FFI — tests both kernel and native paths for `id` and `append` |
## 4. tricu Language Quick Reference
@@ -145,12 +183,75 @@ Portable executable bundles are generated via `Wire.hs`. See `docs/arboricx-bund
TRICU_DB_PATH=/tmp/tricu.db ./result/bin/tricu export -o list_ops.arboricx append
```
## 8. Directory Layout
## 8. Zig Arboricx Host (`ext/zig/`)
The Zig host is a fast implementation for running Arboricx bundles. It provides a native bundle parser and arena-based evaluator.
### Modules
| File | Role |
|------|------|
| `src/main.zig` | CLI entrypoint — default native path, `--kernel` fallback |
| `src/bundle.zig` | Native Arboricx bundle parser — verifies digests, hashes, loads DAG into arena |
| `src/c_abi.zig` | C FFI exports — `arboricx_init`, tree constructors, codecs, reduction, bundle loading |
| `src/reduce.zig` | WHNF reducer (Tree Calculus `apply` rules) |
| `src/arena.zig` | Node arena (`ArrayListUnmanaged`) |
| `src/tree.zig` | `Node` union + iterative `copyTree` |
| `src/codecs.zig` | Number/string/list/bytes encoding + result unwrapping |
| `src/kernel.zig` | Embeds DAG kernel into arena (fallback path only) |
| `src/ternary.zig` | Ternary string parser for Tree Calculus terms |
| `tools/gen_kernel.zig` | Build-time tool: converts `.dag``kernel_embed.zig` |
| `include/arboricx.h` | C header for `libarboricx` |
### C ABI
Key functions:
```c
arb_ctx_t* arboricx_init(void);
uint32_t arb_load_bundle(arb_ctx_t*, const uint8_t* bytes, size_t len, const char* name);
uint32_t arb_load_bundle_default(arb_ctx_t*, const uint8_t* bytes, size_t len);
uint32_t arb_reduce(arb_ctx_t*, uint32_t root, uint64_t fuel);
```
`arb_reduce` evaluates in a **fresh scratch arena** so garbage never accumulates.
### Stack size requirement
Tree Calculus reduction is deeply recursive. Assume a segfault is a memory limitation until proven otherwise.
```bash
ulimit -s 32768 # 32 MB
```
### Performance comparison
| Fixture | Native path | Kernel path (`--kernel`) |
|---------|-------------|--------------------------|
| `append "hello " "world"` | **~0.007 s** | ~3.4 s |
| `id "hello"` | **~0.005 s** | ~0.38 s |
The kernel path is kept as a "cool but useless" fallback — the DAG is tiny (~30 KB) so the cost is negligible.
## 9. Nix Flake Outputs
| Output | Description |
|--------|-------------|
| `packages.default` / `packages.tricu` | Haskell tricu package |
| `packages.tricu-zig` | Zig CLI + `libarboricx.a` + `libarboricx.so` + `arboricx.h` |
| `packages.tricu-zig-tests` | **Separate test target** — C ABI + native bundle + Python FFI tests |
| `packages.tricu-container` | Docker image |
| `checks.default` / `checks.tricu` | Haskell test suite via Tasty/HUnit |
`tricu-zig-tests` is deliberately **not** in `checks` so `nix flake check` remains fast.
## 10. Directory Layout
```
tricu/
├── flake.nix # Nix flake: packages, tests, devShell
├── tricu.cabal # Cabal package (used via callCabal2nix)
├── AGENTS.md # This file
├── src/ # Haskell modules
│ ├── Main.hs
│ ├── Eval.hs
@@ -160,10 +261,11 @@ tricu/
│ ├── REPL.hs
│ ├── Research.hs
│ ├── ContentStore.hs
│ └── Wire.hs # Arboricx portable wire format
│ └── Wire.hs
├── test/
│ ├── Spec.hs # Tasty + HUnit tests
│ ├── *.tri # tricu test programs
│ ├── *.arboricx # Arboricx bundle fixtures
│ └── local-ns/ # Module namespace test files
├── lib/
│ ├── base.tri
@@ -175,10 +277,52 @@ tricu/
│ ├── toSource.tri
│ ├── levelOrderTraversal.tri
│ └── patternMatching.tri
── AGENTS.md # This file
── ext/ # Multi-language Arboricx hosts
│ ├── js/ # Node.js bundle parser + reducer
│ │ ├── src/
│ │ │ ├── bundle.js
│ │ │ ├── manifest.js
│ │ │ ├── merkle.js
│ │ │ ├── tree.js
│ │ │ ├── codecs.js
│ │ │ └── cli.js
│ │ └── test/
│ ├── php/ # PHP bundle loader + reducer
│ │ ├── src/
│ │ │ ├── functions.php
│ │ │ ├── codecs.php
│ │ │ ├── kernel.php
│ │ │ └── Tree/
│ │ └── run.php
│ └── zig/ # Zig production host
│ ├── build.zig
│ ├── build.zig.zon
│ ├── kernel_run_arboricx_typed.dag
│ ├── include/arboricx.h
│ ├── src/
│ │ ├── main.zig
│ │ ├── bundle.zig
│ │ ├── c_abi.zig
│ │ ├── codecs.zig
│ │ ├── kernel.zig
│ │ ├── reduce.zig
│ │ ├── arena.zig
│ │ ├── tree.zig
│ │ └── ternary.zig
│ ├── tests/
│ │ ├── c_abi_test.c
│ │ ├── c_abi_append_test.c
│ │ ├── native_bundle_append_test.c
│ │ ├── native_bundle_id_test.c
│ │ ├── native_bundle_bools_test.c
│ │ └── python_ffi_test.py
│ └── tools/
│ └── gen_kernel.zig
└── docs/
└── arboricx-bundle-format.md
```
## 9. Content Store Workflow (Custom DB)
## 11. Content Store Workflow (Custom DB)
The content store location is controlled by the `TRICU_DB_PATH` environment variable. When set, `eval` mode automatically loads all stored terms into the initial environment, so you can call any previously imported/evaluated term by name.
@@ -206,14 +350,16 @@ t> !definitions
Without `TRICU_DB_PATH` set, `eval` uses only the terms defined in the input file(s).
## 10. Development Tips
## 12. Development Tips
- **REPL:** `nix run .#` starts the interactive tricu REPL.
- **Evaluate files:** `nix run .# -- eval -f demos/equality.tri`
- **Zig host:** `nix build .#tricu-zig` then `./result/bin/tricu-zig <bundle> [args...]`
- **Zig tests:** `nix build .#tricu-zig-tests`
- **GHC options:** `-threaded -rtsopts -with-rtsopts=-N` for parallel runtime. Use `-N` RTS flag for multi-core.
- **Upx** is in the devShell for binary compression if needed.
## 11. Viewing Haskell Dependency Docs from Nix
## 13. Viewing Haskell Dependency Docs from Nix
When you need Haddock documentation for a Haskell dependency available in Nixpkgs, build the package's `doc` output directly with `^doc`.