diff --git a/AGENTS.md b/AGENTS.md index ed3aed8..cf47efe 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -2,66 +2,25 @@ > For AI agents and contributors working in this repository. -## 1. Build & Test +## Build & Test ```bash -# Haskell tests (default check) +# Tests nix flake check -# Zig build -nix build .#tricu-zig - -# Zig tests (separate target — not part of nix flake check) -nix build .#tricu-zig-tests - -# Full build +# Build tricu executable nix build .# ``` -### ⚠️ Never call `cabal` directly +### Never call `cabal` directly > **Rule of thumb:** if it builds, links, or tests, it goes through `nix`. -## 2. Project Overview +## Project Overview -**tricu** (pronounced "tree-shoe") is a programming-language experiment written in Haskell. It implements [Triage Calculus](https://olydis.medium.com/a-visual-introduction-to-tree-calculus-2f4a34ceffc2), an extension of Barry Jay's Tree Calculus, with lambda-abstraction sugar that gets eliminated back to pure tree calculus terms. +**tricu** (pronounced "tree-shoe") is a programming-language experiment written primarily in Haskell. -### Core types (in `src/Research.hs`) - -| Type | Description | -|------|-------------| -| `T = Leaf \| Stem T \| Fork T T` | Tree Calculus term (the runtime value) | -| `TricuAST` | Parsed AST with `SDef`, `SApp`, `SLambda`, etc. | -| `LToken` | Lexer tokens | -| `Node` / `MerkleHash` | Content-addressed Merkle DAG nodes | - -### Source modules (Haskell) - -| Module | Purpose | -|--------|---------| -| `Main.hs` | CLI entry point (`cmdargs`), three modes: `repl`, `eval`, `decode` | -| `Eval.hs` | Interpreter: `evalTricu`, `result`, `evalSingle` | -| `Parser.hs` | Megaparsec parser → `TricuAST` | -| `Lexer.hs` | Megaparsec lexer → `LToken` | -| `FileEval.hs` | File loading, module imports, `!import` | -| `REPL.hs` | Interactive Read-Eval-Print Loop (haskeline) | -| `Research.hs` | Core types, `apply` reduction, booleans, marshalling (`ofString`, `ofNumber`), output formatters (`toAscii`, `toTernaryString`, `decodeResult`) | -| `ContentStore.hs` | SQLite-backed term persistence | -| `Wire.hs` | Arboricx portable wire format — encode/decode/import/export of Merkle-DAG bundle blobs | - -### Multi-language Arboricx ecosystem - -Arboricx is the portable executable-object format used by tricu. The project now includes native parsing and execution in multiple languages: - -| Language | Location | Capabilities | -|----------|----------|--------------| -| **Haskell** | `src/Wire.hs`, `src/Research.hs` | Reference implementation — bundle encode/decode, content store, full Tree Calculus reduction | -| **tricu (self-hosted)** | `kernel_run_arboricx_typed.dag` | A self-hosting Arboricx parser/executor written in tricu itself. Used as a kernel inside the Zig host for maximum portability ("cool but useless" — ~3s for `append`) | -| **Zig** | `ext/zig/` | **Production host** — native bundle parser, WHNF reducer, C ABI (`libarboricx.so` / `.a`), CLI (`tricu-zig`), Python FFI support | -| **JavaScript (Node)** | `ext/js/` | Native bundle parser, manifest decoder, Merkle DAG verifier, Tree Calculus reducer, CLI runner | -| **PHP** | `ext/php/` | FFI wrapper around `libarboricx.so`, CLI runner | - -All hosts share the same bundle format and Merkle hashing scheme. +Core types are in `src/Research.hs`. ### File extensions @@ -70,8 +29,6 @@ All hosts share the same bundle format and Merkle hashing scheme. - `.arboricx` - Portable executable bundle - `.dag` - Serialized kernel DAG (used by `gen_kernel.zig` at build time) -## 3. Test Suite - ### Haskell tests Tests live in `test/Spec.hs` and use **Tasty** + **HUnit**. @@ -80,42 +37,7 @@ Tests live in `test/Spec.hs` and use **Tasty** + **HUnit**. nix flake check ``` -### Test groups - -| Group | What it covers | -|-------|----------------| -| `lexer` | Megaparsec lexer - identifiers, keywords, strings, escapes, invalid tokens | -| `parser` | Parser - defs, lambda, applications, lists, comments, parentheses | -| `simpleEvaluation` | Core `apply` reduction rules, variable substitution, immutability | -| `lambdas` | Lambda elimination, SKI calculus, higher-order functions, currying, shadowing, free vars | -| `providedLibraries` | `lib/list.tri` - triage, booleans, list ops (`head`, `tail`, `map`, `emptyList?`, `append`, `equal?`) | -| `fileEval` | Loading `.tri` files, multi-file context, decode | -| `modules` | `!import`, cyclic deps, namespacing, multi-level imports, unresolved vars, local namespaces | -| `demos` | `demos/*.tri` - structural equality, `toSource`, `size`, level-order traversal | -| `decoding` | `decodeResult` - Leaf, numbers, strings, lists, mixed | -| `elimLambdaSingle` | Lambda elimination: eta reduction, SDef binding, semantics preservation | -| `stressElimLambda` | Lambda elimination stress test: 200 vars, 800-body curried lambda | - -### Zig tests - -Run separately via: - -```bash -nix build .#tricu-zig-tests -``` - -These are **not** included in `nix flake check`. The test derivation compiles and runs: - -| Test | What it covers | -|------|----------------| -| `c_abi_test.c` | Smoke tests — leaf, stem, fork, app, reduce, number/string roundtrip, kernel root | -| `c_abi_append_test.c` | Kernel path — `append.arboricx` with string arguments via Tricu kernel | -| `native_bundle_append_test.c` | Native fast path — `append.arboricx` loaded natively, applied, reduced | -| `native_bundle_id_test.c` | Native fast path — `id.arboricx` | -| `native_bundle_bools_test.c` | Native fast path — `true.arboricx` / `false.arboricx` | -| `python_ffi_test.py` | Python ctypes FFI — tests both kernel and native paths for `id` and `append` | - -## 4. tricu Language Quick Reference +## tricu Language Quick Reference ``` t → Leaf (the base term) @@ -132,242 +54,4 @@ head (map f xs) → From lib/list.tri ``` CRITICAL: - -When working with recursion in `tricu` files: - -1. Put consumed data first in recursive workers. -2. Let data shape drive recursion. -3. Do not let counters unroll over abstract input. - -## 5. Output Formats - -The `eval` command accepts `--form` (shorthand `-t`): - -| Format | Value | Description | -|--------|-------|-------------| -| `tree` | `TreeCalculus` | Simple `t` form (default) | -| `fsl` | `FSL` | Full show representation | -| `ast` | `AST` | Parsed AST representation | -| `ternary` | `Ternary` | Ternary string encoding | -| `ascii` | `Ascii` | ASCII-art tree diagram | -| `decode` | `Decode` | Human-readable (strings, numbers, lists) | - -## 6. Content Addressing - -Each `T` term is content-addressed via a Merkle DAG: - -``` -NLeaf → 0x00 -NStem(h) → 0x01 || h (32 bytes) -NFork(l,r) → 0x02 || l (32 bytes) || r (32 bytes) - -hash = SHA256("arboricx.merkle.node.v1" <> 0x00 <> serialized_node) -``` - -This is stored in SQLite via `ContentStore.hs`. Hash suffixes on identifiers (e.g., `foo_abc123...`) are validated: 16–64 hex characters (SHA256). - -## 7. Arboricx Portable Bundles (`.arboricx`) - -Portable executable bundles are generated via `Wire.hs`. See `docs/arboricx-bundle-format.md` for the full binary format spec. - -```bash -# Export a bundle from the content store -./result/bin/tricu export -o myterm.arboricx myterm - -# Run a bundle (requires TRICU_DB_PATH) -./result/bin/tricu import -f lib/list.tri -TRICU_DB_PATH=/tmp/tricu.db ./result/bin/tricu export -o list_ops.arboricx append -``` - -## 8. Zig Arboricx Host (`ext/zig/`) - -The Zig host is a fast implementation for running Arboricx bundles. It provides a native bundle parser and arena-based evaluator. - -### Modules - -| File | Role | -|------|------| -| `src/main.zig` | CLI entrypoint — default native path, `--kernel` fallback | -| `src/bundle.zig` | Native Arboricx bundle parser — verifies digests, hashes, loads DAG into arena | -| `src/c_abi.zig` | C FFI exports — `arboricx_init`, tree constructors, codecs, reduction, bundle loading | -| `src/reduce.zig` | WHNF reducer (Tree Calculus `apply` rules) | -| `src/arena.zig` | Node arena (`ArrayListUnmanaged`) | -| `src/tree.zig` | `Node` union + iterative `copyTree` | -| `src/codecs.zig` | Number/string/list/bytes encoding + result unwrapping | -| `src/kernel.zig` | Embeds DAG kernel into arena (fallback path only) | -| `src/ternary.zig` | Ternary string parser for Tree Calculus terms | -| `tools/gen_kernel.zig` | Build-time tool: converts `.dag` → `kernel_embed.zig` | -| `include/arboricx.h` | C header for `libarboricx` | - -### C ABI - -Key functions: - -```c -arb_ctx_t* arboricx_init(void); -uint32_t arb_load_bundle(arb_ctx_t*, const uint8_t* bytes, size_t len, const char* name); -uint32_t arb_load_bundle_default(arb_ctx_t*, const uint8_t* bytes, size_t len); -uint32_t arb_reduce(arb_ctx_t*, uint32_t root, uint64_t fuel); -``` - -`arb_reduce` evaluates in a **fresh scratch arena** so garbage never accumulates. - -### Stack size requirement - -Tree Calculus reduction is deeply recursive. Assume a segfault is a memory limitation until proven otherwise. - -```bash -ulimit -s 32768 # 32 MB -``` - -### Performance comparison - -| Fixture | Native path | Kernel path (`--kernel`) | -|---------|-------------|--------------------------| -| `append "hello " "world"` | **~0.007 s** | ~3.4 s | -| `id "hello"` | **~0.005 s** | ~0.38 s | - -The kernel path is kept as a "cool but useless" fallback — the DAG is tiny (~30 KB) so the cost is negligible. - -## 9. Nix Flake Outputs - -| Output | Description | -|--------|-------------| -| `packages.default` / `packages.tricu` | Haskell tricu package | -| `packages.tricu-zig` | Zig CLI + `libarboricx.a` + `libarboricx.so` + `arboricx.h` | -| `packages.tricu-zig-tests` | **Separate test target** — C ABI + native bundle + Python FFI tests | -| `packages.tricu-php` | PHP source + `libarboricx.so` + `tricu-php` wrapper script | -| `packages.tricu-php-tests` | **Separate test target** — PHP FFI tests against fixture bundles | -| `packages.tricu-container` | Docker image | -| `checks.default` / `checks.tricu` | Haskell test suite via Tasty/HUnit | - -`tricu-zig-tests` is deliberately **not** in `checks` so `nix flake check` remains fast. - -## 10. Directory Layout - -``` -tricu/ -├── flake.nix # Nix flake: packages, tests, devShell -├── tricu.cabal # Cabal package (used via callCabal2nix) -├── AGENTS.md # This file -├── src/ # Haskell modules -│ ├── Main.hs -│ ├── Eval.hs -│ ├── Parser.hs -│ ├── Lexer.hs -│ ├── FileEval.hs -│ ├── REPL.hs -│ ├── Research.hs -│ ├── ContentStore.hs -│ └── Wire.hs -├── test/ -│ ├── Spec.hs # Tasty + HUnit tests -│ ├── *.tri # tricu test programs -│ ├── *.arboricx # Arboricx bundle fixtures -│ └── local-ns/ # Module namespace test files -├── lib/ -│ ├── base.tri -│ ├── list.tri -│ └── patterns.tri -├── demos/ -│ ├── equality.tri -│ ├── size.tri -│ ├── toSource.tri -│ ├── levelOrderTraversal.tri -│ └── patternMatching.tri -├── ext/ # Multi-language Arboricx hosts -│ ├── js/ # Node.js bundle parser + reducer -│ │ ├── src/ -│ │ │ ├── bundle.js -│ │ │ ├── manifest.js -│ │ │ ├── merkle.js -│ │ │ ├── tree.js -│ │ │ ├── codecs.js -│ │ │ └── cli.js -│ │ └── test/ -│ ├── php/ # PHP FFI host for libarboricx.so -│ │ ├── src/ -│ │ │ └── ffi.php -│ │ └── run.php -│ └── zig/ # Zig production host -│ ├── build.zig -│ ├── build.zig.zon -│ ├── kernel_run_arboricx_typed.dag -│ ├── include/arboricx.h -│ ├── src/ -│ │ ├── main.zig -│ │ ├── bundle.zig -│ │ ├── c_abi.zig -│ │ ├── codecs.zig -│ │ ├── kernel.zig -│ │ ├── reduce.zig -│ │ ├── arena.zig -│ │ ├── tree.zig -│ │ └── ternary.zig -│ ├── tests/ -│ │ ├── c_abi_test.c -│ │ ├── c_abi_append_test.c -│ │ ├── native_bundle_append_test.c -│ │ ├── native_bundle_id_test.c -│ │ ├── native_bundle_bools_test.c -│ │ └── python_ffi_test.py -│ └── tools/ -│ └── gen_kernel.zig -└── docs/ - └── arboricx-bundle-format.md -``` - -## 11. Content Store Workflow (Custom DB) - -The content store location is controlled by the `TRICU_DB_PATH` environment variable. When set, `eval` mode automatically loads all stored terms into the initial environment, so you can call any previously imported/evaluated term by name. - -```bash -# Use a local DB -export TRICU_DB_PATH=/tmp/tricu-local.db - -# Import terms from the standard library -./result/bin/tricu import -f lib/list.tri - -# Now use them in eval mode -echo "not? (t t)" | ./result/bin/tricu eval -t decode -# Output: t - -echo "not? (t t t)" | ./result/bin/tricu eval -t decode -# Output: Stem Leaf - -echo "equal? (t t) (t t t)" | ./result/bin/tricu eval -t decode -# Output: t - -# Check what's in the store -./result/bin/tricu -t> !definitions -``` - -Without `TRICU_DB_PATH` set, `eval` uses only the terms defined in the input file(s). - -## 12. Development Tips - -- **REPL:** `nix run .#` starts the interactive tricu REPL. -- **Evaluate files:** `nix run .# -- eval -f demos/equality.tri` -- **Zig host:** `nix build .#tricu-zig` then `./result/bin/tricu-zig [args...]` -- **Zig tests:** `nix build .#tricu-zig-tests` -- **GHC options:** `-threaded -rtsopts -with-rtsopts=-N` for parallel runtime. Use `-N` RTS flag for multi-core. -- **Upx** is in the devShell for binary compression if needed. - -## 13. Viewing Haskell Dependency Docs from Nix - -When you need Haddock documentation for a Haskell dependency available in Nixpkgs, build the package's `doc` output directly with `^doc`. - -Example: - -Replace `megaparsec` with the dependency name you need: - -```sh -nix build "nixpkgs#haskellPackages.${pkg}^doc" -``` - -View the available documentation files: - -```sh -find ./result-doc -type f \( -name '*.html' -o -name '*.haddock' \) | sort -``` +When working with `tricu` `.tri` files ***YOU MUST REVIEW notes/tricu-normalization-rules.md*** diff --git a/demos/interactionTrees/arboricx-server.tri b/demos/interactionTrees/arboricxServer.tri similarity index 100% rename from demos/interactionTrees/arboricx-server.tri rename to demos/interactionTrees/arboricxServer.tri diff --git a/lib/arboricx/common.tri b/lib/arboricx/common.tri index 5164559..51c9a45 100644 --- a/lib/arboricx/common.tri +++ b/lib/arboricx/common.tri @@ -1,7 +1,6 @@ -!import "../base.tri" !Local -!import "../list.tri" !Local -!import "../bytes.tri" !Local -!import "../binary.tri" !Local +!import "../prelude.tri" !Local +!import "../bytes.tri" !Local +!import "../binary.tri" !Local arboricxMagic = [(65) (82) (66) (79) (82) (73) (67) (88)] arboricxMajorVersion = [(0) (1)] diff --git a/lib/arboricx/nodes.tri b/lib/arboricx/nodes.tri index 46e9443..e27a66d 100644 --- a/lib/arboricx/nodes.tri +++ b/lib/arboricx/nodes.tri @@ -46,14 +46,72 @@ nodePayloadValid? = (nodePayload : (nodePayloadStem? nodePayload) (nodePayloadFork? nodePayload))) +nodeU32FromBytes4 = (b0 b1 b2 b3 : + u32BEBytesToNat + (pair b0 + (pair b1 + (pair b2 + (pair b3 t))))) + +withNodePayloadStemIndex = (nodePayload shortK indexK : + matchList + (shortK t) + (tag r0 : + matchList + (shortK t) + (b0 r1 : + matchList + (shortK t) + (b1 r2 : + matchList + (shortK t) + (b2 r3 : + matchList + (shortK t) + (b3 _ : + indexK (nodeU32FromBytes4 b0 b1 b2 b3)) + r3) r2) r1) r0) nodePayload) + +withNodePayloadForkIndices = (nodePayload shortK indicesK : + matchList + (shortK t) + (tag r0 : + matchList + (shortK t) + (l0 r1 : + matchList + (shortK t) + (l1 r2 : + matchList + (shortK t) + (l2 r3 : + matchList + (shortK t) + (l3 r4 : + matchList + (shortK t) + (r0b r5 : + matchList + (shortK t) + (r1b r6 : + matchList + (shortK t) + (r2b r7 : + matchList + (shortK t) + (r3b _ : + indicesK + (nodeU32FromBytes4 l0 l1 l2 l3) + (nodeU32FromBytes4 r0b r1b r2b r3b)) r7) r6) r5) r4) r3) r2) r1) r0) nodePayload) + nodePayloadStemChildIndex = (nodePayload : - u32BEBytesToNat (bytesTake 4 (bytesDrop 1 nodePayload))) + withNodePayloadStemIndex nodePayload (_ : 0) (index : index)) nodePayloadForkLeftIndex = (nodePayload : - u32BEBytesToNat (bytesTake 4 (bytesDrop 1 nodePayload))) + withNodePayloadForkIndices nodePayload (_ : 0) (left right : left)) nodePayloadForkRightIndex = (nodePayload : - u32BEBytesToNat (bytesTake 4 (bytesDrop 5 nodePayload))) + withNodePayloadForkIndices nodePayload (_ : 0) (left right : right)) nodeRecordsHaveInvalidPayload? = y (self nodeRecords : matchList @@ -65,31 +123,44 @@ nodeRecordsHaveInvalidPayload? = y (self nodeRecords : nodeRecords) nodePayloadChildIndices = (nodePayload : - matchBool + matchList t - (matchBool - (pair (nodePayloadStemChildIndex nodePayload) t) - (pair (nodePayloadForkLeftIndex nodePayload) - (pair (nodePayloadForkRightIndex nodePayload) t)) - (nodePayloadStem? nodePayload)) - (nodePayloadLeaf? nodePayload)) + (tag rest : + lazyBool + (_ : + withNodePayloadStemIndex + nodePayload + (_ : t) + (childIndex : pair childIndex t)) + (_ : + lazyBool + (_ : + withNodePayloadForkIndices + nodePayload + (_ : t) + (leftIndex rightIndex : + pair leftIndex (pair rightIndex t))) + (_ : t) + (equal? tag nodePayloadForkTag)) + (equal? tag nodePayloadStemTag)) + nodePayload) -- True iff index n names an element before limit in records. -- For topologically sorted indexed bundles, every child of record i must -- satisfy childIndex < i, so searching only the prefix [0, i) validates both -- bounds and acyclicity. -nodeIndexInPrefix? = y (self n records i limit : - matchBool +nodeIndexInPrefix? = y (self records n i limit : + matchList false - (matchList - false - (_ rest : - matchBool + (_ rest : + matchBool + false + (matchBool true - (self n rest (succ i) limit) + (self rest n (succ i) limit) (equal? i n)) - records) - (equal? i limit)) + (equal? i limit)) + records) nodeChildIndicesInPrefix? = y (self childIndices records limit : matchList @@ -98,7 +169,7 @@ nodeChildIndicesInPrefix? = y (self childIndices records limit : matchBool (self rest records limit) false - (nodeIndexInPrefix? childIndex records 0 limit)) + (nodeIndexInPrefix? records childIndex 0 limit)) childIndices) nodePayloadIndicesValid? = (nodePayload i records : @@ -178,31 +249,124 @@ nodesSectionRecords = (nodesSection : (_ nodeRecords : nodeRecords) nodesSection) -nodePayloadToTreeWith = (self nodeRecords nodePayload : - matchBool - (ok t t) - (matchBool - (bindResult (self (nodePayloadStemChildIndex nodePayload) nodeRecords) - (child _ : ok (t child) t)) - (bindResult (self (nodePayloadForkLeftIndex nodePayload) nodeRecords) - (left _ : - bindResult (self (nodePayloadForkRightIndex nodePayload) nodeRecords) - (right _ : ok (pair left right) t))) - (nodePayloadStem? nodePayload)) - (nodePayloadLeaf? nodePayload)) +nodeBuiltTreeIndex = (entry : + matchPair + (index _ : index) + entry) -nodeIndexToTree = y (self nodeIndex nodeRecords : - (nodePayload : - matchBool - (nodePayloadToTreeWith self nodeRecords nodePayload) - (err errMissingNode t) - (not? (equal? nodePayload t))) - (nth nodeIndex nodeRecords)) +nodeBuiltTreeValue = (entry : + matchPair + (_ tree : tree) + entry) + +nodeTreeByIndex_ = (self builtTrees targetIndex : + lazyList + (_ : err errMissingNode t) + (entry rest : + lazyBool + (_ : ok (nodeBuiltTreeValue entry) t) + (_ : self rest targetIndex) + (equal? (nodeBuiltTreeIndex entry) targetIndex)) + builtTrees) + +nodeTreeByIndex = (builtTrees targetIndex : + y nodeTreeByIndex_ builtTrees targetIndex) + +nodePayloadToTreeFromBuilt = (builtTrees nodePayload : + matchList + (err errInvalidNodePayload t) + (tag rest : + lazyBool + (_ : ok t t) + (_ : + lazyBool + (_ : + withNodePayloadStemIndex + nodePayload + (_ : err errInvalidNodePayload t) + (childIndex : + lazyResult + (code after : err code after) + (child _ : ok (t child) t) + (nodeTreeByIndex builtTrees childIndex))) + (_ : + lazyBool + (_ : + withNodePayloadForkIndices + nodePayload + (_ : err errInvalidNodePayload t) + (leftIndex rightIndex : + lazyResult + (code after : err code after) + (left _ : + lazyResult + (code after : err code after) + (right _ : ok (pair left right) t) + (nodeTreeByIndex builtTrees rightIndex)) + (nodeTreeByIndex builtTrees leftIndex))) + (_ : err errInvalidNodePayload t) + (equal? tag nodePayloadForkTag)) + (equal? tag nodePayloadStemTag)) + (equal? tag 0)) + nodePayload) + +nodeBuildState = (targetIndex i builtTrees : + pair targetIndex (pair i builtTrees)) + +nodeBuildStateTargetIndex = (state : + matchPair + (targetIndex _ : targetIndex) + state) + +nodeBuildStateI = (state : + matchPair + (_ rest : + matchPair + (i _ : i) + rest) + state) + +nodeBuildStateBuiltTrees = (state : + matchPair + (_ rest : + matchPair + (_ builtTrees : builtTrees) + rest) + state) + +nodeIndexToTree_ = (self remainingRecords state : + ((nodeIndex : + ((i : + ((builtTrees : + lazyList + (_ : err errMissingNode t) + (nodePayload rest : + lazyResult + (code after : err code after) + (tree _ : + lazyBool + (_ : ok tree t) + (_ : + self + rest + (nodeBuildState + nodeIndex + (succ i) + (pair (pair i tree) builtTrees))) + (equal? i nodeIndex)) + (nodePayloadToTreeFromBuilt builtTrees nodePayload)) + remainingRecords) + (nodeBuildStateBuiltTrees state))) + (nodeBuildStateI state))) + (nodeBuildStateTargetIndex state))) + +nodeIndexToTree = (nodeRecords nodeIndex : + y nodeIndexToTree_ nodeRecords (nodeBuildState nodeIndex 0 t)) readArboricxTreeFromIndex = (rootIndexBytes bs : bindResult (readArboricxNodesSection bs) (nodesSection afterContainer : - bindResult (nodeIndexToTree (u32BEBytesToNat rootIndexBytes) (nodesSectionRecords nodesSection)) + bindResult (nodeIndexToTree (nodesSectionRecords nodesSection) (u32BEBytesToNat rootIndexBytes)) (tree _ : ok tree afterContainer))) readArboricxExecutableFromIndex = readArboricxTreeFromIndex diff --git a/lib/arboricx/server.tri b/lib/arboricx/server.tri index e1023b6..7b31756 100644 --- a/lib/arboricx/server.tri +++ b/lib/arboricx/server.tri @@ -9,23 +9,40 @@ pathJoin = a b : append a (append "/" b) -objectDir = root shard : pathJoin (pathJoin root "objects") shard +objectDir = root shard : + pathJoin (pathJoin root "objects") shard + +hashShard = (hash : + matchList + t + (h0 r0 : + matchList + (pair h0 t) + (h1 r1 : + matchList + (pair h0 (pair h1 t)) + (h2 _ : + pair h0 (pair h1 (pair h2 t))) + r1) + r0) + hash) bundleObjectPath = (root hash : - ((shard : pathJoin (objectDir root shard) (append hash ".arboricx")) - (take 3 hash))) - ---bundleTmpPath = (root hash time : --- pathJoin (pathJoin root "tmp") (append hash (append "." (append (showNumber time) ".tmp")))) + pathJoin + (objectDir root (hashShard hash)) + (append hash ".arboricx")) bundleTmpPath = (root hash time : - pathJoin (pathJoin root "tmp") (append hash ".tmp")) + pathJoin + (pathJoin root "tmp") + (append hash ".tmp")) -- --------------------------------------------------------------------------- -- Store initialization -- --------------------------------------------------------------------------- -ensureDir = path : void (createDirectory path) +ensureDir = path : + void (createDirectory path) ensureStore = (root : foldl @@ -59,7 +76,7 @@ putBundleWithHash = (root bundleBytes time hash : root bundleBytes hash - (take 3 hash) + (hashShard hash) (bundleTmpPath root hash time) (bundleObjectPath root hash)) @@ -85,56 +102,106 @@ getBundleByHash = (root hash : (errMsg : pure (err errMsg t)) (bytes : pure (ok bytes t))) +-- --------------------------------------------------------------------------- +-- Route prefix helper +-- --------------------------------------------------------------------------- + +stripPrefix_ = (self input prefix : + lazyList + (_ : + lazyList + (_ : just t) + (_ _ : nothing) + prefix) + (ih ir : + lazyList + (_ : just input) + (ph pr : + lazyBool + (_ : self ir pr) + (_ : nothing) + (equal? ih ph)) + prefix) + input) + +stripPrefix = (prefix input : + y stripPrefix_ input prefix) + +bundleHashPrefix = "/_arboricx/bundle/hash/" + -- --------------------------------------------------------------------------- -- Registry routes -- --------------------------------------------------------------------------- healthRoute = (method target : - matchBool - (pure (okResponse "OK\n")) - (pure notFoundResponse) - (and? (equal? method "GET") (equal? target "/_arboricx/health"))) + lazyBool + (_ : + lazyBool + (_ : pure (okResponse "OK\n")) + (_ : pure notFoundResponse) + (equal? target "/_arboricx/health")) + (_ : pure notFoundResponse) + (equal? method "GET")) putBundleRoute = (root method target body : - matchBool - (bind (putBundle root body) (result : - matchResult - (err _ : pure (badRequestResponse (append "Upload failed: " err))) - (hash _ : pure (createdResponse hash)) - result)) - (pure notFoundResponse) - (and? (equal? method "POST") (equal? target "/_arboricx/bundles"))) + lazyBool + (_ : + lazyBool + (_ : + bind (putBundle root body) (result : + matchResult + (err _ : pure (badRequestResponse (append "Upload failed: " err))) + (hash _ : pure (createdResponse hash)) + result)) + (_ : pure notFoundResponse) + (equal? target "/_arboricx/bundles")) + (_ : pure notFoundResponse) + (equal? method "POST")) getBundleRoute = (root method target : - matchBool - ((hash : - bind (getBundleByHash root hash) (result : - matchResult - (errMsg _ : pure (errorResponse 404 errMsg)) - (bytes _ : pure (response 200 "application/vnd.arboricx.bundle" bytes)) - result)) - (drop 23 target)) - (pure notFoundResponse) - (and? (equal? method "GET") (startsWith? "/_arboricx/bundle/hash/" target))) + lazyBool + (_ : + lazyMaybe + (_ : pure notFoundResponse) + (hash : + bind (getBundleByHash root hash) (result : + matchResult + (errMsg _ : pure (errorResponse 404 errMsg)) + (bytes _ : pure (response 200 "application/vnd.arboricx.bundle" bytes)) + result)) + (stripPrefix bundleHashPrefix target)) + (_ : pure notFoundResponse) + (equal? method "GET")) arboricxRouter = (root method target headers body : - matchBool - (getBundleRoute root method target) - (matchBool - (putBundleRoute root method target body) - (matchBool - (healthRoute method target) - (pure notFoundResponse) - (and? (equal? method "GET") (equal? target "/_arboricx/health"))) - (and? (equal? method "POST") (equal? target "/_arboricx/bundles"))) - (and? (equal? method "GET") (startsWith? "/_arboricx/bundle/hash/" target))) + lazyBool + (_ : + lazyMaybe + (_ : healthRoute method target) + (hash : + bind (getBundleByHash root hash) (result : + matchResult + (errMsg _ : pure (errorResponse 404 errMsg)) + (bytes _ : pure (response 200 "application/vnd.arboricx.bundle" bytes)) + result)) + (stripPrefix bundleHashPrefix target)) + (_ : + lazyBool + (_ : putBundleRoute root method target body) + (_ : pure notFoundResponse) + (equal? method "POST")) + (equal? method "GET")) -- --------------------------------------------------------------------------- -- Server entrypoint -- --------------------------------------------------------------------------- arboricxHandler = (root client peer : - httpHandlerIO (arboricxRouter root) client peer) + httpHandlerIO + (method target headers body : + arboricxRouter root method target headers body) + client + peer) arboricxServer = (root addr port : onResult_ (listenSocket addr port 128) diff --git a/lib/binary.tri b/lib/binary.tri index 880d26e..6bc0180 100644 --- a/lib/binary.tri +++ b/lib/binary.tri @@ -56,7 +56,6 @@ expectU8 = (expected bs : read2 = (bs : readBytes 2 bs) read4 = (bs : readBytes 4 bs) -readU16BEBytes = (bs : read2 bs) readU32BEBytes = (bs : read4 bs) -- --------------------------------------------------------------------------- @@ -105,14 +104,3 @@ eof? = (bs : (emptyList? bs)) expectAscii = expectBytes - --- --------------------------------------------------------------------------- --- Endian / int conversion helpers --- --------------------------------------------------------------------------- - -u16BE = bytes : add (mul 256 (head bytes)) (head (tail bytes)) - -u16LE = bytes : add (mul 256 (head (tail bytes))) (head bytes) - -readU16BE = bs : bindParser read2 (bytes rest : ok (u16BE bytes) rest) bs -readU16LE = bs : bindParser read2 (bytes rest : ok (u16LE bytes) rest) bs diff --git a/lib/http.tri b/lib/http.tri index f6732ec..e23360d 100644 --- a/lib/http.tri +++ b/lib/http.tri @@ -3,7 +3,7 @@ !import "socket.tri" !Local -- --------------------------------------------------------------------------- --- 1. Constants +-- Constants -- --------------------------------------------------------------------------- maxHeaderBytes = 65536 @@ -14,39 +14,7 @@ crlf = pair 13 (pair 10 t) crlfcrlf = pair 13 (pair 10 (pair 13 (pair 10 t))) -- --------------------------------------------------------------------------- --- 2. Lazy eliminators --- --------------------------------------------------------------------------- - -lazyBool = (thenK elseK cond : - ((chosen : chosen t) - (matchBool - thenK - elseK - cond))) - -lazyList = (nilK consK xs : - ((chosen : chosen t) - (matchList - nilK - (h r : (_ : consK h r)) - xs))) - -lazyMaybe = (noneK someK m : - ((chosen : chosen t) - (matchMaybe - noneK - (x : (_ : someK x)) - m))) - -lazyResult = (errK okK result : - ((chosen : chosen t) - (matchResult - (code rest : (_ : errK code rest)) - (value rest : (_ : okK value rest)) - result))) - --- --------------------------------------------------------------------------- --- 3. Small byte/list helpers +-- Small byte/list helpers -- --------------------------------------------------------------------------- chomp = (xs : @@ -60,7 +28,7 @@ chomp = (xs : (reverse xs)) -- --------------------------------------------------------------------------- --- 4. Response construction +-- Response construction -- --------------------------------------------------------------------------- statusPhrase = (code : @@ -184,7 +152,7 @@ responseForMethod = (method resp : (equal? method "HEAD")) -- --------------------------------------------------------------------------- --- 5. Header receive / framing +-- Header receive / framing -- --------------------------------------------------------------------------- recvUntilMax_ = (y (self sock pattern maxBytes acc accLen : @@ -221,7 +189,7 @@ recvHeaders = (sock : recvUntilMax sock crlfcrlf maxHeaderBytes) -- --------------------------------------------------------------------------- --- 6. Request line parsing +-- Request line parsing -- --------------------------------------------------------------------------- readLineBytes_ = (y (self bs acc : @@ -277,7 +245,7 @@ parseRequestLine = (bs : (readLineBytes bs))) -- --------------------------------------------------------------------------- --- 7. Header parsing +-- Header parsing -- --------------------------------------------------------------------------- @@ -429,7 +397,7 @@ parseHeaders = (bs : y parseHeaders_ bs t t t false true) -- --------------------------------------------------------------------------- --- 8. Content-Length parsing +-- Content-Length parsing -- --------------------------------------------------------------------------- bit0? = (x : @@ -544,7 +512,7 @@ contentLength = (headers : y contentLength_ headers) -- --------------------------------------------------------------------------- --- 9. Body reading +-- Body reading -- --------------------------------------------------------------------------- consumeAvailable_ = (y (self bytes remaining acc : diff --git a/lib/lazy.tri b/lib/lazy.tri new file mode 100644 index 0000000..2ba3677 --- /dev/null +++ b/lib/lazy.tri @@ -0,0 +1,30 @@ +!import "base.tri" !Local +!import "list.tri" !Local + +lazyBool = (thenK elseK cond : + ((chosen : chosen t) + (matchBool + thenK + elseK + cond))) + +lazyList = (nilK consK xs : + ((chosen : chosen t) + (matchList + nilK + (h r : (_ : consK h r)) + xs))) + +lazyMaybe = (noneK someK m : + ((chosen : chosen t) + (matchMaybe + noneK + (x : (_ : someK x)) + m))) + +lazyResult = (errK okK result : + ((chosen : chosen t) + (matchResult + (code rest : (_ : errK code rest)) + (value rest : (_ : okK value rest)) + result))) diff --git a/lib/prelude.tri b/lib/prelude.tri index 91f779f..d2c2093 100644 --- a/lib/prelude.tri +++ b/lib/prelude.tri @@ -3,4 +3,5 @@ !import "base.tri" !Local !import "list.tri" !Local !import "bytes.tri" !Local +!import "lazy.tri" !Local !import "conversions.tri" !Local diff --git a/notes/recursive-consumers.md b/notes/recursive-consumers.md deleted file mode 100644 index aa21f93..0000000 --- a/notes/recursive-consumers.md +++ /dev/null @@ -1,81 +0,0 @@ -# Recursive Consumer Argument Order - -## Rule - -Put consumed data first in recursive workers in `tricu` code. - -*AVOID* this shape: - -```text -worker control state input -``` - -*USE* this shape: - -```text -worker input control state -``` - -The consumed structure should block recursion when it is unknown. Counters, indexes, lengths, and accumulator state should not be able to drive recursion over abstract input. - -## Bad shape - -The original `readBytes_` worker put loop-control arguments before the byte stream: - -```tricu -readBytes_ = y (self n i bs original acc : - matchBool - (ok (reverse acc) bs) - (matchResult - (code rest : err code original) - (actual rest : - self n (succ i) rest original (pair actual acc)) - (readU8 bs)) - (equal? i n)) - -readBytes = (n bs : readBytes_ n 0 bs bs t) -``` - -With a partial application like: - -```tricu -readBytes 2 -``` - -the evaluator knows `n = 2` and `i = 0`, but `bs` is still abstract. That lets the counter check drive recursive specialization before the byte stream is available, which can build a huge symbolic residual tree. This has been proven; do not reason about it further. - -## Good shape - -The corrected worker takes the byte stream first and immediately case-analyzes it: - -```tricu -readBytes_ = y (self bs n i original acc : - matchList - (matchBool - (ok (reverse acc) bs) - (err errUnexpectedEof original) - (equal? i n)) - (h r : - matchBool - (ok (reverse acc) bs) - (self r n (succ i) original (pair h acc)) - (equal? i n)) - bs) - -readBytes = (n bs : readBytes_ bs n 0 bs t) -``` - -Now: - -```tricu -readBytes 2 -``` - -becomes a function waiting on `bs`. Since the worker immediately performs `matchList ... bs`, evaluation blocks on the missing input instead of unrolling the counter loop. - -## Takeaway - -```text -Let consumed data drive recursion. -Do not let counters unroll over abstract input. -``` diff --git a/notes/tricu-normalization-rules.md b/notes/tricu-normalization-rules.md new file mode 100644 index 0000000..8ee76f9 --- /dev/null +++ b/notes/tricu-normalization-rules.md @@ -0,0 +1,248 @@ +# The takeaway + +Consumed data must block recursion. +Control data must not drive recursion. +Branches with work must be lazy. +Top-level fixed points must be hidden behind wrappers. +Fixed-format data should be destructured finitely, not sliced recursively. + +## Rules for normalization-safe `tricu` + +A top-level definition must normalize when its runtime inputs are still abstract. Therefore, avoid any shape where known control data can unfold recursion before the consumed data is available. + +## 1. Put consumed data first + +Recursive workers should take the structure they consume before counters, indexes, limits, accumulators, or other control state. + +Avoid: + +```tricu +worker index records state +``` + +Prefer: + +```tricu +worker records index state +``` + +The worker’s first real operation should usually be a case split on the consumed value: + +```tricu +worker_ = (self records state : + lazyList + nilCase + consCase + records) +``` + +## 2. Do not use generic recursive consumers on abstract fixed-format data + +Avoid applying helpers like these to abstract values in top-level-normalized definitions: + +```tricu +take n xs +drop n xs +nth n xs +length xs +startsWith? prefix xs +bytesTake n bytes +bytesDrop n bytes +``` + +These can be driven by known counters, indexes, lengths, or prefixes while `xs` is still abstract. + +For fixed-format data, use finite destructuring helpers instead: + +```tricu +withNodePayloadForkIndices payload shortK indicesK +hashShard hash +``` + +This keeps the recursion bounded by syntax, not by a runtime counter. + +## 3. Use lazy eliminators when a branch contains work + +If a branch contains recursion, IO construction, parsing, lookup, response construction, or anything that may recurse internally, do not pass it as an ordinary branch value. + +Avoid: + +```tricu +matchBool + resultNow + (self rest state) + cond +``` + +Prefer: + +```tricu +lazyBool + (_ : resultNow) + (_ : self rest state) + cond +``` + +Same rule for result, maybe, and list elimination: + +```tricu +lazyBool +lazyResult +lazyMaybe +lazyList +``` + +Strict eliminators are safe only when both branches are already cheap normal forms. + +## 4. Do not expose top-level fixed points directly + +Avoid top-level definitions like: + +```tricu +foo_ = y (self input state : ...) +``` + +Prefer the library-style split: + +```tricu +foo_ = (self input state : ...) + +foo = (input state : + y foo_ input state) +``` + +This prevents each independently-normalized top-level definition from trying to normalize the fixed point itself. + +## 5. Keep recursive self-application small and structurally progressing + +Prefer recursive calls shaped like: + +```tricu +self rest nextState +``` + +over wide calls like: + +```tricu +self rest index i limit acc flags +``` + +Pack non-consumed state into a record/pair if needed. + +The consumed argument should visibly progress: + +```tricu +self rest nextState +``` + +not restart from the original structure: + +```tricu +self originalRecords newIndex newState +``` + +Restarting from the original input inside recursive branches can create residual trees with no obvious structural progress. + +## 6. Recursive state updates must be non-recursive + +Do not call a recursive helper while constructing the next recursive state. + +Avoid: + +```tricu +self rest (listSnoc acc value) +``` + +because `listSnoc` is itself recursive. + +Prefer constant-time constructors: + +```tricu +self rest (pair value acc) +``` + +If order matters, reverse later only when the input is concrete, or store explicit indexes in an association list. + +## 7. Do not rebuild from the whole input when a prefix invariant exists + +If validation guarantees child references point backward, use that invariant. + +Avoid: + +```tricu +buildTree allRecords childIndex +``` + +inside the build of each node. + +Prefer: + +```tricu +lookup childIndex builtPrefix +``` + +For Arboricx nodes, this meant scanning records once left-to-right and resolving children from `builtTrees`. + +## 8. Make route/path helpers consumed-data-driven + +For request paths, hashes, and byte strings, avoid counter/prefix-driven recursive operations over abstract request data. + +Avoid: + +```tricu +take 3 hash +drop 23 target +startsWith? prefix target +``` + +Prefer: + +```tricu +hashShard hash +stripPrefix prefix target +``` + +where the helper case-analyzes the consumed runtime data before recurring. + +For fixed small slices like the first three hash bytes, use finite destructuring rather than `take`. + +## 9. Treat top-level normalization as stricter than runtime evaluation + +A function can be semantically correct at runtime and still fail import normalization. + +Ask this for every top-level definition: + +```text +Can this normalize while all of its arguments are unknown? +``` + +If the answer depends on “the branch will not be taken” or “the input will be concrete by then,” the definition is probably not normalization-safe. + +## 10. When a definition hangs alphabetically, inspect reachable dependencies + +The alphabetically first hanging definition is not necessarily the root cause. It may simply be the first definition that reaches a later problematic helper. + +Debug by replacing reachable branches with constants: + +```tricu +foo = (... : pure notFoundResponse) +``` + +Then add back one dependency at a time. If a constant version normalizes, the issue is in reachable branch work, not the wrapper itself. + +## Compact checklist + +Before adding or exporting a definition, check: + +```text +1. Does every recursive worker consume unknown data first? +2. Is every recursive branch thunked with lazy eliminators? +3. Is `y` applied inside the public wrapper, not exposed as a top-level worker value? +4. Are recursive self-calls visibly progressing on consumed data? +5. Are recursive state updates constant-time? +6. Are `take`, `drop`, `nth`, `length`, `startsWith?`, or byte slicing used on abstract data? +7. Could a known counter, index, prefix, or length drive recursion? +8. Are fixed-format fields parsed with finite destructuring helpers? +9. Does any branch construct dynamic paths/responses from abstract data using recursive list helpers? +10. Can the definition normalize with all runtime arguments still unknown? +```