Files
tricu/AGENTS.md

378 lines
14 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# AGENTS.md - tricu Project Guide
> For AI agents and contributors working in this repository.
## 0. Test Driven Development
Write and discuss tests with the user before working on implementation code. Do not modify existing tests without explicit permission.
## 1. Build & Test
```bash
# Haskell tests (default check)
nix flake check
# Zig build
nix build .#tricu-zig
# Zig tests (separate target — not part of nix flake check)
nix build .#tricu-zig-tests
# Full build
nix build .#
```
### ⚠️ Never call `cabal` directly
> **Rule of thumb:** if it builds, links, or tests, it goes through `nix`.
## 2. Project Overview
**tricu** (pronounced "tree-shoe") is a programming-language experiment written in Haskell. It implements [Triage Calculus](https://olydis.medium.com/a-visual-introduction-to-tree-calculus-2f4a34ceffc2), an extension of Barry Jay's Tree Calculus, with lambda-abstraction sugar that gets eliminated back to pure tree calculus terms.
### Core types (in `src/Research.hs`)
| Type | Description |
|------|-------------|
| `T = Leaf \| Stem T \| Fork T T` | Tree Calculus term (the runtime value) |
| `TricuAST` | Parsed AST with `SDef`, `SApp`, `SLambda`, etc. |
| `LToken` | Lexer tokens |
| `Node` / `MerkleHash` | Content-addressed Merkle DAG nodes |
### Source modules (Haskell)
| Module | Purpose |
|--------|---------|
| `Main.hs` | CLI entry point (`cmdargs`), three modes: `repl`, `eval`, `decode` |
| `Eval.hs` | Interpreter: `evalTricu`, `result`, `evalSingle` |
| `Parser.hs` | Megaparsec parser → `TricuAST` |
| `Lexer.hs` | Megaparsec lexer → `LToken` |
| `FileEval.hs` | File loading, module imports, `!import` |
| `REPL.hs` | Interactive Read-Eval-Print Loop (haskeline) |
| `Research.hs` | Core types, `apply` reduction, booleans, marshalling (`ofString`, `ofNumber`), output formatters (`toAscii`, `toTernaryString`, `decodeResult`) |
| `ContentStore.hs` | SQLite-backed term persistence |
| `Wire.hs` | Arboricx portable wire format — encode/decode/import/export of Merkle-DAG bundle blobs |
### Multi-language Arboricx ecosystem
Arboricx is the portable executable-object format used by tricu. The project now includes native parsing and execution in multiple languages:
| Language | Location | Capabilities |
|----------|----------|--------------|
| **Haskell** | `src/Wire.hs`, `src/Research.hs` | Reference implementation — bundle encode/decode, content store, full Tree Calculus reduction |
| **tricu (self-hosted)** | `kernel_run_arboricx_typed.dag` | A self-hosting Arboricx parser/executor written in tricu itself. Used as a kernel inside the Zig host for maximum portability ("cool but useless" — ~3s for `append`) |
| **Zig** | `ext/zig/` | **Production host** — native bundle parser, WHNF reducer, C ABI (`libarboricx.so` / `.a`), CLI (`tricu-zig`), Python FFI support |
| **JavaScript (Node)** | `ext/js/` | Native bundle parser, manifest decoder, Merkle DAG verifier, Tree Calculus reducer, CLI runner |
| **PHP** | `ext/php/` | FFI wrapper around `libarboricx.so`, CLI runner |
All hosts share the same bundle format and Merkle hashing scheme.
### File extensions
- `.hs` - Haskell source
- `.tri` - tricu language source (used in `lib/`, `test/`, `demos/`)
- `.arboricx` - Portable executable bundle
- `.dag` - Serialized kernel DAG (used by `gen_kernel.zig` at build time)
## 3. Test Suite
### Haskell tests
Tests live in `test/Spec.hs` and use **Tasty** + **HUnit**.
```bash
nix flake check
```
### Test groups
| Group | What it covers |
|-------|----------------|
| `lexer` | Megaparsec lexer - identifiers, keywords, strings, escapes, invalid tokens |
| `parser` | Parser - defs, lambda, applications, lists, comments, parentheses |
| `simpleEvaluation` | Core `apply` reduction rules, variable substitution, immutability |
| `lambdas` | Lambda elimination, SKI calculus, higher-order functions, currying, shadowing, free vars |
| `providedLibraries` | `lib/list.tri` - triage, booleans, list ops (`head`, `tail`, `map`, `emptyList?`, `append`, `equal?`) |
| `fileEval` | Loading `.tri` files, multi-file context, decode |
| `modules` | `!import`, cyclic deps, namespacing, multi-level imports, unresolved vars, local namespaces |
| `demos` | `demos/*.tri` - structural equality, `toSource`, `size`, level-order traversal |
| `decoding` | `decodeResult` - Leaf, numbers, strings, lists, mixed |
| `elimLambdaSingle` | Lambda elimination: eta reduction, SDef binding, semantics preservation |
| `stressElimLambda` | Lambda elimination stress test: 200 vars, 800-body curried lambda |
### Zig tests
Run separately via:
```bash
nix build .#tricu-zig-tests
```
These are **not** included in `nix flake check`. The test derivation compiles and runs:
| Test | What it covers |
|------|----------------|
| `c_abi_test.c` | Smoke tests — leaf, stem, fork, app, reduce, number/string roundtrip, kernel root |
| `c_abi_append_test.c` | Kernel path — `append.arboricx` with string arguments via Tricu kernel |
| `native_bundle_append_test.c` | Native fast path — `append.arboricx` loaded natively, applied, reduced |
| `native_bundle_id_test.c` | Native fast path — `id.arboricx` |
| `native_bundle_bools_test.c` | Native fast path — `true.arboricx` / `false.arboricx` |
| `python_ffi_test.py` | Python ctypes FFI — tests both kernel and native paths for `id` and `append` |
## 4. tricu Language Quick Reference
```
t → Leaf (the base term)
t t → Stem Leaf
t t t → Fork Leaf Leaf
x = t → Define term x = Leaf
id = (a : a) → Lambda identity (eliminates to tree calculus)
head (map f xs) → From lib/list.tri
!import "./path.tri" NS → Import file under namespace
-- line comment
```
CRITICAL:
When working with recursion in `tricu` files:
1. Put consumed data first in recursive workers.
2. Let data shape drive recursion.
3. Do not let counters unroll over abstract input.
## 5. Output Formats
The `eval` command accepts `--form` (shorthand `-t`):
| Format | Value | Description |
|--------|-------|-------------|
| `tree` | `TreeCalculus` | Simple `t` form (default) |
| `fsl` | `FSL` | Full show representation |
| `ast` | `AST` | Parsed AST representation |
| `ternary` | `Ternary` | Ternary string encoding |
| `ascii` | `Ascii` | ASCII-art tree diagram |
| `decode` | `Decode` | Human-readable (strings, numbers, lists) |
## 6. Content Addressing
Each `T` term is content-addressed via a Merkle DAG:
```
NLeaf → 0x00
NStem(h) → 0x01 || h (32 bytes)
NFork(l,r) → 0x02 || l (32 bytes) || r (32 bytes)
hash = SHA256("arboricx.merkle.node.v1" <> 0x00 <> serialized_node)
```
This is stored in SQLite via `ContentStore.hs`. Hash suffixes on identifiers (e.g., `foo_abc123...`) are validated: 1664 hex characters (SHA256).
## 7. Arboricx Portable Bundles (`.arboricx`)
Portable executable bundles are generated via `Wire.hs`. See `docs/arboricx-bundle-format.md` for the full binary format spec.
```bash
# Export a bundle from the content store
./result/bin/tricu export -o myterm.arboricx myterm
# Run a bundle (requires TRICU_DB_PATH)
./result/bin/tricu import -f lib/list.tri
TRICU_DB_PATH=/tmp/tricu.db ./result/bin/tricu export -o list_ops.arboricx append
```
## 8. Zig Arboricx Host (`ext/zig/`)
The Zig host is a fast implementation for running Arboricx bundles. It provides a native bundle parser and arena-based evaluator.
### Modules
| File | Role |
|------|------|
| `src/main.zig` | CLI entrypoint — default native path, `--kernel` fallback |
| `src/bundle.zig` | Native Arboricx bundle parser — verifies digests, hashes, loads DAG into arena |
| `src/c_abi.zig` | C FFI exports — `arboricx_init`, tree constructors, codecs, reduction, bundle loading |
| `src/reduce.zig` | WHNF reducer (Tree Calculus `apply` rules) |
| `src/arena.zig` | Node arena (`ArrayListUnmanaged`) |
| `src/tree.zig` | `Node` union + iterative `copyTree` |
| `src/codecs.zig` | Number/string/list/bytes encoding + result unwrapping |
| `src/kernel.zig` | Embeds DAG kernel into arena (fallback path only) |
| `src/ternary.zig` | Ternary string parser for Tree Calculus terms |
| `tools/gen_kernel.zig` | Build-time tool: converts `.dag``kernel_embed.zig` |
| `include/arboricx.h` | C header for `libarboricx` |
### C ABI
Key functions:
```c
arb_ctx_t* arboricx_init(void);
uint32_t arb_load_bundle(arb_ctx_t*, const uint8_t* bytes, size_t len, const char* name);
uint32_t arb_load_bundle_default(arb_ctx_t*, const uint8_t* bytes, size_t len);
uint32_t arb_reduce(arb_ctx_t*, uint32_t root, uint64_t fuel);
```
`arb_reduce` evaluates in a **fresh scratch arena** so garbage never accumulates.
### Stack size requirement
Tree Calculus reduction is deeply recursive. Assume a segfault is a memory limitation until proven otherwise.
```bash
ulimit -s 32768 # 32 MB
```
### Performance comparison
| Fixture | Native path | Kernel path (`--kernel`) |
|---------|-------------|--------------------------|
| `append "hello " "world"` | **~0.007 s** | ~3.4 s |
| `id "hello"` | **~0.005 s** | ~0.38 s |
The kernel path is kept as a "cool but useless" fallback — the DAG is tiny (~30 KB) so the cost is negligible.
## 9. Nix Flake Outputs
| Output | Description |
|--------|-------------|
| `packages.default` / `packages.tricu` | Haskell tricu package |
| `packages.tricu-zig` | Zig CLI + `libarboricx.a` + `libarboricx.so` + `arboricx.h` |
| `packages.tricu-zig-tests` | **Separate test target** — C ABI + native bundle + Python FFI tests |
| `packages.tricu-php` | PHP source + `libarboricx.so` + `tricu-php` wrapper script |
| `packages.tricu-php-tests` | **Separate test target** — PHP FFI tests against fixture bundles |
| `packages.tricu-container` | Docker image |
| `checks.default` / `checks.tricu` | Haskell test suite via Tasty/HUnit |
`tricu-zig-tests` is deliberately **not** in `checks` so `nix flake check` remains fast.
## 10. Directory Layout
```
tricu/
├── flake.nix # Nix flake: packages, tests, devShell
├── tricu.cabal # Cabal package (used via callCabal2nix)
├── AGENTS.md # This file
├── src/ # Haskell modules
│ ├── Main.hs
│ ├── Eval.hs
│ ├── Parser.hs
│ ├── Lexer.hs
│ ├── FileEval.hs
│ ├── REPL.hs
│ ├── Research.hs
│ ├── ContentStore.hs
│ └── Wire.hs
├── test/
│ ├── Spec.hs # Tasty + HUnit tests
│ ├── *.tri # tricu test programs
│ ├── *.arboricx # Arboricx bundle fixtures
│ └── local-ns/ # Module namespace test files
├── lib/
│ ├── base.tri
│ ├── list.tri
│ └── patterns.tri
├── demos/
│ ├── equality.tri
│ ├── size.tri
│ ├── toSource.tri
│ ├── levelOrderTraversal.tri
│ └── patternMatching.tri
├── ext/ # Multi-language Arboricx hosts
│ ├── js/ # Node.js bundle parser + reducer
│ │ ├── src/
│ │ │ ├── bundle.js
│ │ │ ├── manifest.js
│ │ │ ├── merkle.js
│ │ │ ├── tree.js
│ │ │ ├── codecs.js
│ │ │ └── cli.js
│ │ └── test/
│ ├── php/ # PHP FFI host for libarboricx.so
│ │ ├── src/
│ │ │ └── ffi.php
│ │ └── run.php
│ └── zig/ # Zig production host
│ ├── build.zig
│ ├── build.zig.zon
│ ├── kernel_run_arboricx_typed.dag
│ ├── include/arboricx.h
│ ├── src/
│ │ ├── main.zig
│ │ ├── bundle.zig
│ │ ├── c_abi.zig
│ │ ├── codecs.zig
│ │ ├── kernel.zig
│ │ ├── reduce.zig
│ │ ├── arena.zig
│ │ ├── tree.zig
│ │ └── ternary.zig
│ ├── tests/
│ │ ├── c_abi_test.c
│ │ ├── c_abi_append_test.c
│ │ ├── native_bundle_append_test.c
│ │ ├── native_bundle_id_test.c
│ │ ├── native_bundle_bools_test.c
│ │ └── python_ffi_test.py
│ └── tools/
│ └── gen_kernel.zig
└── docs/
└── arboricx-bundle-format.md
```
## 11. Content Store Workflow (Custom DB)
The content store location is controlled by the `TRICU_DB_PATH` environment variable. When set, `eval` mode automatically loads all stored terms into the initial environment, so you can call any previously imported/evaluated term by name.
```bash
# Use a local DB
export TRICU_DB_PATH=/tmp/tricu-local.db
# Import terms from the standard library
./result/bin/tricu import -f lib/list.tri
# Now use them in eval mode
echo "not? (t t)" | ./result/bin/tricu eval -t decode
# Output: t
echo "not? (t t t)" | ./result/bin/tricu eval -t decode
# Output: Stem Leaf
echo "equal? (t t) (t t t)" | ./result/bin/tricu eval -t decode
# Output: t
# Check what's in the store
./result/bin/tricu
t> !definitions
```
Without `TRICU_DB_PATH` set, `eval` uses only the terms defined in the input file(s).
## 12. Development Tips
- **REPL:** `nix run .#` starts the interactive tricu REPL.
- **Evaluate files:** `nix run .# -- eval -f demos/equality.tri`
- **Zig host:** `nix build .#tricu-zig` then `./result/bin/tricu-zig <bundle> [args...]`
- **Zig tests:** `nix build .#tricu-zig-tests`
- **GHC options:** `-threaded -rtsopts -with-rtsopts=-N` for parallel runtime. Use `-N` RTS flag for multi-core.
- **Upx** is in the devShell for binary compression if needed.
## 13. Viewing Haskell Dependency Docs from Nix
When you need Haddock documentation for a Haskell dependency available in Nixpkgs, build the package's `doc` output directly with `^doc`.
Example:
Replace `megaparsec` with the dependency name you need:
```sh
nix build "nixpkgs#haskellPackages.${pkg}^doc"
```
View the available documentation files:
```sh
find ./result-doc -type f \( -name '*.html' -o -name '*.haddock' \) | sort
```