Small host execution ergos

This commit is contained in:
2026-05-09 18:18:25 -05:00
parent 2773109b87
commit d0886ad886
4 changed files with 510 additions and 26 deletions

View File

@@ -0,0 +1,384 @@
# Self-hosted Arboricx Host Prototype
This document describes how to build a minimal host-language shell that can execute Arboricx bundles through the self-hosted tricu Arboricx parser/executor.
The intended reader is an implementation agent building a first prototype in a host language such as PHP. The same approach should generalize to any language with a small Tree Calculus evaluator.
## Goal
Build a tiny host program that can:
1. Represent Tree Calculus values.
2. Reduce/evaluate Tree Calculus terms.
3. Load or embed the tricu Arboricx runtime kernel.
4. Read an application `.arboricx` bundle from disk.
5. Convert host inputs into canonical Tree Calculus values.
6. Apply the kernel to the application bundle and arguments.
7. Decode the result back into host values.
A concrete target example:
```tricu
-- Application bundle root is an unapplied function:
append "hello "
```
The host should be able to call that bundle with the host string `"james"` and receive:
```text
hello james
```
Conceptually the host evaluates:
```tricu
runArboricxArgs <applicationBundleBytes> ["james"]
```
where `runArboricxArgs` comes from the self-hosted Arboricx runtime kernel.
## Architectural overview
There are two Arboricx bundles involved:
1. **Kernel bundle**
- Contains the self-hosted Arboricx parser/executor written in tricu.
- Exposes ergonomic runtime entrypoints such as `runArboricxArgs`.
- This can be hardcoded as a Tree Calculus value in the host, or loaded by a minimal host-side Arboricx parser.
2. **Application bundle**
- The bundle the user wants to execute.
- Example: a bundle whose exported root is `append "hello "`, waiting for one more string argument.
- The host reads this file as raw bytes and encodes those bytes as a Tree Calculus byte list.
The minimal host does **not** need to understand the application bundle format if the kernel is already available as a Tree Calculus value. The host only passes the application bundle bytes to the kernel.
## Required host components
### 1. Tree representation
The host needs a representation for the three Tree Calculus constructors:
```text
Leaf
Stem child
Fork left right
```
Use whatever is idiomatic for the host language. In PHP, for a prototype, simple classes or tagged arrays are sufficient.
Example shape:
```php
abstract class T {}
final class Leaf extends T {}
final class Stem extends T { public T $child; }
final class Fork extends T { public T $left; public T $right; }
```
or tagged arrays:
```php
['tag' => 'leaf']
['tag' => 'stem', 'child' => $t]
['tag' => 'fork', 'left' => $l, 'right' => $r]
```
The evaluator and codecs only need these three constructors.
### 2. Tree Calculus evaluator
The host must implement Tree Calculus reduction. This is the core VM.
The evaluator should use normal-order evaluation, matching the runtime semantics expected by Arboricx manifests:
```text
runtimeEvaluation = "normal-order"
```
The evaluator only needs the Tree Calculus reduction rules. There is no parser requirement for the host prototype if terms are constructed directly as trees.
Implementation notes:
- Evaluation must support application: a tree applied to another tree.
- In this codebase, application is represented structurally as `Fork function argument` before reduction.
- The evaluator repeatedly reduces until normal form or until a configured step/fuel limit is reached.
- Add a fuel limit for the first prototype to avoid infinite reductions during debugging.
Reference implementation locations:
- Haskell evaluator/reduction: `src/Research.hs`
- JavaScript Arboricx runtime evaluator: `ext/js/src/` if present in the checkout
Use those as references for exact reduction behavior.
### 3. Kernel availability
The host needs access to the self-hosted Arboricx runtime kernel as a Tree Calculus value.
There are two viable bootstrap strategies.
#### Strategy A: hardcode the kernel tree
For the first host prototype, this is recommended.
Workflow:
1. Compile/export the tricu kernel entrypoint as an Arboricx bundle or tree value.
2. Convert the selected exported kernel function into a host-language Tree Calculus literal.
3. Commit/embed that literal in the host implementation.
Then the host does not need any Arboricx parser of its own for the kernel. It only needs Tree Calculus reduction.
#### Strategy B: bootstrap the kernel from an Arboricx bundle
Alternatively, the host can implement a minimal Arboricx parser just sufficient to load the kernel bundle.
This is more work up front, but avoids hardcoding a huge tree literal.
If using this strategy, the host-side parser needs to:
1. Parse the Arboricx container.
2. Parse enough manifest/export data to locate the desired kernel export.
3. Parse node records.
4. Reconstruct the selected root Tree Calculus value from the Merkle node DAG.
This logic is exactly what the tricu self-hosted kernel does, so the hardcoded-kernel path is simpler for early ports.
## Kernel entrypoints
The ergonomic runtime API currently lives in `lib/arboricx.tri`.
Primary entrypoints:
```tricu
readArboricxExecutableByName nameBytes bundleBytes
readArboricxExecutable bundleBytes
runArboricxByName nameBytes bundleBytes arg
runArboricx bundleBytes arg
runArboricxArgsByName nameBytes bundleBytes args
runArboricxArgs bundleBytes args
```
Recommended host entrypoint:
```tricu
runArboricxArgs
```
It accepts:
1. Raw application bundle bytes as a Tree Calculus byte list.
2. A Tree Calculus list of arguments.
It returns a result-wrapped value.
For named exports, use:
```tricu
runArboricxArgsByName
```
It accepts:
1. Export name as bytes.
2. Application bundle bytes as bytes.
3. Argument list.
### Applying the kernel in the host evaluator
If the host has the Tree Calculus value for `runArboricxArgs`, call it by constructing nested application trees.
In Tree Calculus application form:
```text
((runArboricxArgs bundleBytesTree) argsTree)
```
Structurally, if `app(f, x)` constructs `Fork(f, x)`, then:
```php
$expr = app(app($kernelRunArboricxArgs, $bundleBytesTree), $argsTree);
$result = normalize($expr);
```
For named export execution:
```text
(((runArboricxArgsByName nameBytesTree) bundleBytesTree) argsTree)
```
Structurally:
```php
$expr = app(
app(
app($kernelRunArboricxArgsByName, $nameBytesTree),
$bundleBytesTree
),
$argsTree
);
$result = normalize($expr);
```
## Result convention
The runtime API returns results using the tricu `ok` / `err` convention from `lib/binary.tri`:
```tricu
ok value rest = pair true (pair value rest)
err code rest = pair false (pair code rest)
```
The host should unwrap this result before decoding the final value.
Expected success shape:
```tricu
ok value rest
```
For typical execution, `value` is the application result. `rest` is usually not important to the host shell unless debugging parser behavior.
Expected error shape:
```tricu
err code rest
```
The error code is a Tree Calculus number. Error constants are defined in:
- `lib/binary.tri`
- `lib/arboricx-common.tri`
A prototype host can simply report the numeric error code and optionally dump a compact representation of `rest`.
## Example execution flow
Suppose the application bundle exports this root:
```tricu
append "hello "
```
The bundle root is an unapplied function waiting for one more string argument.
Host flow:
1. Load kernel entrypoint tree:
```php
$runArboricxArgs = loadHardcodedKernelEntrypoint('runArboricxArgs');
```
2. Read application bundle bytes:
```php
$bytes = file_get_contents('append-hello.arboricx');
```
3. Encode bundle bytes as a Tree Calculus byte list:
```php
$bundleBytesTree = encodeBytes($bytes);
```
4. Encode host argument(s):
```php
$arg = encodeString('james');
$args = encodeList([$arg]);
```
5. Build application expression:
```php
$expr = app(app($runArboricxArgs, $bundleBytesTree), $args);
```
6. Evaluate:
```php
$result = normalize($expr);
```
7. Unwrap `ok` result:
```php
[$ok, $value, $rest] = unwrapResult($result);
if (!$ok) { throw new RuntimeException('Arboricx error'); }
```
8. Decode the value:
```php
echo decodeString($value); // hello james
```
## What the kernel does internally
`runArboricxArgs` performs the following steps inside Tree Calculus:
1. Parse and validate the raw Arboricx bundle bytes.
2. Parse the manifest.
3. Select the default export:
- use export named `main` if present,
- otherwise use the sole export if exactly one exists,
- otherwise return an error.
4. Read the nodes section.
5. Reconstruct the selected root tree from the Merkle DAG.
6. Apply each host-provided argument in order.
7. Return `ok result rest` or an `err`.
`runArboricxArgsByName` is identical except that it selects a named export.
## Tests proving the expected behavior
The relevant Haskell tests are in `test/Spec.hs` under `manifestReadingTests`.
Important cases:
- `readArboricxExecutable: reconstructs default export tree`
- `readArboricxExecutableByName: selects named export`
- `runArboricx: applies host-provided argument to default export`
- `runArboricxArgs: applies host-provided argument list in order`
These tests demonstrate the host-shell contract:
- application bundle bytes are supplied as a Tree Calculus byte list,
- host arguments are supplied as canonical Tree Calculus values,
- execution returns a result-wrapped Tree Calculus value.
## Minimal PHP prototype checklist
A PHP prototype should implement:
- [ ] Tree data constructors: `Leaf`, `Stem`, `Fork`.
- [ ] Application helper: `app($f, $x) = Fork($f, $x)`.
- [ ] Normal-order Tree Calculus reducer.
- [ ] Fuel/step limit for debugging.
- [ ] Hardcoded kernel entrypoint tree for `runArboricxArgs`.
- [ ] Encode application bundle file bytes into a Tree Calculus byte list.
- [ ] Encode host argument values into Tree Calculus values.
- [ ] Build expression: `((runArboricxArgs bundleBytes) args)`.
- [ ] Normalize expression.
- [ ] Unwrap `ok` / `err` result.
- [ ] Decode result value into host type.
For exact codec details, reference the Haskell implementation in `src/Research.hs` and the existing JS runtime if available.
## Current recommendation
For the first PHP implementation:
1. Hardcode only the `runArboricxArgs` kernel entrypoint as a Tree Calculus value.
2. Do not implement host-side Arboricx parsing yet.
3. Implement only enough codecs for:
- bytes,
- strings,
- lists,
- result unwrapping.
4. Use one test fixture: an Arboricx bundle whose root is `append "hello "`.
5. Assert that calling it with `"james"` returns `"hello james"`.
Once that works, add named export support via `runArboricxArgsByName` and expand codecs as needed.

View File

@@ -243,35 +243,39 @@ getExportNames_ = y (self acc exports :
getExportNames = (exports :
getExportNames_ t exports)
-- Select an export: prefer explicit name, then "main", then single, then error.
selectExport_ = y (self exports name nameBytes :
mainExportName = "main"
maybeExportToResult = (maybeExport :
triage
(err errMissingSection t)
(export : ok export t)
(_ _ : err errMissingSection t)
maybeExport)
selectSingleExport = (exports :
matchList
(err errMissingSection t)
(export rest :
matchBool
-- Explicit name given
(matchBool
nothing
(ok export t)
(err errMissingSection t)
(_ _ : nothing)
(findExportByName exports nameBytes))
-- No explicit name: try "main"
(matchBool
nothing
(matchBool
(equal? (length exports) 1)
(ok (head exports) t)
(err errMissingSection t)
(bytesEq? (exportName (head exports)) nameBytes))
(_ _ : nothing)
(findExportByName exports nameBytes))
-- Single export: auto-select
(matchBool
(equal? (length exports) 1)
(ok (head exports) t)
(err errMissingSection t)
(emptyList? exports))
(emptyList? rest))
exports)
selectDefaultExport = (exports :
triage
(selectSingleExport exports)
(export : ok export t)
(_ _ : err errMissingSection t)
(findExportByName exports mainExportName))
-- Select an export: explicit name if provided, otherwise "main", otherwise
-- the sole export if the bundle has exactly one export.
selectExport = (exports nameBytes :
selectExport_ exports nameBytes nameBytes)
matchBool
(selectDefaultExport exports)
(maybeExportToResult (findExportByName exports nameBytes))
(emptyList? nameBytes))
selectExportOpt = (exports optNameBytes :
selectExport exports optNameBytes)
@@ -304,7 +308,7 @@ manifestRuntimeAbi = (core : pairFirst (pairSecond (pairSecond (pairSecond (pair
manifestCapabilities = (core : pairFirst (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond core))))))))))
manifestClosureByte = (core : pairFirst (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond core)))))))))))
manifestRoots = (core : pairFirst (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond core))))))))))))
manifestExports = (core : pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond core)))))))))))
manifestExports = (core : pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond core))))))))))))
-- Helper: compare a manifest field against an expected byte string.
manifestFieldMatch? = (actual expected : bytesEq? actual expected)

View File

@@ -16,3 +16,38 @@ readArboricxBundle = (bs :
(validCore _ : ok (pair validCore metadataWithExtensions) afterContainer))
parsedManifest))
sections))
-- Select an export from a validated bundle and reconstruct its root tree.
-- Returns ok executable afterContainer, or propagates parse/selection/node errors.
readArboricxExecutableByName = (nameBytes bs :
bindResult (readArboricxBundle bs)
(bundleResult afterBundle :
matchPair
(validCore _ :
bindResult (selectExport (manifestExports validCore) nameBytes)
(selectedExport _ :
readArboricxTreeFromHash (exportRoot selectedExport) bs))
bundleResult))
readArboricxExecutable = (bs :
readArboricxExecutableByName [] bs)
applyArgs = (f args :
foldl
(acc arg : acc arg)
f
args)
runArboricxByName = (nameBytes bs arg :
bindResult (readArboricxExecutableByName nameBytes bs)
(executable rest : ok (executable arg) rest))
runArboricx = (bs arg :
runArboricxByName [] bs arg)
runArboricxArgsByName = (nameBytes bs args :
bindResult (readArboricxExecutableByName nameBytes bs)
(executable rest : ok (applyArgs executable args) rest))
runArboricxArgs = (bs args :
runArboricxArgsByName [] bs args)

View File

@@ -2734,4 +2734,65 @@ manifestReadingTests = testGroup "Manifest Reading Tests"
let env = evalTricu library (parseTricu input)
let algoT = result env
toString algoT @?= Right "sha256"
, testCase "readArboricxExecutable: reconstructs default export tree" $ do
(srcConn, termHash, originalTerm) <- storeTermInTempDB $ unlines
[ "main = t t" ]
wireData <- exportBundle srcConn [termHash]
let input = "matchResult "
++ " (code rest : err code rest) "
++ " (tree rest : ok tree []) "
++ " (readArboricxExecutable " ++ bytesExpr (map toInteger $ BS.unpack wireData) ++ ")"
library <- evaluateFile "./lib/arboricx.tri"
let env = evalTricu library (parseTricu input)
result env @?= okT originalTerm (bytesT [])
close srcConn
, testCase "readArboricxExecutableByName: selects named export" $ do
srcConn <- newContentStore
let parsed = parseTricu $ unlines
[ "leaf = t"
, "stem = t t"
, "main = stem"
]
env = evalTricu Map.empty parsed
leafTerm = maybe (error "leaf missing") id (Map.lookup "leaf" env)
stemTerm = maybe (error "stem missing") id (Map.lookup "stem" env)
leafHash <- storeTerm srcConn ["leaf"] leafTerm
stemHash <- storeTerm srcConn ["stem"] stemTerm
wireData <- exportNamedBundle srcConn [("leaf", leafHash), ("stem", stemHash)]
let input = "matchResult "
++ " (code rest : err code rest) "
++ " (tree rest : ok tree []) "
++ " (readArboricxExecutableByName " ++ bytesExpr (map (fromIntegral . fromEnum) "stem") ++ " " ++ bytesExpr (map toInteger $ BS.unpack wireData) ++ ")"
library <- evaluateFile "./lib/arboricx.tri"
let resultEnv = evalTricu library (parseTricu input)
result resultEnv @?= okT stemTerm (bytesT [])
close srcConn
, testCase "runArboricx: applies host-provided argument to default export" $ do
(srcConn, termHash, _) <- storeTermInTempDB $ unlines
[ "main = (x : x)" ]
wireData <- exportBundle srcConn [termHash]
let input = "matchResult "
++ " (code rest : err code rest) "
++ " (value rest : value) "
++ " (runArboricx " ++ bytesExpr (map toInteger $ BS.unpack wireData) ++ " \"hello\")"
library <- evaluateFile "./lib/arboricx.tri"
let env = evalTricu library (parseTricu input)
toString (result env) @?= Right "hello"
close srcConn
, testCase "runArboricxArgs: applies host-provided argument list in order" $ do
(srcConn, termHash, _) <- storeTermInTempDB $ unlines
[ "main = (x y : x)" ]
wireData <- exportBundle srcConn [termHash]
let input = "matchResult "
++ " (code rest : err code rest) "
++ " (value rest : value) "
++ " (runArboricxArgs " ++ bytesExpr (map toInteger $ BS.unpack wireData) ++ " [(\"left\") (\"right\")])"
library <- evaluateFile "./lib/arboricx.tri"
let env = evalTricu library (parseTricu input)
toString (result env) @?= Right "left"
close srcConn
]