Small host execution ergos
This commit is contained in:
384
docs/self-hosted-arboricx-host.md
Normal file
384
docs/self-hosted-arboricx-host.md
Normal file
@@ -0,0 +1,384 @@
|
||||
# Self-hosted Arboricx Host Prototype
|
||||
|
||||
This document describes how to build a minimal host-language shell that can execute Arboricx bundles through the self-hosted tricu Arboricx parser/executor.
|
||||
|
||||
The intended reader is an implementation agent building a first prototype in a host language such as PHP. The same approach should generalize to any language with a small Tree Calculus evaluator.
|
||||
|
||||
## Goal
|
||||
|
||||
Build a tiny host program that can:
|
||||
|
||||
1. Represent Tree Calculus values.
|
||||
2. Reduce/evaluate Tree Calculus terms.
|
||||
3. Load or embed the tricu Arboricx runtime kernel.
|
||||
4. Read an application `.arboricx` bundle from disk.
|
||||
5. Convert host inputs into canonical Tree Calculus values.
|
||||
6. Apply the kernel to the application bundle and arguments.
|
||||
7. Decode the result back into host values.
|
||||
|
||||
A concrete target example:
|
||||
|
||||
```tricu
|
||||
-- Application bundle root is an unapplied function:
|
||||
append "hello "
|
||||
```
|
||||
|
||||
The host should be able to call that bundle with the host string `"james"` and receive:
|
||||
|
||||
```text
|
||||
hello james
|
||||
```
|
||||
|
||||
Conceptually the host evaluates:
|
||||
|
||||
```tricu
|
||||
runArboricxArgs <applicationBundleBytes> ["james"]
|
||||
```
|
||||
|
||||
where `runArboricxArgs` comes from the self-hosted Arboricx runtime kernel.
|
||||
|
||||
## Architectural overview
|
||||
|
||||
There are two Arboricx bundles involved:
|
||||
|
||||
1. **Kernel bundle**
|
||||
- Contains the self-hosted Arboricx parser/executor written in tricu.
|
||||
- Exposes ergonomic runtime entrypoints such as `runArboricxArgs`.
|
||||
- This can be hardcoded as a Tree Calculus value in the host, or loaded by a minimal host-side Arboricx parser.
|
||||
|
||||
2. **Application bundle**
|
||||
- The bundle the user wants to execute.
|
||||
- Example: a bundle whose exported root is `append "hello "`, waiting for one more string argument.
|
||||
- The host reads this file as raw bytes and encodes those bytes as a Tree Calculus byte list.
|
||||
|
||||
The minimal host does **not** need to understand the application bundle format if the kernel is already available as a Tree Calculus value. The host only passes the application bundle bytes to the kernel.
|
||||
|
||||
## Required host components
|
||||
|
||||
### 1. Tree representation
|
||||
|
||||
The host needs a representation for the three Tree Calculus constructors:
|
||||
|
||||
```text
|
||||
Leaf
|
||||
Stem child
|
||||
Fork left right
|
||||
```
|
||||
|
||||
Use whatever is idiomatic for the host language. In PHP, for a prototype, simple classes or tagged arrays are sufficient.
|
||||
|
||||
Example shape:
|
||||
|
||||
```php
|
||||
abstract class T {}
|
||||
final class Leaf extends T {}
|
||||
final class Stem extends T { public T $child; }
|
||||
final class Fork extends T { public T $left; public T $right; }
|
||||
```
|
||||
|
||||
or tagged arrays:
|
||||
|
||||
```php
|
||||
['tag' => 'leaf']
|
||||
['tag' => 'stem', 'child' => $t]
|
||||
['tag' => 'fork', 'left' => $l, 'right' => $r]
|
||||
```
|
||||
|
||||
The evaluator and codecs only need these three constructors.
|
||||
|
||||
### 2. Tree Calculus evaluator
|
||||
|
||||
The host must implement Tree Calculus reduction. This is the core VM.
|
||||
|
||||
The evaluator should use normal-order evaluation, matching the runtime semantics expected by Arboricx manifests:
|
||||
|
||||
```text
|
||||
runtimeEvaluation = "normal-order"
|
||||
```
|
||||
|
||||
The evaluator only needs the Tree Calculus reduction rules. There is no parser requirement for the host prototype if terms are constructed directly as trees.
|
||||
|
||||
Implementation notes:
|
||||
|
||||
- Evaluation must support application: a tree applied to another tree.
|
||||
- In this codebase, application is represented structurally as `Fork function argument` before reduction.
|
||||
- The evaluator repeatedly reduces until normal form or until a configured step/fuel limit is reached.
|
||||
- Add a fuel limit for the first prototype to avoid infinite reductions during debugging.
|
||||
|
||||
Reference implementation locations:
|
||||
|
||||
- Haskell evaluator/reduction: `src/Research.hs`
|
||||
- JavaScript Arboricx runtime evaluator: `ext/js/src/` if present in the checkout
|
||||
|
||||
Use those as references for exact reduction behavior.
|
||||
|
||||
### 3. Kernel availability
|
||||
|
||||
The host needs access to the self-hosted Arboricx runtime kernel as a Tree Calculus value.
|
||||
|
||||
There are two viable bootstrap strategies.
|
||||
|
||||
#### Strategy A: hardcode the kernel tree
|
||||
|
||||
For the first host prototype, this is recommended.
|
||||
|
||||
Workflow:
|
||||
|
||||
1. Compile/export the tricu kernel entrypoint as an Arboricx bundle or tree value.
|
||||
2. Convert the selected exported kernel function into a host-language Tree Calculus literal.
|
||||
3. Commit/embed that literal in the host implementation.
|
||||
|
||||
Then the host does not need any Arboricx parser of its own for the kernel. It only needs Tree Calculus reduction.
|
||||
|
||||
#### Strategy B: bootstrap the kernel from an Arboricx bundle
|
||||
|
||||
Alternatively, the host can implement a minimal Arboricx parser just sufficient to load the kernel bundle.
|
||||
|
||||
This is more work up front, but avoids hardcoding a huge tree literal.
|
||||
|
||||
If using this strategy, the host-side parser needs to:
|
||||
|
||||
1. Parse the Arboricx container.
|
||||
2. Parse enough manifest/export data to locate the desired kernel export.
|
||||
3. Parse node records.
|
||||
4. Reconstruct the selected root Tree Calculus value from the Merkle node DAG.
|
||||
|
||||
This logic is exactly what the tricu self-hosted kernel does, so the hardcoded-kernel path is simpler for early ports.
|
||||
|
||||
## Kernel entrypoints
|
||||
|
||||
The ergonomic runtime API currently lives in `lib/arboricx.tri`.
|
||||
|
||||
Primary entrypoints:
|
||||
|
||||
```tricu
|
||||
readArboricxExecutableByName nameBytes bundleBytes
|
||||
readArboricxExecutable bundleBytes
|
||||
runArboricxByName nameBytes bundleBytes arg
|
||||
runArboricx bundleBytes arg
|
||||
runArboricxArgsByName nameBytes bundleBytes args
|
||||
runArboricxArgs bundleBytes args
|
||||
```
|
||||
|
||||
Recommended host entrypoint:
|
||||
|
||||
```tricu
|
||||
runArboricxArgs
|
||||
```
|
||||
|
||||
It accepts:
|
||||
|
||||
1. Raw application bundle bytes as a Tree Calculus byte list.
|
||||
2. A Tree Calculus list of arguments.
|
||||
|
||||
It returns a result-wrapped value.
|
||||
|
||||
For named exports, use:
|
||||
|
||||
```tricu
|
||||
runArboricxArgsByName
|
||||
```
|
||||
|
||||
It accepts:
|
||||
|
||||
1. Export name as bytes.
|
||||
2. Application bundle bytes as bytes.
|
||||
3. Argument list.
|
||||
|
||||
### Applying the kernel in the host evaluator
|
||||
|
||||
If the host has the Tree Calculus value for `runArboricxArgs`, call it by constructing nested application trees.
|
||||
|
||||
In Tree Calculus application form:
|
||||
|
||||
```text
|
||||
((runArboricxArgs bundleBytesTree) argsTree)
|
||||
```
|
||||
|
||||
Structurally, if `app(f, x)` constructs `Fork(f, x)`, then:
|
||||
|
||||
```php
|
||||
$expr = app(app($kernelRunArboricxArgs, $bundleBytesTree), $argsTree);
|
||||
$result = normalize($expr);
|
||||
```
|
||||
|
||||
For named export execution:
|
||||
|
||||
```text
|
||||
(((runArboricxArgsByName nameBytesTree) bundleBytesTree) argsTree)
|
||||
```
|
||||
|
||||
Structurally:
|
||||
|
||||
```php
|
||||
$expr = app(
|
||||
app(
|
||||
app($kernelRunArboricxArgsByName, $nameBytesTree),
|
||||
$bundleBytesTree
|
||||
),
|
||||
$argsTree
|
||||
);
|
||||
$result = normalize($expr);
|
||||
```
|
||||
|
||||
## Result convention
|
||||
|
||||
The runtime API returns results using the tricu `ok` / `err` convention from `lib/binary.tri`:
|
||||
|
||||
```tricu
|
||||
ok value rest = pair true (pair value rest)
|
||||
err code rest = pair false (pair code rest)
|
||||
```
|
||||
|
||||
The host should unwrap this result before decoding the final value.
|
||||
|
||||
Expected success shape:
|
||||
|
||||
```tricu
|
||||
ok value rest
|
||||
```
|
||||
|
||||
For typical execution, `value` is the application result. `rest` is usually not important to the host shell unless debugging parser behavior.
|
||||
|
||||
Expected error shape:
|
||||
|
||||
```tricu
|
||||
err code rest
|
||||
```
|
||||
|
||||
The error code is a Tree Calculus number. Error constants are defined in:
|
||||
|
||||
- `lib/binary.tri`
|
||||
- `lib/arboricx-common.tri`
|
||||
|
||||
A prototype host can simply report the numeric error code and optionally dump a compact representation of `rest`.
|
||||
|
||||
## Example execution flow
|
||||
|
||||
Suppose the application bundle exports this root:
|
||||
|
||||
```tricu
|
||||
append "hello "
|
||||
```
|
||||
|
||||
The bundle root is an unapplied function waiting for one more string argument.
|
||||
|
||||
Host flow:
|
||||
|
||||
1. Load kernel entrypoint tree:
|
||||
|
||||
```php
|
||||
$runArboricxArgs = loadHardcodedKernelEntrypoint('runArboricxArgs');
|
||||
```
|
||||
|
||||
2. Read application bundle bytes:
|
||||
|
||||
```php
|
||||
$bytes = file_get_contents('append-hello.arboricx');
|
||||
```
|
||||
|
||||
3. Encode bundle bytes as a Tree Calculus byte list:
|
||||
|
||||
```php
|
||||
$bundleBytesTree = encodeBytes($bytes);
|
||||
```
|
||||
|
||||
4. Encode host argument(s):
|
||||
|
||||
```php
|
||||
$arg = encodeString('james');
|
||||
$args = encodeList([$arg]);
|
||||
```
|
||||
|
||||
5. Build application expression:
|
||||
|
||||
```php
|
||||
$expr = app(app($runArboricxArgs, $bundleBytesTree), $args);
|
||||
```
|
||||
|
||||
6. Evaluate:
|
||||
|
||||
```php
|
||||
$result = normalize($expr);
|
||||
```
|
||||
|
||||
7. Unwrap `ok` result:
|
||||
|
||||
```php
|
||||
[$ok, $value, $rest] = unwrapResult($result);
|
||||
if (!$ok) { throw new RuntimeException('Arboricx error'); }
|
||||
```
|
||||
|
||||
8. Decode the value:
|
||||
|
||||
```php
|
||||
echo decodeString($value); // hello james
|
||||
```
|
||||
|
||||
## What the kernel does internally
|
||||
|
||||
`runArboricxArgs` performs the following steps inside Tree Calculus:
|
||||
|
||||
1. Parse and validate the raw Arboricx bundle bytes.
|
||||
2. Parse the manifest.
|
||||
3. Select the default export:
|
||||
- use export named `main` if present,
|
||||
- otherwise use the sole export if exactly one exists,
|
||||
- otherwise return an error.
|
||||
4. Read the nodes section.
|
||||
5. Reconstruct the selected root tree from the Merkle DAG.
|
||||
6. Apply each host-provided argument in order.
|
||||
7. Return `ok result rest` or an `err`.
|
||||
|
||||
`runArboricxArgsByName` is identical except that it selects a named export.
|
||||
|
||||
## Tests proving the expected behavior
|
||||
|
||||
The relevant Haskell tests are in `test/Spec.hs` under `manifestReadingTests`.
|
||||
|
||||
Important cases:
|
||||
|
||||
- `readArboricxExecutable: reconstructs default export tree`
|
||||
- `readArboricxExecutableByName: selects named export`
|
||||
- `runArboricx: applies host-provided argument to default export`
|
||||
- `runArboricxArgs: applies host-provided argument list in order`
|
||||
|
||||
These tests demonstrate the host-shell contract:
|
||||
|
||||
- application bundle bytes are supplied as a Tree Calculus byte list,
|
||||
- host arguments are supplied as canonical Tree Calculus values,
|
||||
- execution returns a result-wrapped Tree Calculus value.
|
||||
|
||||
## Minimal PHP prototype checklist
|
||||
|
||||
A PHP prototype should implement:
|
||||
|
||||
- [ ] Tree data constructors: `Leaf`, `Stem`, `Fork`.
|
||||
- [ ] Application helper: `app($f, $x) = Fork($f, $x)`.
|
||||
- [ ] Normal-order Tree Calculus reducer.
|
||||
- [ ] Fuel/step limit for debugging.
|
||||
- [ ] Hardcoded kernel entrypoint tree for `runArboricxArgs`.
|
||||
- [ ] Encode application bundle file bytes into a Tree Calculus byte list.
|
||||
- [ ] Encode host argument values into Tree Calculus values.
|
||||
- [ ] Build expression: `((runArboricxArgs bundleBytes) args)`.
|
||||
- [ ] Normalize expression.
|
||||
- [ ] Unwrap `ok` / `err` result.
|
||||
- [ ] Decode result value into host type.
|
||||
|
||||
For exact codec details, reference the Haskell implementation in `src/Research.hs` and the existing JS runtime if available.
|
||||
|
||||
## Current recommendation
|
||||
|
||||
For the first PHP implementation:
|
||||
|
||||
1. Hardcode only the `runArboricxArgs` kernel entrypoint as a Tree Calculus value.
|
||||
2. Do not implement host-side Arboricx parsing yet.
|
||||
3. Implement only enough codecs for:
|
||||
- bytes,
|
||||
- strings,
|
||||
- lists,
|
||||
- result unwrapping.
|
||||
4. Use one test fixture: an Arboricx bundle whose root is `append "hello "`.
|
||||
5. Assert that calling it with `"james"` returns `"hello james"`.
|
||||
|
||||
Once that works, add named export support via `runArboricxArgsByName` and expand codecs as needed.
|
||||
Reference in New Issue
Block a user