385 lines
10 KiB
Markdown
385 lines
10 KiB
Markdown
# Self-hosted Arboricx Host Prototype
|
|
|
|
This document describes how to build a minimal host-language shell that can execute Arboricx bundles through the self-hosted tricu Arboricx parser/executor.
|
|
|
|
The intended reader is an implementation agent building a first prototype in a host language such as PHP. The same approach should generalize to any language with a small Tree Calculus evaluator.
|
|
|
|
## Goal
|
|
|
|
Build a tiny host program that can:
|
|
|
|
1. Represent Tree Calculus values.
|
|
2. Reduce/evaluate Tree Calculus terms.
|
|
3. Load or embed the tricu Arboricx runtime kernel.
|
|
4. Read an application `.arboricx` bundle from disk.
|
|
5. Convert host inputs into canonical Tree Calculus values.
|
|
6. Apply the kernel to the application bundle and arguments.
|
|
7. Decode the result back into host values.
|
|
|
|
A concrete target example:
|
|
|
|
```tricu
|
|
-- Application bundle root is an unapplied function:
|
|
append "hello "
|
|
```
|
|
|
|
The host should be able to call that bundle with the host string `"james"` and receive:
|
|
|
|
```text
|
|
hello james
|
|
```
|
|
|
|
Conceptually the host evaluates:
|
|
|
|
```tricu
|
|
runArboricxArgs <applicationBundleBytes> ["james"]
|
|
```
|
|
|
|
where `runArboricxArgs` comes from the self-hosted Arboricx runtime kernel.
|
|
|
|
## Architectural overview
|
|
|
|
There are two Arboricx bundles involved:
|
|
|
|
1. **Kernel bundle**
|
|
- Contains the self-hosted Arboricx parser/executor written in tricu.
|
|
- Exposes ergonomic runtime entrypoints such as `runArboricxArgs`.
|
|
- This can be hardcoded as a Tree Calculus value in the host, or loaded by a minimal host-side Arboricx parser.
|
|
|
|
2. **Application bundle**
|
|
- The bundle the user wants to execute.
|
|
- Example: a bundle whose exported root is `append "hello "`, waiting for one more string argument.
|
|
- The host reads this file as raw bytes and encodes those bytes as a Tree Calculus byte list.
|
|
|
|
The minimal host does **not** need to understand the application bundle format if the kernel is already available as a Tree Calculus value. The host only passes the application bundle bytes to the kernel.
|
|
|
|
## Required host components
|
|
|
|
### 1. Tree representation
|
|
|
|
The host needs a representation for the three Tree Calculus constructors:
|
|
|
|
```text
|
|
Leaf
|
|
Stem child
|
|
Fork left right
|
|
```
|
|
|
|
Use whatever is idiomatic for the host language. In PHP, for a prototype, simple classes or tagged arrays are sufficient.
|
|
|
|
Example shape:
|
|
|
|
```php
|
|
abstract class T {}
|
|
final class Leaf extends T {}
|
|
final class Stem extends T { public T $child; }
|
|
final class Fork extends T { public T $left; public T $right; }
|
|
```
|
|
|
|
or tagged arrays:
|
|
|
|
```php
|
|
['tag' => 'leaf']
|
|
['tag' => 'stem', 'child' => $t]
|
|
['tag' => 'fork', 'left' => $l, 'right' => $r]
|
|
```
|
|
|
|
The evaluator and codecs only need these three constructors.
|
|
|
|
### 2. Tree Calculus evaluator
|
|
|
|
The host must implement Tree Calculus reduction. This is the core VM.
|
|
|
|
The evaluator should use normal-order evaluation, matching the runtime semantics expected by Arboricx manifests:
|
|
|
|
```text
|
|
runtimeEvaluation = "normal-order"
|
|
```
|
|
|
|
The evaluator only needs the Tree Calculus reduction rules. There is no parser requirement for the host prototype if terms are constructed directly as trees.
|
|
|
|
Implementation notes:
|
|
|
|
- Evaluation must support application: a tree applied to another tree.
|
|
- In this codebase, application is represented structurally as `Fork function argument` before reduction.
|
|
- The evaluator repeatedly reduces until normal form or until a configured step/fuel limit is reached.
|
|
- Add a fuel limit for the first prototype to avoid infinite reductions during debugging.
|
|
|
|
Reference implementation locations:
|
|
|
|
- Haskell evaluator/reduction: `src/Research.hs`
|
|
- JavaScript Arboricx runtime evaluator: `ext/js/src/` if present in the checkout
|
|
|
|
Use those as references for exact reduction behavior.
|
|
|
|
### 3. Kernel availability
|
|
|
|
The host needs access to the self-hosted Arboricx runtime kernel as a Tree Calculus value.
|
|
|
|
There are two viable bootstrap strategies.
|
|
|
|
#### Strategy A: hardcode the kernel tree
|
|
|
|
For the first host prototype, this is recommended.
|
|
|
|
Workflow:
|
|
|
|
1. Compile/export the tricu kernel entrypoint as an Arboricx bundle or tree value.
|
|
2. Convert the selected exported kernel function into a host-language Tree Calculus literal.
|
|
3. Commit/embed that literal in the host implementation.
|
|
|
|
Then the host does not need any Arboricx parser of its own for the kernel. It only needs Tree Calculus reduction.
|
|
|
|
#### Strategy B: bootstrap the kernel from an Arboricx bundle
|
|
|
|
Alternatively, the host can implement a minimal Arboricx parser just sufficient to load the kernel bundle.
|
|
|
|
This is more work up front, but avoids hardcoding a huge tree literal.
|
|
|
|
If using this strategy, the host-side parser needs to:
|
|
|
|
1. Parse the Arboricx container.
|
|
2. Parse enough manifest/export data to locate the desired kernel export.
|
|
3. Parse node records.
|
|
4. Reconstruct the selected root Tree Calculus value from the Merkle node DAG.
|
|
|
|
This logic is exactly what the tricu self-hosted kernel does, so the hardcoded-kernel path is simpler for early ports.
|
|
|
|
## Kernel entrypoints
|
|
|
|
The ergonomic runtime API currently lives in `lib/arboricx.tri`.
|
|
|
|
Primary entrypoints:
|
|
|
|
```tricu
|
|
readArboricxExecutableByName nameBytes bundleBytes
|
|
readArboricxExecutable bundleBytes
|
|
runArboricxByName nameBytes bundleBytes arg
|
|
runArboricx bundleBytes arg
|
|
runArboricxArgsByName nameBytes bundleBytes args
|
|
runArboricxArgs bundleBytes args
|
|
```
|
|
|
|
Recommended host entrypoint:
|
|
|
|
```tricu
|
|
runArboricxArgs
|
|
```
|
|
|
|
It accepts:
|
|
|
|
1. Raw application bundle bytes as a Tree Calculus byte list.
|
|
2. A Tree Calculus list of arguments.
|
|
|
|
It returns a result-wrapped value.
|
|
|
|
For named exports, use:
|
|
|
|
```tricu
|
|
runArboricxArgsByName
|
|
```
|
|
|
|
It accepts:
|
|
|
|
1. Export name as bytes.
|
|
2. Application bundle bytes as bytes.
|
|
3. Argument list.
|
|
|
|
### Applying the kernel in the host evaluator
|
|
|
|
If the host has the Tree Calculus value for `runArboricxArgs`, call it by constructing nested application trees.
|
|
|
|
In Tree Calculus application form:
|
|
|
|
```text
|
|
((runArboricxArgs bundleBytesTree) argsTree)
|
|
```
|
|
|
|
Structurally, if `app(f, x)` constructs `Fork(f, x)`, then:
|
|
|
|
```php
|
|
$expr = app(app($kernelRunArboricxArgs, $bundleBytesTree), $argsTree);
|
|
$result = normalize($expr);
|
|
```
|
|
|
|
For named export execution:
|
|
|
|
```text
|
|
(((runArboricxArgsByName nameBytesTree) bundleBytesTree) argsTree)
|
|
```
|
|
|
|
Structurally:
|
|
|
|
```php
|
|
$expr = app(
|
|
app(
|
|
app($kernelRunArboricxArgsByName, $nameBytesTree),
|
|
$bundleBytesTree
|
|
),
|
|
$argsTree
|
|
);
|
|
$result = normalize($expr);
|
|
```
|
|
|
|
## Result convention
|
|
|
|
The runtime API returns results using the tricu `ok` / `err` convention from `lib/binary.tri`:
|
|
|
|
```tricu
|
|
ok value rest = pair true (pair value rest)
|
|
err code rest = pair false (pair code rest)
|
|
```
|
|
|
|
The host should unwrap this result before decoding the final value.
|
|
|
|
Expected success shape:
|
|
|
|
```tricu
|
|
ok value rest
|
|
```
|
|
|
|
For typical execution, `value` is the application result. `rest` is usually not important to the host shell unless debugging parser behavior.
|
|
|
|
Expected error shape:
|
|
|
|
```tricu
|
|
err code rest
|
|
```
|
|
|
|
The error code is a Tree Calculus number. Error constants are defined in:
|
|
|
|
- `lib/binary.tri`
|
|
- `lib/arboricx-common.tri`
|
|
|
|
A prototype host can simply report the numeric error code and optionally dump a compact representation of `rest`.
|
|
|
|
## Example execution flow
|
|
|
|
Suppose the application bundle exports this root:
|
|
|
|
```tricu
|
|
append "hello "
|
|
```
|
|
|
|
The bundle root is an unapplied function waiting for one more string argument.
|
|
|
|
Host flow:
|
|
|
|
1. Load kernel entrypoint tree:
|
|
|
|
```php
|
|
$runArboricxArgs = loadHardcodedKernelEntrypoint('runArboricxArgs');
|
|
```
|
|
|
|
2. Read application bundle bytes:
|
|
|
|
```php
|
|
$bytes = file_get_contents('append-hello.arboricx');
|
|
```
|
|
|
|
3. Encode bundle bytes as a Tree Calculus byte list:
|
|
|
|
```php
|
|
$bundleBytesTree = encodeBytes($bytes);
|
|
```
|
|
|
|
4. Encode host argument(s):
|
|
|
|
```php
|
|
$arg = encodeString('james');
|
|
$args = encodeList([$arg]);
|
|
```
|
|
|
|
5. Build application expression:
|
|
|
|
```php
|
|
$expr = app(app($runArboricxArgs, $bundleBytesTree), $args);
|
|
```
|
|
|
|
6. Evaluate:
|
|
|
|
```php
|
|
$result = normalize($expr);
|
|
```
|
|
|
|
7. Unwrap `ok` result:
|
|
|
|
```php
|
|
[$ok, $value, $rest] = unwrapResult($result);
|
|
if (!$ok) { throw new RuntimeException('Arboricx error'); }
|
|
```
|
|
|
|
8. Decode the value:
|
|
|
|
```php
|
|
echo decodeString($value); // hello james
|
|
```
|
|
|
|
## What the kernel does internally
|
|
|
|
`runArboricxArgs` performs the following steps inside Tree Calculus:
|
|
|
|
1. Parse and validate the raw Arboricx bundle bytes.
|
|
2. Parse the manifest.
|
|
3. Select the default export:
|
|
- use export named `main` if present,
|
|
- otherwise use the sole export if exactly one exists,
|
|
- otherwise return an error.
|
|
4. Read the nodes section.
|
|
5. Reconstruct the selected root tree from the Merkle DAG.
|
|
6. Apply each host-provided argument in order.
|
|
7. Return `ok result rest` or an `err`.
|
|
|
|
`runArboricxArgsByName` is identical except that it selects a named export.
|
|
|
|
## Tests proving the expected behavior
|
|
|
|
The relevant Haskell tests are in `test/Spec.hs` under `manifestReadingTests`.
|
|
|
|
Important cases:
|
|
|
|
- `readArboricxExecutable: reconstructs default export tree`
|
|
- `readArboricxExecutableByName: selects named export`
|
|
- `runArboricx: applies host-provided argument to default export`
|
|
- `runArboricxArgs: applies host-provided argument list in order`
|
|
|
|
These tests demonstrate the host-shell contract:
|
|
|
|
- application bundle bytes are supplied as a Tree Calculus byte list,
|
|
- host arguments are supplied as canonical Tree Calculus values,
|
|
- execution returns a result-wrapped Tree Calculus value.
|
|
|
|
## Minimal PHP prototype checklist
|
|
|
|
A PHP prototype should implement:
|
|
|
|
- [ ] Tree data constructors: `Leaf`, `Stem`, `Fork`.
|
|
- [ ] Application helper: `app($f, $x) = Fork($f, $x)`.
|
|
- [ ] Normal-order Tree Calculus reducer.
|
|
- [ ] Fuel/step limit for debugging.
|
|
- [ ] Hardcoded kernel entrypoint tree for `runArboricxArgs`.
|
|
- [ ] Encode application bundle file bytes into a Tree Calculus byte list.
|
|
- [ ] Encode host argument values into Tree Calculus values.
|
|
- [ ] Build expression: `((runArboricxArgs bundleBytes) args)`.
|
|
- [ ] Normalize expression.
|
|
- [ ] Unwrap `ok` / `err` result.
|
|
- [ ] Decode result value into host type.
|
|
|
|
For exact codec details, reference the Haskell implementation in `src/Research.hs` and the existing JS runtime if available.
|
|
|
|
## Current recommendation
|
|
|
|
For the first PHP implementation:
|
|
|
|
1. Hardcode only the `runArboricxArgs` kernel entrypoint as a Tree Calculus value.
|
|
2. Do not implement host-side Arboricx parsing yet.
|
|
3. Implement only enough codecs for:
|
|
- bytes,
|
|
- strings,
|
|
- lists,
|
|
- result unwrapping.
|
|
4. Use one test fixture: an Arboricx bundle whose root is `append "hello "`.
|
|
5. Assert that calling it with `"james"` returns `"hello james"`.
|
|
|
|
Once that works, add named export support via `runArboricxArgsByName` and expand codecs as needed.
|