484 lines
14 KiB
Markdown
484 lines
14 KiB
Markdown
# Self-hosted Arboricx Host Prototype
|
|
|
|
This document describes how to build a minimal host-language shell that can execute Arboricx bundles through the self-hosted tricu Arboricx parser/executor.
|
|
|
|
The intended reader is an implementation agent building a first prototype in a host language such as PHP. The same approach should generalize to any language with a small Tree Calculus evaluator.
|
|
|
|
See also: [`docs/host-abi.md`](./host-abi.md) for the precise host-facing ABI value tags and typed runner contract.
|
|
|
|
## Goal
|
|
|
|
Build a tiny host program that can:
|
|
|
|
1. Represent Tree Calculus values.
|
|
2. Reduce/evaluate Tree Calculus terms.
|
|
3. Load or embed the tricu Arboricx runtime kernel.
|
|
4. Read an application `.arboricx` bundle from disk.
|
|
5. Convert host inputs into canonical Tree Calculus values.
|
|
6. Apply the kernel to the application bundle and arguments.
|
|
7. Unwrap a standardized host ABI result.
|
|
8. Decode the host ABI payload back into host values.
|
|
|
|
A concrete target example:
|
|
|
|
```tricu
|
|
-- Application bundle root is an unapplied function:
|
|
append "hello "
|
|
```
|
|
|
|
The host should be able to call that bundle with the host string `"james"` and receive:
|
|
|
|
```text
|
|
hello james
|
|
```
|
|
|
|
With the Host ABI layer, the preferred conceptual call is:
|
|
|
|
```tricu
|
|
runArboricxToString <applicationBundleBytes> ["james"]
|
|
```
|
|
|
|
This returns:
|
|
|
|
```tricu
|
|
ok (hostString "hello james") rest
|
|
```
|
|
|
|
where `runArboricxToString` comes from the self-hosted Arboricx runtime kernel.
|
|
|
|
## Architectural overview
|
|
|
|
There are two Arboricx bundles involved:
|
|
|
|
1. **Kernel bundle**
|
|
- Contains the self-hosted Arboricx parser/executor written in tricu.
|
|
- Exposes ergonomic runtime entrypoints such as `runArboricxArgs` and Host ABI entrypoints such as `runArboricxToString`.
|
|
- This can be hardcoded as a Tree Calculus value in the host, or loaded by a minimal host-side Arboricx parser.
|
|
|
|
2. **Application bundle**
|
|
- The bundle the user wants to execute.
|
|
- Example: a bundle whose exported root is `append "hello "`, waiting for one more string argument.
|
|
- The host reads this file as raw bytes and encodes those bytes as a Tree Calculus byte list.
|
|
|
|
The minimal host does **not** need to understand the application bundle format if the kernel is already available as a Tree Calculus value. The host only passes the application bundle bytes to the kernel.
|
|
|
|
## Required host components
|
|
|
|
### 1. Tree representation
|
|
|
|
The host needs a representation for the three Tree Calculus constructors:
|
|
|
|
```text
|
|
Leaf
|
|
Stem child
|
|
Fork left right
|
|
```
|
|
|
|
Use whatever is idiomatic for the host language. In PHP, for a prototype, simple classes or tagged arrays are sufficient.
|
|
|
|
Example shape:
|
|
|
|
```php
|
|
abstract class T {}
|
|
final class Leaf extends T {}
|
|
final class Stem extends T { public T $child; }
|
|
final class Fork extends T { public T $left; public T $right; }
|
|
```
|
|
|
|
or tagged arrays:
|
|
|
|
```php
|
|
['tag' => 'leaf']
|
|
['tag' => 'stem', 'child' => $t]
|
|
['tag' => 'fork', 'left' => $l, 'right' => $r]
|
|
```
|
|
|
|
The evaluator and codecs only need these three constructors.
|
|
|
|
### 2. Tree Calculus evaluator
|
|
|
|
The host must implement Tree Calculus reduction. This is the core VM.
|
|
|
|
The evaluator should use normal-order evaluation, matching the runtime semantics expected by Arboricx manifests:
|
|
|
|
```text
|
|
runtimeEvaluation = "normal-order"
|
|
```
|
|
|
|
The evaluator only needs the Tree Calculus reduction rules. There is no parser requirement for the host prototype if terms are constructed directly as trees.
|
|
|
|
Implementation notes:
|
|
|
|
- Evaluation must support application: a tree applied to another tree.
|
|
- In this codebase, application is represented structurally as `Fork function argument` before reduction.
|
|
- The evaluator repeatedly reduces until normal form or until a configured step/fuel limit is reached.
|
|
- Add a fuel limit for the first prototype to avoid infinite reductions during debugging.
|
|
|
|
Reference implementation locations:
|
|
|
|
- Haskell evaluator/reduction: `src/Research.hs`
|
|
- JavaScript Arboricx runtime evaluator: `ext/js/src/` if present in the checkout
|
|
|
|
Use those as references for exact reduction behavior.
|
|
|
|
### 3. Kernel availability
|
|
|
|
The host needs access to the self-hosted Arboricx runtime kernel as a Tree Calculus value.
|
|
|
|
There are two viable bootstrap strategies.
|
|
|
|
#### Strategy A: hardcode the kernel tree
|
|
|
|
For the first host prototype, this is recommended.
|
|
|
|
Workflow:
|
|
|
|
1. Compile/export the tricu kernel entrypoint as an Arboricx bundle or tree value.
|
|
2. Convert the selected exported kernel function into a host-language Tree Calculus literal.
|
|
3. Commit/embed that literal in the host implementation.
|
|
|
|
Then the host does not need any Arboricx parser of its own for the kernel. It only needs Tree Calculus reduction.
|
|
|
|
#### Strategy B: bootstrap the kernel from an Arboricx bundle
|
|
|
|
Alternatively, the host can implement a minimal Arboricx parser just sufficient to load the kernel bundle.
|
|
|
|
This is more work up front, but avoids hardcoding a huge tree literal.
|
|
|
|
If using this strategy, the host-side parser needs to:
|
|
|
|
1. Parse the Arboricx container.
|
|
2. Parse enough manifest/export data to locate the desired kernel export.
|
|
3. Parse node records.
|
|
4. Reconstruct the selected root Tree Calculus value from the Merkle node DAG.
|
|
|
|
This logic is exactly what the tricu self-hosted kernel does, so the hardcoded-kernel path is simpler for early ports.
|
|
|
|
## Kernel entrypoints
|
|
|
|
The ergonomic runtime API currently lives in `lib/arboricx.tri`.
|
|
|
|
### Raw execution entrypoints
|
|
|
|
These return raw application results inside the existing `ok` / `err` result protocol:
|
|
|
|
```tricu
|
|
readArboricxExecutableByName nameBytes bundleBytes
|
|
readArboricxExecutable bundleBytes
|
|
runArboricxByName nameBytes bundleBytes arg
|
|
runArboricx bundleBytes arg
|
|
runArboricxArgsByName nameBytes bundleBytes args
|
|
runArboricxArgs bundleBytes args
|
|
```
|
|
|
|
`runArboricxArgs` accepts:
|
|
|
|
1. Raw application bundle bytes as a Tree Calculus byte list.
|
|
2. A Tree Calculus list of arguments.
|
|
|
|
For named exports, use `runArboricxArgsByName`, which accepts:
|
|
|
|
1. Export name as bytes.
|
|
2. Application bundle bytes as bytes.
|
|
3. Argument list.
|
|
|
|
### Host ABI typed entrypoints
|
|
|
|
For host-language ports, prefer the Host ABI typed runners. These wrap successful outputs in a tagged host ABI value so every host can decode the same envelope shape.
|
|
|
|
Default export variants:
|
|
|
|
```tricu
|
|
runArboricxToTree bundleBytes args
|
|
runArboricxToString bundleBytes args
|
|
runArboricxToNumber bundleBytes args
|
|
runArboricxToBool bundleBytes args
|
|
runArboricxToList bundleBytes args
|
|
runArboricxToBytes bundleBytes args
|
|
```
|
|
|
|
Named export variants:
|
|
|
|
```tricu
|
|
runArboricxByNameToTree nameBytes bundleBytes args
|
|
runArboricxByNameToString nameBytes bundleBytes args
|
|
runArboricxByNameToNumber nameBytes bundleBytes args
|
|
runArboricxByNameToBool nameBytes bundleBytes args
|
|
runArboricxByNameToList nameBytes bundleBytes args
|
|
runArboricxByNameToBytes nameBytes bundleBytes args
|
|
```
|
|
|
|
Recommended first host entrypoint for the `append "hello "` example:
|
|
|
|
```tricu
|
|
runArboricxToString
|
|
```
|
|
|
|
## Applying the kernel in the host evaluator
|
|
|
|
If the host has the Tree Calculus value for `runArboricxToString`, call it by constructing nested application trees.
|
|
|
|
In Tree Calculus application form:
|
|
|
|
```text
|
|
((runArboricxToString bundleBytesTree) argsTree)
|
|
```
|
|
|
|
Structurally, if `app(f, x)` constructs `Fork(f, x)`, then:
|
|
|
|
```php
|
|
$expr = app(app($kernelRunArboricxToString, $bundleBytesTree), $argsTree);
|
|
$result = normalize($expr);
|
|
```
|
|
|
|
For named export execution:
|
|
|
|
```text
|
|
(((runArboricxByNameToString nameBytesTree) bundleBytesTree) argsTree)
|
|
```
|
|
|
|
Structurally:
|
|
|
|
```php
|
|
$expr = app(
|
|
app(
|
|
app($kernelRunArboricxByNameToString, $nameBytesTree),
|
|
$bundleBytesTree
|
|
),
|
|
$argsTree
|
|
);
|
|
$result = normalize($expr);
|
|
```
|
|
|
|
## Result convention and Host ABI envelope
|
|
|
|
All runtime APIs return the existing tricu `ok` / `err` convention from `lib/binary.tri`:
|
|
|
|
```tricu
|
|
ok value rest = pair true (pair value rest)
|
|
err code rest = pair false (pair code rest)
|
|
```
|
|
|
|
The host should always unwrap this outer result first.
|
|
|
|
### Raw runners
|
|
|
|
Raw runners such as `runArboricxArgs` return:
|
|
|
|
```tricu
|
|
ok rawApplicationValue rest
|
|
```
|
|
|
|
The host must know how to interpret `rawApplicationValue`.
|
|
|
|
### Host ABI typed runners
|
|
|
|
Typed runners such as `runArboricxToString` return:
|
|
|
|
```tricu
|
|
ok hostAbiValue rest
|
|
```
|
|
|
|
A host ABI value has shape:
|
|
|
|
```tricu
|
|
pair tag payload
|
|
```
|
|
|
|
The payload is still the canonical/raw Tree Calculus representation for that type.
|
|
|
|
Initial tags are specified in [`docs/host-abi.md`](./host-abi.md):
|
|
|
|
```tricu
|
|
hostTreeTag = 0
|
|
hostStringTag = 1
|
|
hostNumberTag = 2
|
|
hostBoolTag = 3
|
|
hostListTag = 4
|
|
hostBytesTag = 5
|
|
```
|
|
|
|
For example:
|
|
|
|
```tricu
|
|
runArboricxToString bundleBytes ["james"]
|
|
```
|
|
|
|
returns:
|
|
|
|
```tricu
|
|
ok (hostString "hello james") rest
|
|
```
|
|
|
|
which is structurally:
|
|
|
|
```tricu
|
|
ok (pair hostStringTag "hello james") rest
|
|
```
|
|
|
|
### Error shape
|
|
|
|
Expected error shape:
|
|
|
|
```tricu
|
|
err code rest
|
|
```
|
|
|
|
The error code is a Tree Calculus number. Error constants are defined in:
|
|
|
|
- `lib/binary.tri`
|
|
- `lib/arboricx-common.tri`
|
|
- `lib/arboricx.tri` for Host ABI codec errors, currently `errHostCodecFailed = 14`
|
|
|
|
Typed runners return `errHostCodecFailed` if the application result cannot be interpreted as the requested type.
|
|
|
|
A prototype host can report the numeric error code and optionally dump a compact representation of `rest`.
|
|
|
|
## Example execution flow
|
|
|
|
Suppose the application bundle exports this root:
|
|
|
|
```tricu
|
|
append "hello "
|
|
```
|
|
|
|
The bundle root is an unapplied function waiting for one more string argument.
|
|
|
|
Host flow:
|
|
|
|
1. Load kernel entrypoint tree:
|
|
|
|
```php
|
|
$runArboricxToString = loadHardcodedKernelEntrypoint('runArboricxToString');
|
|
```
|
|
|
|
2. Read application bundle bytes:
|
|
|
|
```php
|
|
$bytes = file_get_contents('append-hello.arboricx');
|
|
```
|
|
|
|
3. Encode bundle bytes as a Tree Calculus byte list:
|
|
|
|
```php
|
|
$bundleBytesTree = encodeBytes($bytes);
|
|
```
|
|
|
|
4. Encode host argument(s):
|
|
|
|
```php
|
|
$arg = encodeString('james');
|
|
$args = encodeList([$arg]);
|
|
```
|
|
|
|
5. Build application expression:
|
|
|
|
```php
|
|
$expr = app(app($runArboricxToString, $bundleBytesTree), $args);
|
|
```
|
|
|
|
6. Evaluate:
|
|
|
|
```php
|
|
$result = normalize($expr);
|
|
```
|
|
|
|
7. Unwrap `ok` result:
|
|
|
|
```php
|
|
[$ok, $hostValue, $rest] = unwrapResult($result);
|
|
if (!$ok) { throw new RuntimeException('Arboricx error'); }
|
|
```
|
|
|
|
8. Unwrap Host ABI envelope:
|
|
|
|
```php
|
|
[$tag, $payload] = unwrapHostValue($hostValue);
|
|
if ($tag !== HOST_STRING_TAG) { throw new RuntimeException('Expected string'); }
|
|
```
|
|
|
|
9. Decode the payload:
|
|
|
|
```php
|
|
echo decodeString($payload); // hello james
|
|
```
|
|
|
|
## What the kernel does internally
|
|
|
|
`runArboricxToString` performs the following steps inside Tree Calculus:
|
|
|
|
1. Parse and validate the raw Arboricx bundle bytes.
|
|
2. Parse the manifest.
|
|
3. Select the default export:
|
|
- use export named `main` if present,
|
|
- otherwise use the sole export if exactly one exists,
|
|
- otherwise return an error.
|
|
4. Read the nodes section.
|
|
5. Reconstruct the selected root tree from the Merkle DAG.
|
|
6. Apply each host-provided argument in order.
|
|
7. Validate that the raw result is string-like.
|
|
8. Return `ok (hostString result) rest`, or an `err`.
|
|
|
|
`runArboricxByNameToString` is identical except that it selects a named export.
|
|
|
|
Other typed runners follow the same pattern for their requested output type.
|
|
|
|
## Tests proving the expected behavior
|
|
|
|
The relevant Haskell tests are in `test/Spec.hs` under `manifestReadingTests`.
|
|
|
|
Important cases:
|
|
|
|
- `readArboricxExecutable: reconstructs default export tree`
|
|
- `readArboricxExecutableByName: selects named export`
|
|
- `runArboricx: applies host-provided argument to default export`
|
|
- `runArboricxArgs: applies host-provided argument list in order`
|
|
- `host ABI: constructors expose tag and payload`
|
|
- `runArboricxToTree: wraps raw result as hostTree`
|
|
- `runArboricxToString: wraps string result as hostString`
|
|
- `runArboricxToNumber: wraps number result as hostNumber`
|
|
- `runArboricxToBool: rejects non-bool result`
|
|
|
|
These tests demonstrate the host-shell contract:
|
|
|
|
- application bundle bytes are supplied as a Tree Calculus byte list,
|
|
- host arguments are supplied as canonical Tree Calculus values,
|
|
- execution returns an outer result-wrapped value,
|
|
- Host ABI typed runners return a tagged ABI envelope inside `ok`.
|
|
|
|
## Minimal PHP prototype checklist
|
|
|
|
A PHP prototype should implement:
|
|
|
|
- [ ] Tree data constructors: `Leaf`, `Stem`, `Fork`.
|
|
- [ ] Application helper: `app($f, $x) = Fork($f, $x)`.
|
|
- [ ] Normal-order Tree Calculus reducer.
|
|
- [ ] Fuel/step limit for debugging.
|
|
- [ ] Hardcoded kernel entrypoint tree for `runArboricxToString` for the first string-output prototype.
|
|
- [ ] Encode application bundle file bytes into a Tree Calculus byte list.
|
|
- [ ] Encode host argument values into Tree Calculus values.
|
|
- [ ] Build expression: `((runArboricxToString bundleBytes) args)`.
|
|
- [ ] Normalize expression.
|
|
- [ ] Unwrap outer `ok` / `err` result.
|
|
- [ ] Unwrap Host ABI `pair tag payload` envelope.
|
|
- [ ] Decode payload according to tag.
|
|
|
|
For exact codec details, reference the Haskell implementation in `src/Research.hs` and the existing JS runtime if available.
|
|
|
|
## Current recommendation
|
|
|
|
For the first PHP implementation:
|
|
|
|
1. Hardcode only the `runArboricxToString` kernel entrypoint as a Tree Calculus value.
|
|
2. Do not implement host-side Arboricx parsing yet.
|
|
3. Implement only enough codecs for:
|
|
- bytes,
|
|
- strings,
|
|
- lists,
|
|
- result unwrapping,
|
|
- Host ABI envelope unwrapping.
|
|
4. Use one test fixture: an Arboricx bundle whose root is `append "hello "`.
|
|
5. Assert that calling it with `"james"` returns an outer `ok`, then a `hostString`, then payload `"hello james"`.
|
|
|
|
Once that works, add named export support via `runArboricxByNameToString` and expand Host ABI tags/codecs as needed.
|