Host ABI definition and ergonomics in TC

This commit is contained in:
2026-05-09 18:33:03 -05:00
parent d0886ad886
commit 2e8a0a4c46
4 changed files with 542 additions and 52 deletions

247
docs/host-abi.md Normal file
View File

@@ -0,0 +1,247 @@
# tricu Host ABI
This document specifies the first host-facing ABI for self-hosted Arboricx execution.
The ABI is intentionally small. A host language should only need to implement Tree Calculus construction/reduction plus a tiny set of canonical payload codecs. Higher-level execution policy lives in Tree Calculus.
## Goals
- Keep host-language implementations small and auditable.
- Preserve canonical Tree Calculus representations for payloads.
- Provide a stable tagged envelope so hosts do not need per-application result conventions.
- Reuse the existing `ok` / `err` result protocol.
- Support typed execution wrappers for common return types.
## Non-goals
- This ABI does not remove the need for host codecs entirely.
- This ABI does not define every possible application protocol.
- This ABI does not require auto-detecting arbitrary result types.
## Outer result protocol
Host ABI runners return the existing tricu result shape from `lib/binary.tri`:
```tricu
ok value rest = pair true (pair value rest)
err code rest = pair false (pair code rest)
```
On success, `value` is a host ABI value.
On failure, `code` is a canonical Tree Calculus number. The host may report the numeric code and optionally inspect `rest` for debugging.
## Host ABI value shape
A host ABI value is:
```tricu
pair tag payload
```
The `tag` says how the host should interpret `payload`.
The payload is always the canonical/raw Tree Calculus representation for that type. The ABI envelope tags the payload; it does not replace or recursively wrap canonical Tree Calculus data.
## Tags
Initial tags:
```tricu
hostTreeTag = 0
hostStringTag = 1
hostNumberTag = 2
hostBoolTag = 3
hostListTag = 4
hostBytesTag = 5
```
Planned/error tag, if needed later:
```tricu
hostErrorTag = 6
```
The first implementation keeps errors in the outer `err` result protocol rather than returning `hostError` inside `ok`.
## Constructors
The ABI constructors are:
```tricu
hostTree value
hostString bytes
hostNumber n
hostBool b
hostList xs
hostBytes bytes
```
Each constructor returns:
```tricu
pair tag payload
```
Examples:
```tricu
hostString "hello"
hostNumber 42
hostBool true
hostList [1 2 3]
hostTree (t t t)
```
## Payload conventions
Payloads use existing canonical tricu encodings:
| ABI value | Payload |
| --- | --- |
| `hostTree` | arbitrary raw Tree Calculus value |
| `hostString` | canonical string/byte-list representation |
| `hostNumber` | canonical tricu number |
| `hostBool` | canonical tricu bool (`false = t`, `true = t t`) |
| `hostList` | canonical tricu list (`t` empty, `pair head tail` cons) |
| `hostBytes` | canonical byte list |
`hostList` payloads are raw canonical lists, **not** lists of host ABI values.
## Accessors / matching
The first ABI should expose simple accessors:
```tricu
hostValueTag hostValue
hostValuePayload hostValue
```
A host can decode the envelope by destructuring the pair directly, but these helpers make the ABI explicit and testable.
## Validation predicates
Typed runners should validate that the raw application result can be interpreted as the requested type before wrapping it.
Initial predicates:
```tricu
hostNumber? value
hostBool? value
hostList? value
hostString? value
hostBytes? value
```
These predicates are structural checks over canonical encodings. They are not general semantic type inference.
Important ambiguity note:
Tree Calculus encodings are not globally disjoint. For example, `t` is also `false`, `0`, and `[]`. Typed runners intentionally interpret values according to the requested type.
## Error behavior
Typed ABI runners return an error if the application result does not match the requested type.
Initial error code:
```tricu
errHostCodecFailed = 14
```
Example:
```tricu
runArboricxToString bundle args
```
returns:
```tricu
ok (hostString resultBytes) rest
```
if `resultBytes` is string-like, otherwise:
```tricu
err errHostCodecFailed result
```
where `result` is the raw application result that failed validation.
## Execution wrappers
The base self-hosted Arboricx runners are defined in `lib/arboricx.tri`:
```tricu
runArboricxArgs bundleBytes args
runArboricxArgsByName nameBytes bundleBytes args
```
Host ABI wrappers layer typed output envelopes on top:
```tricu
runArboricxToTree bundleBytes args
runArboricxToString bundleBytes args
runArboricxToNumber bundleBytes args
runArboricxToBool bundleBytes args
runArboricxToList bundleBytes args
runArboricxToBytes bundleBytes args
```
Named-export variants:
```tricu
runArboricxByNameToTree nameBytes bundleBytes args
runArboricxByNameToString nameBytes bundleBytes args
runArboricxByNameToNumber nameBytes bundleBytes args
runArboricxByNameToBool nameBytes bundleBytes args
runArboricxByNameToList nameBytes bundleBytes args
runArboricxByNameToBytes nameBytes bundleBytes args
```
## Host usage
For a bundle whose default export is an unapplied function:
```tricu
append "hello "
```
A host that expects a string result evaluates:
```tricu
runArboricxToString bundleBytes ["james"]
```
On success, the result is:
```tricu
ok (hostString "hello james") rest
```
The host then:
1. unwraps `ok`,
2. checks `hostStringTag`,
3. decodes the canonical string payload.
## Implementation reference
- Tree constructors, numbers, strings, and lists: `src/Research.hs`
- Result protocol: `lib/binary.tri`
- Arboricx parser/executor: `lib/arboricx.tri`
- Host ABI implementation: `lib/host-abi.tri` or `lib/arboricx.tri`, depending on final organization
## First-pass invariants
Tests should cover these invariants:
1. Each constructor stores the correct tag and payload.
2. `hostValueTag` and `hostValuePayload` destructure values correctly.
3. `runArboricxToTree` always wraps successful raw results as `hostTree`.
4. `runArboricxToString` wraps string-like results as `hostString`.
5. `runArboricxToNumber` wraps number-like results as `hostNumber`.
6. `runArboricxToBool` wraps canonical booleans as `hostBool`.
7. A typed runner returns `errHostCodecFailed` when validation fails.
8. Named-export typed runners select the requested export before wrapping.

View File

@@ -4,6 +4,8 @@ This document describes how to build a minimal host-language shell that can exec
The intended reader is an implementation agent building a first prototype in a host language such as PHP. The same approach should generalize to any language with a small Tree Calculus evaluator.
See also: [`docs/host-abi.md`](./host-abi.md) for the precise host-facing ABI value tags and typed runner contract.
## Goal
Build a tiny host program that can:
@@ -14,7 +16,8 @@ Build a tiny host program that can:
4. Read an application `.arboricx` bundle from disk.
5. Convert host inputs into canonical Tree Calculus values.
6. Apply the kernel to the application bundle and arguments.
7. Decode the result back into host values.
7. Unwrap a standardized host ABI result.
8. Decode the host ABI payload back into host values.
A concrete target example:
@@ -29,13 +32,19 @@ The host should be able to call that bundle with the host string `"james"` and r
hello james
```
Conceptually the host evaluates:
With the Host ABI layer, the preferred conceptual call is:
```tricu
runArboricxArgs <applicationBundleBytes> ["james"]
runArboricxToString <applicationBundleBytes> ["james"]
```
where `runArboricxArgs` comes from the self-hosted Arboricx runtime kernel.
This returns:
```tricu
ok (hostString "hello james") rest
```
where `runArboricxToString` comes from the self-hosted Arboricx runtime kernel.
## Architectural overview
@@ -43,7 +52,7 @@ There are two Arboricx bundles involved:
1. **Kernel bundle**
- Contains the self-hosted Arboricx parser/executor written in tricu.
- Exposes ergonomic runtime entrypoints such as `runArboricxArgs`.
- Exposes ergonomic runtime entrypoints such as `runArboricxArgs` and Host ABI entrypoints such as `runArboricxToString`.
- This can be hardcoded as a Tree Calculus value in the host, or loaded by a minimal host-side Arboricx parser.
2. **Application bundle**
@@ -149,7 +158,9 @@ This logic is exactly what the tricu self-hosted kernel does, so the hardcoded-k
The ergonomic runtime API currently lives in `lib/arboricx.tri`.
Primary entrypoints:
### Raw execution entrypoints
These return raw application results inside the existing `ok` / `err` result protocol:
```tricu
readArboricxExecutableByName nameBytes bundleBytes
@@ -160,52 +171,70 @@ runArboricxArgsByName nameBytes bundleBytes args
runArboricxArgs bundleBytes args
```
Recommended host entrypoint:
```tricu
runArboricxArgs
```
It accepts:
`runArboricxArgs` accepts:
1. Raw application bundle bytes as a Tree Calculus byte list.
2. A Tree Calculus list of arguments.
It returns a result-wrapped value.
For named exports, use:
```tricu
runArboricxArgsByName
```
It accepts:
For named exports, use `runArboricxArgsByName`, which accepts:
1. Export name as bytes.
2. Application bundle bytes as bytes.
3. Argument list.
### Applying the kernel in the host evaluator
### Host ABI typed entrypoints
If the host has the Tree Calculus value for `runArboricxArgs`, call it by constructing nested application trees.
For host-language ports, prefer the Host ABI typed runners. These wrap successful outputs in a tagged host ABI value so every host can decode the same envelope shape.
Default export variants:
```tricu
runArboricxToTree bundleBytes args
runArboricxToString bundleBytes args
runArboricxToNumber bundleBytes args
runArboricxToBool bundleBytes args
runArboricxToList bundleBytes args
runArboricxToBytes bundleBytes args
```
Named export variants:
```tricu
runArboricxByNameToTree nameBytes bundleBytes args
runArboricxByNameToString nameBytes bundleBytes args
runArboricxByNameToNumber nameBytes bundleBytes args
runArboricxByNameToBool nameBytes bundleBytes args
runArboricxByNameToList nameBytes bundleBytes args
runArboricxByNameToBytes nameBytes bundleBytes args
```
Recommended first host entrypoint for the `append "hello "` example:
```tricu
runArboricxToString
```
## Applying the kernel in the host evaluator
If the host has the Tree Calculus value for `runArboricxToString`, call it by constructing nested application trees.
In Tree Calculus application form:
```text
((runArboricxArgs bundleBytesTree) argsTree)
((runArboricxToString bundleBytesTree) argsTree)
```
Structurally, if `app(f, x)` constructs `Fork(f, x)`, then:
```php
$expr = app(app($kernelRunArboricxArgs, $bundleBytesTree), $argsTree);
$expr = app(app($kernelRunArboricxToString, $bundleBytesTree), $argsTree);
$result = normalize($expr);
```
For named export execution:
```text
(((runArboricxArgsByName nameBytesTree) bundleBytesTree) argsTree)
(((runArboricxByNameToString nameBytesTree) bundleBytesTree) argsTree)
```
Structurally:
@@ -213,7 +242,7 @@ Structurally:
```php
$expr = app(
app(
app($kernelRunArboricxArgsByName, $nameBytesTree),
app($kernelRunArboricxByNameToString, $nameBytesTree),
$bundleBytesTree
),
$argsTree
@@ -221,24 +250,73 @@ $expr = app(
$result = normalize($expr);
```
## Result convention
## Result convention and Host ABI envelope
The runtime API returns results using the tricu `ok` / `err` convention from `lib/binary.tri`:
All runtime APIs return the existing tricu `ok` / `err` convention from `lib/binary.tri`:
```tricu
ok value rest = pair true (pair value rest)
err code rest = pair false (pair code rest)
```
The host should unwrap this result before decoding the final value.
The host should always unwrap this outer result first.
Expected success shape:
### Raw runners
Raw runners such as `runArboricxArgs` return:
```tricu
ok value rest
ok rawApplicationValue rest
```
For typical execution, `value` is the application result. `rest` is usually not important to the host shell unless debugging parser behavior.
The host must know how to interpret `rawApplicationValue`.
### Host ABI typed runners
Typed runners such as `runArboricxToString` return:
```tricu
ok hostAbiValue rest
```
A host ABI value has shape:
```tricu
pair tag payload
```
The payload is still the canonical/raw Tree Calculus representation for that type.
Initial tags are specified in [`docs/host-abi.md`](./host-abi.md):
```tricu
hostTreeTag = 0
hostStringTag = 1
hostNumberTag = 2
hostBoolTag = 3
hostListTag = 4
hostBytesTag = 5
```
For example:
```tricu
runArboricxToString bundleBytes ["james"]
```
returns:
```tricu
ok (hostString "hello james") rest
```
which is structurally:
```tricu
ok (pair hostStringTag "hello james") rest
```
### Error shape
Expected error shape:
@@ -250,8 +328,11 @@ The error code is a Tree Calculus number. Error constants are defined in:
- `lib/binary.tri`
- `lib/arboricx-common.tri`
- `lib/arboricx.tri` for Host ABI codec errors, currently `errHostCodecFailed = 14`
A prototype host can simply report the numeric error code and optionally dump a compact representation of `rest`.
Typed runners return `errHostCodecFailed` if the application result cannot be interpreted as the requested type.
A prototype host can report the numeric error code and optionally dump a compact representation of `rest`.
## Example execution flow
@@ -268,7 +349,7 @@ Host flow:
1. Load kernel entrypoint tree:
```php
$runArboricxArgs = loadHardcodedKernelEntrypoint('runArboricxArgs');
$runArboricxToString = loadHardcodedKernelEntrypoint('runArboricxToString');
```
2. Read application bundle bytes:
@@ -293,7 +374,7 @@ Host flow:
5. Build application expression:
```php
$expr = app(app($runArboricxArgs, $bundleBytesTree), $args);
$expr = app(app($runArboricxToString, $bundleBytesTree), $args);
```
6. Evaluate:
@@ -305,19 +386,26 @@ Host flow:
7. Unwrap `ok` result:
```php
[$ok, $value, $rest] = unwrapResult($result);
[$ok, $hostValue, $rest] = unwrapResult($result);
if (!$ok) { throw new RuntimeException('Arboricx error'); }
```
8. Decode the value:
8. Unwrap Host ABI envelope:
```php
echo decodeString($value); // hello james
[$tag, $payload] = unwrapHostValue($hostValue);
if ($tag !== HOST_STRING_TAG) { throw new RuntimeException('Expected string'); }
```
9. Decode the payload:
```php
echo decodeString($payload); // hello james
```
## What the kernel does internally
`runArboricxArgs` performs the following steps inside Tree Calculus:
`runArboricxToString` performs the following steps inside Tree Calculus:
1. Parse and validate the raw Arboricx bundle bytes.
2. Parse the manifest.
@@ -328,9 +416,12 @@ Host flow:
4. Read the nodes section.
5. Reconstruct the selected root tree from the Merkle DAG.
6. Apply each host-provided argument in order.
7. Return `ok result rest` or an `err`.
7. Validate that the raw result is string-like.
8. Return `ok (hostString result) rest`, or an `err`.
`runArboricxArgsByName` is identical except that it selects a named export.
`runArboricxByNameToString` is identical except that it selects a named export.
Other typed runners follow the same pattern for their requested output type.
## Tests proving the expected behavior
@@ -342,12 +433,18 @@ Important cases:
- `readArboricxExecutableByName: selects named export`
- `runArboricx: applies host-provided argument to default export`
- `runArboricxArgs: applies host-provided argument list in order`
- `host ABI: constructors expose tag and payload`
- `runArboricxToTree: wraps raw result as hostTree`
- `runArboricxToString: wraps string result as hostString`
- `runArboricxToNumber: wraps number result as hostNumber`
- `runArboricxToBool: rejects non-bool result`
These tests demonstrate the host-shell contract:
- application bundle bytes are supplied as a Tree Calculus byte list,
- host arguments are supplied as canonical Tree Calculus values,
- execution returns a result-wrapped Tree Calculus value.
- execution returns an outer result-wrapped value,
- Host ABI typed runners return a tagged ABI envelope inside `ok`.
## Minimal PHP prototype checklist
@@ -357,13 +454,14 @@ A PHP prototype should implement:
- [ ] Application helper: `app($f, $x) = Fork($f, $x)`.
- [ ] Normal-order Tree Calculus reducer.
- [ ] Fuel/step limit for debugging.
- [ ] Hardcoded kernel entrypoint tree for `runArboricxArgs`.
- [ ] Hardcoded kernel entrypoint tree for `runArboricxToString` for the first string-output prototype.
- [ ] Encode application bundle file bytes into a Tree Calculus byte list.
- [ ] Encode host argument values into Tree Calculus values.
- [ ] Build expression: `((runArboricxArgs bundleBytes) args)`.
- [ ] Build expression: `((runArboricxToString bundleBytes) args)`.
- [ ] Normalize expression.
- [ ] Unwrap `ok` / `err` result.
- [ ] Decode result value into host type.
- [ ] Unwrap outer `ok` / `err` result.
- [ ] Unwrap Host ABI `pair tag payload` envelope.
- [ ] Decode payload according to tag.
For exact codec details, reference the Haskell implementation in `src/Research.hs` and the existing JS runtime if available.
@@ -371,14 +469,15 @@ For exact codec details, reference the Haskell implementation in `src/Research.h
For the first PHP implementation:
1. Hardcode only the `runArboricxArgs` kernel entrypoint as a Tree Calculus value.
1. Hardcode only the `runArboricxToString` kernel entrypoint as a Tree Calculus value.
2. Do not implement host-side Arboricx parsing yet.
3. Implement only enough codecs for:
- bytes,
- strings,
- lists,
- result unwrapping.
- result unwrapping,
- Host ABI envelope unwrapping.
4. Use one test fixture: an Arboricx bundle whose root is `append "hello "`.
5. Assert that calling it with `"james"` returns `"hello james"`.
5. Assert that calling it with `"james"` returns an outer `ok`, then a `hostString`, then payload `"hello james"`.
Once that works, add named export support via `runArboricxArgsByName` and expand codecs as needed.
Once that works, add named export support via `runArboricxByNameToString` and expand Host ABI tags/codecs as needed.

View File

@@ -51,3 +51,86 @@ runArboricxArgsByName = (nameBytes bs args :
runArboricxArgs = (bs args :
runArboricxArgsByName [] bs args)
errHostCodecFailed = 14
hostTreeTag = 0
hostStringTag = 1
hostNumberTag = 2
hostBoolTag = 3
hostListTag = 4
hostBytesTag = 5
hostTree = (value : pair hostTreeTag value)
hostString = (bytes : pair hostStringTag bytes)
hostNumber = (n : pair hostNumberTag n)
hostBool = (b : pair hostBoolTag b)
hostList = (xs : pair hostListTag xs)
hostBytes = (bytes : pair hostBytesTag bytes)
hostValueTag = (hostValue : pairFirst hostValue)
hostValuePayload = (hostValue : pairSecond hostValue)
hostBool? = (value : or? (equal? value false) (equal? value true))
hostNumber? = y (self value :
triage
true
(_ : false)
(bit rest :
and?
(or? (equal? bit false) (equal? bit true))
(self rest))
value)
hostList? = y (self value :
triage
true
(_ : false)
(_ rest : self rest)
value)
hostString? = y (self value :
matchList
true
(byte rest : and? (hostNumber? byte) (self rest))
value)
hostBytes? = hostString?
wrapHostValue = (validator wrapper resultValue rest :
matchBool
(ok (wrapper resultValue) rest)
(err errHostCodecFailed resultValue)
(validator resultValue))
runArboricxByNameToTree = (nameBytes bs args :
bindResult (runArboricxArgsByName nameBytes bs args)
(value rest : ok (hostTree value) rest))
runArboricxByNameToString = (nameBytes bs args :
bindResult (runArboricxArgsByName nameBytes bs args)
(value rest : wrapHostValue hostString? hostString value rest))
runArboricxByNameToNumber = (nameBytes bs args :
bindResult (runArboricxArgsByName nameBytes bs args)
(value rest : wrapHostValue hostNumber? hostNumber value rest))
runArboricxByNameToBool = (nameBytes bs args :
bindResult (runArboricxArgsByName nameBytes bs args)
(value rest : wrapHostValue hostBool? hostBool value rest))
runArboricxByNameToList = (nameBytes bs args :
bindResult (runArboricxArgsByName nameBytes bs args)
(value rest : wrapHostValue hostList? hostList value rest))
runArboricxByNameToBytes = (nameBytes bs args :
bindResult (runArboricxArgsByName nameBytes bs args)
(value rest : wrapHostValue hostBytes? hostBytes value rest))
runArboricxToTree = (bs args : runArboricxByNameToTree [] bs args)
runArboricxToString = (bs args : runArboricxByNameToString [] bs args)
runArboricxToNumber = (bs args : runArboricxByNameToNumber [] bs args)
runArboricxToBool = (bs args : runArboricxByNameToBool [] bs args)
runArboricxToList = (bs args : runArboricxByNameToList [] bs args)
runArboricxToBytes = (bs args : runArboricxByNameToBytes [] bs args)

View File

@@ -2795,4 +2795,65 @@ manifestReadingTests = testGroup "Manifest Reading Tests"
let env = evalTricu library (parseTricu input)
toString (result env) @?= Right "left"
close srcConn
, testCase "host ABI: constructors expose tag and payload" $ do
library <- evaluateFile "./lib/arboricx.tri"
let stringInput = "hostString \"hello\""
stringEnv = evalTricu library (parseTricu stringInput)
result stringEnv @?= pairT (ofNumber 1) (ofString "hello")
let tagEnv = evalTricu library (parseTricu "hostValueTag (hostNumber 42)")
result tagEnv @?= ofNumber 2
let payloadEnv = evalTricu library (parseTricu "hostValuePayload (hostBool true)")
result payloadEnv @?= trueT
, testCase "runArboricxToTree: wraps raw result as hostTree" $ do
(srcConn, termHash, originalTerm) <- storeTermInTempDB $ unlines
[ "main = t t" ]
wireData <- exportBundle srcConn [termHash]
let input = "matchResult "
++ " (code rest : err code rest) "
++ " (hostValue rest : ok hostValue []) "
++ " (runArboricxToTree " ++ bytesExpr (map toInteger $ BS.unpack wireData) ++ " [])"
library <- evaluateFile "./lib/arboricx.tri"
let env = evalTricu library (parseTricu input)
result env @?= okT (pairT (ofNumber 0) originalTerm) (bytesT [])
close srcConn
, testCase "runArboricxToString: wraps string result as hostString" $ do
(srcConn, termHash, _) <- storeTermInTempDB $ unlines
[ "main = (x : x)" ]
wireData <- exportBundle srcConn [termHash]
let input = "matchResult "
++ " (code rest : err code rest) "
++ " (hostValue rest : ok hostValue []) "
++ " (runArboricxToString " ++ bytesExpr (map toInteger $ BS.unpack wireData) ++ " [(\"hello\")])"
library <- evaluateFile "./lib/arboricx.tri"
let env = evalTricu library (parseTricu input)
result env @?= okT (pairT (ofNumber 1) (ofString "hello")) (bytesT [])
close srcConn
, testCase "runArboricxToNumber: wraps number result as hostNumber" $ do
(srcConn, termHash, _) <- storeTermInTempDB $ unlines
[ "main = 42" ]
wireData <- exportBundle srcConn [termHash]
let input = "matchResult "
++ " (code rest : err code rest) "
++ " (hostValue rest : ok hostValue []) "
++ " (runArboricxToNumber " ++ bytesExpr (map toInteger $ BS.unpack wireData) ++ " [])"
library <- evaluateFile "./lib/arboricx.tri"
let env = evalTricu library (parseTricu input)
result env @?= okT (pairT (ofNumber 2) (ofNumber 42)) (bytesT [])
close srcConn
, testCase "runArboricxToBool: rejects non-bool result" $ do
(srcConn, termHash, _) <- storeTermInTempDB $ unlines
[ "main = 42" ]
wireData <- exportBundle srcConn [termHash]
let input = "runArboricxToBool " ++ bytesExpr (map toInteger $ BS.unpack wireData) ++ " []"
library <- evaluateFile "./lib/arboricx.tri"
let env = evalTricu library (parseTricu input)
case result env of
Fork falseTag (Fork code _) | falseTag == falseT -> code @?= ofNumber 14
actual -> assertFailure $ "expected host codec error, got: " ++ show actual
close srcConn
]