4 Commits

21 changed files with 3474 additions and 1483 deletions

122
AGENTS.md
View File

@@ -2,6 +2,10 @@
> For AI agents and contributors working in this repository. > For AI agents and contributors working in this repository.
## 0. TDD
Write and discuss tests with the user before implementing any implementation code.
## 1. Build & Test ## 1. Build & Test
```bash ```bash
@@ -128,114 +132,18 @@ hash = SHA256("arboricx.merkle.node.v1" <> 0x00 <> serialized_node)
This is stored in SQLite via `ContentStore.hs`. Hash suffixes on identifiers (e.g., `foo_abc123...`) are validated: 1664 hex characters (SHA256). This is stored in SQLite via `ContentStore.hs`. Hash suffixes on identifiers (e.g., `foo_abc123...`) are validated: 1664 hex characters (SHA256).
## 7. Arboricx Portable Wire Format ## 7. Arboricx Portable Bundles (`.arboricx`)
The **Arboricx wire format** (module `Wire.hs`) defines a portable binary bundle for exchanging Tree Calculus terms, their Merkle DAGs, and associated metadata. It is versioned and schema-driven. Portable executable bundles are generated via `Wire.hs`. See `docs/arboricx-bundle-format.md` for the full binary format spec.
### Header ```bash
# Export a bundle from the content store
./result/bin/tricu export -o myterm.arboricx myterm
# Run a bundle (requires TRICU_DB_PATH)
./result/bin/tricu import -f lib/list.tri
TRICU_DB_PATH=/tmp/tricu.db ./result/bin/tricu export -o list_ops.arboricx append
``` ```
+------------------+-----------------+------------------+----------------+
| Magic (8 bytes) | Major (2 bytes) | Minor (2 bytes) | Section Count |
| | | | (4 bytes) |
+------------------+-----------------+------------------+----------------+
| Flags (8 bytes) | Dir Offset (8 bytes)
+------------------+-----------------+------------------+
```
- **Magic**: `ARBORICX` (`0x41 0x52 0x42 0x4f 0x52 0x49 0x43 0x58`)
- **Header length**: 32 bytes
- **Major version**: `1` | **Minor version**: `0`
### Section Directory
Immediately follows the header. Each section entry is 60 bytes:
```
+------------------+------------------+-----------------+------------------+
| Type (4 bytes) | Version (2 bytes)| Flags (2 bytes) | Compression (2) |
+------------------+------------------+-----------------+------------------+
| Digest Algo (2) | Offset (8 bytes) | Length (8 bytes)| SHA256 digest (32)|
+------------------+------------------+-----------------+------------------+
```
Known section types:
| Type | Name | Required | Description |
|------|-----------|----------|-------------|
| 1 | manifest | Yes | JSON manifest metadata |
| 2 | nodes | Yes | Binary Merkle node payloads |
### Section 1 — Manifest (JSON)
The manifest describes the bundle's semantics, exports, and schema. Key fields:
| Field | Value | Description |
|-------|-------|-------------|
| `schema` | `"arboricx.bundle.manifest.v1"` | Manifest schema version |
| `bundleType` | `"tree-calculus-executable-object"` | Bundle category |
| `tree.calculus` | `"tree-calculus.v1"` | Tree calculus version |
| `tree.nodeHash.algorithm` | `"sha256"` | Hash algorithm |
| `tree.nodeHash.domain` | `"arboricx.merkle.node.v1"` | Hash domain string |
| `tree.nodePayload` | `"arboricx.merkle.payload.v1"` | Payload encoding |
| `runtime.semantics` | `"tree-calculus.v1"` | Evaluation semantics |
| `runtime.abi` | `"arboricx.abi.tree.v1"` | Runtime ABI |
| `closure` | `"complete"` | Bundle must be a complete DAG |
| `roots` | `[{"hash": "...", "role": "..."}]` | Named root hashes |
| `exports` | `[{"name": "...", "root": "..."}]` | Export aliases for roots |
| `metadata.createdBy` | `"arboricx"` | Originator |
### Section 2 — Nodes (Binary)
```
+------------------+-------------------+-------------------+-----------------+
| Node Count (8) | Hash (32 bytes) | Payload Len (4) | Payload (N) |
+------------------+-------------------+-------------------+-----------------+
```
Each node entry contains:
- 32-byte Merkle hash (hex-encoded in identifiers, raw in binary)
- 4-byte big-endian payload length
- N bytes of serialized node payload (`0x00` for Leaf, `0x01 || hash` for Stem, `0x02 || left || right` for Fork)
### Bundle verification flow
1. Check magic bytes
2. Validate major version
3. Parse section directory
4. For each section: verify SHA256 digest against actual bytes
5. Decode JSON manifest
6. Decode binary node entries into Merkle DAG
7. Verify all root hashes present in manifest exist in node map
8. Verify export root hashes present
9. Verify children references are complete (no dangling nodes)
10. Reject unknown critical sections
### Data types (Wire.hs)
| Type | Purpose |
|------|---------|
| `Bundle` | Top-level bundle: version, roots, nodes map, manifest |
| `BundleManifest` | JSON metadata: schema, tree spec, runtime spec, roots, exports |
| `TreeSpec` | Tree calculus version + hash algorithm + payload encoding |
| `NodeHashSpec` | Hash algorithm and domain string |
| `RuntimeSpec` | Semantics, evaluation order, ABI, capabilities |
| `BundleRoot` | Root hash + role (`"default"` or `"root"`) |
| `BundleExport` | Export name + root hash + kind + ABI |
| `BundleMetadata` | Optional package, version, description, license, createdBy |
| `ClosureMode` | `ClosureComplete` or `ClosurePartial` |
### Key functions
| Function | Signature | Purpose |
|----------|-----------|---------|
| `encodeBundle` | `Bundle → ByteString` | Serialize bundle to wire bytes |
| `decodeBundle` | `ByteString → Either String Bundle` | Parse wire bytes into Bundle |
| `verifyBundle` | `Bundle → Either String ()` | Validate DAG, manifest, roots |
| `collectReachableNodes` | `Connection → MerkleHash → IO [(MerkleHash, ByteString)]` | Traverse DAG from root |
| `exportBundle` | `Connection → [MerkleHash] → IO ByteString` | Build bundle from content store |
| `exportNamedBundle` | `Connection → [(Text, MerkleHash)] → IO ByteString` | Build with named roots |
| `importBundle` | `Connection → ByteString → IO [MerkleHash]` | Import bundle into content store |
## 8. Directory Layout ## 8. Directory Layout
@@ -273,12 +181,12 @@ tricu/
## 9. JS Arboricx Runtime ## 9. JS Arboricx Runtime
A JavaScript implementation of the Arboricx portable bundle runtime lives in `ext/js/`. A JavaScript implementation of the Arboricx portable bundle runtime lives in `ext/js/`.
It is a reference implementation — not a tricu source parser. It reads `.tri.bundle` files produced by the Haskell toolchain, verifies Merkle node hashes, reconstructs tree values, and reduces them. It is a reference implementation — not a tricu source parser. It reads `.arboricx` files produced by the Haskell toolchain, verifies Merkle node hashes, reconstructs tree values, and reduces them.
From project root: From project root:
```bash ```bash
node ext/js/src/cli.js inspect test/fixtures/id.tri.bundle node ext/js/src/cli.js inspect test/fixtures/id.arboricx
node ext/js/src/cli.js run test/fixtures/true.tri.bundle node ext/js/src/cli.js run test/fixtures/true.arboricx
``` ```
The JS runtime implements: The JS runtime implements:

View File

@@ -1,339 +0,0 @@
# Arboricx Portable Bundle v1 (CBOR Manifest Profile)
Status: **Draft, implementation-aligned** (derived from `src/Wire.hs` as of 2026-05-07)
This document specifies the **actual on-wire format and validation behavior** currently implemented by `tricu` for Arboricx bundles, with a focus on the newer CBOR manifest path.
---
## 1. Scope
This profile defines:
1. The binary container envelope (header + section directory + section payloads).
2. The CBOR manifest section format.
3. The Merkle node section format.
4. Decode/verify/import behavior in `Wire.hs`.
5. Known gaps and sane resolutions.
Non-goals:
- tricu source parsing/lambda elimination/module semantics.
- Signature systems / trust policy.
- Compression codecs beyond `none`.
---
## 2. Container format
A bundle is a byte stream:
```
[32-byte header]
[section directory: section_count * 60 bytes]
[section payload bytes...]
```
### 2.1 Header (32 bytes)
| Field | Size | Encoding | Value / Notes |
|---|---:|---|---|
| Magic | 8 | raw bytes | `41 52 42 4f 52 49 58 00` (`"ARBORICX"`) |
| Major | 2 | u16 BE | Must be `1` |
| Minor | 2 | u16 BE | Currently `0` |
| SectionCount | 4 | u32 BE | Number of section directory entries |
| Flags | 8 | u64 BE | Currently emitted as `0`; not interpreted |
| DirectoryOffset | 8 | u64 BE | Offset of section directory (currently `32`) |
Reader behavior:
- Reject if total bytes < 32.
- Reject bad magic.
- Reject major != 1.
### 2.2 Section directory entry (60 bytes each)
| Field | Size | Encoding | Notes |
|---|---:|---|---|
| Type | 4 | u32 BE | e.g. 1=manifest, 2=nodes |
| Version | 2 | u16 BE | Currently emitted as `1`; not enforced on read |
| Flags | 2 | u16 BE | bit0 = critical |
| Compression | 2 | u16 BE | `0` = none (required) |
| DigestAlgorithm | 2 | u16 BE | `1` = SHA-256 (required) |
| Offset | 8 | u64 BE | Absolute byte offset |
| Length | 8 | u64 BE | Section payload length |
| Digest | 32 | raw bytes | SHA-256 of section bytes |
Reader behavior:
- Reject unknown **critical** section types.
- Reject compression != 0.
- Reject digest algorithm != 1.
- Reject out-of-bounds sections.
- Reject digest mismatch.
### 2.3 Required section types
| Type | Name | Required |
|---:|---|---|
| 1 | manifest | yes |
| 2 | nodes | yes |
Decode currently rejects duplicate section type 1 or 2.
---
## 3. Manifest section (CBOR)
Manifest bytes are CBOR-encoded map data (using `cborg`).
### 3.1 Top-level manifest schema
Top-level map has **exactly 8 keys** in this exact decode order in current implementation:
1. `schema` (text)
2. `bundleType` (text)
3. `tree` (map)
4. `runtime` (map)
5. `closure` (text: `"complete"|"partial"`)
6. `roots` (array)
7. `exports` (array)
8. `metadata` (map)
> Important: Current decoder is order-strict; it expects keys in this sequence.
### 3.2 Nested structures
#### `tree` map (3 keys, order-strict)
- `calculus`: text
- `nodeHash`: map
- `nodePayload`: text
`nodeHash` map (2 keys, order-strict):
- `algorithm`: text
- `domain`: text
#### `runtime` map (4 keys, order-strict)
- `semantics`: text
- `evaluation`: text
- `abi`: text
- `capabilities`: array(text)
#### `roots` array of maps
Each root map has 2 keys (order-strict):
- `hash`: bytes (raw 32-byte hash payload encoded as CBOR byte string)
- `role`: text
#### `exports` array of maps
Each export map has 4 keys (order-strict):
- `name`: text
- `root`: bytes (32-byte hash)
- `kind`: text
- `abi`: text
#### `metadata` map
Flexible key set; decoded as map(text -> text), then projected into optional fields:
- `package`
- `version`
- `description`
- `license`
- `createdBy`
Unknown metadata keys are ignored.
### 3.3 Default emitted manifest values
Writers in `Wire.hs` currently emit:
- `schema = "arboricx.bundle.manifest.v1"`
- `bundleType = "tree-calculus-executable-object"`
- `tree.calculus = "tree-calculus.v1"`
- `tree.nodeHash.algorithm = "sha256"`
- `tree.nodeHash.domain = "arboricx.merkle.node.v1"`
- `tree.nodePayload = "arboricx.merkle.payload.v1"`
- `runtime.semantics = "tree-calculus.v1"`
- `runtime.evaluation = "normal-order"`
- `runtime.abi = "arboricx.abi.tree.v1"`
- `runtime.capabilities = []`
- `closure = "complete"`
- `metadata.createdBy = "arboricx"`
---
## 4. Nodes section (binary)
Node section payload layout:
```
node_count: u64 BE
repeat node_count times:
hash: 32 bytes
payload_len: u32 BE
payload: payload_len bytes
```
Node payload grammar:
- `0x00` => Leaf
- `0x01 || child_hash(32)` => Stem
- `0x02 || left_hash(32)||right(32)` => Fork
Section decoder rejects:
- duplicate node hashes,
- truncated entries,
- payload overruns,
- trailing bytes after final node.
---
## 5. Verification behavior (`verifyBundle`)
`verifyBundle` enforces all of:
1. bundle version >= 1.
2. bundle has at least one node.
3. manifest constants match hardcoded v1 values (schema/type/calculus/hash algo/domain/payload/runtime semantics/ABI).
4. runtime capabilities must be empty.
5. closure must be `complete`.
6. manifest has at least one root and one export.
7. root sets in `bundleRoots` and `manifest.roots` must match exactly.
8. each root and export root exists in node map.
9. each node payload deserializes and re-hashes to declared node hash.
10. all referenced child hashes exist.
11. full closure reachability from roots succeeds.
`importBundle` runs decode + verify before storing nodes.
---
## 6. Export/import semantics
### 6.1 Export
`exportNamedBundle`:
- Traverses reachable nodes for each requested root hash.
- Builds node map.
- Builds default manifest and CBOR bytes.
- Emits two sections (manifest, nodes).
`exportBundle` auto-names exports:
- 1 root => `root`
- N>1 => `root0`, `root1`, ...
### 6.2 Import
`importBundle`:
1. Decode bundle.
2. Verify bundle.
3. Insert all node payloads into content store.
4. For each manifest export: reconstruct tree by export root and store name binding in DB.
5. Return bundle root list.
---
## 7. Determinism properties
Current implementation is deterministic for identical logical input because:
- Node map serialized in ascending hash order (`Map.toAscList`).
- Field order in manifest encoding is fixed by code.
- Section ordering is fixed: manifest then nodes.
So repeated exports of same roots produce byte-identical bundles.
---
## 8. Known gaps and sane resolutions
These are important design gaps visible from current code.
### Gap A: Node hash domain mismatch risk (critical)
Status: **resolved in current codebase**.
What was wrong:
- Manifest declared `tree.nodeHash.domain = "arboricx.merkle.node.v1"`.
- Hashing implementation previously used `"tricu.merkle.node.v1"`.
Current state:
- Haskell hashing now uses `"arboricx.merkle.node.v1"`.
- JS reference runtime hashing now uses `"arboricx.merkle.node.v1"`.
- JS manifest validation now requires `"arboricx.merkle.node.v1"`.
Remaining recommendation:
- Keep hash-domain constants centralized/shared to prevent future drift.
- Add explicit test vectors for Leaf/Stem/Fork hashes under the Arboricx domain.
### Gap B: CBOR decode is order-strict, not generic-map tolerant
Observed:
- Decoder expects exact key order for most maps.
Impact:
- Another canonical CBOR writer that reorders keys may decode-fail even if semantically equivalent.
Sane resolution:
- For v1 compatibility, decode maps as unordered key/value collections, require key presence and types, and reject unknown keys only where desired.
- Keep writer deterministic, but relax reader.
### Gap C: “Canonical CBOR” claim is stronger than implementation
Observed:
- Writer uses fixed order but does not explicitly sort keys per RFC 8949 canonical ordering rules.
Sane resolution:
- Either (a) rename as “deterministic CBOR” profile, or (b) implement explicit canonical key ordering and canonical-length/minimal integer forms checks.
### Gap D: Extra section preservation
Observed:
- Decoder tolerates unknown non-critical sections, but `Bundle` model/encoder drops them on re-encode.
Sane resolution:
- Add `bundleExtraSections :: [SectionEntry+Bytes]` if round-trip preservation is desired.
### Gap E: Section version not enforced
Observed:
- Section entry `Version` is parsed but unused.
Sane resolution:
- Enforce known version matrix (e.g., manifest v1, nodes v1), or explicitly document “advisory only”.
### Gap F: Runtime capability policy is hard fail
Observed:
- Any non-empty capabilities list is rejected.
Sane resolution:
- Keep strict for now, but define capability negotiation strategy for v1.1+ (unknown capabilities => reject unless explicitly allowed by host policy).
### Gap G: Error handling style in import/export path
Observed:
- Several paths throw `error` for malformed data/store misses.
Sane resolution:
- Return `Either`-style typed errors through public API (`decode`, `verify`, `import`), reserve exceptions for truly internal faults.
---
## 9. Conformance checklist (v1 current)
A conforming v1 reader/writer for this profile should:
- Implement the 32-byte header and 60-byte section records exactly.
- Support required sections 1 and 2.
- Verify section digests with SHA-256.
- Decode/encode manifest CBOR matching the field model above.
- Parse nodes section and validate node payload structure.
- Recompute and verify node hashes.
- Enforce complete closure for roots.
- Enforce manifest/runtime constants used by v1.
---
## 10. Suggested follow-up docs
To stabilize interoperability, add:
1. `docs/arboricx-bundle-test-vectors.md` (golden header/manifest/nodes + expected hashes).
2. `docs/arboricx-bundle-errors.md` (normative error codes/strings).
3. `docs/arboricx-bundle-evolution.md` (rules for minor/major upgrades, capability negotiation, extra sections).

View File

@@ -0,0 +1,419 @@
# Arboricx Portable Bundle Format Specification
**Version:** 0.1
**Status:** Exploratory
**Author:** A range of slopmachines guided by James Eversole
**Human Review Status:** 5 minute scan-through - this is an evolving and malleable document
The Arboricx Portable Bundle is a self-contained, content-addressed binary format for distributing Tree Calculus programs and their associated Merkle DAGs. It provides:
- A fixed binary container with header, section directory, and typed sections
- A language-neutral Merkle node layer for content-addressed tree values
- A fixed-order binary manifest for semantic metadata, exports, and optional extensions
## Table of Contents
1. [Top-Level Container Layout](#1-top-level-container-layout)
2. [Header](#2-header)
3. [Section Directory](#3-section-directory)
4. [Section: Manifest (type 1)](#4-section-manifest-type-1)
5. [Section: Nodes (type 2)](#5-section-nodes-type-2)
6. [Merkle Node Payload Format](#6-merkle-node-payload-format)
7. [Merkle Hash Computation](#7-merkle-hash-computation)
8. [Tree Calculus Reduction Semantics](#8-tree-calculus-reduction-semantics)
9. [Binary Primitives](#9-binary-primitives)
10. [Bundle Verification](#10-bundle-verification)
11. [Known Section Types](#11-known-section-types)
---
## 1. Top-Level Container Layout
An Arboricx bundle is a flat binary blob with the following layout:
```
+------------------+------------------+------------------+------------------+
| Header | Section Directory| Manifest Section | Nodes Section |
| (32 bytes) | (N × 60 bytes) | (variable) | (variable) |
+------------------+------------------+------------------+------------------+
```
The container uses **big-endian** byte order for all multi-byte integers.
Total bundle size = 32 + (sectionCount × 60) + manifestSize + nodesSize
---
## 2. Header
| Offset | Size | Field | Description |
|--------|------|-------|-------------|
| 0 | 8 bytes | Magic | ASCII `"ARBORICX"` (`0x41 0x52 0x42 0x4F 0x52 0x49 0x43 0x58`) |
| 8 | 2 bytes | Major version | `u16` BE. Currently `1` |
| 10 | 2 bytes | Minor version | `u16` BE. Currently `0` |
| 12 | 4 bytes | Section count | `u32` BE. Number of entries in the section directory |
| 16 | 8 bytes | Flags | `u64` BE. Reserved; currently all zeros |
| 24 | 8 bytes | Directory offset | `u64` BE. Byte offset from the start of the bundle to the section directory |
**Constraints:**
- Major version must be `1`. Bundles with unsupported major versions are rejected.
- The directory offset must point to a valid location within the bundle.
- The directory offset is always `32` for bundles with the current layout (header immediately followed by the directory).
---
## 3. Section Directory
The section directory is an array of `N` entries, where `N` is the section count from the header. Each entry is exactly **60 bytes**.
| Offset (within entry) | Size | Field | Description |
|----------------------|------|-------|-------------|
| 0 | 4 bytes | Type | `u32` BE. Section type identifier (see [Known Section Types](#11-known-section-types)) |
| 4 | 2 bytes | Version | `u16` BE. Section-specific version |
| 6 | 2 bytes | Flags | `u16` BE. Bit flags: bit 0 (`0x0001`) = critical section |
| 8 | 2 bytes | Compression | `u16` BE. Compression codec (currently only `0` = none) |
| 10 | 2 bytes | Digest algorithm | `u16` BE. Hash algorithm (currently only `1` = SHA-256) |
| 12 | 8 bytes | Offset | `u64` BE. Byte offset from the start of the bundle to the section data |
| 20 | 8 bytes | Length | `u64` BE. Length of the section data in bytes |
| 28 | 32 bytes | SHA-256 digest | Raw digest of the section data |
**Verification:**
- Unknown critical sections (flags & `0x0001`) are rejected.
- Compression must be `0` (none).
- Digest algorithm must be `1` (SHA-256).
- The SHA-256 digest in the directory entry must match `SHA256(section_data)`.
---
## 4. Section: Manifest (type 1)
The manifest is a binary encoding of bundle metadata. It uses a **fixed-order core** layout followed by an optional **TLV tail** for extensibility.
### 4.1 Format
```
Manifest =
magic 8 bytes "ARBMNFST"
major u16 BE Manifest major version (1)
minor u16 BE Manifest minor version (0)
schema string Length-prefixed UTF-8 text
bundleType string Length-prefixed UTF-8 text
treeCalculus string Length-prefixed UTF-8 text
treeHashAlgorithm string Length-prefixed UTF-8 text
treeHashDomain string Length-prefixed UTF-8 text
treeNodePayload string Length-prefixed UTF-8 text
runtimeSemantics string Length-prefixed UTF-8 text
runtimeEvaluation string Length-prefixed UTF-8 text
runtimeAbi string Length-prefixed UTF-8 text
capabilityCount u32 BE Number of capability strings
capabilities string[] Array of length-prefixed UTF-8 capability strings
closure u8 0 = complete, 1 = partial
rootCount u32 BE Number of root entries
roots Root[] Array of root entries
exportCount u32 BE Number of export entries
exports Export[] Array of export entries
metadataFieldCount u32 BE Number of metadata TLV entries
metadataFields TLV[] Metadata tag-value entries
extensionFieldCount u32 BE Number of extension TLV entries
extensionFields TLV[] Extension tag-value entries (skipped by parsers)
```
**Trailing bytes after the manifest must be zero** (no leftover data).
### 4.2 String Format
Every `string` field uses the same encoding:
```
string =
length u32 BE Number of UTF-8 bytes in the string (not the number of characters)
bytes byte[length] UTF-8 encoded string content
```
The length field carries the byte count, so parsers can skip strings without decoding UTF-8.
### 4.3 Root Entry
```
Root =
hash 32 bytes Raw SHA-256 hash of the Merkle node
role string Length-prefixed UTF-8 text ("default" for the first root, "root" for others)
```
The hash is stored as **raw bytes** (not hex-encoded). It corresponds to the Merkle hash of the node.
### 4.4 Export Entry
```
Export =
name string Length-prefixed UTF-8 text (export identifier)
root 32 bytes Raw SHA-256 hash of the Merkle node
kind string Length-prefixed UTF-8 text (currently "term")
abi string Length-prefixed UTF-8 text (ABI string)
```
### 4.5 TLV Entry
```
TLV =
tag u16 BE Tag identifier (type)
length u32 BE Number of bytes in the value
value byte[length] Raw bytes
```
TLV entries support variable-length values and are skippable by parsers that do not recognize a tag: read the `u32` length and advance by `2 + 4 + length` bytes.
### 4.6 Metadata Tags
| Tag | Name | Value |
|-----|------|-------|
| 1 | package | UTF-8 text: package name |
| 2 | version | UTF-8 text: version string |
| 3 | description | UTF-8 text: description |
| 4 | license | UTF-8 text: license identifier or text |
| 5 | createdBy | UTF-8 text: creator identifier |
Unknown metadata tags are ignored. Unknown extension tags are skipped by length.
### 4.7 Semantic Constraints
A valid bundle manifest must satisfy:
| Constraint | Value |
|-----------|-------|
| `schema` | `"arboricx.bundle.manifest.v1"` |
| `bundleType` | `"tree-calculus-executable-object"` |
| `treeCalculus` | `"tree-calculus.v1"` |
| `treeHashAlgorithm` | `"sha256"` |
| `treeHashDomain` | `"arboricx.merkle.node.v1"` |
| `treeNodePayload` | `"arboricx.merkle.payload.v1"` |
| `runtimeSemantics` | `"tree-calculus.v1"` |
| `runtimeAbi` | `"arboricx.abi.tree.v1"` |
| `runtimeCapabilities` | Empty array |
| `closure` | `0` (complete) |
| `rootCount` | At least 1 |
| `exportCount` | At least 1 |
| Export names | Non-empty |
| Export roots | Non-empty (32 bytes each) |
---
## 5. Section: Nodes (type 2)
The nodes section contains all Merkle DAG nodes referenced by the manifest. It is a sequence of node entries preceded by a count.
```
NodesSection =
nodeCount u64 BE Total number of node entries
entries NodeEntry[]
```
Each node entry:
```
NodeEntry =
hash 32 bytes Raw SHA-256 hash of this node
payloadLen u32 BE Length of the payload in bytes
payload byte[payloadLen] Node payload (see Section 6)
```
The node count is `u64` to support large bundles. Entries are stored in the order produced by the exporter (typically sorted by hash for determinism).
---
## 6. Merkle Node Payload Format
Each node in the Merkle DAG is one of three types. The payload is a single byte type tag followed by hash references:
### Leaf
```
Payload = 0x00
```
A leaf has no children. The payload is exactly 1 byte.
### Stem
```
Payload = 0x01 || child_hash (32 bytes raw)
```
A stem has exactly one child. The payload is 33 bytes.
### Fork
```
Payload = 0x02 || left_hash (32 bytes raw) || right_hash (32 bytes raw)
```
A fork has exactly two children. The payload is 65 bytes.
**Validation:**
- Leaf payloads must be exactly 1 byte (`0x00`).
- Stem payloads must be exactly 33 bytes.
- Fork payloads must be exactly 65 bytes.
- Unknown type bytes are rejected.
---
## 7. Merkle Hash Computation
Each node is identified by a SHA-256 hash of its canonical payload:
```
hash = SHA256( domain_tag || 0x00 || payload )
```
Where:
| Component | Value |
|-----------|-------|
| `domain_tag` | `"arboricx.merkle.node.v1"` as UTF-8 bytes |
| Separator | `0x00` (one zero byte) |
| `payload` | The node's canonical serialization from Section 6 |
**Examples:**
- **Leaf:** `SHA256("arboricx.merkle.node.v1" || 0x00 || 0x00)`
- **Stem:** `SHA256("arboricx.merkle.node.v1" || 0x00 || 0x01 || child_hash_bytes)`
- **Fork:** `SHA256("arboricx.merkle.node.v1" || 0x00 || 0x02 || left_hash_bytes || right_hash_bytes)`
The resulting SHA-256 hash is stored as a hex-encoded string in the manifest (64 hex characters). Within the nodes section, it is stored as raw bytes.
---
## 8. Tree Calculus Reduction Semantics
The bundle represents a **Tree Calculus** term as a Merkle DAG. The reduction rules are:
### Apply Rules
```
apply(Fork(Leaf, a), _) = a
apply(Fork(Stem(a), b), c) = apply(apply(a, c), apply(b, c))
apply(Fork(Fork, _, _), Leaf) = left of inner Fork
apply(Fork(Fork, _, _), Stem) = right of inner Fork
apply(Fork(Fork, _, _), Fork) = apply(apply(c, u), v) where c = Fork(u, v)
apply(Leaf, b) = Stem(b)
apply(Stem(a), b) = Fork(a, b)
```
### Internal Representation
In the reduction engine, Fork nodes use a `[right, left]` (stack) ordering:
- `Fork = [right_child, left_child]`
- `Stem = [child]`
- `Leaf = []`
This ordering supports stack-based reduction: pop two terms, apply, push results back.
### Closure
The bundle declares `closure = "complete"`, meaning all nodes reachable from export roots are present in the nodes section. No external references exist.
---
## 9. Binary Primitives
All multi-byte integers use **big-endian** byte order.
### u16 (2 bytes)
```
byte[0] | byte[1]
value = (byte[0] << 8) | byte[1]
```
### u32 (4 bytes)
```
byte[0] | byte[1] | byte[2] | byte[3]
value = (byte[0] << 24) | (byte[1] << 16) | (byte[2] << 8) | byte[3]
```
### u64 (8 bytes)
```
byte[0] ... byte[7]
value = (byte[0] << 56) | ... | byte[7]
```
### u8 (1 byte)
A single byte, value `0-255`.
---
## 10. Bundle Verification
A complete bundle verification proceeds in this order:
1. **Magic check:** First 8 bytes must be `"ARBORICX"`.
2. **Version check:** Major version must be `1`.
3. **Section directory:** Parse all entries; reject unknown critical sections.
4. **Digest verification:** For each section, compute `SHA256(section_data)` and compare with the digest in the directory entry.
5. **Manifest parsing:** Decode the fixed-order manifest; validate semantic constraints.
6. **Node section:** Parse all node entries; reject duplicates.
7. **Root verification:** All root hashes from the manifest must exist in the node map.
8. **Export verification:** All export root hashes must exist in the node map.
9. **Node hash verification:** For each node, compute `SHA256(domain || 0x00 || payload)` and compare with the stored hash.
10. **Children verification:** For each Stem/Fork node, both child hashes must exist in the node map.
11. **Closure verification:** Starting from each root hash, traverse the DAG and confirm all reachable nodes are present.
---
## 11. Known Section Types
| Type | Name | Required | Version | Description |
|------|------|----------|---------|-------------|
| 1 | Manifest | Yes | 1 | Bundle metadata in fixed-order binary format |
| 2 | Nodes | Yes | 1 | Merkle DAG node entries |
Unknown section types are permitted if not marked as critical (flags bit 0 is not set).
---
## Appendix A: Complete Example Layout (id.arboricx)
A minimal `id.arboricx` bundle has:
```
+---------------------------------------------------+
| Header (32 bytes) |
| Magic: "ARBORICX" |
| Major: 1, Minor: 0 |
| Section count: 2 |
| Flags: 0 |
| Dir offset: 32 |
+---------------------------------------------------+
| Section Directory (120 bytes = 2 × 60) |
| Entry 0: type=1 (manifest), offset=152, len=375 |
| Entry 1: type=2 (nodes), offset=527, len=284 |
+---------------------------------------------------+
| Manifest Section (375 bytes) |
| Magic: "ARBMNFST" |
| Version: 1.0 |
| Core strings (schema, bundleType, tree spec, |
| runtime spec, capabilities, closure, roots, |
| exports, metadata TLVs, extension fields) |
+---------------------------------------------------+
| Nodes Section (284 bytes) |
| Node count: 2 |
| Node entry 1: hash + payload (Leaf) |
| Node entry 2: hash + payload (Fork) |
+---------------------------------------------------+
```
The manifest section starts at byte 152 (0x98) and the nodes section at byte 527 (0x20F).
---
## Appendix B: File Extension
Bundles produced by the `tricu` tool use the `.arboricx` file extension. The `.tri` extension is used for plain source files; the `.arboricx` extension identifies the portable binary format.

247
docs/host-abi.md Normal file
View File

@@ -0,0 +1,247 @@
# tricu Host ABI
This document specifies the first host-facing ABI for self-hosted Arboricx execution.
The ABI is intentionally small. A host language should only need to implement Tree Calculus construction/reduction plus a tiny set of canonical payload codecs. Higher-level execution policy lives in Tree Calculus.
## Goals
- Keep host-language implementations small and auditable.
- Preserve canonical Tree Calculus representations for payloads.
- Provide a stable tagged envelope so hosts do not need per-application result conventions.
- Reuse the existing `ok` / `err` result protocol.
- Support typed execution wrappers for common return types.
## Non-goals
- This ABI does not remove the need for host codecs entirely.
- This ABI does not define every possible application protocol.
- This ABI does not require auto-detecting arbitrary result types.
## Outer result protocol
Host ABI runners return the existing tricu result shape from `lib/binary.tri`:
```tricu
ok value rest = pair true (pair value rest)
err code rest = pair false (pair code rest)
```
On success, `value` is a host ABI value.
On failure, `code` is a canonical Tree Calculus number. The host may report the numeric code and optionally inspect `rest` for debugging.
## Host ABI value shape
A host ABI value is:
```tricu
pair tag payload
```
The `tag` says how the host should interpret `payload`.
The payload is always the canonical/raw Tree Calculus representation for that type. The ABI envelope tags the payload; it does not replace or recursively wrap canonical Tree Calculus data.
## Tags
Initial tags:
```tricu
hostTreeTag = 0
hostStringTag = 1
hostNumberTag = 2
hostBoolTag = 3
hostListTag = 4
hostBytesTag = 5
```
Planned/error tag, if needed later:
```tricu
hostErrorTag = 6
```
The first implementation keeps errors in the outer `err` result protocol rather than returning `hostError` inside `ok`.
## Constructors
The ABI constructors are:
```tricu
hostTree value
hostString bytes
hostNumber n
hostBool b
hostList xs
hostBytes bytes
```
Each constructor returns:
```tricu
pair tag payload
```
Examples:
```tricu
hostString "hello"
hostNumber 42
hostBool true
hostList [1 2 3]
hostTree (t t t)
```
## Payload conventions
Payloads use existing canonical tricu encodings:
| ABI value | Payload |
| --- | --- |
| `hostTree` | arbitrary raw Tree Calculus value |
| `hostString` | canonical string/byte-list representation |
| `hostNumber` | canonical tricu number |
| `hostBool` | canonical tricu bool (`false = t`, `true = t t`) |
| `hostList` | canonical tricu list (`t` empty, `pair head tail` cons) |
| `hostBytes` | canonical byte list |
`hostList` payloads are raw canonical lists, **not** lists of host ABI values.
## Accessors / matching
The first ABI should expose simple accessors:
```tricu
hostValueTag hostValue
hostValuePayload hostValue
```
A host can decode the envelope by destructuring the pair directly, but these helpers make the ABI explicit and testable.
## Validation predicates
Typed runners should validate that the raw application result can be interpreted as the requested type before wrapping it.
Initial predicates:
```tricu
hostNumber? value
hostBool? value
hostList? value
hostString? value
hostBytes? value
```
These predicates are structural checks over canonical encodings. They are not general semantic type inference.
Important ambiguity note:
Tree Calculus encodings are not globally disjoint. For example, `t` is also `false`, `0`, and `[]`. Typed runners intentionally interpret values according to the requested type.
## Error behavior
Typed ABI runners return an error if the application result does not match the requested type.
Initial error code:
```tricu
errHostCodecFailed = 14
```
Example:
```tricu
runArboricxToString bundle args
```
returns:
```tricu
ok (hostString resultBytes) rest
```
if `resultBytes` is string-like, otherwise:
```tricu
err errHostCodecFailed result
```
where `result` is the raw application result that failed validation.
## Execution wrappers
The base self-hosted Arboricx runners are defined in `lib/arboricx.tri`:
```tricu
runArboricxArgs bundleBytes args
runArboricxArgsByName nameBytes bundleBytes args
```
Host ABI wrappers layer typed output envelopes on top:
```tricu
runArboricxToTree bundleBytes args
runArboricxToString bundleBytes args
runArboricxToNumber bundleBytes args
runArboricxToBool bundleBytes args
runArboricxToList bundleBytes args
runArboricxToBytes bundleBytes args
```
Named-export variants:
```tricu
runArboricxByNameToTree nameBytes bundleBytes args
runArboricxByNameToString nameBytes bundleBytes args
runArboricxByNameToNumber nameBytes bundleBytes args
runArboricxByNameToBool nameBytes bundleBytes args
runArboricxByNameToList nameBytes bundleBytes args
runArboricxByNameToBytes nameBytes bundleBytes args
```
## Host usage
For a bundle whose default export is an unapplied function:
```tricu
append "hello "
```
A host that expects a string result evaluates:
```tricu
runArboricxToString bundleBytes ["james"]
```
On success, the result is:
```tricu
ok (hostString "hello james") rest
```
The host then:
1. unwraps `ok`,
2. checks `hostStringTag`,
3. decodes the canonical string payload.
## Implementation reference
- Tree constructors, numbers, strings, and lists: `src/Research.hs`
- Result protocol: `lib/binary.tri`
- Arboricx parser/executor: `lib/arboricx.tri`
- Host ABI implementation: `lib/host-abi.tri` or `lib/arboricx.tri`, depending on final organization
## First-pass invariants
Tests should cover these invariants:
1. Each constructor stores the correct tag and payload.
2. `hostValueTag` and `hostValuePayload` destructure values correctly.
3. `runArboricxToTree` always wraps successful raw results as `hostTree`.
4. `runArboricxToString` wraps string-like results as `hostString`.
5. `runArboricxToNumber` wraps number-like results as `hostNumber`.
6. `runArboricxToBool` wraps canonical booleans as `hostBool`.
7. A typed runner returns `errHostCodecFailed` when validation fails.
8. Named-export typed runners select the requested export before wrapping.

View File

@@ -0,0 +1,483 @@
# Self-hosted Arboricx Host Prototype
This document describes how to build a minimal host-language shell that can execute Arboricx bundles through the self-hosted tricu Arboricx parser/executor.
The intended reader is an implementation agent building a first prototype in a host language such as PHP. The same approach should generalize to any language with a small Tree Calculus evaluator.
See also: [`docs/host-abi.md`](./host-abi.md) for the precise host-facing ABI value tags and typed runner contract.
## Goal
Build a tiny host program that can:
1. Represent Tree Calculus values.
2. Reduce/evaluate Tree Calculus terms.
3. Load or embed the tricu Arboricx runtime kernel.
4. Read an application `.arboricx` bundle from disk.
5. Convert host inputs into canonical Tree Calculus values.
6. Apply the kernel to the application bundle and arguments.
7. Unwrap a standardized host ABI result.
8. Decode the host ABI payload back into host values.
A concrete target example:
```tricu
-- Application bundle root is an unapplied function:
append "hello "
```
The host should be able to call that bundle with the host string `"james"` and receive:
```text
hello james
```
With the Host ABI layer, the preferred conceptual call is:
```tricu
runArboricxToString <applicationBundleBytes> ["james"]
```
This returns:
```tricu
ok (hostString "hello james") rest
```
where `runArboricxToString` comes from the self-hosted Arboricx runtime kernel.
## Architectural overview
There are two Arboricx bundles involved:
1. **Kernel bundle**
- Contains the self-hosted Arboricx parser/executor written in tricu.
- Exposes ergonomic runtime entrypoints such as `runArboricxArgs` and Host ABI entrypoints such as `runArboricxToString`.
- This can be hardcoded as a Tree Calculus value in the host, or loaded by a minimal host-side Arboricx parser.
2. **Application bundle**
- The bundle the user wants to execute.
- Example: a bundle whose exported root is `append "hello "`, waiting for one more string argument.
- The host reads this file as raw bytes and encodes those bytes as a Tree Calculus byte list.
The minimal host does **not** need to understand the application bundle format if the kernel is already available as a Tree Calculus value. The host only passes the application bundle bytes to the kernel.
## Required host components
### 1. Tree representation
The host needs a representation for the three Tree Calculus constructors:
```text
Leaf
Stem child
Fork left right
```
Use whatever is idiomatic for the host language. In PHP, for a prototype, simple classes or tagged arrays are sufficient.
Example shape:
```php
abstract class T {}
final class Leaf extends T {}
final class Stem extends T { public T $child; }
final class Fork extends T { public T $left; public T $right; }
```
or tagged arrays:
```php
['tag' => 'leaf']
['tag' => 'stem', 'child' => $t]
['tag' => 'fork', 'left' => $l, 'right' => $r]
```
The evaluator and codecs only need these three constructors.
### 2. Tree Calculus evaluator
The host must implement Tree Calculus reduction. This is the core VM.
The evaluator should use normal-order evaluation, matching the runtime semantics expected by Arboricx manifests:
```text
runtimeEvaluation = "normal-order"
```
The evaluator only needs the Tree Calculus reduction rules. There is no parser requirement for the host prototype if terms are constructed directly as trees.
Implementation notes:
- Evaluation must support application: a tree applied to another tree.
- In this codebase, application is represented structurally as `Fork function argument` before reduction.
- The evaluator repeatedly reduces until normal form or until a configured step/fuel limit is reached.
- Add a fuel limit for the first prototype to avoid infinite reductions during debugging.
Reference implementation locations:
- Haskell evaluator/reduction: `src/Research.hs`
- JavaScript Arboricx runtime evaluator: `ext/js/src/` if present in the checkout
Use those as references for exact reduction behavior.
### 3. Kernel availability
The host needs access to the self-hosted Arboricx runtime kernel as a Tree Calculus value.
There are two viable bootstrap strategies.
#### Strategy A: hardcode the kernel tree
For the first host prototype, this is recommended.
Workflow:
1. Compile/export the tricu kernel entrypoint as an Arboricx bundle or tree value.
2. Convert the selected exported kernel function into a host-language Tree Calculus literal.
3. Commit/embed that literal in the host implementation.
Then the host does not need any Arboricx parser of its own for the kernel. It only needs Tree Calculus reduction.
#### Strategy B: bootstrap the kernel from an Arboricx bundle
Alternatively, the host can implement a minimal Arboricx parser just sufficient to load the kernel bundle.
This is more work up front, but avoids hardcoding a huge tree literal.
If using this strategy, the host-side parser needs to:
1. Parse the Arboricx container.
2. Parse enough manifest/export data to locate the desired kernel export.
3. Parse node records.
4. Reconstruct the selected root Tree Calculus value from the Merkle node DAG.
This logic is exactly what the tricu self-hosted kernel does, so the hardcoded-kernel path is simpler for early ports.
## Kernel entrypoints
The ergonomic runtime API currently lives in `lib/arboricx.tri`.
### Raw execution entrypoints
These return raw application results inside the existing `ok` / `err` result protocol:
```tricu
readArboricxExecutableByName nameBytes bundleBytes
readArboricxExecutable bundleBytes
runArboricxByName nameBytes bundleBytes arg
runArboricx bundleBytes arg
runArboricxArgsByName nameBytes bundleBytes args
runArboricxArgs bundleBytes args
```
`runArboricxArgs` accepts:
1. Raw application bundle bytes as a Tree Calculus byte list.
2. A Tree Calculus list of arguments.
For named exports, use `runArboricxArgsByName`, which accepts:
1. Export name as bytes.
2. Application bundle bytes as bytes.
3. Argument list.
### Host ABI typed entrypoints
For host-language ports, prefer the Host ABI typed runners. These wrap successful outputs in a tagged host ABI value so every host can decode the same envelope shape.
Default export variants:
```tricu
runArboricxToTree bundleBytes args
runArboricxToString bundleBytes args
runArboricxToNumber bundleBytes args
runArboricxToBool bundleBytes args
runArboricxToList bundleBytes args
runArboricxToBytes bundleBytes args
```
Named export variants:
```tricu
runArboricxByNameToTree nameBytes bundleBytes args
runArboricxByNameToString nameBytes bundleBytes args
runArboricxByNameToNumber nameBytes bundleBytes args
runArboricxByNameToBool nameBytes bundleBytes args
runArboricxByNameToList nameBytes bundleBytes args
runArboricxByNameToBytes nameBytes bundleBytes args
```
Recommended first host entrypoint for the `append "hello "` example:
```tricu
runArboricxToString
```
## Applying the kernel in the host evaluator
If the host has the Tree Calculus value for `runArboricxToString`, call it by constructing nested application trees.
In Tree Calculus application form:
```text
((runArboricxToString bundleBytesTree) argsTree)
```
Structurally, if `app(f, x)` constructs `Fork(f, x)`, then:
```php
$expr = app(app($kernelRunArboricxToString, $bundleBytesTree), $argsTree);
$result = normalize($expr);
```
For named export execution:
```text
(((runArboricxByNameToString nameBytesTree) bundleBytesTree) argsTree)
```
Structurally:
```php
$expr = app(
app(
app($kernelRunArboricxByNameToString, $nameBytesTree),
$bundleBytesTree
),
$argsTree
);
$result = normalize($expr);
```
## Result convention and Host ABI envelope
All runtime APIs return the existing tricu `ok` / `err` convention from `lib/binary.tri`:
```tricu
ok value rest = pair true (pair value rest)
err code rest = pair false (pair code rest)
```
The host should always unwrap this outer result first.
### Raw runners
Raw runners such as `runArboricxArgs` return:
```tricu
ok rawApplicationValue rest
```
The host must know how to interpret `rawApplicationValue`.
### Host ABI typed runners
Typed runners such as `runArboricxToString` return:
```tricu
ok hostAbiValue rest
```
A host ABI value has shape:
```tricu
pair tag payload
```
The payload is still the canonical/raw Tree Calculus representation for that type.
Initial tags are specified in [`docs/host-abi.md`](./host-abi.md):
```tricu
hostTreeTag = 0
hostStringTag = 1
hostNumberTag = 2
hostBoolTag = 3
hostListTag = 4
hostBytesTag = 5
```
For example:
```tricu
runArboricxToString bundleBytes ["james"]
```
returns:
```tricu
ok (hostString "hello james") rest
```
which is structurally:
```tricu
ok (pair hostStringTag "hello james") rest
```
### Error shape
Expected error shape:
```tricu
err code rest
```
The error code is a Tree Calculus number. Error constants are defined in:
- `lib/binary.tri`
- `lib/arboricx-common.tri`
- `lib/arboricx.tri` for Host ABI codec errors, currently `errHostCodecFailed = 14`
Typed runners return `errHostCodecFailed` if the application result cannot be interpreted as the requested type.
A prototype host can report the numeric error code and optionally dump a compact representation of `rest`.
## Example execution flow
Suppose the application bundle exports this root:
```tricu
append "hello "
```
The bundle root is an unapplied function waiting for one more string argument.
Host flow:
1. Load kernel entrypoint tree:
```php
$runArboricxToString = loadHardcodedKernelEntrypoint('runArboricxToString');
```
2. Read application bundle bytes:
```php
$bytes = file_get_contents('append-hello.arboricx');
```
3. Encode bundle bytes as a Tree Calculus byte list:
```php
$bundleBytesTree = encodeBytes($bytes);
```
4. Encode host argument(s):
```php
$arg = encodeString('james');
$args = encodeList([$arg]);
```
5. Build application expression:
```php
$expr = app(app($runArboricxToString, $bundleBytesTree), $args);
```
6. Evaluate:
```php
$result = normalize($expr);
```
7. Unwrap `ok` result:
```php
[$ok, $hostValue, $rest] = unwrapResult($result);
if (!$ok) { throw new RuntimeException('Arboricx error'); }
```
8. Unwrap Host ABI envelope:
```php
[$tag, $payload] = unwrapHostValue($hostValue);
if ($tag !== HOST_STRING_TAG) { throw new RuntimeException('Expected string'); }
```
9. Decode the payload:
```php
echo decodeString($payload); // hello james
```
## What the kernel does internally
`runArboricxToString` performs the following steps inside Tree Calculus:
1. Parse and validate the raw Arboricx bundle bytes.
2. Parse the manifest.
3. Select the default export:
- use export named `main` if present,
- otherwise use the sole export if exactly one exists,
- otherwise return an error.
4. Read the nodes section.
5. Reconstruct the selected root tree from the Merkle DAG.
6. Apply each host-provided argument in order.
7. Validate that the raw result is string-like.
8. Return `ok (hostString result) rest`, or an `err`.
`runArboricxByNameToString` is identical except that it selects a named export.
Other typed runners follow the same pattern for their requested output type.
## Tests proving the expected behavior
The relevant Haskell tests are in `test/Spec.hs` under `manifestReadingTests`.
Important cases:
- `readArboricxExecutable: reconstructs default export tree`
- `readArboricxExecutableByName: selects named export`
- `runArboricx: applies host-provided argument to default export`
- `runArboricxArgs: applies host-provided argument list in order`
- `host ABI: constructors expose tag and payload`
- `runArboricxToTree: wraps raw result as hostTree`
- `runArboricxToString: wraps string result as hostString`
- `runArboricxToNumber: wraps number result as hostNumber`
- `runArboricxToBool: rejects non-bool result`
These tests demonstrate the host-shell contract:
- application bundle bytes are supplied as a Tree Calculus byte list,
- host arguments are supplied as canonical Tree Calculus values,
- execution returns an outer result-wrapped value,
- Host ABI typed runners return a tagged ABI envelope inside `ok`.
## Minimal PHP prototype checklist
A PHP prototype should implement:
- [ ] Tree data constructors: `Leaf`, `Stem`, `Fork`.
- [ ] Application helper: `app($f, $x) = Fork($f, $x)`.
- [ ] Normal-order Tree Calculus reducer.
- [ ] Fuel/step limit for debugging.
- [ ] Hardcoded kernel entrypoint tree for `runArboricxToString` for the first string-output prototype.
- [ ] Encode application bundle file bytes into a Tree Calculus byte list.
- [ ] Encode host argument values into Tree Calculus values.
- [ ] Build expression: `((runArboricxToString bundleBytes) args)`.
- [ ] Normalize expression.
- [ ] Unwrap outer `ok` / `err` result.
- [ ] Unwrap Host ABI `pair tag payload` envelope.
- [ ] Decode payload according to tag.
For exact codec details, reference the Haskell implementation in `src/Research.hs` and the existing JS runtime if available.
## Current recommendation
For the first PHP implementation:
1. Hardcode only the `runArboricxToString` kernel entrypoint as a Tree Calculus value.
2. Do not implement host-side Arboricx parsing yet.
3. Implement only enough codecs for:
- bytes,
- strings,
- lists,
- result unwrapping,
- Host ABI envelope unwrapping.
4. Use one test fixture: an Arboricx bundle whose root is `append "hello "`.
5. Assert that calling it with `"james"` returns an outer `ok`, then a `hostString`, then payload `"hello james"`.
Once that works, add named export support via `runArboricxByNameToString` and expand Host ABI tags/codecs as needed.

View File

@@ -18,12 +18,12 @@
* Offset 8B u64 BE * Offset 8B u64 BE
* Length 8B u64 BE * Length 8B u64 BE
* SHA256Digest 32B raw * SHA256Digest 32B raw
* Manifest: canonical CBOR-encoded map (cborg output from Haskell) * Manifest: fixed-order core + TLV tail (ARBMNFST magic)
* Nodes: binary section * Nodes: binary section
*/ */
import { createHash } from "node:crypto"; import { createHash } from "node:crypto";
import { decodeCbor } from "./cbor.js"; import { decodeManifest } from "./manifest.js";
// ── Constants ─────────────────────────────────────────────────────────────── // ── Constants ───────────────────────────────────────────────────────────────
@@ -173,37 +173,12 @@ export function parseBundle(buffer) {
} }
/** /**
* Post-process a CBOR-decoded manifest to normalize hash fields * Convenience: parse and return the manifest from the fixed-order binary format.
* from raw bytes to hex strings (matching the old JSON wire format).
*/
function normalizeManifest(raw) {
const tree = raw.tree;
if (tree && tree.nodeHash && tree.nodeHash.domain) {
tree.nodeHash.domain = tree.nodeHash.domain;
}
// Convert root hashes from raw bytes to hex
const roots = (raw.roots || []).map((r) => ({
...r,
hash: r.hash instanceof Uint8Array ? Buffer.from(r.hash).toString("hex") : r.hash,
}));
// Convert export root hashes from raw bytes to hex
const exports = (raw.exports || []).map((e) => ({
...e,
root: e.root instanceof Uint8Array ? Buffer.from(e.root).toString("hex") : e.root,
}));
return { ...raw, roots, exports };
}
/**
* Convenience: parse and return the manifest from CBOR.
*/ */
export function parseManifest(buffer) { export function parseManifest(buffer) {
const bundle = parseBundle(buffer); const bundle = parseBundle(buffer);
const manifestEntry = bundle.sections.get(SECTION_MANIFEST); const manifestEntry = bundle.sections.get(SECTION_MANIFEST);
return normalizeManifest(decodeCbor(manifestEntry.data)); return decodeManifest(manifestEntry.data);
} }
/** /**

View File

@@ -1,130 +0,0 @@
/**
* cbor.js — Minimal CBOR decoder for the Arboricx manifest format.
*
* Decodes the canonical CBOR produced by the Haskell cborg library:
* - Maps: major type 5 (0xa0 + length)
* - Arrays: major type 4 (0x80 + length)
* - Text strings: major type 3, UTF-8 encoded
* - Byte strings: major type 2
* - Unsigned ints: major type 0
* - Simple values: 0xc2 = false, 0xc3 = true
*
* Only covers the subset needed for the manifest.
*/
// ── Decoding state ──────────────────────────────────────────────────────────
/**
* @param {Buffer} data
* @returns {number} remaining buffer
*/
function makeDecoder(data) {
let offset = 0;
return {
/** @returns {number} current offset */
getPos() { return offset; },
/** @returns {number} remaining bytes */
remaining() { return data.length - offset; },
/** @returns {number} total length */
length() { return data.length; },
/** Read N bytes and advance */
read(n) {
if (offset + n > data.length) {
throw new Error(`CBOR read: expected ${n} bytes, ${data.length - offset} remaining at offset ${offset}`);
}
const slice = data.slice(offset, offset + n);
offset += n;
return slice;
},
/** Read a single byte */
readByte() {
if (offset >= data.length) {
throw new Error(`CBOR readByte: no bytes remaining at offset ${offset}`);
}
return data[offset++];
},
};
}
// ── CBOR helpers ────────────────────────────────────────────────────────────
/**
* Read a CBOR length (major type initial byte encodes length for values < 24).
* For 24+, reads additional bytes per spec.
* @returns {number}
*/
function cborReadLength(dec, startByte) {
const additional = startByte & 0x1f;
if (additional < 24) return additional;
if (additional === 24) return dec.read(1)[0];
if (additional === 25) return dec.read(2).readUint16BE(0);
if (additional === 26) return dec.read(4).readUint32BE(0);
throw new Error(`CBOR: unsupported additional info ${additional}`);
}
// ── Top-level decode ────────────────────────────────────────────────────────
/**
* Decode a single CBOR value from buffer bytes.
* @param {Buffer} buf
* @returns {*}
*/
export function decodeCbor(buf) {
const dec = makeDecoder(buf);
const result = cborDecode(dec);
return result;
}
function cborDecode(dec) {
const first = dec.readByte();
const major = (first >> 5) & 0x07;
const info = first & 0x1f;
switch (major) {
case 0: // unsigned int
case 1: // negative int
return cborReadLength(dec, first);
case 2: // byte string
return dec.read(cborReadLength(dec, first));
case 3: // text string (UTF-8)
const len = cborReadLength(dec, first);
return dec.read(len).toString("utf-8");
case 4: // array
const arrLen = cborReadLength(dec, first);
const arr = [];
for (let i = 0; i < arrLen; i++) {
arr.push(cborDecode(dec));
}
return arr;
case 5: // map
const mapLen = cborReadLength(dec, first);
const map = {};
for (let i = 0; i < mapLen; i++) {
const key = cborDecode(dec);
const val = cborDecode(dec);
map[key] = val;
}
return map;
case 7: // simple values / floats
if (info === 20) return false;
if (info === 21) return true;
if (info === 22) return null; // undefined
if (info === 23) return null; // break (shouldn't appear in definite-length)
// 0xf9-fb are half/float/double floats — not used by our writer
throw new Error(`CBOR: unsupported simple value ${info}`);
default:
// Tags (major 6) and break (0xff) — not used in our manifest
throw new Error(`CBOR: unsupported major type ${major}, info ${info}`);
}
}

View File

@@ -1,13 +1,220 @@
/** /**
* manifest.js — Minimal manifest parsing and export lookup. * manifest.js — Fixed-order manifest parsing and export lookup.
* *
* The manifest is a JSON object with fields: * The manifest binary format (ManifestV1):
* schema, bundleType, tree, runtime, closure, roots, exports, * magic(8) + major(u16) + minor(u16)
* imports, sections, metadata * + schema(string) + bundleType(string)
* + treeCalculus(string) + treeHashAlgorithm(string) + treeHashDomain(string) + treeNodePayload(string)
* + runtimeSemantics(string) + runtimeEvaluation(string) + runtimeAbi(string)
* + capabilityCount(u32) + capabilities(string[])
* + closure(u8)
* + rootCount(u32) + roots[]
* + exportCount(u32) + exports[]
* + metadataFieldCount(u32) + metadataTLVs[]
* + extensionFieldCount(u32) + extensionTLVs[]
* *
* We parse only what we need for runtime entrypoint selection. * String format: u32 BE length + UTF-8 bytes.
* Root: 32 bytes raw hash + role(string).
* Export: name(string) + 32 bytes raw root hash + kind(string) + abi(string).
* TLV: u16 tag + u32 length + value bytes.
*/ */
// ── Constants ───────────────────────────────────────────────────────────────
const MANIFEST_MAGIC = "ARBMNFST";
const MANIFEST_MAJOR = 1;
const MANIFEST_MINOR = 0;
// Metadata TLV tags
const TAG_PACKAGE = 1;
const TAG_VERSION = 2;
const TAG_DESCRIPTION = 3;
const TAG_LICENSE = 4;
const TAG_CREATED_BY = 5;
// Closure bytes
const CLOSURE_COMPLETE = 0;
const CLOSURE_PARTIAL = 1;
// ── Binary helpers ──────────────────────────────────────────────────────────
function u16(buf, off) {
if (off + 2 > buf.length) throw new Error("manifest: not enough bytes for u16");
return { value: buf.readUint16BE(off), next: off + 2 };
}
function u32(buf, off) {
if (off + 4 > buf.length) throw new Error("manifest: not enough bytes for u32");
return { value: buf.readUint32BE(off), next: off + 4 };
}
function u8(buf, off) {
if (off >= buf.length) throw new Error("manifest: not enough bytes for u8");
return { value: buf.readUint8(off), next: off + 1 };
}
/**
* Read a length-prefixed UTF-8 string: u32 BE length + UTF-8 bytes.
* Returns { text, next }.
*/
function readStr(buf, off) {
const { value: len, next: afterLen } = u32(buf, off);
if (afterLen + len > buf.length) throw new Error("manifest: string extends beyond input");
return { text: buf.toString("utf-8", afterLen, afterLen + len), next: afterLen + len };
}
/**
* Read raw bytes of given length.
* Returns { bytes, next }.
*/
function readRaw(buf, off, n) {
if (off + n > buf.length) throw new Error(`manifest: not enough bytes for ${n}-byte read`);
return { value: buf.slice(off, off + n), next: off + n };
}
// ── Manifest decoder ────────────────────────────────────────────────────────
/**
* Decode the manifest binary from a Buffer.
*
* Returns a normalized manifest object matching the shape expected
* by validateManifest / selectExport.
*/
export function decodeManifest(buf) {
let off = 0;
// Magic (8 bytes)
const magic = buf.toString("utf-8", 0, 8);
if (magic !== MANIFEST_MAGIC) {
throw new Error(`invalid manifest magic: expected ${MANIFEST_MAGIC}, got "${magic}"`);
}
off = 8;
// Version
const { value: major } = u16(buf, off);
if (major !== MANIFEST_MAJOR) throw new Error(`unsupported manifest major version: ${major}`);
off += 4; // u16 major + u16 minor
// Helper: read length-prefixed text
const readText = () => {
const { text, next } = readStr(buf, off);
off = next;
return text;
};
// Core strings
const schema = readText();
const bundleType = readText();
const treeCalculus = readText();
const treeHashAlgorithm = readText();
const treeHashDomain = readText();
const treeNodePayload = readText();
const runtimeSemantics = readText();
const runtimeEvaluation = readText();
const runtimeAbi = readText();
// Capabilities (u32 count + string[])
const { value: capCount } = u32(buf, off);
off += 4;
const capabilities = [];
for (let i = 0; i < capCount; i++) {
capabilities.push(readText());
}
// Closure (u8)
const { value: closureByte } = u8(buf, off);
off += 1;
const closure = closureByte === CLOSURE_COMPLETE ? "complete" : "partial";
// Roots (u32 count + Root[])
// Root: 32 bytes raw hash + role(string)
const { value: rootCount } = u32(buf, off);
off += 4;
const roots = [];
for (let i = 0; i < rootCount; i++) {
const { value: hashRaw } = readRaw(buf, off, 32);
off += 32;
const { text: role, next: rOff } = readStr(buf, off);
off = rOff;
roots.push({ hash: hashRaw.toString("hex"), role });
}
// Exports (u32 count + Export[])
// Export: name(string) + 32 bytes raw root hash + kind(string) + abi(string)
const { value: exportCount } = u32(buf, off);
off += 4;
const exports = [];
for (let i = 0; i < exportCount; i++) {
const { text: name, next: nOff } = readStr(buf, off);
off = nOff;
const { value: expHashRaw } = readRaw(buf, off, 32);
off += 32;
const { text: kind, next: kOff } = readStr(buf, off);
off = kOff;
const { text: abi, next: aOff } = readStr(buf, off);
off = aOff;
exports.push({ name, root: expHashRaw.toString("hex"), kind, abi });
}
// Metadata (u32 count + TLV[])
// TLV: u16 tag + u32 length + value bytes
const { value: metaCount } = u32(buf, off);
off += 4;
const metadata = {};
for (let i = 0; i < metaCount; i++) {
const { value: tag } = u16(buf, off);
off += 2;
const { value: tlvLen } = u32(buf, off);
off += 4;
const { value: tlvRaw } = readRaw(buf, off, tlvLen);
off += tlvLen;
const val = tlvRaw.toString("utf-8");
switch (tag) {
case TAG_PACKAGE: metadata.package = val; break;
case TAG_VERSION: metadata.version = val; break;
case TAG_DESCRIPTION: metadata.description = val; break;
case TAG_LICENSE: metadata.license = val; break;
case TAG_CREATED_BY: metadata.createdBy = val; break;
}
}
// Extensions (u32 count + TLV[] — skip all)
const { value: extCount } = u32(buf, off);
off += 4;
for (let i = 0; i < extCount; i++) {
const { value: _tag } = u16(buf, off);
off += 2;
const { value: tlvLen } = u32(buf, off);
off += 4;
off += tlvLen; // skip value
}
return {
schema,
bundleType,
tree: {
calculus: treeCalculus,
nodeHash: {
algorithm: treeHashAlgorithm,
domain: treeHashDomain,
},
nodePayload: treeNodePayload,
},
runtime: {
semantics: runtimeSemantics,
evaluation: runtimeEvaluation,
abi: runtimeAbi,
capabilities,
},
closure,
roots,
exports,
metadata: Object.keys(metadata).length > 0 ? metadata : undefined,
};
}
// ── Validation ──────────────────────────────────────────────────────────────
/** /**
* Validate the manifest against the runtime profile requirements. * Validate the manifest against the runtime profile requirements.
* Throws on violation. * Throws on violation.

432
lib/arboricx-common.tri Normal file
View File

@@ -0,0 +1,432 @@
!import "base.tri" !Local
!import "list.tri" !Local
!import "bytes.tri" !Local
!import "binary.tri" !Local
arboricxMagic = [(65) (82) (66) (79) (82) (73) (67) (88)]
arboricxMajorVersion = [(0) (1)]
arboricxMinorVersion = [(0) (0)]
arboricxManifestSectionId = [(0) (0) (0) (1)]
arboricxNodesSectionId = [(0) (0) (0) (2)]
-- Manifest magic and version constants
arboricxManifestMagic = [(65) (82) (66) (77) (78) (70) (83) (84)]
arboricxManifestMajorVersion = [(0) (1)]
arboricxManifestMinorVersion = [(0) (0)]
errMissingSection = 4
errUnsupportedVersion = 5
errDuplicateSection = 6
errDuplicateNode = 7
errInvalidNodePayload = 8
errMissingNode = 9
errInvalidManifestMagic = 10
errUnsupportedManifestVersion = 11
errTrailingManifestBytes = 12
errManifestValidationFailed = 13
nodePayloadLeafTag = 0
nodePayloadStemTag = 1
nodePayloadForkTag = 2
readArboricxMagic = (bs : expectBytes arboricxMagic bs)
readArboricxHeader = (bs :
bindResult (readArboricxMagic bs)
(_ afterMagic :
bindResult (readBytes 2 afterMagic)
(majorVersion afterMajor :
bindResult (readBytes 2 afterMajor)
(minorVersion afterMinor :
bindResult (readBytes 4 afterMinor)
(sectionCount afterSectionCount :
bindResult (readBytes 8 afterSectionCount)
(flags afterFlags :
bindResult (readBytes 8 afterFlags)
(dirOffset afterDirOffset :
ok
(pair majorVersion
(pair minorVersion
(pair sectionCount
(pair flags dirOffset))))
afterDirOffset)))))))
readSectionRecord = (bs :
bindResult (readBytes 4 bs)
(sectionId afterSectionId :
bindResult (readBytes 2 afterSectionId)
(sectionVersion afterSectionVersion :
bindResult (readBytes 2 afterSectionVersion)
(sectionFlags afterSectionFlags :
bindResult (readBytes 2 afterSectionFlags)
(compression afterCompression :
bindResult (readBytes 2 afterCompression)
(digestAlgorithm afterDigestAlgorithm :
bindResult (readBytes 8 afterDigestAlgorithm)
(offset afterOffset :
bindResult (readBytes 8 afterOffset)
(length afterLength :
bindResult (readBytes 32 afterLength)
(digest afterDigest :
ok
(pair sectionId
(pair sectionVersion
(pair sectionFlags
(pair compression
(pair digestAlgorithm
(pair offset
(pair length digest)))))))
afterDigest)))))))))
readSectionDirectory_ = y (self bs sectionCount i acc :
matchBool
(ok (reverse acc) bs)
(bindResult (readSectionRecord bs)
(sectionRecord afterSectionRecord :
self afterSectionRecord sectionCount (succ i) (pair sectionRecord acc)))
(equal? i sectionCount))
readSectionDirectory = (sectionCount bs : readSectionDirectory_ bs sectionCount 0 t)
sectionRecordId = (sectionRecord :
matchPair
(sectionId _ : sectionId)
sectionRecord)
sectionRecordVersion = (sectionRecord :
matchPair
(_ payload :
matchPair
(sectionVersion _ : sectionVersion)
payload)
sectionRecord)
sectionRecordFlags = (sectionRecord :
matchPair
(_ payload :
matchPair
(_ payload2 :
matchPair
(sectionFlags _ : sectionFlags)
payload2)
payload)
sectionRecord)
sectionRecordCompression = (sectionRecord :
matchPair
(_ payload :
matchPair
(_ payload2 :
matchPair
(_ payload3 :
matchPair
(compression _ : compression)
payload3)
payload2)
payload)
sectionRecord)
sectionRecordDigestAlgorithm = (sectionRecord :
matchPair
(_ payload :
matchPair
(_ payload2 :
matchPair
(_ payload3 :
matchPair
(_ payload4 :
matchPair
(digestAlgorithm _ : digestAlgorithm)
payload4)
payload3)
payload2)
payload)
sectionRecord)
sectionRecordOffset = (sectionRecord :
matchPair
(_ payload :
matchPair
(_ payload2 :
matchPair
(_ payload3 :
matchPair
(_ payload4 :
matchPair
(_ payload5 :
matchPair
(offset _ : offset)
payload5)
payload4)
payload3)
payload2)
payload)
sectionRecord)
sectionRecordLength = (sectionRecord :
matchPair
(_ payload :
matchPair
(_ payload2 :
matchPair
(_ payload3 :
matchPair
(_ payload4 :
matchPair
(_ payload5 :
matchPair
(_ payload6 :
matchPair
(length _ : length)
payload6)
payload5)
payload4)
payload3)
payload2)
payload)
sectionRecord)
sectionRecordDigest = (sectionRecord :
matchPair
(_ payload :
matchPair
(_ payload2 :
matchPair
(_ payload3 :
matchPair
(_ payload4 :
matchPair
(_ payload5 :
matchPair
(_ payload6 :
matchPair
(_ digest : digest)
payload6)
payload5)
payload4)
payload3)
payload2)
payload)
sectionRecord)
lookupSectionRecord_ = y (self directory sectionId :
matchList
nothing
(sectionRecord rest :
matchBool
(just sectionRecord)
(self rest sectionId)
(bytesEq? sectionId (sectionRecordId sectionRecord)))
directory)
lookupSectionRecord = (sectionId directory : lookupSectionRecord_ directory sectionId)
sectionDirectoryHasId?_ = y (self directory sectionId :
matchList
false
(sectionRecord rest :
or?
(bytesEq? sectionId (sectionRecordId sectionRecord))
(self rest sectionId))
directory)
sectionDirectoryHasId? = (sectionId directory : sectionDirectoryHasId?_ directory sectionId)
sectionDirectoryHasDuplicateIds? = y (self directory :
matchList
false
(sectionRecord rest :
or?
(sectionDirectoryHasId?_ rest (sectionRecordId sectionRecord))
(self rest))
directory)
validateSectionDirectory = (directory rest :
matchBool
(err errDuplicateSection rest)
(ok directory rest)
(sectionDirectoryHasDuplicateIds? directory))
byteSlice = (offset length bytes : bytesTake length (bytesDrop offset bytes))
natMake = (bit rest :
matchBool
0
(pair bit rest)
(and? (equal? bit 0) (equal? rest 0)))
natAdd = y (self a b :
triage
b
(_ : b)
(aBit aRest :
triage
a
(_ : a)
(bBit bRest :
matchBool
(natMake 0 (succ (self aRest bRest)))
(natMake (matchBool (matchBool 0 1 bBit) (matchBool 1 0 bBit) aBit)
(self aRest bRest))
(and? (equal? aBit 1) (equal? bBit 1)))
b)
a)
natDouble = (n : matchBool 0 (pair 0 n) (equal? n 0))
natTimes256 = (n :
natDouble
(natDouble
(natDouble
(natDouble
(natDouble
(natDouble
(natDouble
(natDouble n))))))))
byteNatShiftAppend_ = y (self byte acc i :
matchBool
acc
(triage
(natMake 0 (self 0 acc (succ i)))
(_ : acc)
(bit rest : natMake bit (self rest acc (succ i)))
byte)
(equal? i 8))
byteNatShiftAppend = (byte acc : byteNatShiftAppend_ byte acc 0)
beBytesToNat = (bytes :
foldl
(acc byte : byteNatShiftAppend byte acc)
0
bytes)
u32BEBytesToNat = beBytesToNat
u64BEBytesToNat = beBytesToNat
arboricxHeaderMajorVersion = (header :
matchPair
(majorVersion _ : majorVersion)
header)
arboricxHeaderMinorVersion = (header :
matchPair
(_ payload :
matchPair
(minorVersion _ : minorVersion)
payload)
header)
arboricxHeaderSectionCount = (header :
matchPair
(_ payload :
matchPair
(_ payload2 :
matchPair
(sectionCount _ : sectionCount)
payload2)
payload)
header)
arboricxHeaderFlags = (header :
matchPair
(_ payload :
matchPair
(_ payload2 :
matchPair
(_ payload3 :
matchPair
(flags _ : flags)
payload3)
payload2)
payload)
header)
arboricxHeaderDirOffset = (header :
matchPair
(_ payload :
matchPair
(_ payload2 :
matchPair
(_ payload3 :
matchPair
(_ dirOffset : dirOffset)
payload3)
payload2)
payload)
header)
validateArboricxHeader = (header rest :
matchBool
(ok header rest)
(err errUnsupportedVersion rest)
(and?
(bytesEq? arboricxMajorVersion (arboricxHeaderMajorVersion header))
(bytesEq? arboricxMinorVersion (arboricxHeaderMinorVersion header))))
readArboricxContainer = (bs :
bindResult (readArboricxHeader bs)
(header afterHeader :
bindResult (validateArboricxHeader header afterHeader)
(validHeader afterValidHeader :
bindResult (readSectionDirectory
(u32BEBytesToNat (arboricxHeaderSectionCount validHeader))
(bytesDrop (u64BEBytesToNat (arboricxHeaderDirOffset validHeader)) bs))
(directory afterDirectory :
bindResult (validateSectionDirectory directory afterDirectory)
(validDirectory afterValidDirectory :
ok (pair validHeader validDirectory) afterValidDirectory)))))
sectionRecordOffsetNat = (sectionRecord :
u64BEBytesToNat (sectionRecordOffset sectionRecord))
sectionRecordLengthNat = (sectionRecord :
u64BEBytesToNat (sectionRecordLength sectionRecord))
extractSectionBytes = (sectionRecord containerBytes :
byteSlice
(sectionRecordOffsetNat sectionRecord)
(sectionRecordLengthNat sectionRecord)
containerBytes)
extractSectionBytesResult = (sectionRecord containerBytes rest :
(sectionBytes :
matchBool
(ok sectionBytes rest)
(err errUnexpectedEof rest)
(equal? (bytesLength sectionBytes) (sectionRecordLengthNat sectionRecord)))
(extractSectionBytes sectionRecord containerBytes))
lookupSectionBytes = (sectionId directory containerBytes :
triage
nothing
(sectionRecord : just (extractSectionBytes sectionRecord containerBytes))
(_ _ : nothing)
(lookupSectionRecord sectionId directory))
sectionBytesOrErr = (sectionId directory containerBytes rest :
triage
(err errMissingSection rest)
(sectionRecord : extractSectionBytesResult sectionRecord containerBytes rest)
(_ _ : err errMissingSection rest)
(lookupSectionRecord sectionId directory))
readArboricxSectionBytes = (sectionId bs :
bindResult (readArboricxContainer bs)
(container afterContainer :
matchPair
(_ directory : sectionBytesOrErr sectionId directory bs afterContainer)
container))
readArboricxRequiredSections = (bs :
bindResult (readArboricxContainer bs)
(container afterContainer :
matchPair
(_ directory :
bindResult (sectionBytesOrErr arboricxManifestSectionId directory bs afterContainer)
(manifestBytes _ :
bindResult (sectionBytesOrErr arboricxNodesSectionId directory bs afterContainer)
(nodesBytes _ :
ok (pair manifestBytes nodesBytes) afterContainer)))
container))

343
lib/arboricx-manifest.tri Normal file
View File

@@ -0,0 +1,343 @@
!import "arboricx-nodes.tri" !Local
readManifestMagic = (bs :
expectBytes arboricxManifestMagic bs)
-- Read a u32 BE length, then that many raw bytes.
-- Returns the payload bytes and remaining input.
readLengthPrefixedString = (bs :
bindResult (readBytes 4 bs)
(lengthBytes afterLengthBytes :
bindResult (readBytes (u32BEBytesToNat lengthBytes) afterLengthBytes)
(payload afterPayload :
ok payload afterPayload)))
-- Helper: read a single capability string (length-prefixed string)
readCapability = (bs :
readLengthPrefixedString bs)
-- Helper worker: read N capability strings (counts up from 0)
readCapabilities_ = y (self bs count i acc :
matchBool
(ok (reverse acc) bs)
(bindResult (readCapability bs)
(cap afterCap :
self afterCap count (succ i) (pair cap acc)))
(equal? i count))
-- Helper: read N capabilities
readCapabilities = (count bs :
readCapabilities_ bs count 0 t)
-- Helper: read a single root entry (32-byte raw hash + length-prefixed role)
readRootEntry = (bs :
bindResult (readBytes 32 bs)
(hashRaw afterHash :
bindResult (readLengthPrefixedString afterHash)
(role afterRole :
ok (pair hashRaw role) afterRole)))
-- Helper worker: read N root entries (counts up from 0)
readRoots_ = y (self bs count i acc :
matchBool
(ok (reverse acc) bs)
(bindResult (readRootEntry bs)
(root afterRoot :
self afterRoot count (succ i) (pair root acc)))
(equal? i count))
-- Helper: read N roots
readRoots = (count bs :
readRoots_ bs count 0 t)
-- Helper: read a single export entry
readExportEntry = (bs :
bindResult (readLengthPrefixedString bs)
(name afterName :
bindResult (readBytes 32 afterName)
(rootHashRaw afterRootHash :
bindResult (readLengthPrefixedString afterRootHash)
(kind afterKind :
bindResult (readLengthPrefixedString afterKind)
(abi afterAbi :
ok (pair name (pair rootHashRaw (pair kind abi))) afterAbi)))))
-- Helper worker: read N export entries (counts up from 0)
readExports_ = y (self bs count i acc :
matchBool
(ok (reverse acc) bs)
(bindResult (readExportEntry bs)
(exp afterExp :
self afterExp count (succ i) (pair exp acc)))
(equal? i count))
-- Helper: read N exports
readExports = (count bs :
readExports_ bs count 0 t)
-- Main core manifest parser.
-- Reads: magic, version, core strings, capabilities, closure, roots, exports.
readManifestCore = (bs :
bindResult (readManifestMagic bs)
(_ afterMagic :
bindResult (readBytes 2 afterMagic)
(majorVersion afterMajor :
bindResult (readBytes 2 afterMajor)
(minorVersion afterMinor :
bindResult (readLengthPrefixedString afterMinor)
(schema afterSchema :
bindResult (readLengthPrefixedString afterSchema)
(bundleType afterBundleType :
bindResult (readLengthPrefixedString afterBundleType)
(treeCalculus afterTreeCalculus :
bindResult (readLengthPrefixedString afterTreeCalculus)
(treeHashAlgorithm afterTreeHashAlgorithm :
bindResult (readLengthPrefixedString afterTreeHashAlgorithm)
(treeHashDomain afterTreeHashDomain :
bindResult (readLengthPrefixedString afterTreeHashDomain)
(treeNodePayload afterTreeNodePayload :
bindResult (readLengthPrefixedString afterTreeNodePayload)
(runtimeSemantics afterRuntimeSemantics :
bindResult (readLengthPrefixedString afterRuntimeSemantics)
(runtimeEvaluation afterRuntimeEvaluation :
bindResult (readLengthPrefixedString afterRuntimeEvaluation)
(runtimeAbi afterRuntimeAbi :
bindResult (readBytes 4 afterRuntimeAbi)
(capCountRaw afterCapCountRaw :
bindResult (readCapabilities (u32BEBytesToNat capCountRaw) afterCapCountRaw)
(capabilities afterCapabilities :
bindResult (readBytes 1 afterCapabilities)
(closureByte afterClosureByte :
bindResult (readBytes 4 afterClosureByte)
(rootCountRaw afterRootCountRaw :
bindResult (readRoots (u32BEBytesToNat rootCountRaw) afterRootCountRaw)
(roots afterRoots :
bindResult (readBytes 4 afterRoots)
(exportCountRaw afterExportCountRaw :
bindResult (readExports (u32BEBytesToNat exportCountRaw) afterExportCountRaw)
(exports afterExports :
ok
(pair schema
(pair bundleType
(pair treeCalculus
(pair treeHashAlgorithm
(pair treeHashDomain
(pair treeNodePayload
(pair runtimeSemantics
(pair runtimeEvaluation
(pair runtimeAbi
(pair capabilities
(pair closureByte (pair roots exports)))))))))))) afterExports))))))))))))))))))))
-- Metadata tag constants (u16 values)
tagPackage = [(0) (1)]
tagVersion = [(0) (2)]
tagDescription = [(0) (3)]
tagLicense = [(0) (4)]
tagCreatedBy = [(0) (5)]
-- Read a single TLV entry: u16 tag + u32 length + value bytes.
-- Returns the pair (tag, value) and remaining input.
readTLV = (bs :
bindResult (readBytes 2 bs)
(tag afterTag :
bindResult (readBytes 4 afterTag)
(tlvLenRaw afterTlvLenRaw :
bindResult (readBytes (u32BEBytesToNat tlvLenRaw) afterTlvLenRaw)
(tlvValue afterTlvValue :
ok (pair tag tlvValue) afterTlvValue))))
-- Worker: read N TLV entries (counts up from 0)
readTLVs_ = y (self bs count i acc :
matchBool
(ok (reverse acc) bs)
(bindResult (readTLV bs)
(tlv afterTlv :
self afterTlv count (succ i) (pair tlv acc)))
(equal? i count))
-- Read a count followed by that many TLV entries.
readTLVList = (count bs :
readTLVs_ bs count 0 t)
-- Skip N extension TLV entries (counts up from 0)
skipTLVs_ = y (self bs count i :
matchBool
(ok unit bs)
(bindResult (readTLV bs)
(_ afterTlv :
self afterTlv count (succ i)))
(equal? i count))
-- Full manifest parser: core fields + metadata TLV list + extension TLV list.
readManifest = (bs :
bindResult (readManifestCore bs)
(coreManifest afterCore :
bindResult (readBytes 4 afterCore)
(metaCountRaw afterMetaCountRaw :
bindResult (readTLVList (u32BEBytesToNat metaCountRaw) afterMetaCountRaw)
(metadataFields afterMetadataFields :
bindResult (readBytes 4 afterMetadataFields)
(extCountRaw afterExtCountRaw :
bindResult (skipTLVs_ afterExtCountRaw (u32BEBytesToNat extCountRaw) 0)
(afterExtensions _ :
ok
(pair coreManifest (pair metadataFields afterExtensions))
afterExtensions))))))
-- Lookup a metadata value by tag from a TLV list.
-- Returns nothing if not found, just value if found.
lookupMetadata_ = y (self tlvs tag :
matchList
nothing
(tlv rest :
matchBool
(just (matchPair (_ value : value) tlv))
(self rest tag)
(bytesEq? (matchPair (tlvTag _ : tlvTag) tlv) tag))
tlvs)
lookupMetadata = (tlvs tag :
lookupMetadata_ tlvs tag)
-- Get export name from an export entry (pair name (pair rootHash (pair kind abi)))
exportName = (exp :
matchPair
(name _ : name)
exp)
exportRoot = (exp :
matchPair
(_ payload :
matchPair
(root _ : root)
payload)
exp)
-- Check if an export name matches a given byte string.
exportNameEq? = (nameBytes exp :
bytesEq? nameBytes (exportName exp))
-- Find first export matching a name, or nothing.
findExportByName_ = y (self exports name :
matchList
nothing
(exp rest :
matchBool
(just exp)
(self rest name)
(exportNameEq? name exp))
exports)
findExportByName = (exports name :
findExportByName_ exports name)
-- Get list of all export names from a list of exports.
getExportNames_ = y (self acc exports :
matchList
(reverse acc)
(exp rest :
self (pair (exportName exp) acc) rest)
exports)
getExportNames = (exports :
getExportNames_ t exports)
mainExportName = "main"
maybeExportToResult = (maybeExport :
triage
(err errMissingSection t)
(export : ok export t)
(_ _ : err errMissingSection t)
maybeExport)
selectSingleExport = (exports :
matchList
(err errMissingSection t)
(export rest :
matchBool
(ok export t)
(err errMissingSection t)
(emptyList? rest))
exports)
selectDefaultExport = (exports :
triage
(selectSingleExport exports)
(export : ok export t)
(_ _ : err errMissingSection t)
(findExportByName exports mainExportName))
-- Select an export: explicit name if provided, otherwise "main", otherwise
-- the sole export if the bundle has exactly one export.
selectExport = (exports nameBytes :
matchBool
(selectDefaultExport exports)
(maybeExportToResult (findExportByName exports nameBytes))
(emptyList? nameBytes))
selectExportOpt = (exports optNameBytes :
selectExport exports optNameBytes)
-- Expected core string values (raw UTF-8 bytes, not decoded to Unicode characters).
expectedSchema = "arboricx.bundle.manifest.v1"
expectedBundleType = "tree-calculus-executable-object"
expectedTreeCalculus = "tree-calculus.v1"
expectedTreeHashAlgorithm = "sha256"
expectedTreeHashDomain = "arboricx.merkle.node.v1"
expectedTreeNodePayload = "arboricx.merkle.payload.v1"
expectedRuntimeSemantics = "tree-calculus.v1"
expectedRuntimeEvaluation = "normal-order"
expectedRuntimeAbi = "arboricx.abi.tree.v1"
-- Manifest core field accessors.
-- readManifestCore returns: (pair schema (pair bundleType (... (pair closureByte (pair roots exports)))))
pairFirst = (p : matchPair (a _ : a) p)
pairSecond = (p : matchPair (_ b : b) p)
manifestSchema = (core : pairFirst core)
manifestBundleType = (core : pairFirst (pairSecond core))
manifestTreeCalculus = (core : pairFirst (pairSecond (pairSecond core)))
manifestTreeHashAlgorithm = (core : pairFirst (pairSecond (pairSecond (pairSecond core))))
manifestTreeHashDomain = (core : pairFirst (pairSecond (pairSecond (pairSecond (pairSecond core)))))
manifestTreeNodePayload = (core : pairFirst (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond core))))))
manifestRuntimeSemantics = (core : pairFirst (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond core)))))))
manifestRuntimeEvaluation = (core : pairFirst (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond core))))))))
manifestRuntimeAbi = (core : pairFirst (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond core)))))))))
manifestCapabilities = (core : pairFirst (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond core))))))))))
manifestClosureByte = (core : pairFirst (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond core)))))))))))
manifestRoots = (core : pairFirst (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond core))))))))))))
manifestExports = (core : pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond (pairSecond core))))))))))))
-- Helper: compare a manifest field against an expected byte string.
manifestFieldMatch? = (actual expected : bytesEq? actual expected)
-- Validate core manifest fields against expected values.
validateManifestCore = (core rest :
matchBool
(ok core rest)
(err errManifestValidationFailed rest)
(and?
(manifestFieldMatch? (manifestSchema core) expectedSchema)
(and?
(manifestFieldMatch? (manifestBundleType core) expectedBundleType)
(and?
(manifestFieldMatch? (manifestTreeCalculus core) expectedTreeCalculus)
(and?
(manifestFieldMatch? (manifestTreeHashAlgorithm core) expectedTreeHashAlgorithm)
(and?
(manifestFieldMatch? (manifestTreeHashDomain core) expectedTreeHashDomain)
(and?
(manifestFieldMatch? (manifestTreeNodePayload core) expectedTreeNodePayload)
(and?
(manifestFieldMatch? (manifestRuntimeSemantics core) expectedRuntimeSemantics)
(and?
(manifestFieldMatch? (manifestRuntimeEvaluation core) expectedRuntimeEvaluation)
(and?
(manifestFieldMatch? (manifestRuntimeAbi core) expectedRuntimeAbi)
(and?
(bytesEq? (manifestClosureByte core) [(0)])
(and?
(not? (emptyList? (manifestRoots core)))
(not? (emptyList? (manifestExports core)))))))))))))))

232
lib/arboricx-nodes.tri Normal file
View File

@@ -0,0 +1,232 @@
!import "arboricx-common.tri" !Local
readNodeRecord = (bs :
bindResult (readBytes 32 bs)
(nodeHash afterNodeHash :
bindResult (readBytes 4 afterNodeHash)
(payloadLength afterPayloadLength :
bindResult (readBytes (u32BEBytesToNat payloadLength) afterPayloadLength)
(payload afterPayload :
ok
(pair nodeHash
(pair payloadLength payload))
afterPayload))))
nodeRecordHash = (nodeRecord :
matchPair
(nodeHash _ : nodeHash)
nodeRecord)
nodeRecordPayloadLength = (nodeRecord :
matchPair
(_ payload :
matchPair
(payloadLength _ : payloadLength)
payload)
nodeRecord)
nodeRecordPayload = (nodeRecord :
matchPair
(_ payload :
matchPair
(_ nodePayload : nodePayload)
payload)
nodeRecord)
nodePayloadKind = (nodePayload : bytesHead nodePayload)
nodePayloadHasTag? = (tag nodePayload :
triage
false
(actualTag : byteEq? actualTag tag)
(_ _ : false)
(nodePayloadKind nodePayload))
nodePayloadLeaf? = (nodePayload : bytesEq? [(0)] nodePayload)
nodePayloadStem? = (nodePayload :
and?
(nodePayloadHasTag? nodePayloadStemTag nodePayload)
(equal? (bytesLength nodePayload) 33))
nodePayloadFork? = (nodePayload :
and?
(nodePayloadHasTag? nodePayloadForkTag nodePayload)
(equal? (bytesLength nodePayload) 65))
nodePayloadValid? = (nodePayload :
or?
(nodePayloadLeaf? nodePayload)
(or?
(nodePayloadStem? nodePayload)
(nodePayloadFork? nodePayload)))
nodePayloadStemChildHash = (nodePayload : bytesTake 32 (bytesDrop 1 nodePayload))
nodePayloadForkLeftHash = (nodePayload : bytesTake 32 (bytesDrop 1 nodePayload))
nodePayloadForkRightHash = (nodePayload : bytesTake 32 (bytesDrop 33 nodePayload))
nodeRecordPayloadValid? = (nodeRecord : nodePayloadValid? (nodeRecordPayload nodeRecord))
nodeRecordsHaveInvalidPayload? = y (self nodeRecords :
matchList
false
(nodeRecord rest :
or?
(not? (nodeRecordPayloadValid? nodeRecord))
(self rest))
nodeRecords)
nodeRecordsHaveHash? = y (self nodeRecords nodeHash :
matchList
false
(nodeRecord rest :
or?
(bytesEq? nodeHash (nodeRecordHash nodeRecord))
(self rest nodeHash))
nodeRecords)
nodeRecordsHaveDuplicateHashes? = y (self nodeRecords :
matchList
false
(nodeRecord rest :
or?
(nodeRecordsHaveHash? rest (nodeRecordHash nodeRecord))
(self rest))
nodeRecords)
lookupNodeRecord_ = y (self nodeRecords nodeHash :
matchList
nothing
(nodeRecord rest :
matchBool
(just nodeRecord)
(self rest nodeHash)
(bytesEq? nodeHash (nodeRecordHash nodeRecord)))
nodeRecords)
lookupNodeRecord = (nodeHash nodeRecords : lookupNodeRecord_ nodeRecords nodeHash)
nodeRecordChildHashes = (nodeRecord :
(nodePayload :
matchBool
t
(matchBool
(pair (nodePayloadStemChildHash nodePayload) t)
(pair (nodePayloadForkLeftHash nodePayload)
(pair (nodePayloadForkRightHash nodePayload) t))
(nodePayloadStem? nodePayload))
(nodePayloadLeaf? nodePayload))
(nodeRecordPayload nodeRecord))
nodeHashPresent? = (nodeHash nodeRecords : nodeRecordsHaveHash? nodeRecords nodeHash)
nodeChildHashesPresent? = y (self childHashes nodeRecords :
matchList
true
(childHash rest :
and?
(nodeHashPresent? childHash nodeRecords)
(self rest nodeRecords))
childHashes)
nodeRecordChildrenPresent? = (nodeRecord nodeRecords :
nodeChildHashesPresent? (nodeRecordChildHashes nodeRecord) nodeRecords)
nodeRecordsClosed? = y (self nodeRecords allNodeRecords :
matchList
true
(nodeRecord rest :
and?
(nodeRecordChildrenPresent? nodeRecord allNodeRecords)
(self rest allNodeRecords))
nodeRecords)
validateNodeRecords = (nodeRecords rest :
matchBool
(err errInvalidNodePayload rest)
(matchBool
(err errDuplicateNode rest)
(matchBool
(ok nodeRecords rest)
(err errMissingNode rest)
(nodeRecordsClosed? nodeRecords nodeRecords))
(nodeRecordsHaveDuplicateHashes? nodeRecords))
(nodeRecordsHaveInvalidPayload? nodeRecords))
readNodeRecords_ = y (self bs nodeCount i acc :
matchBool
(ok (reverse acc) bs)
(bindResult (readNodeRecord bs)
(nodeRecord afterNodeRecord :
self afterNodeRecord nodeCount (succ i) (pair nodeRecord acc)))
(equal? i nodeCount))
readNodeRecords = (nodeCount bs : readNodeRecords_ bs nodeCount 0 t)
readNodesSection = (bs :
bindResult (readBytes 8 bs)
(nodeCount afterNodeCount :
bindResult (readNodeRecords (u64BEBytesToNat nodeCount) afterNodeCount)
(nodeRecords afterNodeRecords :
bindResult (validateNodeRecords nodeRecords afterNodeRecords)
(validNodeRecords afterValidNodeRecords :
ok (pair nodeCount validNodeRecords) afterValidNodeRecords))))
readNodesSectionComplete = (bs :
bindResult (readNodesSection bs)
(nodesSection afterNodesSection :
matchBool
(ok nodesSection afterNodesSection)
(err errUnexpectedBytes afterNodesSection)
(bytesNil? afterNodesSection)))
readArboricxNodesSection = (bs :
bindResult (readArboricxContainer bs)
(container afterContainer :
matchPair
(_ directory :
bindResult (sectionBytesOrErr arboricxNodesSectionId directory bs afterContainer)
(nodesBytes _ :
bindResult (readNodesSectionComplete nodesBytes)
(nodesSection _ : ok nodesSection afterContainer)))
container))
nodesSectionCount = (nodesSection :
matchPair
(nodeCount _ : nodeCount)
nodesSection)
nodesSectionRecords = (nodesSection :
matchPair
(_ nodeRecords : nodeRecords)
nodesSection)
nodeRecordToTreeWith = (self nodeRecords nodeRecord :
(nodePayload :
matchBool
(ok t t)
(matchBool
(bindResult (self (nodePayloadStemChildHash nodePayload) nodeRecords)
(child _ : ok (t child) t))
(bindResult (self (nodePayloadForkLeftHash nodePayload) nodeRecords)
(left _ :
bindResult (self (nodePayloadForkRightHash nodePayload) nodeRecords)
(right _ : ok (pair left right) t)))
(nodePayloadStem? nodePayload))
(nodePayloadLeaf? nodePayload))
(nodeRecordPayload nodeRecord))
nodeHashToTree = y (self nodeHash nodeRecords :
triage
(err errMissingNode t)
(nodeRecord : nodeRecordToTreeWith self nodeRecords nodeRecord)
(_ _ : err errMissingNode t)
(lookupNodeRecord nodeHash nodeRecords))
readArboricxTreeFromHash = (rootHash bs :
bindResult (readArboricxNodesSection bs)
(nodesSection afterContainer :
bindResult (nodeHashToTree rootHash (nodesSectionRecords nodesSection))
(tree _ : ok tree afterContainer)))
readArboricxExecutableFromHash = readArboricxTreeFromHash

View File

@@ -1,654 +1,136 @@
!import "base.tri" !Local !import "arboricx-manifest.tri" !Local
!import "list.tri" !Local
!import "bytes.tri" !Local
!import "binary.tri" !Local
arboricxMagic = [(65) (82) (66) (79) (82) (73) (67) (88)] -- Read and validate a full Arboricx bundle.
arboricxMajorVersion = [(0) (1)] -- Returns (pair validManifest afterContainer).
arboricxMinorVersion = [(0) (0)] -- The manifest core fields are validated against expected values.
arboricxManifestSectionId = [(0) (0) (0) (1)] readArboricxBundle = (bs :
arboricxNodesSectionId = [(0) (0) (0) (2)] bindResult (readArboricxRequiredSections bs)
(sections afterContainer :
errMissingSection = 4
errUnsupportedVersion = 5
errDuplicateSection = 6
errDuplicateNode = 7
errInvalidNodePayload = 8
errMissingNode = 9
nodePayloadLeafTag = 0
nodePayloadStemTag = 1
nodePayloadForkTag = 2
readArboricxMagic = (bs : expectBytes arboricxMagic bs)
readArboricxHeader = (bs :
bindResult (readArboricxMagic bs)
(_ afterMagic :
bindResult (readBytes 2 afterMagic)
(majorVersion afterMajor :
bindResult (readBytes 2 afterMajor)
(minorVersion afterMinor :
bindResult (readBytes 4 afterMinor)
(sectionCount afterSectionCount :
bindResult (readBytes 8 afterSectionCount)
(flags afterFlags :
bindResult (readBytes 8 afterFlags)
(dirOffset afterDirOffset :
ok
(pair majorVersion
(pair minorVersion
(pair sectionCount
(pair flags dirOffset))))
afterDirOffset)))))))
readSectionRecord = (bs :
bindResult (readBytes 4 bs)
(sectionId afterSectionId :
bindResult (readBytes 2 afterSectionId)
(sectionVersion afterSectionVersion :
bindResult (readBytes 2 afterSectionVersion)
(sectionFlags afterSectionFlags :
bindResult (readBytes 2 afterSectionFlags)
(compression afterCompression :
bindResult (readBytes 2 afterCompression)
(digestAlgorithm afterDigestAlgorithm :
bindResult (readBytes 8 afterDigestAlgorithm)
(offset afterOffset :
bindResult (readBytes 8 afterOffset)
(length afterLength :
bindResult (readBytes 32 afterLength)
(digest afterDigest :
ok
(pair sectionId
(pair sectionVersion
(pair sectionFlags
(pair compression
(pair digestAlgorithm
(pair offset
(pair length digest)))))))
afterDigest)))))))))
readSectionDirectory_ = y (self bs sectionCount i acc :
matchBool
(ok (reverse acc) bs)
(bindResult (readSectionRecord bs)
(sectionRecord afterSectionRecord :
self afterSectionRecord sectionCount (succ i) (pair sectionRecord acc)))
(equal? i sectionCount))
readSectionDirectory = (sectionCount bs : readSectionDirectory_ bs sectionCount 0 t)
sectionRecordId = (sectionRecord :
matchPair
(sectionId _ : sectionId)
sectionRecord)
sectionRecordVersion = (sectionRecord :
matchPair
(_ payload :
matchPair matchPair
(sectionVersion _ : sectionVersion) (manifestBytes _ :
payload) bindResult (readManifest manifestBytes)
sectionRecord) (parsedManifest afterManifest :
sectionRecordFlags = (sectionRecord :
matchPair
(_ payload :
matchPair
(_ payload2 :
matchPair
(sectionFlags _ : sectionFlags)
payload2)
payload)
sectionRecord)
sectionRecordCompression = (sectionRecord :
matchPair
(_ payload :
matchPair
(_ payload2 :
matchPair
(_ payload3 :
matchPair matchPair
(compression _ : compression) (coreManifest metadataWithExtensions :
payload3) bindResult (validateManifestCore coreManifest afterManifest)
payload2) (validCore _ : ok (pair validCore metadataWithExtensions) afterContainer))
payload) parsedManifest))
sectionRecord) sections))
sectionRecordDigestAlgorithm = (sectionRecord : -- Select an export from a validated bundle and reconstruct its root tree.
matchPair -- Returns ok executable afterContainer, or propagates parse/selection/node errors.
(_ payload : readArboricxExecutableByName = (nameBytes bs :
bindResult (readArboricxBundle bs)
(bundleResult afterBundle :
matchPair matchPair
(_ payload2 : (validCore _ :
matchPair bindResult (selectExport (manifestExports validCore) nameBytes)
(_ payload3 : (selectedExport _ :
matchPair readArboricxTreeFromHash (exportRoot selectedExport) bs))
(_ payload4 : bundleResult))
matchPair
(digestAlgorithm _ : digestAlgorithm)
payload4)
payload3)
payload2)
payload)
sectionRecord)
sectionRecordOffset = (sectionRecord : readArboricxExecutable = (bs :
matchPair readArboricxExecutableByName [] bs)
(_ payload :
matchPair
(_ payload2 :
matchPair
(_ payload3 :
matchPair
(_ payload4 :
matchPair
(_ payload5 :
matchPair
(offset _ : offset)
payload5)
payload4)
payload3)
payload2)
payload)
sectionRecord)
sectionRecordLength = (sectionRecord : applyArgs = (f args :
matchPair
(_ payload :
matchPair
(_ payload2 :
matchPair
(_ payload3 :
matchPair
(_ payload4 :
matchPair
(_ payload5 :
matchPair
(_ payload6 :
matchPair
(length _ : length)
payload6)
payload5)
payload4)
payload3)
payload2)
payload)
sectionRecord)
sectionRecordDigest = (sectionRecord :
matchPair
(_ payload :
matchPair
(_ payload2 :
matchPair
(_ payload3 :
matchPair
(_ payload4 :
matchPair
(_ payload5 :
matchPair
(_ payload6 :
matchPair
(_ digest : digest)
payload6)
payload5)
payload4)
payload3)
payload2)
payload)
sectionRecord)
lookupSectionRecord_ = y (self directory sectionId :
matchList
nothing
(sectionRecord rest :
matchBool
(just sectionRecord)
(self rest sectionId)
(bytesEq? sectionId (sectionRecordId sectionRecord)))
directory)
lookupSectionRecord = (sectionId directory : lookupSectionRecord_ directory sectionId)
sectionDirectoryHasId?_ = y (self directory sectionId :
matchList
false
(sectionRecord rest :
or?
(bytesEq? sectionId (sectionRecordId sectionRecord))
(self rest sectionId))
directory)
sectionDirectoryHasId? = (sectionId directory : sectionDirectoryHasId?_ directory sectionId)
sectionDirectoryHasDuplicateIds? = y (self directory :
matchList
false
(sectionRecord rest :
or?
(sectionDirectoryHasId?_ rest (sectionRecordId sectionRecord))
(self rest))
directory)
validateSectionDirectory = (directory rest :
matchBool
(err errDuplicateSection rest)
(ok directory rest)
(sectionDirectoryHasDuplicateIds? directory))
byteSlice = (offset length bytes : bytesTake length (bytesDrop offset bytes))
natMake = (bit rest :
matchBool
0
(pair bit rest)
(and? (equal? bit 0) (equal? rest 0)))
natAdd = y (self a b :
triage
b
(_ : b)
(aBit aRest :
triage
a
(_ : a)
(bBit bRest :
matchBool
(natMake 0 (succ (self aRest bRest)))
(natMake (matchBool (matchBool 0 1 bBit) (matchBool 1 0 bBit) aBit)
(self aRest bRest))
(and? (equal? aBit 1) (equal? bBit 1)))
b)
a)
natDouble = (n : matchBool 0 (pair 0 n) (equal? n 0))
natTimes256 = (n :
natDouble
(natDouble
(natDouble
(natDouble
(natDouble
(natDouble
(natDouble
(natDouble n))))))))
byteNatShiftAppend_ = y (self byte acc i :
matchBool
acc
(triage
(natMake 0 (self 0 acc (succ i)))
(_ : acc)
(bit rest : natMake bit (self rest acc (succ i)))
byte)
(equal? i 8))
byteNatShiftAppend = (byte acc : byteNatShiftAppend_ byte acc 0)
beBytesToNat = (bytes :
foldl foldl
(acc byte : byteNatShiftAppend byte acc) (acc arg : acc arg)
0 f
bytes) args)
u32BEBytesToNat = beBytesToNat runArboricxByName = (nameBytes bs arg :
u64BEBytesToNat = beBytesToNat bindResult (readArboricxExecutableByName nameBytes bs)
(executable rest : ok (executable arg) rest))
arboricxHeaderMajorVersion = (header : runArboricx = (bs arg :
matchPair runArboricxByName [] bs arg)
(majorVersion _ : majorVersion)
header)
arboricxHeaderMinorVersion = (header : runArboricxArgsByName = (nameBytes bs args :
matchPair bindResult (readArboricxExecutableByName nameBytes bs)
(_ payload : (executable rest : ok (applyArgs executable args) rest))
matchPair
(minorVersion _ : minorVersion)
payload)
header)
arboricxHeaderSectionCount = (header : runArboricxArgs = (bs args :
matchPair runArboricxArgsByName [] bs args)
(_ payload :
matchPair
(_ payload2 :
matchPair
(sectionCount _ : sectionCount)
payload2)
payload)
header)
arboricxHeaderFlags = (header : errHostCodecFailed = 14
matchPair
(_ payload :
matchPair
(_ payload2 :
matchPair
(_ payload3 :
matchPair
(flags _ : flags)
payload3)
payload2)
payload)
header)
arboricxHeaderDirOffset = (header : hostTreeTag = 0
matchPair hostStringTag = 1
(_ payload : hostNumberTag = 2
matchPair hostBoolTag = 3
(_ payload2 : hostListTag = 4
matchPair hostBytesTag = 5
(_ payload3 :
matchPair
(_ dirOffset : dirOffset)
payload3)
payload2)
payload)
header)
validateArboricxHeader = (header rest : hostTree = (value : pair hostTreeTag value)
matchBool hostString = (bytes : pair hostStringTag bytes)
(ok header rest) hostNumber = (n : pair hostNumberTag n)
(err errUnsupportedVersion rest) hostBool = (b : pair hostBoolTag b)
(and? hostList = (xs : pair hostListTag xs)
(bytesEq? arboricxMajorVersion (arboricxHeaderMajorVersion header)) hostBytes = (bytes : pair hostBytesTag bytes)
(bytesEq? arboricxMinorVersion (arboricxHeaderMinorVersion header))))
readArboricxContainer = (bs : hostValueTag = (hostValue : pairFirst hostValue)
bindResult (readArboricxHeader bs) hostValuePayload = (hostValue : pairSecond hostValue)
(header afterHeader :
bindResult (validateArboricxHeader header afterHeader)
(validHeader afterValidHeader :
bindResult (readSectionDirectory
(u32BEBytesToNat (arboricxHeaderSectionCount validHeader))
(bytesDrop (u64BEBytesToNat (arboricxHeaderDirOffset validHeader)) bs))
(directory afterDirectory :
bindResult (validateSectionDirectory directory afterDirectory)
(validDirectory afterValidDirectory :
ok (pair validHeader validDirectory) afterValidDirectory)))))
sectionRecordOffsetNat = (sectionRecord : hostBool? = (value : or? (equal? value false) (equal? value true))
u64BEBytesToNat (sectionRecordOffset sectionRecord))
sectionRecordLengthNat = (sectionRecord : hostNumber? = y (self value :
u64BEBytesToNat (sectionRecordLength sectionRecord))
extractSectionBytes = (sectionRecord containerBytes :
byteSlice
(sectionRecordOffsetNat sectionRecord)
(sectionRecordLengthNat sectionRecord)
containerBytes)
extractSectionBytesResult = (sectionRecord containerBytes rest :
(sectionBytes :
matchBool
(ok sectionBytes rest)
(err errUnexpectedEof rest)
(equal? (bytesLength sectionBytes) (sectionRecordLengthNat sectionRecord)))
(extractSectionBytes sectionRecord containerBytes))
lookupSectionBytes = (sectionId directory containerBytes :
triage triage
nothing true
(sectionRecord : just (extractSectionBytes sectionRecord containerBytes)) (_ : false)
(_ _ : nothing) (bit rest :
(lookupSectionRecord sectionId directory)) and?
(or? (equal? bit false) (equal? bit true))
sectionBytesOrErr = (sectionId directory containerBytes rest :
triage
(err errMissingSection rest)
(sectionRecord : extractSectionBytesResult sectionRecord containerBytes rest)
(_ _ : err errMissingSection rest)
(lookupSectionRecord sectionId directory))
readArboricxSectionBytes = (sectionId bs :
bindResult (readArboricxContainer bs)
(container afterContainer :
matchPair
(_ directory : sectionBytesOrErr sectionId directory bs afterContainer)
container))
readArboricxRequiredSections = (bs :
bindResult (readArboricxContainer bs)
(container afterContainer :
matchPair
(_ directory :
bindResult (sectionBytesOrErr arboricxManifestSectionId directory bs afterContainer)
(manifestBytes _ :
bindResult (sectionBytesOrErr arboricxNodesSectionId directory bs afterContainer)
(nodesBytes _ :
ok (pair manifestBytes nodesBytes) afterContainer)))
container))
readNodeRecord = (bs :
bindResult (readBytes 32 bs)
(nodeHash afterNodeHash :
bindResult (readBytes 4 afterNodeHash)
(payloadLength afterPayloadLength :
bindResult (readBytes (u32BEBytesToNat payloadLength) afterPayloadLength)
(payload afterPayload :
ok
(pair nodeHash
(pair payloadLength payload))
afterPayload))))
nodeRecordHash = (nodeRecord :
matchPair
(nodeHash _ : nodeHash)
nodeRecord)
nodeRecordPayloadLength = (nodeRecord :
matchPair
(_ payload :
matchPair
(payloadLength _ : payloadLength)
payload)
nodeRecord)
nodeRecordPayload = (nodeRecord :
matchPair
(_ payload :
matchPair
(_ nodePayload : nodePayload)
payload)
nodeRecord)
nodePayloadKind = (nodePayload : bytesHead nodePayload)
nodePayloadHasTag? = (tag nodePayload :
triage
false
(actualTag : byteEq? actualTag tag)
(_ _ : false)
(nodePayloadKind nodePayload))
nodePayloadLeaf? = (nodePayload : bytesEq? [(0)] nodePayload)
nodePayloadStem? = (nodePayload :
and?
(nodePayloadHasTag? nodePayloadStemTag nodePayload)
(equal? (bytesLength nodePayload) 33))
nodePayloadFork? = (nodePayload :
and?
(nodePayloadHasTag? nodePayloadForkTag nodePayload)
(equal? (bytesLength nodePayload) 65))
nodePayloadValid? = (nodePayload :
or?
(nodePayloadLeaf? nodePayload)
(or?
(nodePayloadStem? nodePayload)
(nodePayloadFork? nodePayload)))
nodePayloadStemChildHash = (nodePayload : bytesTake 32 (bytesDrop 1 nodePayload))
nodePayloadForkLeftHash = (nodePayload : bytesTake 32 (bytesDrop 1 nodePayload))
nodePayloadForkRightHash = (nodePayload : bytesTake 32 (bytesDrop 33 nodePayload))
nodeRecordPayloadValid? = (nodeRecord : nodePayloadValid? (nodeRecordPayload nodeRecord))
nodeRecordsHaveInvalidPayload? = y (self nodeRecords :
matchList
false
(nodeRecord rest :
or?
(not? (nodeRecordPayloadValid? nodeRecord))
(self rest)) (self rest))
nodeRecords) value)
nodeRecordsHaveHash? = y (self nodeRecords nodeHash : hostList? = y (self value :
matchList triage
false true
(nodeRecord rest : (_ : false)
or? (_ rest : self rest)
(bytesEq? nodeHash (nodeRecordHash nodeRecord)) value)
(self rest nodeHash))
nodeRecords)
nodeRecordsHaveDuplicateHashes? = y (self nodeRecords : hostString? = y (self value :
matchList
false
(nodeRecord rest :
or?
(nodeRecordsHaveHash? rest (nodeRecordHash nodeRecord))
(self rest))
nodeRecords)
lookupNodeRecord_ = y (self nodeRecords nodeHash :
matchList
nothing
(nodeRecord rest :
matchBool
(just nodeRecord)
(self rest nodeHash)
(bytesEq? nodeHash (nodeRecordHash nodeRecord)))
nodeRecords)
lookupNodeRecord = (nodeHash nodeRecords : lookupNodeRecord_ nodeRecords nodeHash)
nodeRecordChildHashes = (nodeRecord :
(nodePayload :
matchBool
t
(matchBool
(pair (nodePayloadStemChildHash nodePayload) t)
(pair (nodePayloadForkLeftHash nodePayload)
(pair (nodePayloadForkRightHash nodePayload) t))
(nodePayloadStem? nodePayload))
(nodePayloadLeaf? nodePayload))
(nodeRecordPayload nodeRecord))
nodeHashPresent? = (nodeHash nodeRecords : nodeRecordsHaveHash? nodeRecords nodeHash)
nodeChildHashesPresent? = y (self childHashes nodeRecords :
matchList matchList
true true
(childHash rest : (byte rest : and? (hostNumber? byte) (self rest))
and? value)
(nodeHashPresent? childHash nodeRecords)
(self rest nodeRecords))
childHashes)
nodeRecordChildrenPresent? = (nodeRecord nodeRecords : hostBytes? = hostString?
nodeChildHashesPresent? (nodeRecordChildHashes nodeRecord) nodeRecords)
nodeRecordsClosed? = y (self nodeRecords allNodeRecords : wrapHostValue = (validator wrapper resultValue rest :
matchList
true
(nodeRecord rest :
and?
(nodeRecordChildrenPresent? nodeRecord allNodeRecords)
(self rest allNodeRecords))
nodeRecords)
validateNodeRecords = (nodeRecords rest :
matchBool matchBool
(err errInvalidNodePayload rest) (ok (wrapper resultValue) rest)
(matchBool (err errHostCodecFailed resultValue)
(err errDuplicateNode rest) (validator resultValue))
(matchBool
(ok nodeRecords rest)
(err errMissingNode rest)
(nodeRecordsClosed? nodeRecords nodeRecords))
(nodeRecordsHaveDuplicateHashes? nodeRecords))
(nodeRecordsHaveInvalidPayload? nodeRecords))
readNodeRecords_ = y (self bs nodeCount i acc : runArboricxByNameToTree = (nameBytes bs args :
matchBool bindResult (runArboricxArgsByName nameBytes bs args)
(ok (reverse acc) bs) (value rest : ok (hostTree value) rest))
(bindResult (readNodeRecord bs)
(nodeRecord afterNodeRecord :
self afterNodeRecord nodeCount (succ i) (pair nodeRecord acc)))
(equal? i nodeCount))
readNodeRecords = (nodeCount bs : readNodeRecords_ bs nodeCount 0 t) runArboricxByNameToString = (nameBytes bs args :
bindResult (runArboricxArgsByName nameBytes bs args)
(value rest : wrapHostValue hostString? hostString value rest))
readNodesSection = (bs : runArboricxByNameToNumber = (nameBytes bs args :
bindResult (readBytes 8 bs) bindResult (runArboricxArgsByName nameBytes bs args)
(nodeCount afterNodeCount : (value rest : wrapHostValue hostNumber? hostNumber value rest))
bindResult (readNodeRecords (u64BEBytesToNat nodeCount) afterNodeCount)
(nodeRecords afterNodeRecords :
bindResult (validateNodeRecords nodeRecords afterNodeRecords)
(validNodeRecords afterValidNodeRecords :
ok (pair nodeCount validNodeRecords) afterValidNodeRecords))))
readNodesSectionComplete = (bs : runArboricxByNameToBool = (nameBytes bs args :
bindResult (readNodesSection bs) bindResult (runArboricxArgsByName nameBytes bs args)
(nodesSection afterNodesSection : (value rest : wrapHostValue hostBool? hostBool value rest))
matchBool
(ok nodesSection afterNodesSection)
(err errUnexpectedBytes afterNodesSection)
(bytesNil? afterNodesSection)))
readArboricxNodesSection = (bs : runArboricxByNameToList = (nameBytes bs args :
bindResult (readArboricxContainer bs) bindResult (runArboricxArgsByName nameBytes bs args)
(container afterContainer : (value rest : wrapHostValue hostList? hostList value rest))
matchPair
(_ directory :
bindResult (sectionBytesOrErr arboricxNodesSectionId directory bs afterContainer)
(nodesBytes _ :
bindResult (readNodesSectionComplete nodesBytes)
(nodesSection _ : ok nodesSection afterContainer)))
container))
nodesSectionCount = (nodesSection : runArboricxByNameToBytes = (nameBytes bs args :
matchPair bindResult (runArboricxArgsByName nameBytes bs args)
(nodeCount _ : nodeCount) (value rest : wrapHostValue hostBytes? hostBytes value rest))
nodesSection)
nodesSectionRecords = (nodesSection : runArboricxToTree = (bs args : runArboricxByNameToTree [] bs args)
matchPair runArboricxToString = (bs args : runArboricxByNameToString [] bs args)
(_ nodeRecords : nodeRecords) runArboricxToNumber = (bs args : runArboricxByNameToNumber [] bs args)
nodesSection) runArboricxToBool = (bs args : runArboricxByNameToBool [] bs args)
runArboricxToList = (bs args : runArboricxByNameToList [] bs args)
nodeRecordToTreeWith = (self nodeRecords nodeRecord : runArboricxToBytes = (bs args : runArboricxByNameToBytes [] bs args)
(nodePayload :
matchBool
(ok t t)
(matchBool
(bindResult (self (nodePayloadStemChildHash nodePayload) nodeRecords)
(child _ : ok (t child) t))
(bindResult (self (nodePayloadForkLeftHash nodePayload) nodeRecords)
(left _ :
bindResult (self (nodePayloadForkRightHash nodePayload) nodeRecords)
(right _ : ok (pair left right) t)))
(nodePayloadStem? nodePayload))
(nodePayloadLeaf? nodePayload))
(nodeRecordPayload nodeRecord))
nodeHashToTree = y (self nodeHash nodeRecords :
triage
(err errMissingNode t)
(nodeRecord : nodeRecordToTreeWith self nodeRecords nodeRecord)
(_ _ : err errMissingNode t)
(lookupNodeRecord nodeHash nodeRecords))
readArboricxTreeFromHash = (rootHash bs :
bindResult (readArboricxNodesSection bs)
(nodesSection afterContainer :
bindResult (nodeHashToTree rootHash (nodesSectionRecords nodesSection))
(tree _ : ok tree afterContainer)))
readArboricxExecutableFromHash = readArboricxTreeFromHash

View File

@@ -0,0 +1,17 @@
# tricu CLI debugging notes
For ad-hoc expressions, prefer stdin mode and set `TRICU_DB_PATH` to a DB that already has library definitions imported:
```sh
TRICU_DB_PATH=/tmp/gpt.db ./result/bin/tricu eval -t decode <<'EOF'
main = <expression-to-run>
EOF
```
Important details:
- `eval` from stdin evaluates the submitted program and uses its final/main result.
- When using `-f FILE`, the CLI expects a `main` definition in the evaluated file context.
- With `TRICU_DB_PATH=/tmp/gpt.db`, definitions already loaded into that content store are in scope; do not add `!import` lines unless you intentionally want file import preprocessing.
- `!import "lib/arboricx.tri" !Local` is relative to the file being preprocessed; from temp files it will look under `/tmp`, so avoid that pattern for scratch files.
- Do not inspect huge Arboricx values with `-t fsl`; write small predicates/accessors and return booleans, numbers, or byte strings decoded with `-t decode`.

View File

@@ -24,40 +24,22 @@ module Wire
import ContentStore (getNodeMerkle, loadTree, putMerkleNode, storeTerm) import ContentStore (getNodeMerkle, loadTree, putMerkleNode, storeTerm)
import Research import Research
import Codec.CBOR.Decoding ( Decoder
, decodeString
, decodeBytes
, decodeListLen
, decodeMapLen
)
import Control.Monad (replicateM, forM)
import Codec.CBOR.Encoding ( Encoding
, encodeMapLen
, encodeListLen
, encodeString
, encodeBytes
)
import Codec.CBOR.Write (toLazyByteString)
import Data.Monoid (mconcat)
import Codec.CBOR.Read (deserialiseFromBytes, DeserialiseFailure(..))
import Control.Exception (SomeException, evaluate, try) import Control.Exception (SomeException, evaluate, try)
import Control.Monad (foldM, unless, when) import Control.Monad (foldM, unless, when)
import Crypto.Hash (Digest, SHA256, hash) import Crypto.Hash (Digest, SHA256, hash)
import Data.Bits ((.&.), (.|.), shiftL, shiftR) import Data.Bits ((.|.), (.&.), shiftL, shiftR)
import Data.ByteArray (convert) import Data.ByteArray (convert)
import Data.ByteString (ByteString) import Data.ByteString (ByteString)
import Data.Foldable (traverse_) import Data.Foldable (traverse_)
import Data.Map (Map) import Data.Map (Map)
import Data.Text (Text, unpack) import Data.Text (Text, unpack)
import Data.Text.Encoding (decodeUtf8, encodeUtf8) import Data.Text.Encoding (decodeUtf8, decodeUtf8', encodeUtf8)
import Data.Word (Word16, Word32, Word64) import Data.Word (Word16, Word32, Word64, Word8)
import Database.SQLite.Simple (Connection) import Database.SQLite.Simple (Connection)
import GHC.Generics (Generic) import GHC.Generics (Generic)
import qualified Data.ByteString as BS import qualified Data.ByteString as BS
import qualified Data.ByteString.Base16 as Base16 import qualified Data.ByteString.Base16 as Base16
import qualified Data.ByteString.Lazy as BL
import qualified Data.Map as Map import qualified Data.Map as Map
import qualified Data.Set as Set import qualified Data.Set as Set
import qualified Data.Text as T import qualified Data.Text as T
@@ -91,92 +73,316 @@ compressionNone = 0
digestSha256 = 1 digestSha256 = 1
-- --------------------------------------------------------------------------- -- ---------------------------------------------------------------------------
-- CBOR encoding helpers -- Manifest binary constants
-- --------------------------------------------------------------------------- -- ---------------------------------------------------------------------------
-- | Canonical CBOR map length encoder. -- | Magic prefix identifying the fixed-order manifest v1 format.
cmkLen :: Int -> Encoding manifestMagic :: ByteString
cmkLen n = encodeMapLen (fromIntegral n) manifestMagic = "ARBMNFST"
-- | Decode a CBOR array of n elements. -- | Manifest major version.
decodeListN :: Decoder s a -> Int -> Decoder s [a] manifestMajorVersion :: Word16
decodeListN dec n = replicateM n dec manifestMajorVersion = 1
-- | Decode a CBOR map (sequence of key-value pairs). -- | Manifest minor version.
decodeMapN :: Decoder s a -> Decoder s b -> Int -> Decoder s [(a, b)] manifestMinorVersion :: Word16
decodeMapN keyDec valDec n = forM [1..n] $ \_ -> manifestMinorVersion = 0
keyDec >>= \k -> valDec >>= \v -> pure (k, v)
decodeKey :: Text -> Decoder s () -- | Closure mode to byte.
decodeKey expected = do closureToByte :: ClosureMode -> Word8
actual <- decodeString closureToByte = \case
unless (actual == expected) $ ClosureComplete -> 0
fail $ "expected key " ++ show expected ++ ", got " ++ show actual ClosurePartial -> 1
-- | Canonical CBOR array length encoder. closureFromByte :: Word8 -> Either String ClosureMode
cakLen :: Int -> Encoding closureFromByte = \case
cakLen n = encodeListLen (fromIntegral n) 0 -> Right ClosureComplete
1 -> Right ClosurePartial
n -> Left $ "unsupported closure byte: " ++ show n
-- | Encode a canonical CBOR map with key-value pairs as flat sequence. -- | Metadata tag constants.
cmkPairs :: [(Text, Encoding)] -> Encoding tagPackage, tagVersion, tagDescription, tagLicense, tagCreatedBy :: Word16
cmkPairs [] = cmkLen 0 tagPackage = 1
cmkPairs kvs = cmkLen (length kvs) <> mconcat [encodeString k <> v | (k, v) <- kvs] tagVersion = 2
tagDescription = 3
-- | Encode a canonical CBOR array. tagLicense = 4
cakSeq :: [Encoding] -> Encoding tagCreatedBy = 5
cakSeq [] = cakLen 0
cakSeq xs = cakLen (length xs) <> mconcat xs
-- | Encode a canonical CBOR text string.
encText :: Text -> Encoding
encText = encodeString
-- | Encode a canonical CBOR byte string.
encBytes :: ByteString -> Encoding
encBytes = encodeBytes
-- --------------------------------------------------------------------------- -- ---------------------------------------------------------------------------
-- Data types with CBOR instances -- Fixed-order manifest binary helpers
-- ---------------------------------------------------------------------------
-- | Encode a UTF-8 text string as: u32 length + UTF-8 bytes.
encodeLengthPrefixedText :: Text -> ByteString
encodeLengthPrefixedText t = encode32 (fromIntegral $ BS.length bs) <> bs
where bs = encodeUtf8 t
-- | Decode a length-prefixed UTF-8 text string.
-- Returns the decoded Text and the remaining ByteString.
decodeLengthPrefixedText :: ByteString -> Either String (Text, ByteString)
decodeLengthPrefixedText bs =
case decode32be "text_length" bs of
Left err -> Left $ "decodeLengthPrefixedText: " ++ err
Right (len, rest) -> do
let payloadLen = fromIntegral len
when (BS.length rest < payloadLen) $
Left "decodeLengthPrefixedText: string extends beyond input"
let (textBytes, after) = BS.splitAt payloadLen rest
case decodeUtf8' textBytes of
Right txt -> Right (txt, after)
Left _ -> Left "decodeLengthPrefixedText: invalid UTF-8"
-- | Encode a metadata value as a TLV entry: u16 tag + u32 length + raw bytes.
encodeMetadataTLV :: Word16 -> ByteString -> ByteString
encodeMetadataTLV tag val = encode16 tag <> encode32 (fromIntegral $ BS.length val) <> val
-- ---------------------------------------------------------------------------
-- Fixed-order manifest encoders
-- ---------------------------------------------------------------------------
-- | Encode the entire manifest in fixed-order core + TLV tail layout.
encodeManifest :: BundleManifest -> ByteString
encodeManifest m =
manifestMagic
<> encode16 manifestMajorVersion
<> encode16 manifestMinorVersion
<> encodeLengthPrefixedText (manifestSchema m)
<> encodeLengthPrefixedText (manifestBundleType m)
<> encodeLengthPrefixedText (treeCalculus (manifestTree m))
<> encodeLengthPrefixedText (nodeHashAlgorithm (treeNodeHash (manifestTree m)))
<> encodeLengthPrefixedText (nodeHashDomain (treeNodeHash (manifestTree m)))
<> encodeLengthPrefixedText (treeNodePayload (manifestTree m))
<> encodeLengthPrefixedText (runtimeSemantics (manifestRuntime m))
<> encodeLengthPrefixedText (runtimeEvaluation (manifestRuntime m))
<> encodeLengthPrefixedText (runtimeAbi (manifestRuntime m))
<> encode32 (fromIntegral $ length (runtimeCapabilities (manifestRuntime m)))
<> encodeCapabilities (runtimeCapabilities (manifestRuntime m))
<> BS.pack [closureToByte (manifestClosure m)]
<> encode32 (fromIntegral $ length (manifestRoots m))
<> encodeRoots (manifestRoots m)
<> encode32 (fromIntegral $ length (manifestExports m))
<> encodeExports (manifestExports m)
<> encodeMetadataTLVs (manifestMetadata m)
<> encode32 0 -- zero extension fields
encodeCapabilities :: [Text] -> ByteString
encodeCapabilities caps = mconcat (map encodeLengthPrefixedText caps)
encodeRoots :: [BundleRoot] -> ByteString
encodeRoots = mconcat . map encodeRoot
encodeRoot :: BundleRoot -> ByteString
encodeRoot root =
merkleHashToRaw (rootHash root)
<> encodeLengthPrefixedText (rootRole root)
encodeExports :: [BundleExport] -> ByteString
encodeExports = mconcat . map encodeExport
encodeExport :: BundleExport -> ByteString
encodeExport exp =
encodeLengthPrefixedText (exportName exp)
<> merkleHashToRaw (exportRoot exp)
<> encodeLengthPrefixedText (exportKind exp)
<> encodeLengthPrefixedText (exportAbi exp)
-- | Encode metadata as: u32 field count + TLV entries for present fields.
-- Metadata TLV values are raw UTF-8 bytes; the TLV length already carries size.
encodeMetadataTLVs :: BundleMetadata -> ByteString
encodeMetadataTLVs m =
let entries = metadataTLVEntries m
in encode32 (fromIntegral $ length entries) <> encodeTLVs entries
metadataTLVEntries :: BundleMetadata -> [(Word16, ByteString)]
metadataTLVEntries m =
maybeEntry tagPackage (metadataPackage m)
++ maybeEntry tagVersion (metadataVersion m)
++ maybeEntry tagDescription (metadataDescription m)
++ maybeEntry tagLicense (metadataLicense m)
++ maybeEntry tagCreatedBy (metadataCreatedBy m)
where
maybeEntry _ Nothing = []
maybeEntry tag (Just value) = [(tag, encodeUtf8 value)]
encodeTLVs :: [(Word16, ByteString)] -> ByteString
encodeTLVs tlvs = mconcat (map (uncurry encodeMetadataTLV) tlvs)
-- ---------------------------------------------------------------------------
-- Fixed-order manifest decoders
-- ---------------------------------------------------------------------------
-- | Decode the manifest from fixed-order core + TLV tail bytes.
-- All remaining bytes after the core fields are treated as the TLV tail.
decodeManifest :: ByteString -> Either String BundleManifest
decodeManifest bs = do
-- Header
when (BS.length bs < 8) $ Left "manifest too short for magic"
when (BS.take 8 bs /= manifestMagic) $ Left "invalid manifest magic"
let rest = BS.drop 8 bs
(major, rest') <- decode16be "major" rest
when (major /= manifestMajorVersion) $ Left $ "unsupported manifest major version: " ++ show major
(_minor, rest'') <- decode16be "minor" rest'
-- Core strings
(schema, rest''') <- decodeLengthPrefixedText rest''
(bundleType, rest'''') <- decodeLengthPrefixedText rest'''
-- Tree spec fields (flat)
(calc, rest1) <- decodeLengthPrefixedText rest''''
(alg, rest2) <- decodeLengthPrefixedText rest1
(domain, rest3) <- decodeLengthPrefixedText rest2
(payload, rest4) <- decodeLengthPrefixedText rest3
-- Runtime spec fields (flat)
(sem, restR1) <- decodeLengthPrefixedText rest4
(eval, restR2) <- decodeLengthPrefixedText restR1
(abi, restR3) <- decodeLengthPrefixedText restR2
(capCount, restR4) <- decode32be "capability_count" restR3
let capLen = fromIntegral capCount
(caps, restR5) <- decodeCapabilities capLen restR4
-- Closure
when (BS.length restR5 < 1) $ Left "manifest truncated: missing closure byte"
let (closureByte, restR6) = BS.splitAt 1 restR5
closure <- closureFromByte (head $ BS.unpack closureByte)
-- Roots
(rootCount, restR7) <- decode32be "root_count" restR6
let rootCountInt = fromIntegral rootCount
(roots, restR8) <- decodeRoots rootCountInt restR7
-- Exports
(exportCount, restR9) <- decode32be "export_count" restR8
let exportCountInt = fromIntegral exportCount
(exports, restR10) <- decodeExports exportCountInt restR9
-- TLV tail
(metadata, _ext) <- decodeMetadataAndExtensions restR10
pure BundleManifest
{ manifestSchema = schema
, manifestBundleType = bundleType
, manifestTree = TreeSpec
{ treeCalculus = calc
, treeNodeHash = NodeHashSpec
{ nodeHashAlgorithm = alg
, nodeHashDomain = domain
}
, treeNodePayload = payload
}
, manifestRuntime = RuntimeSpec
{ runtimeSemantics = sem
, runtimeEvaluation = eval
, runtimeAbi = abi
, runtimeCapabilities = caps
}
, manifestClosure = closure
, manifestRoots = roots
, manifestExports = exports
, manifestMetadata = metadata
}
-- | Decode length-prefixed capability strings.
decodeCapabilities :: Int -> ByteString -> Either String ([Text], ByteString)
decodeCapabilities 0 bs = Right ([], bs)
decodeCapabilities n bs = do
(txt, rest) <- decodeLengthPrefixedText bs
(restTxts, restFinal) <- decodeCapabilities (n - 1) rest
Right (txt : restTxts, restFinal)
-- | Decode root entries.
decodeRoots :: Int -> ByteString -> Either String ([BundleRoot], ByteString)
decodeRoots 0 bs = Right ([], bs)
decodeRoots n bs = do
when (BS.length bs < 32) $ Left "decodeRoots: truncated root hash"
let (hashBytes, rest) = BS.splitAt 32 bs
role <- decodeLengthPrefixedText rest
(restRoots, restFinal) <- decodeRoots (n - 1) (snd role)
Right (BundleRoot (rawToMerkleHash hashBytes) (fst role) : restRoots, restFinal)
-- | Decode export entries.
decodeExports :: Int -> ByteString -> Either String ([BundleExport], ByteString)
decodeExports 0 bs = Right ([], bs)
decodeExports n bs = do
name <- decodeLengthPrefixedText bs
when (BS.length (snd name) < 32) $ Left "decodeExports: truncated export root hash"
let (hashBytes, rest) = BS.splitAt 32 (snd name)
kind <- decodeLengthPrefixedText rest
abi <- decodeLengthPrefixedText (snd kind)
(restExports, restFinal) <- decodeExports (n - 1) (snd abi)
Right (BundleExport (fst name) (rawToMerkleHash hashBytes) (fst kind) (fst abi) : restExports, restFinal)
-- | Decode TLV tail into metadata and extensions.
-- Layout: u32 metadata-count, metadata TLVs, u32 extension-count, extension TLVs.
-- For now, known metadata tags are decoded and extension TLVs are skipped.
decodeMetadataAndExtensions :: ByteString -> Either String (BundleMetadata, ByteString)
decodeMetadataAndExtensions bs = do
(metadataCount, rest1) <- decode32be "metadata_field_count" bs
(metadataTlvs, rest2) <- decodeTLVs (fromIntegral metadataCount) rest1
metadata <- decodeMetadataTLVs metadataTlvs
(extensionCount, rest3) <- decode32be "extension_field_count" rest2
(_extensionTlvs, rest4) <- decodeTLVs (fromIntegral extensionCount) rest3
unless (BS.null rest4) $ Left "trailing bytes after manifest TLV tail"
Right (metadata, rest4)
-- | Decode a fixed number of TLV entries.
decodeTLVs :: Int -> ByteString -> Either String ([TLVEntry], ByteString)
decodeTLVs 0 bs = Right ([], bs)
decodeTLVs n bs = do
(tag, rest1) <- decode16be "tlv_tag" bs
(len, rest2) <- decode32be "tlv_length" rest1
let payloadLen = fromIntegral len
when (BS.length rest2 < payloadLen) $ Left "TLV value extends beyond input"
let (value, after) = BS.splitAt payloadLen rest2
(restTlvs, restFinal) <- decodeTLVs (n - 1) after
Right ((tag, value) : restTlvs, restFinal)
-- | Decode known metadata TLV entries into BundleMetadata.
-- Unknown tags are ignored.
decodeMetadataTLVs :: [(Word16, ByteString)] -> Either String BundleMetadata
decodeMetadataTLVs tlvs = do
pkg <- decodeOptionalMetadataText tagPackage
ver <- decodeOptionalMetadataText tagVersion
desc <- decodeOptionalMetadataText tagDescription
lic <- decodeOptionalMetadataText tagLicense
by <- decodeOptionalMetadataText tagCreatedBy
pure BundleMetadata
{ metadataPackage = pkg
, metadataVersion = ver
, metadataDescription = desc
, metadataLicense = lic
, metadataCreatedBy = by
}
where
lookupTag t = go t tlvs
go _ [] = Nothing
go t ((tag, val):rest)
| tag == t = Just val
| otherwise = go t rest
decodeOptionalMetadataText tag =
case lookupTag tag of
Nothing -> Right Nothing
Just raw -> case decodeUtf8' raw of
Right txt -> Right (Just txt)
Left _ -> Left $ "metadata TLV has invalid UTF-8 for tag " ++ show tag
type TLVEntry = (Word16, ByteString)
-- ---------------------------------------------------------------------------
-- Data types
-- --------------------------------------------------------------------------- -- ---------------------------------------------------------------------------
-- | Closure declaration. -- | Closure declaration.
data ClosureMode = ClosureComplete | ClosurePartial data ClosureMode = ClosureComplete | ClosurePartial
deriving (Show, Eq, Ord, Generic) deriving (Show, Eq, Ord, Generic)
toCBORClosure :: ClosureMode -> Encoding
toCBORClosure = encText . \case
ClosureComplete -> "complete"
ClosurePartial -> "partial"
closureFromCBOR :: Decoder s ClosureMode
closureFromCBOR = decodeString >>= \case
"complete" -> pure ClosureComplete
"partial" -> pure ClosurePartial
other -> fail $ "ClosureMode: " ++ show other
-- | Hash specification (algorithm + domain strings). -- | Hash specification (algorithm + domain strings).
data NodeHashSpec = NodeHashSpec data NodeHashSpec = NodeHashSpec
{ nodeHashAlgorithm :: Text { nodeHashAlgorithm :: Text
, nodeHashDomain :: Text , nodeHashDomain :: Text
} deriving (Show, Eq, Ord, Generic) } deriving (Show, Eq, Ord, Generic)
toCBORNodeHashSpec :: NodeHashSpec -> Encoding
toCBORNodeHashSpec (NodeHashSpec alg dom) =
cmkPairs
[ ("algorithm", encText alg)
, ("domain", encText dom)
]
nodeHashSpecFromCBOR :: Decoder s NodeHashSpec
nodeHashSpecFromCBOR = do
n <- decodeMapLen
unless (n == 2) $ fail "NodeHashSpec: must have exactly 2 entries"
decodeKey "algorithm"
alg <- decodeString
decodeKey "domain"
dom <- decodeString
pure (NodeHashSpec alg dom)
-- | Tree specification. -- | Tree specification.
data TreeSpec = TreeSpec data TreeSpec = TreeSpec
{ treeCalculus :: Text { treeCalculus :: Text
@@ -184,26 +390,6 @@ data TreeSpec = TreeSpec
, treeNodePayload :: Text , treeNodePayload :: Text
} deriving (Show, Eq, Ord, Generic) } deriving (Show, Eq, Ord, Generic)
toCBORTreeSpec :: TreeSpec -> Encoding
toCBORTreeSpec (TreeSpec calc hspec payload) =
cmkPairs
[ ("calculus", encText calc)
, ("nodeHash", toCBORNodeHashSpec hspec)
, ("nodePayload", encText payload)
]
treeSpecFromCBOR :: Decoder s TreeSpec
treeSpecFromCBOR = do
n <- decodeMapLen
unless (n == 3) $ fail "TreeSpec: must have exactly 3 entries"
decodeKey "calculus"
calc <- decodeString
decodeKey "nodeHash"
hspec <- nodeHashSpecFromCBOR
decodeKey "nodePayload"
payload <- decodeString
pure (TreeSpec calc hspec payload)
-- | Runtime specification. -- | Runtime specification.
data RuntimeSpec = RuntimeSpec data RuntimeSpec = RuntimeSpec
{ runtimeSemantics :: Text { runtimeSemantics :: Text
@@ -212,53 +398,12 @@ data RuntimeSpec = RuntimeSpec
, runtimeCapabilities :: [Text] , runtimeCapabilities :: [Text]
} deriving (Show, Eq, Ord, Generic) } deriving (Show, Eq, Ord, Generic)
toCBORRuntimeSpec :: RuntimeSpec -> Encoding
toCBORRuntimeSpec (RuntimeSpec sem eval abi caps) =
cmkPairs
[ ("semantics", encText sem)
, ("evaluation", encText eval)
, ("abi", encText abi)
, ("capabilities", cakSeq (map encText caps))
]
runtimeSpecFromCBOR :: Decoder s RuntimeSpec
runtimeSpecFromCBOR = do
n <- decodeMapLen
unless (n == 4) $ fail "RuntimeSpec: must have exactly 4 entries"
decodeKey "semantics"
sem <- decodeString
decodeKey "evaluation"
eval <- decodeString
decodeKey "abi"
abi <- decodeString
decodeKey "capabilities"
clen <- decodeListLen
caps <- decodeListN decodeString clen
pure (RuntimeSpec sem eval abi caps)
-- | A root hash reference. -- | A root hash reference.
data BundleRoot = BundleRoot data BundleRoot = BundleRoot
{ rootHash :: MerkleHash { rootHash :: MerkleHash
, rootRole :: Text , rootRole :: Text
} deriving (Show, Eq, Ord, Generic) } deriving (Show, Eq, Ord, Generic)
toCBORBundleRoot :: BundleRoot -> Encoding
toCBORBundleRoot (BundleRoot h role) =
cmkPairs
[ ("hash", encBytes (merkleHashToRaw h))
, ("role", encText role)
]
bundleRootFromCBOR :: Decoder s BundleRoot
bundleRootFromCBOR = do
n <- decodeMapLen
unless (n == 2) $ fail "BundleRoot: must have exactly 2 entries"
decodeKey "hash"
hRaw <- decodeBytes
decodeKey "role"
role <- decodeString
pure (BundleRoot (rawToMerkleHash hRaw) role)
-- | An export entry. -- | An export entry.
data BundleExport = BundleExport data BundleExport = BundleExport
{ exportName :: Text { exportName :: Text
@@ -267,29 +412,6 @@ data BundleExport = BundleExport
, exportAbi :: Text , exportAbi :: Text
} deriving (Show, Eq, Ord, Generic) } deriving (Show, Eq, Ord, Generic)
toCBORBundleExport :: BundleExport -> Encoding
toCBORBundleExport (BundleExport name h kind abi) =
cmkPairs
[ ("name", encText name)
, ("root", encBytes (merkleHashToRaw h))
, ("kind", encText kind)
, ("abi", encText abi)
]
bundleExportFromCBOR :: Decoder s BundleExport
bundleExportFromCBOR = do
n <- decodeMapLen
unless (n == 4) $ fail "BundleExport: must have exactly 4 entries"
decodeKey "name"
name <- decodeString
decodeKey "root"
hRaw <- decodeBytes
decodeKey "kind"
kind <- decodeString
decodeKey "abi"
abi <- decodeString
pure (BundleExport name (rawToMerkleHash hRaw) kind abi)
-- | Optional package metadata. -- | Optional package metadata.
data BundleMetadata = BundleMetadata data BundleMetadata = BundleMetadata
{ metadataPackage :: Maybe Text { metadataPackage :: Maybe Text
@@ -299,33 +421,6 @@ data BundleMetadata = BundleMetadata
, metadataCreatedBy :: Maybe Text , metadataCreatedBy :: Maybe Text
} deriving (Show, Eq, Ord, Generic) } deriving (Show, Eq, Ord, Generic)
metadataFromCBOR :: Decoder s BundleMetadata
metadataFromCBOR = do
mlen <- decodeMapLen
entries <- decodeMapN decodeString decodeString mlen
let lookupText k = go k entries
go _ [] = Nothing
go k ((k', v):rest)
| k == k' = Just v
| otherwise = go k rest
pure BundleMetadata
{ metadataPackage = lookupText "package"
, metadataVersion = lookupText "version"
, metadataDescription = lookupText "description"
, metadataLicense = lookupText "license"
, metadataCreatedBy = lookupText "createdBy"
}
metadataToCBOR :: BundleMetadata -> Encoding
metadataToCBOR (BundleMetadata pkg ver desc lic by) =
let pairs =
maybe [] (\v -> [("package", encText v)]) pkg
++ maybe [] (\v -> [("version", encText v)]) ver
++ maybe [] (\v -> [("description", encText v)]) desc
++ maybe [] (\v -> [("license", encText v)]) lic
++ maybe [] (\v -> [("createdBy", encText v)]) by
in cmkPairs pairs
-- | The manifest: top-level bundle metadata. -- | The manifest: top-level bundle metadata.
data BundleManifest = BundleManifest data BundleManifest = BundleManifest
{ manifestSchema :: Text { manifestSchema :: Text
@@ -338,43 +433,6 @@ data BundleManifest = BundleManifest
, manifestMetadata :: BundleMetadata , manifestMetadata :: BundleMetadata
} deriving (Show, Eq, Generic) } deriving (Show, Eq, Generic)
manifestToCBOR :: BundleManifest -> Encoding
manifestToCBOR m =
cmkPairs
[ ("schema", encText (manifestSchema m))
, ("bundleType", encText (manifestBundleType m))
, ("tree", toCBORTreeSpec (manifestTree m))
, ("runtime", toCBORRuntimeSpec (manifestRuntime m))
, ("closure", toCBORClosure (manifestClosure m))
, ("roots", cakSeq (map toCBORBundleRoot (manifestRoots m)))
, ("exports", cakSeq (map toCBORBundleExport (manifestExports m)))
, ("metadata", metadataToCBOR (manifestMetadata m))
]
manifestFromCBOR :: Decoder s BundleManifest
manifestFromCBOR = do
n <- decodeMapLen
unless (n == 8) $ fail "BundleManifest: must have exactly 8 entries"
decodeKey "schema"
schema <- decodeString
decodeKey "bundleType"
bundleType <- decodeString
decodeKey "tree"
tree <- treeSpecFromCBOR
decodeKey "runtime"
runtime <- runtimeSpecFromCBOR
decodeKey "closure"
closure <- closureFromCBOR
decodeKey "roots"
rlen <- decodeListLen
roots <- decodeListN bundleRootFromCBOR rlen
decodeKey "exports"
elen <- decodeListLen
exports <- decodeListN bundleExportFromCBOR elen
decodeKey "metadata"
metadata <- metadataFromCBOR
pure (BundleManifest schema bundleType tree runtime closure roots exports metadata)
-- | Portable executable-object bundle. -- | Portable executable-object bundle.
-- --
-- Merkle node payloads remain the language-neutral executable core: -- Merkle node payloads remain the language-neutral executable core:
@@ -388,28 +446,12 @@ data Bundle = Bundle
, bundleManifestBytes :: ByteString , bundleManifestBytes :: ByteString
} deriving (Show, Eq) } deriving (Show, Eq)
-- ---------------------------------------------------------------------------
-- CBOR manifest serialization
-- ---------------------------------------------------------------------------
-- | Encode the manifest as canonical CBOR.
encodeManifest :: BundleManifest -> ByteString
encodeManifest m = BL.toStrict (toLazyByteString (manifestToCBOR m))
-- | Decode a manifest from CBOR bytes.
decodeManifest :: ByteString -> Either String BundleManifest
decodeManifest bs =
case deserialiseFromBytes manifestFromCBOR (BL.fromStrict bs) of
Right (rest, m)
| BS.null (BL.toStrict rest) -> Right m
| otherwise -> Left "trailing bytes after manifest CBOR"
Left (DeserialiseFailure _ msg) -> Left msg
-- --------------------------------------------------------------------------- -- ---------------------------------------------------------------------------
-- Bundle encoding -- Bundle encoding
-- --------------------------------------------------------------------------- -- ---------------------------------------------------------------------------
-- | Encode a Bundle to portable Bundle v1 bytes. -- | Encode a Bundle to portable Bundle v1 bytes.
-- The manifest is serialized using the fixed-order core + TLV tail format.
encodeBundle :: Bundle -> ByteString encodeBundle :: Bundle -> ByteString
encodeBundle bundle = encodeBundle bundle =
let nodeSection = encodeNodeSection (bundleNodes bundle) let nodeSection = encodeNodeSection (bundleNodes bundle)

View File

@@ -49,6 +49,7 @@ tests = testGroup "Tricu Tests"
, wireTests , wireTests
, byteListUtilities , byteListUtilities
, binaryReaderTests , binaryReaderTests
, manifestReadingTests
] ]
lexer :: TestTree lexer :: TestTree
@@ -2182,3 +2183,677 @@ binaryReaderTests = testGroup "Binary Reader Tests"
let env = evalTricu library (parseTricu input) let env = evalTricu library (parseTricu input)
result env @?= Fork Leaf Leaf result env @?= Fork Leaf Leaf
] ]
-- ---------------------------------------------------------------------------
-- Manifest reading tests (Steps 1-9)
-- ---------------------------------------------------------------------------
-- Build a minimal manifest:
-- magic "ARBMNFST" (8) + version 1.0 (4) +
-- schema "arboricx.bundle.manifest.v1" (4+27=31) +
-- bundleType "tree-calculus-executable-object" (4+31=35) +
-- treeCalculus "tree-calculus.v1" (4+16=20) +
-- treeHashAlgorithm "sha256" (4+6=10) +
-- treeHashDomain "arboricx.merkle.node.v1" (4+23=27) +
-- treeNodePayload "arboricx.merkle.payload.v1" (4+26=30) +
-- runtimeSemantics "tree-calculus.v1" (4+16=20) +
-- runtimeEvaluation "normal-order" (4+12=16) +
-- runtimeAbi "arboricx.abi.tree.v1" (4+20=24) +
-- capabilityCount 0 (4) +
-- closure 0 (1) +
-- rootCount 1 (4) +
-- root: hash (32) + role "default" (4+7=11) = 43 +
-- exportCount 1 (4) +
-- export: name "term" (4+4=8) + root (32) + kind "term" (4+4=8) + abi "arboricx.abi.tree.v1" (4+20=24) = 72 +
-- Total core = 8+4+31+35+20+10+27+30+20+16+24+4+1+4+43+4+72 = 378 bytes
minimalManifestCoreBytes :: [Integer]
minimalManifestCoreBytes = [65,82,66,77,78,70,83,84] -- ARBMNFST magic
++ u16 1 ++ u16 0 -- version 1.0
++ lengthPrefixed "arboricx.bundle.manifest.v1" -- schema
++ lengthPrefixed "tree-calculus-executable-object" -- bundleType
++ lengthPrefixed "tree-calculus.v1" -- treeCalculus
++ lengthPrefixed "sha256" -- treeHashAlgorithm
++ lengthPrefixed "arboricx.merkle.node.v1" -- treeHashDomain
++ lengthPrefixed "arboricx.merkle.payload.v1" -- treeNodePayload
++ lengthPrefixed "tree-calculus.v1" -- runtimeSemantics
++ lengthPrefixed "normal-order" -- runtimeEvaluation
++ lengthPrefixed "arboricx.abi.tree.v1" -- runtimeAbi
++ u32 0 -- 0 capabilities
++ [0] -- closure complete
++ u32 1 -- 1 root
++ replicate 32 0 -- placeholder root hash
++ lengthPrefixed "default" -- root role
++ u32 1 -- 1 export
++ lengthPrefixed "term" -- export name
++ replicate 32 0 -- placeholder export root hash
++ lengthPrefixed "term" -- export kind
++ lengthPrefixed "arboricx.abi.tree.v1" -- export abi
lengthPrefixed :: String -> [Integer]
lengthPrefixed s = u32 (fromIntegral (length s)) ++ map (fromIntegral . fromEnum) s
-- Full manifest: core + 0 metadata + 0 extension = core + u32(0) + u32(0)
fullMinimalManifestBytes :: [Integer]
fullMinimalManifestBytes = minimalManifestCoreBytes ++ u32 0 ++ u32 0
-- Create TLV list with two entries:
-- tag 1 (package), value "my-pkg", then tag 2 (version), value "1.0"
-- then "rest" bytes
tlvForTagAndValue :: Integer -> String -> [Integer]
tlvForTagAndValue tag val =
u16 (fromIntegral tag) ++ lengthPrefixed val
-- Build a pair of (tag, value) TLV
makeTLVPair :: Integer -> String -> String
makeTLVPair tag val =
"[(pair " ++ bytesExpr [0, fromIntegral tag] ++ " "
++ bytesExpr (map (fromIntegral . fromEnum) val) ++ ")]"
exportEntryExpr :: String -> [Integer] -> String -> String -> String
exportEntryExpr name rootHashBytes kind abi =
"(pair " ++ bytesExpr (map (fromIntegral . fromEnum) name) ++ " "
++ "(pair " ++ bytesExpr rootHashBytes ++ " "
++ "(pair " ++ bytesExpr (map (fromIntegral . fromEnum) kind) ++ " "
++ bytesExpr (map (fromIntegral . fromEnum) abi) ++ ")))"
-- Build list of export entries for the test
singleExportExpr :: String
singleExportExpr =
"[" ++ exportEntryExpr "main" (replicate 32 0) "term" "arboricx.abi.tree.v1" ++ "]"
multiExportExpr :: String
multiExportExpr =
"["
++ exportEntryExpr "main" (replicate 32 0) "term" "arboricx.abi.tree.v1"
-- ++ ", "
++ exportEntryExpr "test" (replicate 32 1) "term" "arboricx.abi.tree.v1"
++ "]"
-- Helper to build a minimal valid manifest core
-- Returns a tricu expression representing the parsed core structure
buildValidCoreExpr :: String
buildValidCoreExpr =
"(pair "
++ bytesExpr (map (fromIntegral . fromEnum) "arboricx.bundle.manifest.v1") ++ " " -- schema
++ "(pair "
++ bytesExpr (map (fromIntegral . fromEnum) "tree-calculus-executable-object") ++ " " -- bundleType
++ "(pair "
++ bytesExpr (map (fromIntegral . fromEnum) "tree-calculus.v1") ++ " " -- treeCalculus
++ "(pair "
++ bytesExpr (map (fromIntegral . fromEnum) "sha256") ++ " " -- treeHashAlgorithm
++ "(pair "
++ bytesExpr (map (fromIntegral . fromEnum) "arboricx.merkle.node.v1") ++ " " -- treeHashDomain
++ "(pair "
++ bytesExpr (map (fromIntegral . fromEnum) "arboricx.merkle.payload.v1") ++ " " -- treeNodePayload
++ "(pair "
++ bytesExpr (map (fromIntegral . fromEnum) "tree-calculus.v1") ++ " " -- runtimeSemantics
++ "(pair "
++ bytesExpr (map (fromIntegral . fromEnum) "normal-order") ++ " " -- runtimeEvaluation
++ "(pair "
++ bytesExpr (map (fromIntegral . fromEnum) "arboricx.abi.tree.v1") ++ " " -- runtimeAbi
++ "(pair "
++ "[] " -- capabilities
++ "(pair "
++ "0 " -- closure
++ "(pair "
++ "[(pair " ++ bytesExpr (replicate 32 0) ++ " "
++ bytesExpr (map (fromIntegral . fromEnum) "default") ++ ")" -- roots (1 root)
++ "] "
++ "[(pair "
++ bytesExpr (map (fromIntegral . fromEnum) "term") ++ " "
++ "(pair " ++ bytesExpr (replicate 32 0) ++ " "
++ "(pair "
++ bytesExpr (map (fromIntegral . fromEnum) "term") ++ " "
++ bytesExpr (map (fromIntegral . fromEnum) "arboricx.abi.tree.v1") ++ ")))" -- exports (1 export)
++ "])"
++ "]"
++ "]"
++ "]"
++ "]"
++ "]"
++ "]"
++ "]"
++ "]"
++ "]"
++ "]"
++ "]"
++ "]"
++ ")"
-- Build a tricu expression that extracts a specific manifest field from
-- readArboricxBundle result and returns it as a byte-list T value.
-- The Haskell test then uses toString to convert it to a String.
extractManifestField :: ByteString -> String -> String
extractManifestField fixtureBytes fieldName =
"matchResult "
++ " (errCode rest : errCode) "
++ " (bundleResult rest : "
++ " matchPair "
++ " (validCore metadataWithExtensions : "
++ " " ++ fieldName ++ " validCore) "
++ " bundleResult) "
++ " (readArboricxBundle " ++ bytesExpr (map toInteger $ BS.unpack fixtureBytes) ++ ")"
manifestReadingTests :: TestTree
manifestReadingTests = testGroup "Manifest Reading Tests"
[
-- ------------------------------------------------------------------------
-- Step 1: readManifestMagic
-- ------------------------------------------------------------------------
testCase "readManifestMagic: accepts correct manifest magic and preserves rest" $ do
let input = "readManifestMagic ((append arboricxManifestMagic) [(1) (2)])"
library <- evaluateFile "./lib/arboricx.tri"
let env = evalTricu library (parseTricu input)
result env @?= okT unitT (bytesT [1,2])
, testCase "readManifestMagic: rejects wrong magic" $ do
let input = "readManifestMagic [(65) (83) (66) (77) (78) (70) (83) (84)]"
library <- evaluateFile "./lib/arboricx.tri"
let env = evalTricu library (parseTricu input)
result env @?= errT unexpectedBytesT (bytesT [65,83,66,77,78,70,83,84])
, testCase "readManifestMagic: short input returns EOF" $ do
let input = "readManifestMagic [(65) (82) (66) (77)]"
library <- evaluateFile "./lib/arboricx.tri"
let env = evalTricu library (parseTricu input)
result env @?= errT eofT (bytesT [65,82,66,77])
-- ------------------------------------------------------------------------
-- Step 2: readLengthPrefixedString
-- ------------------------------------------------------------------------
, testCase "readLengthPrefixedString: reads a 5-byte string" $ do
let input = "readLengthPrefixedString [(0) (0) (0) (5) (104) (101) (108) (108) (111) (99) (111) (110) (116) (101) (114)]"
library <- evaluateFile "./lib/arboricx.tri"
let env = evalTricu library (parseTricu input)
result env @?= okT (bytesT [104,101,108,108,111]) (bytesT [99,111,110,116,101,114])
, testCase "readLengthPrefixedString: reads an empty string" $ do
let input = "readLengthPrefixedString [(0) (0) (0) (0) (97) (98)]"
library <- evaluateFile "./lib/arboricx.tri"
let env = evalTricu library (parseTricu input)
result env @?= okT (bytesT []) (bytesT [97,98])
, testCase "readLengthPrefixedString: short payload returns EOF" $ do
let input = "readLengthPrefixedString [(0) (0) (0) (5) (104) (101) (108)]"
library <- evaluateFile "./lib/arboricx.tri"
let env = evalTricu library (parseTricu input)
result env @?= errT eofT (bytesT [104,101,108])
-- ------------------------------------------------------------------------
-- Step 3: readManifestCore (construct a minimal valid manifest)
-- ------------------------------------------------------------------------
, testCase "readManifestCore: reads a minimal valid manifest core" $ do
let input = "readManifestCore " ++ bytesExpr minimalManifestCoreBytes
library <- evaluateFile "./lib/arboricx.tri"
let env = evalTricu library (parseTricu input)
let actualResult = result env
case actualResult of
(Fork Leaf Leaf) -> assertFailure "should be ok, not t"
(Fork _ (Fork _ rest)) -> return () -- ok case: pair true (pair value rest)
_ -> assertFailure $ "expected ok result, got: " ++ show actualResult
, testCase "readManifestCore: returns error on wrong magic" $ do
let badMagic = [65,83,66,77,78,70,83,84] ++ (drop 8 minimalManifestCoreBytes)
let input = "readManifestCore " ++ bytesExpr badMagic
library <- evaluateFile "./lib/arboricx.tri"
let env = evalTricu library (parseTricu input)
let actualResult = result env
case actualResult of
(Fork falseT _) -> return () -- err case: pair false (pair code rest)
_ -> assertFailure $ "expected err result, got: " ++ show actualResult
-- ------------------------------------------------------------------------
-- Step 4: TLV reader
-- ------------------------------------------------------------------------
, testCase "readTLV: reads a metadata TLV entry" $ do
-- tag = u16 1 = [(0)(1)], length = u32 3 = [(0)(0)(0)(3)], value = "foo" = [102,111,111]
let input = "readTLV [(0) (1) (0) (0) (0) (3) (102) (111) (111) (99) (111) (110) (116) (114) (101) (115) (116)]"
library <- evaluateFile "./lib/arboricx.tri"
let env = evalTricu library (parseTricu input)
let actualResult = result env
case actualResult of
(Fork _ (Fork _ rest)) -> do
-- ok case: verify the value pair
let value = case result env of
(Fork _ (Fork val _)) -> case val of
(Fork tagVal _) -> tagVal
_ -> Leaf
return ()
_ -> assertFailure $ "expected ok result, got: " ++ show actualResult
, testCase "readTLV: returns EOF on empty input" $ do
let input = "readTLV []"
library <- evaluateFile "./lib/arboricx.tri"
let env = evalTricu library (parseTricu input)
result env @?= errT eofT (bytesT [])
, testCase "readTLV: returns EOF on short tag" $ do
let input = "readTLV [(0)]"
library <- evaluateFile "./lib/arboricx.tri"
let env = evalTricu library (parseTricu input)
result env @?= errT eofT (bytesT [0])
, testCase "readTLVList: reads zero TLV entries" $ do
let input = "readTLVList 0 [(1) (2) (3)]"
library <- evaluateFile "./lib/arboricx.tri"
let env = evalTricu library (parseTricu input)
result env @?= okT (ofList []) (bytesT [1,2,3])
, testCase "readTLVList: reads one TLV entry and preserves rest" $ do
-- tag=1, len=3, value="foo"
let input = "readTLVList 1 [(0) (1) (0) (0) (0) (3) (102) (111) (111) (99) (111) (110) (116) (114) (101) (115) (116)]"
library <- evaluateFile "./lib/arboricx.tri"
let env = evalTricu library (parseTricu input)
let actualResult = result env
case actualResult of
(Fork _ (Fork _ rest)) -> do
-- ok: value is list with one TLV, rest should be [(99)...]
return ()
_ -> assertFailure $ "expected ok result, got: " ++ show actualResult
-- ------------------------------------------------------------------------
-- Step 5: readManifest (full parser)
-- ------------------------------------------------------------------------
, testCase "readManifest: parses a minimal manifest with no metadata" $ do
let input = "readManifest " ++ bytesExpr fullMinimalManifestBytes
library <- evaluateFile "./lib/arboricx.tri"
let env = evalTricu library (parseTricu input)
let actualResult = result env
case actualResult of
(Fork _ (Fork _ _)) -> return () -- ok result
_ -> assertFailure $ "expected ok result, got: " ++ show actualResult
, testCase "readManifest: preserves trailing extension bytes" $ do
let input = "readManifest (append " ++ bytesExpr fullMinimalManifestBytes ++ " [(99) (111) (110) (116) (101) (110) (116) (101) (114)])"
library <- evaluateFile "./lib/arboricx.tri"
let env = evalTricu library (parseTricu input)
let actualResult = result env
case actualResult of
(Fork trueTag (Fork _ _)) | trueTag == trueT -> return ()
_ -> assertFailure $ "expected ok result, got: " ++ show actualResult
-- ------------------------------------------------------------------------
-- Step 6: lookupMetadata
-- ------------------------------------------------------------------------
, testCase "lookupMetadata: finds metadata by tag" $ do
let tlv1 = makeTLVPair 1 "my-pkg"
let tlv2 = makeTLVPair 2 "1.0"
let input = "lookupMetadata (" ++ tlv1 ++ ") " ++ bytesExpr [(0), (1)]
library <- evaluateFile "./lib/arboricx.tri"
let env = evalTricu library (parseTricu input)
result env @?= justT (bytesT [109,121,45,112,107,103])
, testCase "lookupMetadata: returns nothing for unknown tag" $ do
let tlv1 = makeTLVPair 1 "my-pkg"
let input = "lookupMetadata " ++ tlv1 ++ " " ++ bytesExpr [(0), (2)]
library <- evaluateFile "./lib/arboricx.tri"
let env = evalTricu library (parseTricu input)
result env @?= nothingT
, testCase "lookupMetadata: returns nothing for empty list" $ do
let input = "lookupMetadata [] " ++ bytesExpr [(0), (1)]
library <- evaluateFile "./lib/arboricx.tri"
let env = evalTricu library (parseTricu input)
result env @?= nothingT
-- ------------------------------------------------------------------------
-- Step 7: Export selection
-- ------------------------------------------------------------------------
-- Build export entry: (pair name (pair rootHash (pair kind abi)))
-- Test: select export by explicit name ("main")
, testCase "selectExport: finds export by explicit name" $ do
let input = "selectExport " ++ multiExportExpr ++ " " ++ bytesExpr (map (fromIntegral . fromEnum) "main")
library <- evaluateFile "./lib/arboricx.tri"
let env = evalTricu library (parseTricu input)
let actualResult = result env
case actualResult of
(Fork _ (Fork _ _)) -> return () -- ok result
_ -> assertFailure $ "expected ok result, got: " ++ show actualResult
-- Test: selectExport prefers "main" when no explicit name
, testCase "selectExport: selects 'main' when no explicit name and 'main' exists" $ do
let input = "selectExport " ++ multiExportExpr ++ " " ++ bytesExpr []
library <- evaluateFile "./lib/arboricx.tri"
let env = evalTricu library (parseTricu input)
let actualResult = result env
case actualResult of
(Fork _ (Fork _ _)) -> return () -- ok result
_ -> assertFailure $ "expected ok result, got: " ++ show actualResult
-- Test: selectExport selects single export when only one exists
, testCase "selectExport: auto-selects single export" $ do
let input = "selectExport " ++ singleExportExpr ++ " " ++ bytesExpr []
library <- evaluateFile "./lib/arboricx.tri"
let env = evalTricu library (parseTricu input)
let actualResult = result env
case actualResult of
(Fork _ (Fork _ _)) -> return () -- ok result
_ -> assertFailure $ "expected ok result, got: " ++ show actualResult
-- Test: getExportNames lists all export names
, testCase "getExportNames: returns list of all export names" $ do
let input = "getExportNames " ++ multiExportExpr
library <- evaluateFile "./lib/arboricx.tri"
let env = evalTricu library (parseTricu input)
let actualResult = result env
-- Should return a list of two byte strings
case actualResult of
(Fork (Fork _ _) (Fork (Fork _ _) _)) -> return () -- list with 2 items
_ -> assertFailure $ "expected list of 2 items, got: " ++ show actualResult
-- Test: selectExport errors when multiple exports but no "main" and no explicit name
, testCase "selectExport: errors with multiple exports but no 'main'" $ do
let multiNoMain =
"["
++ exportEntryExpr "validate" (replicate 32 0) "term" "arboricx.abi.tree.v1"
++ " "
++ exportEntryExpr "test" (replicate 32 1) "term" "arboricx.abi.tree.v1"
++ "]"
let input = "selectExport " ++ multiNoMain ++ " " ++ bytesExpr []
library <- evaluateFile "./lib/arboricx.tri"
let env = evalTricu library (parseTricu input)
let actualResult = result env
case actualResult of
(Fork falseT _) -> return () -- err result
_ -> assertFailure $ "expected err result, got: " ++ show actualResult
-- Test: selectExportOpt works with Just bytes (explicit name given)
, testCase "selectExportOpt: selects by explicit name when given" $ do
let input = "selectExportOpt " ++ multiExportExpr ++ " " ++ bytesExpr (map (fromIntegral . fromEnum) "validate")
library <- evaluateFile "./lib/arboricx.tri"
let env = evalTricu library (parseTricu input)
let actualResult = result env
case actualResult of
(Fork _ (Fork _ _)) -> return () -- ok result
_ -> assertFailure $ "expected ok result, got: " ++ show actualResult
-- ------------------------------------------------------------------------
-- Step 8: validateManifestCore
-- ------------------------------------------------------------------------
, testCase "validateManifestCore: passes on valid core" $ do
let input = "matchResult (code rest : err code rest) (core rest : validateManifestCore core " ++ bytesExpr [(1), (2)] ++ ") (readManifestCore " ++ bytesExpr minimalManifestCoreBytes ++ ")"
library <- evaluateFile "./lib/arboricx.tri"
let env = evalTricu library (parseTricu input)
let actualResult = result env
case actualResult of
(Fork trueTag (Fork _ rest)) | trueTag == trueT -> rest @?= bytesT [1,2]
_ -> assertFailure $ "expected ok result, got: " ++ show actualResult
, testCase "validateManifestCore: fails on wrong schema" $ do
let badCoreBytes = take 16 minimalManifestCoreBytes ++ map (fromIntegral . fromEnum) "z" ++ drop 17 minimalManifestCoreBytes
let input = "matchResult (code rest : err code rest) (core rest : validateManifestCore core " ++ bytesExpr [] ++ ") (readManifestCore " ++ bytesExpr badCoreBytes ++ ")"
library <- evaluateFile "./lib/arboricx.tri"
let env = evalTricu library (parseTricu input)
let actualResult = result env
case actualResult of
(Fork falseTag _) | falseTag == falseT -> return ()
_ -> assertFailure $ "expected err result, got: " ++ show actualResult
-- ------------------------------------------------------------------------
-- Step 9: readArboricxBundle (end-to-end with real fixture)
-- ------------------------------------------------------------------------
, testCase "readArboricxBundle: parses id.arboricx fixture" $ do
fixtureBytes <- BS.readFile "test/fixtures/id.arboricx"
case decodeBundle fixtureBytes of
Left err -> assertFailure $ "decodeBundle failed: " ++ err
Right bundle -> do
let manifestBytes = bundleManifestBytes bundle
-- The manifest section should be parseable
let input = "readManifest " ++ bytesExpr (map toInteger (BS.unpack manifestBytes))
library <- evaluateFile "./lib/arboricx.tri"
let env = evalTricu library (parseTricu input)
let actualResult = result env
case actualResult of
(Fork trueTag (Fork _ _)) | trueTag == trueT -> return ()
_ -> assertFailure $ "readManifest failed on id.arboricx manifest: " ++ show actualResult
, testCase "readArboricxBundle: end-to-end bundle parse" $ do
fixtureBytes <- BS.readFile "test/fixtures/id.arboricx"
let input = "readArboricxBundle " ++ bytesExpr (map toInteger (BS.unpack fixtureBytes))
library <- evaluateFile "./lib/arboricx.tri"
let env = evalTricu library (parseTricu input)
let actualResult = result env
case actualResult of
(Fork _ (Fork _ _)) -> return () -- ok: (pair validManifest afterManifest)
_ -> assertFailure $ "readArboricxBundle failed: " ++ show actualResult
, testCase "readArboricxBundle: rejects bundle with wrong manifest core" $ do
fixtureBytes <- BS.readFile "test/fixtures/id.arboricx"
-- Modify a byte in the manifest section to invalidate it
-- The manifest starts at offset 152 in the bundle (from header dirOffset=32)
-- Section directory: 2 entries * 60 = 120 bytes, starting at offset 32
-- Manifest entry at directory offset 32: type(4) + version(2) + flags(2) + compression(2) + digestAlg(2) + offset(8) + length(8) + digest(32) = 60
-- Manifest offset = 32 + 60 = 92
-- The manifest itself starts at offset 152 (0x98)
-- Change byte at position 152+8 = 160 from 'a' (97) to 'z' (122) to break the schema string
let bs = map toInteger (BS.unpack fixtureBytes)
let modifiedBs = take 160 bs ++ [122] ++ drop 161 bs
let input = "readArboricxBundle " ++ bytesExpr modifiedBs
library <- evaluateFile "./lib/arboricx.tri"
let env = evalTricu library (parseTricu input)
let actualResult = result env
case actualResult of
(Fork falseT _) -> return () -- err result (validation failure)
_ -> assertFailure $ "expected err result, got: " ++ show actualResult
-- ------------------------------------------------------------------------
-- Comprehensive end-to-end: extract manifest fields and verify as strings
-- ------------------------------------------------------------------------
, testCase "readArboricxBundle: extracts and validates manifest schema" $ do
fixtureBytes <- BS.readFile "test/fixtures/id.arboricx"
let input = extractManifestField fixtureBytes "manifestSchema"
library <- evaluateFile "./lib/arboricx.tri"
let env = evalTricu library (parseTricu input)
let schemaT = result env
toString schemaT @?= Right "arboricx.bundle.manifest.v1"
, testCase "readArboricxBundle: extracts and validates bundleType" $ do
fixtureBytes <- BS.readFile "test/fixtures/id.arboricx"
let input = extractManifestField fixtureBytes "manifestBundleType"
library <- evaluateFile "./lib/arboricx.tri"
let env = evalTricu library (parseTricu input)
let bundleTypeT = result env
toString bundleTypeT @?= Right "tree-calculus-executable-object"
, testCase "readArboricxBundle: extracts and validates runtime evaluation" $ do
fixtureBytes <- BS.readFile "test/fixtures/id.arboricx"
let input = extractManifestField fixtureBytes "manifestRuntimeEvaluation"
library <- evaluateFile "./lib/arboricx.tri"
let env = evalTricu library (parseTricu input)
let evalT = result env
toString evalT @?= Right "normal-order"
, testCase "readArboricxBundle: extracts and validates runtime ABI" $ do
fixtureBytes <- BS.readFile "test/fixtures/id.arboricx"
let input = extractManifestField fixtureBytes "manifestRuntimeAbi"
library <- evaluateFile "./lib/arboricx.tri"
let env = evalTricu library (parseTricu input)
let abiT = result env
toString abiT @?= Right "arboricx.abi.tree.v1"
, testCase "readArboricxBundle: extracts and validates root names" $ do
fixtureBytes <- BS.readFile "test/fixtures/id.arboricx"
let input = "matchResult "
++ " (errCode rest : errCode) "
++ " (bundleResult rest : "
++ " matchPair "
++ " (validCore metadataWithExtensions : "
++ " matchList "
++ " (err 99 t) " -- empty roots
++ " (rootEntry rest : "
++ " matchPair "
++ " (_ roleField : roleField) "
++ " rootEntry) "
++ " (manifestRoots validCore)) "
++ " bundleResult) "
++ " (readArboricxBundle " ++ bytesExpr (map toInteger $ BS.unpack fixtureBytes) ++ ")"
library <- evaluateFile "./lib/arboricx.tri"
let env = evalTricu library (parseTricu input)
let rootRoleT = result env
-- Should find at least one root with a role (either "default" or "root")
case toString rootRoleT of
Right role -> assertBool "root role should be 'default' or 'root'"
(role == "default" || role == "root")
Left err -> assertFailure $ "failed to extract root role: " ++ err
, testCase "readArboricxBundle: extracts and validates closure" $ do
fixtureBytes <- BS.readFile "test/fixtures/id.arboricx"
let input = "matchResult "
++ " (errCode rest : errCode) "
++ " (bundleResult rest : "
++ " matchPair "
++ " (validCore metadataWithExtensions : "
++ " matchPair "
++ " (closure _ : closure) "
++ " (manifestClosureByte validCore)) "
++ " bundleResult) "
++ " (readArboricxBundle " ++ bytesExpr (map toInteger $ BS.unpack fixtureBytes) ++ ")"
library <- evaluateFile "./lib/arboricx.tri"
let env = evalTricu library (parseTricu input)
let closureT = result env
case toNumber closureT of
Right 0 -> return ()
Right n -> assertFailure $ "closure should be 0, got " ++ show n
Left err -> assertFailure $ "failed to extract closure: " ++ err
, testCase "readArboricxBundle: extracts and validates hash algorithm" $ do
fixtureBytes <- BS.readFile "test/fixtures/id.arboricx"
let input = extractManifestField fixtureBytes "manifestTreeHashAlgorithm"
library <- evaluateFile "./lib/arboricx.tri"
let env = evalTricu library (parseTricu input)
let algoT = result env
toString algoT @?= Right "sha256"
, testCase "readArboricxExecutable: reconstructs default export tree" $ do
(srcConn, termHash, originalTerm) <- storeTermInTempDB $ unlines
[ "main = t t" ]
wireData <- exportBundle srcConn [termHash]
let input = "matchResult "
++ " (code rest : err code rest) "
++ " (tree rest : ok tree []) "
++ " (readArboricxExecutable " ++ bytesExpr (map toInteger $ BS.unpack wireData) ++ ")"
library <- evaluateFile "./lib/arboricx.tri"
let env = evalTricu library (parseTricu input)
result env @?= okT originalTerm (bytesT [])
close srcConn
, testCase "readArboricxExecutableByName: selects named export" $ do
srcConn <- newContentStore
let parsed = parseTricu $ unlines
[ "leaf = t"
, "stem = t t"
, "main = stem"
]
env = evalTricu Map.empty parsed
leafTerm = maybe (error "leaf missing") id (Map.lookup "leaf" env)
stemTerm = maybe (error "stem missing") id (Map.lookup "stem" env)
leafHash <- storeTerm srcConn ["leaf"] leafTerm
stemHash <- storeTerm srcConn ["stem"] stemTerm
wireData <- exportNamedBundle srcConn [("leaf", leafHash), ("stem", stemHash)]
let input = "matchResult "
++ " (code rest : err code rest) "
++ " (tree rest : ok tree []) "
++ " (readArboricxExecutableByName " ++ bytesExpr (map (fromIntegral . fromEnum) "stem") ++ " " ++ bytesExpr (map toInteger $ BS.unpack wireData) ++ ")"
library <- evaluateFile "./lib/arboricx.tri"
let resultEnv = evalTricu library (parseTricu input)
result resultEnv @?= okT stemTerm (bytesT [])
close srcConn
, testCase "runArboricx: applies host-provided argument to default export" $ do
(srcConn, termHash, _) <- storeTermInTempDB $ unlines
[ "main = (x : x)" ]
wireData <- exportBundle srcConn [termHash]
let input = "matchResult "
++ " (code rest : err code rest) "
++ " (value rest : value) "
++ " (runArboricx " ++ bytesExpr (map toInteger $ BS.unpack wireData) ++ " \"hello\")"
library <- evaluateFile "./lib/arboricx.tri"
let env = evalTricu library (parseTricu input)
toString (result env) @?= Right "hello"
close srcConn
, testCase "runArboricxArgs: applies host-provided argument list in order" $ do
(srcConn, termHash, _) <- storeTermInTempDB $ unlines
[ "main = (x y : x)" ]
wireData <- exportBundle srcConn [termHash]
let input = "matchResult "
++ " (code rest : err code rest) "
++ " (value rest : value) "
++ " (runArboricxArgs " ++ bytesExpr (map toInteger $ BS.unpack wireData) ++ " [(\"left\") (\"right\")])"
library <- evaluateFile "./lib/arboricx.tri"
let env = evalTricu library (parseTricu input)
toString (result env) @?= Right "left"
close srcConn
, testCase "host ABI: constructors expose tag and payload" $ do
library <- evaluateFile "./lib/arboricx.tri"
let stringInput = "hostString \"hello\""
stringEnv = evalTricu library (parseTricu stringInput)
result stringEnv @?= pairT (ofNumber 1) (ofString "hello")
let tagEnv = evalTricu library (parseTricu "hostValueTag (hostNumber 42)")
result tagEnv @?= ofNumber 2
let payloadEnv = evalTricu library (parseTricu "hostValuePayload (hostBool true)")
result payloadEnv @?= trueT
, testCase "runArboricxToTree: wraps raw result as hostTree" $ do
(srcConn, termHash, originalTerm) <- storeTermInTempDB $ unlines
[ "main = t t" ]
wireData <- exportBundle srcConn [termHash]
let input = "matchResult "
++ " (code rest : err code rest) "
++ " (hostValue rest : ok hostValue []) "
++ " (runArboricxToTree " ++ bytesExpr (map toInteger $ BS.unpack wireData) ++ " [])"
library <- evaluateFile "./lib/arboricx.tri"
let env = evalTricu library (parseTricu input)
result env @?= okT (pairT (ofNumber 0) originalTerm) (bytesT [])
close srcConn
, testCase "runArboricxToString: wraps string result as hostString" $ do
(srcConn, termHash, _) <- storeTermInTempDB $ unlines
[ "main = (x : x)" ]
wireData <- exportBundle srcConn [termHash]
let input = "matchResult "
++ " (code rest : err code rest) "
++ " (hostValue rest : ok hostValue []) "
++ " (runArboricxToString " ++ bytesExpr (map toInteger $ BS.unpack wireData) ++ " [(\"hello\")])"
library <- evaluateFile "./lib/arboricx.tri"
let env = evalTricu library (parseTricu input)
result env @?= okT (pairT (ofNumber 1) (ofString "hello")) (bytesT [])
close srcConn
, testCase "runArboricxToNumber: wraps number result as hostNumber" $ do
(srcConn, termHash, _) <- storeTermInTempDB $ unlines
[ "main = 42" ]
wireData <- exportBundle srcConn [termHash]
let input = "matchResult "
++ " (code rest : err code rest) "
++ " (hostValue rest : ok hostValue []) "
++ " (runArboricxToNumber " ++ bytesExpr (map toInteger $ BS.unpack wireData) ++ " [])"
library <- evaluateFile "./lib/arboricx.tri"
let env = evalTricu library (parseTricu input)
result env @?= okT (pairT (ofNumber 2) (ofNumber 42)) (bytesT [])
close srcConn
, testCase "runArboricxToBool: rejects non-bool result" $ do
(srcConn, termHash, _) <- storeTermInTempDB $ unlines
[ "main = 42" ]
wireData <- exportBundle srcConn [termHash]
let input = "runArboricxToBool " ++ bytesExpr (map toInteger $ BS.unpack wireData) ++ " []"
library <- evaluateFile "./lib/arboricx.tri"
let env = evalTricu library (parseTricu input)
case result env of
Fork falseTag (Fork code _) | falseTag == falseT -> code @?= ofNumber 14
actual -> assertFailure $ "expected host codec error, got: " ++ show actual
close srcConn
]

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

View File

@@ -41,7 +41,6 @@ executable tricu
, base16-bytestring , base16-bytestring
, base64-bytestring , base64-bytestring
, bytestring , bytestring
, cborg
, cmdargs , cmdargs
, containers , containers
, cryptonite , cryptonite
@@ -94,7 +93,6 @@ test-suite tricu-tests
, base16-bytestring , base16-bytestring
, base64-bytestring , base64-bytestring
, bytestring , bytestring
, cborg
, cmdargs , cmdargs
, containers , containers
, cryptonite , cryptonite