Drop CBOR for simple custom manifest
This commit is contained in:
118
AGENTS.md
118
AGENTS.md
@@ -128,114 +128,18 @@ hash = SHA256("arboricx.merkle.node.v1" <> 0x00 <> serialized_node)
|
||||
|
||||
This is stored in SQLite via `ContentStore.hs`. Hash suffixes on identifiers (e.g., `foo_abc123...`) are validated: 16–64 hex characters (SHA256).
|
||||
|
||||
## 7. Arboricx Portable Wire Format
|
||||
## 7. Arboricx Portable Bundles (`.arboricx`)
|
||||
|
||||
The **Arboricx wire format** (module `Wire.hs`) defines a portable binary bundle for exchanging Tree Calculus terms, their Merkle DAGs, and associated metadata. It is versioned and schema-driven.
|
||||
Portable executable bundles are generated via `Wire.hs`. See `docs/arboricx-bundle-format.md` for the full binary format spec.
|
||||
|
||||
### Header
|
||||
```bash
|
||||
# Export a bundle from the content store
|
||||
./result/bin/tricu export -o myterm.arboricx myterm
|
||||
|
||||
# Run a bundle (requires TRICU_DB_PATH)
|
||||
./result/bin/tricu import -f lib/list.tri
|
||||
TRICU_DB_PATH=/tmp/tricu.db ./result/bin/tricu export -o list_ops.arboricx append
|
||||
```
|
||||
+------------------+-----------------+------------------+----------------+
|
||||
| Magic (8 bytes) | Major (2 bytes) | Minor (2 bytes) | Section Count |
|
||||
| | | | (4 bytes) |
|
||||
+------------------+-----------------+------------------+----------------+
|
||||
| Flags (8 bytes) | Dir Offset (8 bytes)
|
||||
+------------------+-----------------+------------------+
|
||||
```
|
||||
|
||||
- **Magic**: `ARBORICX` (`0x41 0x52 0x42 0x4f 0x52 0x49 0x43 0x58`)
|
||||
- **Header length**: 32 bytes
|
||||
- **Major version**: `1` | **Minor version**: `0`
|
||||
|
||||
### Section Directory
|
||||
|
||||
Immediately follows the header. Each section entry is 60 bytes:
|
||||
|
||||
```
|
||||
+------------------+------------------+-----------------+------------------+
|
||||
| Type (4 bytes) | Version (2 bytes)| Flags (2 bytes) | Compression (2) |
|
||||
+------------------+------------------+-----------------+------------------+
|
||||
| Digest Algo (2) | Offset (8 bytes) | Length (8 bytes)| SHA256 digest (32)|
|
||||
+------------------+------------------+-----------------+------------------+
|
||||
```
|
||||
|
||||
Known section types:
|
||||
|
||||
| Type | Name | Required | Description |
|
||||
|------|-----------|----------|-------------|
|
||||
| 1 | manifest | Yes | JSON manifest metadata |
|
||||
| 2 | nodes | Yes | Binary Merkle node payloads |
|
||||
|
||||
### Section 1 — Manifest (JSON)
|
||||
|
||||
The manifest describes the bundle's semantics, exports, and schema. Key fields:
|
||||
|
||||
| Field | Value | Description |
|
||||
|-------|-------|-------------|
|
||||
| `schema` | `"arboricx.bundle.manifest.v1"` | Manifest schema version |
|
||||
| `bundleType` | `"tree-calculus-executable-object"` | Bundle category |
|
||||
| `tree.calculus` | `"tree-calculus.v1"` | Tree calculus version |
|
||||
| `tree.nodeHash.algorithm` | `"sha256"` | Hash algorithm |
|
||||
| `tree.nodeHash.domain` | `"arboricx.merkle.node.v1"` | Hash domain string |
|
||||
| `tree.nodePayload` | `"arboricx.merkle.payload.v1"` | Payload encoding |
|
||||
| `runtime.semantics` | `"tree-calculus.v1"` | Evaluation semantics |
|
||||
| `runtime.abi` | `"arboricx.abi.tree.v1"` | Runtime ABI |
|
||||
| `closure` | `"complete"` | Bundle must be a complete DAG |
|
||||
| `roots` | `[{"hash": "...", "role": "..."}]` | Named root hashes |
|
||||
| `exports` | `[{"name": "...", "root": "..."}]` | Export aliases for roots |
|
||||
| `metadata.createdBy` | `"arboricx"` | Originator |
|
||||
|
||||
### Section 2 — Nodes (Binary)
|
||||
|
||||
```
|
||||
+------------------+-------------------+-------------------+-----------------+
|
||||
| Node Count (8) | Hash (32 bytes) | Payload Len (4) | Payload (N) |
|
||||
+------------------+-------------------+-------------------+-----------------+
|
||||
```
|
||||
|
||||
Each node entry contains:
|
||||
- 32-byte Merkle hash (hex-encoded in identifiers, raw in binary)
|
||||
- 4-byte big-endian payload length
|
||||
- N bytes of serialized node payload (`0x00` for Leaf, `0x01 || hash` for Stem, `0x02 || left || right` for Fork)
|
||||
|
||||
### Bundle verification flow
|
||||
|
||||
1. Check magic bytes
|
||||
2. Validate major version
|
||||
3. Parse section directory
|
||||
4. For each section: verify SHA256 digest against actual bytes
|
||||
5. Decode JSON manifest
|
||||
6. Decode binary node entries into Merkle DAG
|
||||
7. Verify all root hashes present in manifest exist in node map
|
||||
8. Verify export root hashes present
|
||||
9. Verify children references are complete (no dangling nodes)
|
||||
10. Reject unknown critical sections
|
||||
|
||||
### Data types (Wire.hs)
|
||||
|
||||
| Type | Purpose |
|
||||
|------|---------|
|
||||
| `Bundle` | Top-level bundle: version, roots, nodes map, manifest |
|
||||
| `BundleManifest` | JSON metadata: schema, tree spec, runtime spec, roots, exports |
|
||||
| `TreeSpec` | Tree calculus version + hash algorithm + payload encoding |
|
||||
| `NodeHashSpec` | Hash algorithm and domain string |
|
||||
| `RuntimeSpec` | Semantics, evaluation order, ABI, capabilities |
|
||||
| `BundleRoot` | Root hash + role (`"default"` or `"root"`) |
|
||||
| `BundleExport` | Export name + root hash + kind + ABI |
|
||||
| `BundleMetadata` | Optional package, version, description, license, createdBy |
|
||||
| `ClosureMode` | `ClosureComplete` or `ClosurePartial` |
|
||||
|
||||
### Key functions
|
||||
|
||||
| Function | Signature | Purpose |
|
||||
|----------|-----------|---------|
|
||||
| `encodeBundle` | `Bundle → ByteString` | Serialize bundle to wire bytes |
|
||||
| `decodeBundle` | `ByteString → Either String Bundle` | Parse wire bytes into Bundle |
|
||||
| `verifyBundle` | `Bundle → Either String ()` | Validate DAG, manifest, roots |
|
||||
| `collectReachableNodes` | `Connection → MerkleHash → IO [(MerkleHash, ByteString)]` | Traverse DAG from root |
|
||||
| `exportBundle` | `Connection → [MerkleHash] → IO ByteString` | Build bundle from content store |
|
||||
| `exportNamedBundle` | `Connection → [(Text, MerkleHash)] → IO ByteString` | Build with named roots |
|
||||
| `importBundle` | `Connection → ByteString → IO [MerkleHash]` | Import bundle into content store |
|
||||
|
||||
## 8. Directory Layout
|
||||
|
||||
@@ -273,12 +177,12 @@ tricu/
|
||||
## 9. JS Arboricx Runtime
|
||||
|
||||
A JavaScript implementation of the Arboricx portable bundle runtime lives in `ext/js/`.
|
||||
It is a reference implementation — not a tricu source parser. It reads `.tri.bundle` files produced by the Haskell toolchain, verifies Merkle node hashes, reconstructs tree values, and reduces them.
|
||||
It is a reference implementation — not a tricu source parser. It reads `.arboricx` files produced by the Haskell toolchain, verifies Merkle node hashes, reconstructs tree values, and reduces them.
|
||||
|
||||
From project root:
|
||||
```bash
|
||||
node ext/js/src/cli.js inspect test/fixtures/id.tri.bundle
|
||||
node ext/js/src/cli.js run test/fixtures/true.tri.bundle
|
||||
node ext/js/src/cli.js inspect test/fixtures/id.arboricx
|
||||
node ext/js/src/cli.js run test/fixtures/true.arboricx
|
||||
```
|
||||
|
||||
The JS runtime implements:
|
||||
|
||||
@@ -1,339 +0,0 @@
|
||||
# Arboricx Portable Bundle v1 (CBOR Manifest Profile)
|
||||
|
||||
Status: **Draft, implementation-aligned** (derived from `src/Wire.hs` as of 2026-05-07)
|
||||
|
||||
This document specifies the **actual on-wire format and validation behavior** currently implemented by `tricu` for Arboricx bundles, with a focus on the newer CBOR manifest path.
|
||||
|
||||
---
|
||||
|
||||
## 1. Scope
|
||||
|
||||
This profile defines:
|
||||
|
||||
1. The binary container envelope (header + section directory + section payloads).
|
||||
2. The CBOR manifest section format.
|
||||
3. The Merkle node section format.
|
||||
4. Decode/verify/import behavior in `Wire.hs`.
|
||||
5. Known gaps and sane resolutions.
|
||||
|
||||
Non-goals:
|
||||
|
||||
- tricu source parsing/lambda elimination/module semantics.
|
||||
- Signature systems / trust policy.
|
||||
- Compression codecs beyond `none`.
|
||||
|
||||
---
|
||||
|
||||
## 2. Container format
|
||||
|
||||
A bundle is a byte stream:
|
||||
|
||||
```
|
||||
[32-byte header]
|
||||
[section directory: section_count * 60 bytes]
|
||||
[section payload bytes...]
|
||||
```
|
||||
|
||||
### 2.1 Header (32 bytes)
|
||||
|
||||
| Field | Size | Encoding | Value / Notes |
|
||||
|---|---:|---|---|
|
||||
| Magic | 8 | raw bytes | `41 52 42 4f 52 49 58 00` (`"ARBORICX"`) |
|
||||
| Major | 2 | u16 BE | Must be `1` |
|
||||
| Minor | 2 | u16 BE | Currently `0` |
|
||||
| SectionCount | 4 | u32 BE | Number of section directory entries |
|
||||
| Flags | 8 | u64 BE | Currently emitted as `0`; not interpreted |
|
||||
| DirectoryOffset | 8 | u64 BE | Offset of section directory (currently `32`) |
|
||||
|
||||
Reader behavior:
|
||||
- Reject if total bytes < 32.
|
||||
- Reject bad magic.
|
||||
- Reject major != 1.
|
||||
|
||||
### 2.2 Section directory entry (60 bytes each)
|
||||
|
||||
| Field | Size | Encoding | Notes |
|
||||
|---|---:|---|---|
|
||||
| Type | 4 | u32 BE | e.g. 1=manifest, 2=nodes |
|
||||
| Version | 2 | u16 BE | Currently emitted as `1`; not enforced on read |
|
||||
| Flags | 2 | u16 BE | bit0 = critical |
|
||||
| Compression | 2 | u16 BE | `0` = none (required) |
|
||||
| DigestAlgorithm | 2 | u16 BE | `1` = SHA-256 (required) |
|
||||
| Offset | 8 | u64 BE | Absolute byte offset |
|
||||
| Length | 8 | u64 BE | Section payload length |
|
||||
| Digest | 32 | raw bytes | SHA-256 of section bytes |
|
||||
|
||||
Reader behavior:
|
||||
- Reject unknown **critical** section types.
|
||||
- Reject compression != 0.
|
||||
- Reject digest algorithm != 1.
|
||||
- Reject out-of-bounds sections.
|
||||
- Reject digest mismatch.
|
||||
|
||||
### 2.3 Required section types
|
||||
|
||||
| Type | Name | Required |
|
||||
|---:|---|---|
|
||||
| 1 | manifest | yes |
|
||||
| 2 | nodes | yes |
|
||||
|
||||
Decode currently rejects duplicate section type 1 or 2.
|
||||
|
||||
---
|
||||
|
||||
## 3. Manifest section (CBOR)
|
||||
|
||||
Manifest bytes are CBOR-encoded map data (using `cborg`).
|
||||
|
||||
### 3.1 Top-level manifest schema
|
||||
|
||||
Top-level map has **exactly 8 keys** in this exact decode order in current implementation:
|
||||
|
||||
1. `schema` (text)
|
||||
2. `bundleType` (text)
|
||||
3. `tree` (map)
|
||||
4. `runtime` (map)
|
||||
5. `closure` (text: `"complete"|"partial"`)
|
||||
6. `roots` (array)
|
||||
7. `exports` (array)
|
||||
8. `metadata` (map)
|
||||
|
||||
> Important: Current decoder is order-strict; it expects keys in this sequence.
|
||||
|
||||
### 3.2 Nested structures
|
||||
|
||||
#### `tree` map (3 keys, order-strict)
|
||||
- `calculus`: text
|
||||
- `nodeHash`: map
|
||||
- `nodePayload`: text
|
||||
|
||||
`nodeHash` map (2 keys, order-strict):
|
||||
- `algorithm`: text
|
||||
- `domain`: text
|
||||
|
||||
#### `runtime` map (4 keys, order-strict)
|
||||
- `semantics`: text
|
||||
- `evaluation`: text
|
||||
- `abi`: text
|
||||
- `capabilities`: array(text)
|
||||
|
||||
#### `roots` array of maps
|
||||
Each root map has 2 keys (order-strict):
|
||||
- `hash`: bytes (raw 32-byte hash payload encoded as CBOR byte string)
|
||||
- `role`: text
|
||||
|
||||
#### `exports` array of maps
|
||||
Each export map has 4 keys (order-strict):
|
||||
- `name`: text
|
||||
- `root`: bytes (32-byte hash)
|
||||
- `kind`: text
|
||||
- `abi`: text
|
||||
|
||||
#### `metadata` map
|
||||
Flexible key set; decoded as map(text -> text), then projected into optional fields:
|
||||
- `package`
|
||||
- `version`
|
||||
- `description`
|
||||
- `license`
|
||||
- `createdBy`
|
||||
|
||||
Unknown metadata keys are ignored.
|
||||
|
||||
### 3.3 Default emitted manifest values
|
||||
|
||||
Writers in `Wire.hs` currently emit:
|
||||
|
||||
- `schema = "arboricx.bundle.manifest.v1"`
|
||||
- `bundleType = "tree-calculus-executable-object"`
|
||||
- `tree.calculus = "tree-calculus.v1"`
|
||||
- `tree.nodeHash.algorithm = "sha256"`
|
||||
- `tree.nodeHash.domain = "arboricx.merkle.node.v1"`
|
||||
- `tree.nodePayload = "arboricx.merkle.payload.v1"`
|
||||
- `runtime.semantics = "tree-calculus.v1"`
|
||||
- `runtime.evaluation = "normal-order"`
|
||||
- `runtime.abi = "arboricx.abi.tree.v1"`
|
||||
- `runtime.capabilities = []`
|
||||
- `closure = "complete"`
|
||||
- `metadata.createdBy = "arboricx"`
|
||||
|
||||
---
|
||||
|
||||
## 4. Nodes section (binary)
|
||||
|
||||
Node section payload layout:
|
||||
|
||||
```
|
||||
node_count: u64 BE
|
||||
repeat node_count times:
|
||||
hash: 32 bytes
|
||||
payload_len: u32 BE
|
||||
payload: payload_len bytes
|
||||
```
|
||||
|
||||
Node payload grammar:
|
||||
|
||||
- `0x00` => Leaf
|
||||
- `0x01 || child_hash(32)` => Stem
|
||||
- `0x02 || left_hash(32)||right(32)` => Fork
|
||||
|
||||
Section decoder rejects:
|
||||
- duplicate node hashes,
|
||||
- truncated entries,
|
||||
- payload overruns,
|
||||
- trailing bytes after final node.
|
||||
|
||||
---
|
||||
|
||||
## 5. Verification behavior (`verifyBundle`)
|
||||
|
||||
`verifyBundle` enforces all of:
|
||||
|
||||
1. bundle version >= 1.
|
||||
2. bundle has at least one node.
|
||||
3. manifest constants match hardcoded v1 values (schema/type/calculus/hash algo/domain/payload/runtime semantics/ABI).
|
||||
4. runtime capabilities must be empty.
|
||||
5. closure must be `complete`.
|
||||
6. manifest has at least one root and one export.
|
||||
7. root sets in `bundleRoots` and `manifest.roots` must match exactly.
|
||||
8. each root and export root exists in node map.
|
||||
9. each node payload deserializes and re-hashes to declared node hash.
|
||||
10. all referenced child hashes exist.
|
||||
11. full closure reachability from roots succeeds.
|
||||
|
||||
`importBundle` runs decode + verify before storing nodes.
|
||||
|
||||
---
|
||||
|
||||
## 6. Export/import semantics
|
||||
|
||||
### 6.1 Export
|
||||
|
||||
`exportNamedBundle`:
|
||||
- Traverses reachable nodes for each requested root hash.
|
||||
- Builds node map.
|
||||
- Builds default manifest and CBOR bytes.
|
||||
- Emits two sections (manifest, nodes).
|
||||
|
||||
`exportBundle` auto-names exports:
|
||||
- 1 root => `root`
|
||||
- N>1 => `root0`, `root1`, ...
|
||||
|
||||
### 6.2 Import
|
||||
|
||||
`importBundle`:
|
||||
1. Decode bundle.
|
||||
2. Verify bundle.
|
||||
3. Insert all node payloads into content store.
|
||||
4. For each manifest export: reconstruct tree by export root and store name binding in DB.
|
||||
5. Return bundle root list.
|
||||
|
||||
---
|
||||
|
||||
## 7. Determinism properties
|
||||
|
||||
Current implementation is deterministic for identical logical input because:
|
||||
- Node map serialized in ascending hash order (`Map.toAscList`).
|
||||
- Field order in manifest encoding is fixed by code.
|
||||
- Section ordering is fixed: manifest then nodes.
|
||||
|
||||
So repeated exports of same roots produce byte-identical bundles.
|
||||
|
||||
---
|
||||
|
||||
## 8. Known gaps and sane resolutions
|
||||
|
||||
These are important design gaps visible from current code.
|
||||
|
||||
### Gap A: Node hash domain mismatch risk (critical)
|
||||
|
||||
Status: **resolved in current codebase**.
|
||||
|
||||
What was wrong:
|
||||
- Manifest declared `tree.nodeHash.domain = "arboricx.merkle.node.v1"`.
|
||||
- Hashing implementation previously used `"tricu.merkle.node.v1"`.
|
||||
|
||||
Current state:
|
||||
- Haskell hashing now uses `"arboricx.merkle.node.v1"`.
|
||||
- JS reference runtime hashing now uses `"arboricx.merkle.node.v1"`.
|
||||
- JS manifest validation now requires `"arboricx.merkle.node.v1"`.
|
||||
|
||||
Remaining recommendation:
|
||||
- Keep hash-domain constants centralized/shared to prevent future drift.
|
||||
- Add explicit test vectors for Leaf/Stem/Fork hashes under the Arboricx domain.
|
||||
|
||||
### Gap B: CBOR decode is order-strict, not generic-map tolerant
|
||||
|
||||
Observed:
|
||||
- Decoder expects exact key order for most maps.
|
||||
|
||||
Impact:
|
||||
- Another canonical CBOR writer that reorders keys may decode-fail even if semantically equivalent.
|
||||
|
||||
Sane resolution:
|
||||
- For v1 compatibility, decode maps as unordered key/value collections, require key presence and types, and reject unknown keys only where desired.
|
||||
- Keep writer deterministic, but relax reader.
|
||||
|
||||
### Gap C: “Canonical CBOR” claim is stronger than implementation
|
||||
|
||||
Observed:
|
||||
- Writer uses fixed order but does not explicitly sort keys per RFC 8949 canonical ordering rules.
|
||||
|
||||
Sane resolution:
|
||||
- Either (a) rename as “deterministic CBOR” profile, or (b) implement explicit canonical key ordering and canonical-length/minimal integer forms checks.
|
||||
|
||||
### Gap D: Extra section preservation
|
||||
|
||||
Observed:
|
||||
- Decoder tolerates unknown non-critical sections, but `Bundle` model/encoder drops them on re-encode.
|
||||
|
||||
Sane resolution:
|
||||
- Add `bundleExtraSections :: [SectionEntry+Bytes]` if round-trip preservation is desired.
|
||||
|
||||
### Gap E: Section version not enforced
|
||||
|
||||
Observed:
|
||||
- Section entry `Version` is parsed but unused.
|
||||
|
||||
Sane resolution:
|
||||
- Enforce known version matrix (e.g., manifest v1, nodes v1), or explicitly document “advisory only”.
|
||||
|
||||
### Gap F: Runtime capability policy is hard fail
|
||||
|
||||
Observed:
|
||||
- Any non-empty capabilities list is rejected.
|
||||
|
||||
Sane resolution:
|
||||
- Keep strict for now, but define capability negotiation strategy for v1.1+ (unknown capabilities => reject unless explicitly allowed by host policy).
|
||||
|
||||
### Gap G: Error handling style in import/export path
|
||||
|
||||
Observed:
|
||||
- Several paths throw `error` for malformed data/store misses.
|
||||
|
||||
Sane resolution:
|
||||
- Return `Either`-style typed errors through public API (`decode`, `verify`, `import`), reserve exceptions for truly internal faults.
|
||||
|
||||
---
|
||||
|
||||
## 9. Conformance checklist (v1 current)
|
||||
|
||||
A conforming v1 reader/writer for this profile should:
|
||||
|
||||
- Implement the 32-byte header and 60-byte section records exactly.
|
||||
- Support required sections 1 and 2.
|
||||
- Verify section digests with SHA-256.
|
||||
- Decode/encode manifest CBOR matching the field model above.
|
||||
- Parse nodes section and validate node payload structure.
|
||||
- Recompute and verify node hashes.
|
||||
- Enforce complete closure for roots.
|
||||
- Enforce manifest/runtime constants used by v1.
|
||||
|
||||
---
|
||||
|
||||
## 10. Suggested follow-up docs
|
||||
|
||||
To stabilize interoperability, add:
|
||||
|
||||
1. `docs/arboricx-bundle-test-vectors.md` (golden header/manifest/nodes + expected hashes).
|
||||
2. `docs/arboricx-bundle-errors.md` (normative error codes/strings).
|
||||
3. `docs/arboricx-bundle-evolution.md` (rules for minor/major upgrades, capability negotiation, extra sections).
|
||||
419
docs/arboricx-bundle-format.md
Normal file
419
docs/arboricx-bundle-format.md
Normal file
@@ -0,0 +1,419 @@
|
||||
# Arboricx Portable Bundle Format Specification
|
||||
|
||||
**Version:** 0.1
|
||||
**Status:** Exploratory
|
||||
**Author:** A range of slopmachines guided by James Eversole
|
||||
**Human Review Status:** 5 minute scan-through - this is an evolving and malleable document
|
||||
|
||||
The Arboricx Portable Bundle is a self-contained, content-addressed binary format for distributing Tree Calculus programs and their associated Merkle DAGs. It provides:
|
||||
|
||||
- A fixed binary container with header, section directory, and typed sections
|
||||
- A language-neutral Merkle node layer for content-addressed tree values
|
||||
- A fixed-order binary manifest for semantic metadata, exports, and optional extensions
|
||||
|
||||
## Table of Contents
|
||||
|
||||
1. [Top-Level Container Layout](#1-top-level-container-layout)
|
||||
2. [Header](#2-header)
|
||||
3. [Section Directory](#3-section-directory)
|
||||
4. [Section: Manifest (type 1)](#4-section-manifest-type-1)
|
||||
5. [Section: Nodes (type 2)](#5-section-nodes-type-2)
|
||||
6. [Merkle Node Payload Format](#6-merkle-node-payload-format)
|
||||
7. [Merkle Hash Computation](#7-merkle-hash-computation)
|
||||
8. [Tree Calculus Reduction Semantics](#8-tree-calculus-reduction-semantics)
|
||||
9. [Binary Primitives](#9-binary-primitives)
|
||||
10. [Bundle Verification](#10-bundle-verification)
|
||||
11. [Known Section Types](#11-known-section-types)
|
||||
|
||||
---
|
||||
|
||||
## 1. Top-Level Container Layout
|
||||
|
||||
An Arboricx bundle is a flat binary blob with the following layout:
|
||||
|
||||
```
|
||||
+------------------+------------------+------------------+------------------+
|
||||
| Header | Section Directory| Manifest Section | Nodes Section |
|
||||
| (32 bytes) | (N × 60 bytes) | (variable) | (variable) |
|
||||
+------------------+------------------+------------------+------------------+
|
||||
```
|
||||
|
||||
The container uses **big-endian** byte order for all multi-byte integers.
|
||||
|
||||
Total bundle size = 32 + (sectionCount × 60) + manifestSize + nodesSize
|
||||
|
||||
---
|
||||
|
||||
## 2. Header
|
||||
|
||||
| Offset | Size | Field | Description |
|
||||
|--------|------|-------|-------------|
|
||||
| 0 | 8 bytes | Magic | ASCII `"ARBORICX"` (`0x41 0x52 0x42 0x4F 0x52 0x49 0x43 0x58`) |
|
||||
| 8 | 2 bytes | Major version | `u16` BE. Currently `1` |
|
||||
| 10 | 2 bytes | Minor version | `u16` BE. Currently `0` |
|
||||
| 12 | 4 bytes | Section count | `u32` BE. Number of entries in the section directory |
|
||||
| 16 | 8 bytes | Flags | `u64` BE. Reserved; currently all zeros |
|
||||
| 24 | 8 bytes | Directory offset | `u64` BE. Byte offset from the start of the bundle to the section directory |
|
||||
|
||||
**Constraints:**
|
||||
- Major version must be `1`. Bundles with unsupported major versions are rejected.
|
||||
- The directory offset must point to a valid location within the bundle.
|
||||
- The directory offset is always `32` for bundles with the current layout (header immediately followed by the directory).
|
||||
|
||||
---
|
||||
|
||||
## 3. Section Directory
|
||||
|
||||
The section directory is an array of `N` entries, where `N` is the section count from the header. Each entry is exactly **60 bytes**.
|
||||
|
||||
| Offset (within entry) | Size | Field | Description |
|
||||
|----------------------|------|-------|-------------|
|
||||
| 0 | 4 bytes | Type | `u32` BE. Section type identifier (see [Known Section Types](#11-known-section-types)) |
|
||||
| 4 | 2 bytes | Version | `u16` BE. Section-specific version |
|
||||
| 6 | 2 bytes | Flags | `u16` BE. Bit flags: bit 0 (`0x0001`) = critical section |
|
||||
| 8 | 2 bytes | Compression | `u16` BE. Compression codec (currently only `0` = none) |
|
||||
| 10 | 2 bytes | Digest algorithm | `u16` BE. Hash algorithm (currently only `1` = SHA-256) |
|
||||
| 12 | 8 bytes | Offset | `u64` BE. Byte offset from the start of the bundle to the section data |
|
||||
| 20 | 8 bytes | Length | `u64` BE. Length of the section data in bytes |
|
||||
| 28 | 32 bytes | SHA-256 digest | Raw digest of the section data |
|
||||
|
||||
**Verification:**
|
||||
- Unknown critical sections (flags & `0x0001`) are rejected.
|
||||
- Compression must be `0` (none).
|
||||
- Digest algorithm must be `1` (SHA-256).
|
||||
- The SHA-256 digest in the directory entry must match `SHA256(section_data)`.
|
||||
|
||||
---
|
||||
|
||||
## 4. Section: Manifest (type 1)
|
||||
|
||||
The manifest is a binary encoding of bundle metadata. It uses a **fixed-order core** layout followed by an optional **TLV tail** for extensibility.
|
||||
|
||||
### 4.1 Format
|
||||
|
||||
```
|
||||
Manifest =
|
||||
magic 8 bytes "ARBMNFST"
|
||||
major u16 BE Manifest major version (1)
|
||||
minor u16 BE Manifest minor version (0)
|
||||
|
||||
schema string Length-prefixed UTF-8 text
|
||||
bundleType string Length-prefixed UTF-8 text
|
||||
|
||||
treeCalculus string Length-prefixed UTF-8 text
|
||||
treeHashAlgorithm string Length-prefixed UTF-8 text
|
||||
treeHashDomain string Length-prefixed UTF-8 text
|
||||
treeNodePayload string Length-prefixed UTF-8 text
|
||||
|
||||
runtimeSemantics string Length-prefixed UTF-8 text
|
||||
runtimeEvaluation string Length-prefixed UTF-8 text
|
||||
runtimeAbi string Length-prefixed UTF-8 text
|
||||
capabilityCount u32 BE Number of capability strings
|
||||
capabilities string[] Array of length-prefixed UTF-8 capability strings
|
||||
|
||||
closure u8 0 = complete, 1 = partial
|
||||
rootCount u32 BE Number of root entries
|
||||
roots Root[] Array of root entries
|
||||
exportCount u32 BE Number of export entries
|
||||
exports Export[] Array of export entries
|
||||
|
||||
metadataFieldCount u32 BE Number of metadata TLV entries
|
||||
metadataFields TLV[] Metadata tag-value entries
|
||||
extensionFieldCount u32 BE Number of extension TLV entries
|
||||
extensionFields TLV[] Extension tag-value entries (skipped by parsers)
|
||||
```
|
||||
|
||||
**Trailing bytes after the manifest must be zero** (no leftover data).
|
||||
|
||||
### 4.2 String Format
|
||||
|
||||
Every `string` field uses the same encoding:
|
||||
|
||||
```
|
||||
string =
|
||||
length u32 BE Number of UTF-8 bytes in the string (not the number of characters)
|
||||
bytes byte[length] UTF-8 encoded string content
|
||||
```
|
||||
|
||||
The length field carries the byte count, so parsers can skip strings without decoding UTF-8.
|
||||
|
||||
### 4.3 Root Entry
|
||||
|
||||
```
|
||||
Root =
|
||||
hash 32 bytes Raw SHA-256 hash of the Merkle node
|
||||
role string Length-prefixed UTF-8 text ("default" for the first root, "root" for others)
|
||||
```
|
||||
|
||||
The hash is stored as **raw bytes** (not hex-encoded). It corresponds to the Merkle hash of the node.
|
||||
|
||||
### 4.4 Export Entry
|
||||
|
||||
```
|
||||
Export =
|
||||
name string Length-prefixed UTF-8 text (export identifier)
|
||||
root 32 bytes Raw SHA-256 hash of the Merkle node
|
||||
kind string Length-prefixed UTF-8 text (currently "term")
|
||||
abi string Length-prefixed UTF-8 text (ABI string)
|
||||
```
|
||||
|
||||
### 4.5 TLV Entry
|
||||
|
||||
```
|
||||
TLV =
|
||||
tag u16 BE Tag identifier (type)
|
||||
length u32 BE Number of bytes in the value
|
||||
value byte[length] Raw bytes
|
||||
```
|
||||
|
||||
TLV entries support variable-length values and are skippable by parsers that do not recognize a tag: read the `u32` length and advance by `2 + 4 + length` bytes.
|
||||
|
||||
### 4.6 Metadata Tags
|
||||
|
||||
| Tag | Name | Value |
|
||||
|-----|------|-------|
|
||||
| 1 | package | UTF-8 text: package name |
|
||||
| 2 | version | UTF-8 text: version string |
|
||||
| 3 | description | UTF-8 text: description |
|
||||
| 4 | license | UTF-8 text: license identifier or text |
|
||||
| 5 | createdBy | UTF-8 text: creator identifier |
|
||||
|
||||
Unknown metadata tags are ignored. Unknown extension tags are skipped by length.
|
||||
|
||||
### 4.7 Semantic Constraints
|
||||
|
||||
A valid bundle manifest must satisfy:
|
||||
|
||||
| Constraint | Value |
|
||||
|-----------|-------|
|
||||
| `schema` | `"arboricx.bundle.manifest.v1"` |
|
||||
| `bundleType` | `"tree-calculus-executable-object"` |
|
||||
| `treeCalculus` | `"tree-calculus.v1"` |
|
||||
| `treeHashAlgorithm` | `"sha256"` |
|
||||
| `treeHashDomain` | `"arboricx.merkle.node.v1"` |
|
||||
| `treeNodePayload` | `"arboricx.merkle.payload.v1"` |
|
||||
| `runtimeSemantics` | `"tree-calculus.v1"` |
|
||||
| `runtimeAbi` | `"arboricx.abi.tree.v1"` |
|
||||
| `runtimeCapabilities` | Empty array |
|
||||
| `closure` | `0` (complete) |
|
||||
| `rootCount` | At least 1 |
|
||||
| `exportCount` | At least 1 |
|
||||
| Export names | Non-empty |
|
||||
| Export roots | Non-empty (32 bytes each) |
|
||||
|
||||
---
|
||||
|
||||
## 5. Section: Nodes (type 2)
|
||||
|
||||
The nodes section contains all Merkle DAG nodes referenced by the manifest. It is a sequence of node entries preceded by a count.
|
||||
|
||||
```
|
||||
NodesSection =
|
||||
nodeCount u64 BE Total number of node entries
|
||||
entries NodeEntry[]
|
||||
```
|
||||
|
||||
Each node entry:
|
||||
|
||||
```
|
||||
NodeEntry =
|
||||
hash 32 bytes Raw SHA-256 hash of this node
|
||||
payloadLen u32 BE Length of the payload in bytes
|
||||
payload byte[payloadLen] Node payload (see Section 6)
|
||||
```
|
||||
|
||||
The node count is `u64` to support large bundles. Entries are stored in the order produced by the exporter (typically sorted by hash for determinism).
|
||||
|
||||
---
|
||||
|
||||
## 6. Merkle Node Payload Format
|
||||
|
||||
Each node in the Merkle DAG is one of three types. The payload is a single byte type tag followed by hash references:
|
||||
|
||||
### Leaf
|
||||
|
||||
```
|
||||
Payload = 0x00
|
||||
```
|
||||
|
||||
A leaf has no children. The payload is exactly 1 byte.
|
||||
|
||||
### Stem
|
||||
|
||||
```
|
||||
Payload = 0x01 || child_hash (32 bytes raw)
|
||||
```
|
||||
|
||||
A stem has exactly one child. The payload is 33 bytes.
|
||||
|
||||
### Fork
|
||||
|
||||
```
|
||||
Payload = 0x02 || left_hash (32 bytes raw) || right_hash (32 bytes raw)
|
||||
```
|
||||
|
||||
A fork has exactly two children. The payload is 65 bytes.
|
||||
|
||||
**Validation:**
|
||||
- Leaf payloads must be exactly 1 byte (`0x00`).
|
||||
- Stem payloads must be exactly 33 bytes.
|
||||
- Fork payloads must be exactly 65 bytes.
|
||||
- Unknown type bytes are rejected.
|
||||
|
||||
---
|
||||
|
||||
## 7. Merkle Hash Computation
|
||||
|
||||
Each node is identified by a SHA-256 hash of its canonical payload:
|
||||
|
||||
```
|
||||
hash = SHA256( domain_tag || 0x00 || payload )
|
||||
```
|
||||
|
||||
Where:
|
||||
|
||||
| Component | Value |
|
||||
|-----------|-------|
|
||||
| `domain_tag` | `"arboricx.merkle.node.v1"` as UTF-8 bytes |
|
||||
| Separator | `0x00` (one zero byte) |
|
||||
| `payload` | The node's canonical serialization from Section 6 |
|
||||
|
||||
**Examples:**
|
||||
|
||||
- **Leaf:** `SHA256("arboricx.merkle.node.v1" || 0x00 || 0x00)`
|
||||
- **Stem:** `SHA256("arboricx.merkle.node.v1" || 0x00 || 0x01 || child_hash_bytes)`
|
||||
- **Fork:** `SHA256("arboricx.merkle.node.v1" || 0x00 || 0x02 || left_hash_bytes || right_hash_bytes)`
|
||||
|
||||
The resulting SHA-256 hash is stored as a hex-encoded string in the manifest (64 hex characters). Within the nodes section, it is stored as raw bytes.
|
||||
|
||||
---
|
||||
|
||||
## 8. Tree Calculus Reduction Semantics
|
||||
|
||||
The bundle represents a **Tree Calculus** term as a Merkle DAG. The reduction rules are:
|
||||
|
||||
### Apply Rules
|
||||
|
||||
```
|
||||
apply(Fork(Leaf, a), _) = a
|
||||
apply(Fork(Stem(a), b), c) = apply(apply(a, c), apply(b, c))
|
||||
apply(Fork(Fork, _, _), Leaf) = left of inner Fork
|
||||
apply(Fork(Fork, _, _), Stem) = right of inner Fork
|
||||
apply(Fork(Fork, _, _), Fork) = apply(apply(c, u), v) where c = Fork(u, v)
|
||||
apply(Leaf, b) = Stem(b)
|
||||
apply(Stem(a), b) = Fork(a, b)
|
||||
```
|
||||
|
||||
### Internal Representation
|
||||
|
||||
In the reduction engine, Fork nodes use a `[right, left]` (stack) ordering:
|
||||
- `Fork = [right_child, left_child]`
|
||||
- `Stem = [child]`
|
||||
- `Leaf = []`
|
||||
|
||||
This ordering supports stack-based reduction: pop two terms, apply, push results back.
|
||||
|
||||
### Closure
|
||||
|
||||
The bundle declares `closure = "complete"`, meaning all nodes reachable from export roots are present in the nodes section. No external references exist.
|
||||
|
||||
---
|
||||
|
||||
## 9. Binary Primitives
|
||||
|
||||
All multi-byte integers use **big-endian** byte order.
|
||||
|
||||
### u16 (2 bytes)
|
||||
|
||||
```
|
||||
byte[0] | byte[1]
|
||||
value = (byte[0] << 8) | byte[1]
|
||||
```
|
||||
|
||||
### u32 (4 bytes)
|
||||
|
||||
```
|
||||
byte[0] | byte[1] | byte[2] | byte[3]
|
||||
value = (byte[0] << 24) | (byte[1] << 16) | (byte[2] << 8) | byte[3]
|
||||
```
|
||||
|
||||
### u64 (8 bytes)
|
||||
|
||||
```
|
||||
byte[0] ... byte[7]
|
||||
value = (byte[0] << 56) | ... | byte[7]
|
||||
```
|
||||
|
||||
### u8 (1 byte)
|
||||
|
||||
A single byte, value `0-255`.
|
||||
|
||||
---
|
||||
|
||||
## 10. Bundle Verification
|
||||
|
||||
A complete bundle verification proceeds in this order:
|
||||
|
||||
1. **Magic check:** First 8 bytes must be `"ARBORICX"`.
|
||||
2. **Version check:** Major version must be `1`.
|
||||
3. **Section directory:** Parse all entries; reject unknown critical sections.
|
||||
4. **Digest verification:** For each section, compute `SHA256(section_data)` and compare with the digest in the directory entry.
|
||||
5. **Manifest parsing:** Decode the fixed-order manifest; validate semantic constraints.
|
||||
6. **Node section:** Parse all node entries; reject duplicates.
|
||||
7. **Root verification:** All root hashes from the manifest must exist in the node map.
|
||||
8. **Export verification:** All export root hashes must exist in the node map.
|
||||
9. **Node hash verification:** For each node, compute `SHA256(domain || 0x00 || payload)` and compare with the stored hash.
|
||||
10. **Children verification:** For each Stem/Fork node, both child hashes must exist in the node map.
|
||||
11. **Closure verification:** Starting from each root hash, traverse the DAG and confirm all reachable nodes are present.
|
||||
|
||||
---
|
||||
|
||||
## 11. Known Section Types
|
||||
|
||||
| Type | Name | Required | Version | Description |
|
||||
|------|------|----------|---------|-------------|
|
||||
| 1 | Manifest | Yes | 1 | Bundle metadata in fixed-order binary format |
|
||||
| 2 | Nodes | Yes | 1 | Merkle DAG node entries |
|
||||
|
||||
Unknown section types are permitted if not marked as critical (flags bit 0 is not set).
|
||||
|
||||
---
|
||||
|
||||
## Appendix A: Complete Example Layout (id.arboricx)
|
||||
|
||||
A minimal `id.arboricx` bundle has:
|
||||
|
||||
```
|
||||
+---------------------------------------------------+
|
||||
| Header (32 bytes) |
|
||||
| Magic: "ARBORICX" |
|
||||
| Major: 1, Minor: 0 |
|
||||
| Section count: 2 |
|
||||
| Flags: 0 |
|
||||
| Dir offset: 32 |
|
||||
+---------------------------------------------------+
|
||||
| Section Directory (120 bytes = 2 × 60) |
|
||||
| Entry 0: type=1 (manifest), offset=152, len=375 |
|
||||
| Entry 1: type=2 (nodes), offset=527, len=284 |
|
||||
+---------------------------------------------------+
|
||||
| Manifest Section (375 bytes) |
|
||||
| Magic: "ARBMNFST" |
|
||||
| Version: 1.0 |
|
||||
| Core strings (schema, bundleType, tree spec, |
|
||||
| runtime spec, capabilities, closure, roots, |
|
||||
| exports, metadata TLVs, extension fields) |
|
||||
+---------------------------------------------------+
|
||||
| Nodes Section (284 bytes) |
|
||||
| Node count: 2 |
|
||||
| Node entry 1: hash + payload (Leaf) |
|
||||
| Node entry 2: hash + payload (Fork) |
|
||||
+---------------------------------------------------+
|
||||
```
|
||||
|
||||
The manifest section starts at byte 152 (0x98) and the nodes section at byte 527 (0x20F).
|
||||
|
||||
---
|
||||
|
||||
## Appendix B: File Extension
|
||||
|
||||
Bundles produced by the `tricu` tool use the `.arboricx` file extension. The `.tri` extension is used for plain source files; the `.arboricx` extension identifies the portable binary format.
|
||||
@@ -18,12 +18,12 @@
|
||||
* Offset 8B u64 BE
|
||||
* Length 8B u64 BE
|
||||
* SHA256Digest 32B raw
|
||||
* Manifest: canonical CBOR-encoded map (cborg output from Haskell)
|
||||
* Manifest: fixed-order core + TLV tail (ARBMNFST magic)
|
||||
* Nodes: binary section
|
||||
*/
|
||||
|
||||
import { createHash } from "node:crypto";
|
||||
import { decodeCbor } from "./cbor.js";
|
||||
import { decodeManifest } from "./manifest.js";
|
||||
|
||||
// ── Constants ───────────────────────────────────────────────────────────────
|
||||
|
||||
@@ -173,37 +173,12 @@ export function parseBundle(buffer) {
|
||||
}
|
||||
|
||||
/**
|
||||
* Post-process a CBOR-decoded manifest to normalize hash fields
|
||||
* from raw bytes to hex strings (matching the old JSON wire format).
|
||||
*/
|
||||
function normalizeManifest(raw) {
|
||||
const tree = raw.tree;
|
||||
if (tree && tree.nodeHash && tree.nodeHash.domain) {
|
||||
tree.nodeHash.domain = tree.nodeHash.domain;
|
||||
}
|
||||
|
||||
// Convert root hashes from raw bytes to hex
|
||||
const roots = (raw.roots || []).map((r) => ({
|
||||
...r,
|
||||
hash: r.hash instanceof Uint8Array ? Buffer.from(r.hash).toString("hex") : r.hash,
|
||||
}));
|
||||
|
||||
// Convert export root hashes from raw bytes to hex
|
||||
const exports = (raw.exports || []).map((e) => ({
|
||||
...e,
|
||||
root: e.root instanceof Uint8Array ? Buffer.from(e.root).toString("hex") : e.root,
|
||||
}));
|
||||
|
||||
return { ...raw, roots, exports };
|
||||
}
|
||||
|
||||
/**
|
||||
* Convenience: parse and return the manifest from CBOR.
|
||||
* Convenience: parse and return the manifest from the fixed-order binary format.
|
||||
*/
|
||||
export function parseManifest(buffer) {
|
||||
const bundle = parseBundle(buffer);
|
||||
const manifestEntry = bundle.sections.get(SECTION_MANIFEST);
|
||||
return normalizeManifest(decodeCbor(manifestEntry.data));
|
||||
return decodeManifest(manifestEntry.data);
|
||||
}
|
||||
|
||||
/**
|
||||
|
||||
@@ -1,130 +0,0 @@
|
||||
/**
|
||||
* cbor.js — Minimal CBOR decoder for the Arboricx manifest format.
|
||||
*
|
||||
* Decodes the canonical CBOR produced by the Haskell cborg library:
|
||||
* - Maps: major type 5 (0xa0 + length)
|
||||
* - Arrays: major type 4 (0x80 + length)
|
||||
* - Text strings: major type 3, UTF-8 encoded
|
||||
* - Byte strings: major type 2
|
||||
* - Unsigned ints: major type 0
|
||||
* - Simple values: 0xc2 = false, 0xc3 = true
|
||||
*
|
||||
* Only covers the subset needed for the manifest.
|
||||
*/
|
||||
|
||||
// ── Decoding state ──────────────────────────────────────────────────────────
|
||||
|
||||
/**
|
||||
* @param {Buffer} data
|
||||
* @returns {number} remaining buffer
|
||||
*/
|
||||
function makeDecoder(data) {
|
||||
let offset = 0;
|
||||
|
||||
return {
|
||||
/** @returns {number} current offset */
|
||||
getPos() { return offset; },
|
||||
|
||||
/** @returns {number} remaining bytes */
|
||||
remaining() { return data.length - offset; },
|
||||
|
||||
/** @returns {number} total length */
|
||||
length() { return data.length; },
|
||||
|
||||
/** Read N bytes and advance */
|
||||
read(n) {
|
||||
if (offset + n > data.length) {
|
||||
throw new Error(`CBOR read: expected ${n} bytes, ${data.length - offset} remaining at offset ${offset}`);
|
||||
}
|
||||
const slice = data.slice(offset, offset + n);
|
||||
offset += n;
|
||||
return slice;
|
||||
},
|
||||
|
||||
/** Read a single byte */
|
||||
readByte() {
|
||||
if (offset >= data.length) {
|
||||
throw new Error(`CBOR readByte: no bytes remaining at offset ${offset}`);
|
||||
}
|
||||
return data[offset++];
|
||||
},
|
||||
};
|
||||
}
|
||||
|
||||
// ── CBOR helpers ────────────────────────────────────────────────────────────
|
||||
|
||||
/**
|
||||
* Read a CBOR length (major type initial byte encodes length for values < 24).
|
||||
* For 24+, reads additional bytes per spec.
|
||||
* @returns {number}
|
||||
*/
|
||||
function cborReadLength(dec, startByte) {
|
||||
const additional = startByte & 0x1f;
|
||||
if (additional < 24) return additional;
|
||||
if (additional === 24) return dec.read(1)[0];
|
||||
if (additional === 25) return dec.read(2).readUint16BE(0);
|
||||
if (additional === 26) return dec.read(4).readUint32BE(0);
|
||||
throw new Error(`CBOR: unsupported additional info ${additional}`);
|
||||
}
|
||||
|
||||
// ── Top-level decode ────────────────────────────────────────────────────────
|
||||
|
||||
/**
|
||||
* Decode a single CBOR value from buffer bytes.
|
||||
* @param {Buffer} buf
|
||||
* @returns {*}
|
||||
*/
|
||||
export function decodeCbor(buf) {
|
||||
const dec = makeDecoder(buf);
|
||||
const result = cborDecode(dec);
|
||||
return result;
|
||||
}
|
||||
|
||||
function cborDecode(dec) {
|
||||
const first = dec.readByte();
|
||||
const major = (first >> 5) & 0x07;
|
||||
const info = first & 0x1f;
|
||||
|
||||
switch (major) {
|
||||
case 0: // unsigned int
|
||||
case 1: // negative int
|
||||
return cborReadLength(dec, first);
|
||||
|
||||
case 2: // byte string
|
||||
return dec.read(cborReadLength(dec, first));
|
||||
|
||||
case 3: // text string (UTF-8)
|
||||
const len = cborReadLength(dec, first);
|
||||
return dec.read(len).toString("utf-8");
|
||||
|
||||
case 4: // array
|
||||
const arrLen = cborReadLength(dec, first);
|
||||
const arr = [];
|
||||
for (let i = 0; i < arrLen; i++) {
|
||||
arr.push(cborDecode(dec));
|
||||
}
|
||||
return arr;
|
||||
|
||||
case 5: // map
|
||||
const mapLen = cborReadLength(dec, first);
|
||||
const map = {};
|
||||
for (let i = 0; i < mapLen; i++) {
|
||||
const key = cborDecode(dec);
|
||||
const val = cborDecode(dec);
|
||||
map[key] = val;
|
||||
}
|
||||
return map;
|
||||
|
||||
case 7: // simple values / floats
|
||||
if (info === 20) return false;
|
||||
if (info === 21) return true;
|
||||
if (info === 22) return null; // undefined
|
||||
if (info === 23) return null; // break (shouldn't appear in definite-length)
|
||||
// 0xf9-fb are half/float/double floats — not used by our writer
|
||||
throw new Error(`CBOR: unsupported simple value ${info}`);
|
||||
|
||||
default:
|
||||
// Tags (major 6) and break (0xff) — not used in our manifest
|
||||
throw new Error(`CBOR: unsupported major type ${major}, info ${info}`);
|
||||
}
|
||||
}
|
||||
@@ -1,13 +1,220 @@
|
||||
/**
|
||||
* manifest.js — Minimal manifest parsing and export lookup.
|
||||
* manifest.js — Fixed-order manifest parsing and export lookup.
|
||||
*
|
||||
* The manifest is a JSON object with fields:
|
||||
* schema, bundleType, tree, runtime, closure, roots, exports,
|
||||
* imports, sections, metadata
|
||||
* The manifest binary format (ManifestV1):
|
||||
* magic(8) + major(u16) + minor(u16)
|
||||
* + schema(string) + bundleType(string)
|
||||
* + treeCalculus(string) + treeHashAlgorithm(string) + treeHashDomain(string) + treeNodePayload(string)
|
||||
* + runtimeSemantics(string) + runtimeEvaluation(string) + runtimeAbi(string)
|
||||
* + capabilityCount(u32) + capabilities(string[])
|
||||
* + closure(u8)
|
||||
* + rootCount(u32) + roots[]
|
||||
* + exportCount(u32) + exports[]
|
||||
* + metadataFieldCount(u32) + metadataTLVs[]
|
||||
* + extensionFieldCount(u32) + extensionTLVs[]
|
||||
*
|
||||
* We parse only what we need for runtime entrypoint selection.
|
||||
* String format: u32 BE length + UTF-8 bytes.
|
||||
* Root: 32 bytes raw hash + role(string).
|
||||
* Export: name(string) + 32 bytes raw root hash + kind(string) + abi(string).
|
||||
* TLV: u16 tag + u32 length + value bytes.
|
||||
*/
|
||||
|
||||
// ── Constants ───────────────────────────────────────────────────────────────
|
||||
|
||||
const MANIFEST_MAGIC = "ARBMNFST";
|
||||
const MANIFEST_MAJOR = 1;
|
||||
const MANIFEST_MINOR = 0;
|
||||
|
||||
// Metadata TLV tags
|
||||
const TAG_PACKAGE = 1;
|
||||
const TAG_VERSION = 2;
|
||||
const TAG_DESCRIPTION = 3;
|
||||
const TAG_LICENSE = 4;
|
||||
const TAG_CREATED_BY = 5;
|
||||
|
||||
// Closure bytes
|
||||
const CLOSURE_COMPLETE = 0;
|
||||
const CLOSURE_PARTIAL = 1;
|
||||
|
||||
// ── Binary helpers ──────────────────────────────────────────────────────────
|
||||
|
||||
function u16(buf, off) {
|
||||
if (off + 2 > buf.length) throw new Error("manifest: not enough bytes for u16");
|
||||
return { value: buf.readUint16BE(off), next: off + 2 };
|
||||
}
|
||||
|
||||
function u32(buf, off) {
|
||||
if (off + 4 > buf.length) throw new Error("manifest: not enough bytes for u32");
|
||||
return { value: buf.readUint32BE(off), next: off + 4 };
|
||||
}
|
||||
|
||||
function u8(buf, off) {
|
||||
if (off >= buf.length) throw new Error("manifest: not enough bytes for u8");
|
||||
return { value: buf.readUint8(off), next: off + 1 };
|
||||
}
|
||||
|
||||
/**
|
||||
* Read a length-prefixed UTF-8 string: u32 BE length + UTF-8 bytes.
|
||||
* Returns { text, next }.
|
||||
*/
|
||||
function readStr(buf, off) {
|
||||
const { value: len, next: afterLen } = u32(buf, off);
|
||||
if (afterLen + len > buf.length) throw new Error("manifest: string extends beyond input");
|
||||
return { text: buf.toString("utf-8", afterLen, afterLen + len), next: afterLen + len };
|
||||
}
|
||||
|
||||
/**
|
||||
* Read raw bytes of given length.
|
||||
* Returns { bytes, next }.
|
||||
*/
|
||||
function readRaw(buf, off, n) {
|
||||
if (off + n > buf.length) throw new Error(`manifest: not enough bytes for ${n}-byte read`);
|
||||
return { value: buf.slice(off, off + n), next: off + n };
|
||||
}
|
||||
|
||||
// ── Manifest decoder ────────────────────────────────────────────────────────
|
||||
|
||||
/**
|
||||
* Decode the manifest binary from a Buffer.
|
||||
*
|
||||
* Returns a normalized manifest object matching the shape expected
|
||||
* by validateManifest / selectExport.
|
||||
*/
|
||||
export function decodeManifest(buf) {
|
||||
let off = 0;
|
||||
|
||||
// Magic (8 bytes)
|
||||
const magic = buf.toString("utf-8", 0, 8);
|
||||
if (magic !== MANIFEST_MAGIC) {
|
||||
throw new Error(`invalid manifest magic: expected ${MANIFEST_MAGIC}, got "${magic}"`);
|
||||
}
|
||||
off = 8;
|
||||
|
||||
// Version
|
||||
const { value: major } = u16(buf, off);
|
||||
if (major !== MANIFEST_MAJOR) throw new Error(`unsupported manifest major version: ${major}`);
|
||||
off += 4; // u16 major + u16 minor
|
||||
|
||||
// Helper: read length-prefixed text
|
||||
const readText = () => {
|
||||
const { text, next } = readStr(buf, off);
|
||||
off = next;
|
||||
return text;
|
||||
};
|
||||
|
||||
// Core strings
|
||||
const schema = readText();
|
||||
const bundleType = readText();
|
||||
const treeCalculus = readText();
|
||||
const treeHashAlgorithm = readText();
|
||||
const treeHashDomain = readText();
|
||||
const treeNodePayload = readText();
|
||||
const runtimeSemantics = readText();
|
||||
const runtimeEvaluation = readText();
|
||||
const runtimeAbi = readText();
|
||||
|
||||
// Capabilities (u32 count + string[])
|
||||
const { value: capCount } = u32(buf, off);
|
||||
off += 4;
|
||||
const capabilities = [];
|
||||
for (let i = 0; i < capCount; i++) {
|
||||
capabilities.push(readText());
|
||||
}
|
||||
|
||||
// Closure (u8)
|
||||
const { value: closureByte } = u8(buf, off);
|
||||
off += 1;
|
||||
const closure = closureByte === CLOSURE_COMPLETE ? "complete" : "partial";
|
||||
|
||||
// Roots (u32 count + Root[])
|
||||
// Root: 32 bytes raw hash + role(string)
|
||||
const { value: rootCount } = u32(buf, off);
|
||||
off += 4;
|
||||
const roots = [];
|
||||
for (let i = 0; i < rootCount; i++) {
|
||||
const { value: hashRaw } = readRaw(buf, off, 32);
|
||||
off += 32;
|
||||
const { text: role, next: rOff } = readStr(buf, off);
|
||||
off = rOff;
|
||||
roots.push({ hash: hashRaw.toString("hex"), role });
|
||||
}
|
||||
|
||||
// Exports (u32 count + Export[])
|
||||
// Export: name(string) + 32 bytes raw root hash + kind(string) + abi(string)
|
||||
const { value: exportCount } = u32(buf, off);
|
||||
off += 4;
|
||||
const exports = [];
|
||||
for (let i = 0; i < exportCount; i++) {
|
||||
const { text: name, next: nOff } = readStr(buf, off);
|
||||
off = nOff;
|
||||
const { value: expHashRaw } = readRaw(buf, off, 32);
|
||||
off += 32;
|
||||
const { text: kind, next: kOff } = readStr(buf, off);
|
||||
off = kOff;
|
||||
const { text: abi, next: aOff } = readStr(buf, off);
|
||||
off = aOff;
|
||||
exports.push({ name, root: expHashRaw.toString("hex"), kind, abi });
|
||||
}
|
||||
|
||||
// Metadata (u32 count + TLV[])
|
||||
// TLV: u16 tag + u32 length + value bytes
|
||||
const { value: metaCount } = u32(buf, off);
|
||||
off += 4;
|
||||
const metadata = {};
|
||||
for (let i = 0; i < metaCount; i++) {
|
||||
const { value: tag } = u16(buf, off);
|
||||
off += 2;
|
||||
const { value: tlvLen } = u32(buf, off);
|
||||
off += 4;
|
||||
const { value: tlvRaw } = readRaw(buf, off, tlvLen);
|
||||
off += tlvLen;
|
||||
const val = tlvRaw.toString("utf-8");
|
||||
switch (tag) {
|
||||
case TAG_PACKAGE: metadata.package = val; break;
|
||||
case TAG_VERSION: metadata.version = val; break;
|
||||
case TAG_DESCRIPTION: metadata.description = val; break;
|
||||
case TAG_LICENSE: metadata.license = val; break;
|
||||
case TAG_CREATED_BY: metadata.createdBy = val; break;
|
||||
}
|
||||
}
|
||||
|
||||
// Extensions (u32 count + TLV[] — skip all)
|
||||
const { value: extCount } = u32(buf, off);
|
||||
off += 4;
|
||||
for (let i = 0; i < extCount; i++) {
|
||||
const { value: _tag } = u16(buf, off);
|
||||
off += 2;
|
||||
const { value: tlvLen } = u32(buf, off);
|
||||
off += 4;
|
||||
off += tlvLen; // skip value
|
||||
}
|
||||
|
||||
return {
|
||||
schema,
|
||||
bundleType,
|
||||
tree: {
|
||||
calculus: treeCalculus,
|
||||
nodeHash: {
|
||||
algorithm: treeHashAlgorithm,
|
||||
domain: treeHashDomain,
|
||||
},
|
||||
nodePayload: treeNodePayload,
|
||||
},
|
||||
runtime: {
|
||||
semantics: runtimeSemantics,
|
||||
evaluation: runtimeEvaluation,
|
||||
abi: runtimeAbi,
|
||||
capabilities,
|
||||
},
|
||||
closure,
|
||||
roots,
|
||||
exports,
|
||||
metadata: Object.keys(metadata).length > 0 ? metadata : undefined,
|
||||
};
|
||||
}
|
||||
|
||||
// ── Validation ──────────────────────────────────────────────────────────────
|
||||
|
||||
/**
|
||||
* Validate the manifest against the runtime profile requirements.
|
||||
* Throws on violation.
|
||||
|
||||
544
src/Wire.hs
544
src/Wire.hs
@@ -24,40 +24,22 @@ module Wire
|
||||
import ContentStore (getNodeMerkle, loadTree, putMerkleNode, storeTerm)
|
||||
import Research
|
||||
|
||||
import Codec.CBOR.Decoding ( Decoder
|
||||
, decodeString
|
||||
, decodeBytes
|
||||
, decodeListLen
|
||||
, decodeMapLen
|
||||
)
|
||||
import Control.Monad (replicateM, forM)
|
||||
import Codec.CBOR.Encoding ( Encoding
|
||||
, encodeMapLen
|
||||
, encodeListLen
|
||||
, encodeString
|
||||
, encodeBytes
|
||||
)
|
||||
import Codec.CBOR.Write (toLazyByteString)
|
||||
import Data.Monoid (mconcat)
|
||||
import Codec.CBOR.Read (deserialiseFromBytes, DeserialiseFailure(..))
|
||||
|
||||
import Control.Exception (SomeException, evaluate, try)
|
||||
import Control.Monad (foldM, unless, when)
|
||||
import Crypto.Hash (Digest, SHA256, hash)
|
||||
import Data.Bits ((.&.), (.|.), shiftL, shiftR)
|
||||
import Data.Bits ((.|.), (.&.), shiftL, shiftR)
|
||||
import Data.ByteArray (convert)
|
||||
import Data.ByteString (ByteString)
|
||||
import Data.Foldable (traverse_)
|
||||
import Data.Map (Map)
|
||||
import Data.Text (Text, unpack)
|
||||
import Data.Text.Encoding (decodeUtf8, encodeUtf8)
|
||||
import Data.Word (Word16, Word32, Word64)
|
||||
import Data.Text.Encoding (decodeUtf8, decodeUtf8', encodeUtf8)
|
||||
import Data.Word (Word16, Word32, Word64, Word8)
|
||||
import Database.SQLite.Simple (Connection)
|
||||
import GHC.Generics (Generic)
|
||||
|
||||
import qualified Data.ByteString as BS
|
||||
import qualified Data.ByteString.Base16 as Base16
|
||||
import qualified Data.ByteString.Lazy as BL
|
||||
import qualified Data.Map as Map
|
||||
import qualified Data.Set as Set
|
||||
import qualified Data.Text as T
|
||||
@@ -91,92 +73,316 @@ compressionNone = 0
|
||||
digestSha256 = 1
|
||||
|
||||
-- ---------------------------------------------------------------------------
|
||||
-- CBOR encoding helpers
|
||||
-- Manifest binary constants
|
||||
-- ---------------------------------------------------------------------------
|
||||
|
||||
-- | Canonical CBOR map length encoder.
|
||||
cmkLen :: Int -> Encoding
|
||||
cmkLen n = encodeMapLen (fromIntegral n)
|
||||
-- | Magic prefix identifying the fixed-order manifest v1 format.
|
||||
manifestMagic :: ByteString
|
||||
manifestMagic = "ARBMNFST"
|
||||
|
||||
-- | Decode a CBOR array of n elements.
|
||||
decodeListN :: Decoder s a -> Int -> Decoder s [a]
|
||||
decodeListN dec n = replicateM n dec
|
||||
-- | Manifest major version.
|
||||
manifestMajorVersion :: Word16
|
||||
manifestMajorVersion = 1
|
||||
|
||||
-- | Decode a CBOR map (sequence of key-value pairs).
|
||||
decodeMapN :: Decoder s a -> Decoder s b -> Int -> Decoder s [(a, b)]
|
||||
decodeMapN keyDec valDec n = forM [1..n] $ \_ ->
|
||||
keyDec >>= \k -> valDec >>= \v -> pure (k, v)
|
||||
-- | Manifest minor version.
|
||||
manifestMinorVersion :: Word16
|
||||
manifestMinorVersion = 0
|
||||
|
||||
decodeKey :: Text -> Decoder s ()
|
||||
decodeKey expected = do
|
||||
actual <- decodeString
|
||||
unless (actual == expected) $
|
||||
fail $ "expected key " ++ show expected ++ ", got " ++ show actual
|
||||
-- | Closure mode to byte.
|
||||
closureToByte :: ClosureMode -> Word8
|
||||
closureToByte = \case
|
||||
ClosureComplete -> 0
|
||||
ClosurePartial -> 1
|
||||
|
||||
-- | Canonical CBOR array length encoder.
|
||||
cakLen :: Int -> Encoding
|
||||
cakLen n = encodeListLen (fromIntegral n)
|
||||
closureFromByte :: Word8 -> Either String ClosureMode
|
||||
closureFromByte = \case
|
||||
0 -> Right ClosureComplete
|
||||
1 -> Right ClosurePartial
|
||||
n -> Left $ "unsupported closure byte: " ++ show n
|
||||
|
||||
-- | Encode a canonical CBOR map with key-value pairs as flat sequence.
|
||||
cmkPairs :: [(Text, Encoding)] -> Encoding
|
||||
cmkPairs [] = cmkLen 0
|
||||
cmkPairs kvs = cmkLen (length kvs) <> mconcat [encodeString k <> v | (k, v) <- kvs]
|
||||
|
||||
-- | Encode a canonical CBOR array.
|
||||
cakSeq :: [Encoding] -> Encoding
|
||||
cakSeq [] = cakLen 0
|
||||
cakSeq xs = cakLen (length xs) <> mconcat xs
|
||||
|
||||
-- | Encode a canonical CBOR text string.
|
||||
encText :: Text -> Encoding
|
||||
encText = encodeString
|
||||
|
||||
-- | Encode a canonical CBOR byte string.
|
||||
encBytes :: ByteString -> Encoding
|
||||
encBytes = encodeBytes
|
||||
-- | Metadata tag constants.
|
||||
tagPackage, tagVersion, tagDescription, tagLicense, tagCreatedBy :: Word16
|
||||
tagPackage = 1
|
||||
tagVersion = 2
|
||||
tagDescription = 3
|
||||
tagLicense = 4
|
||||
tagCreatedBy = 5
|
||||
|
||||
-- ---------------------------------------------------------------------------
|
||||
-- Data types with CBOR instances
|
||||
-- Fixed-order manifest binary helpers
|
||||
-- ---------------------------------------------------------------------------
|
||||
|
||||
-- | Encode a UTF-8 text string as: u32 length + UTF-8 bytes.
|
||||
encodeLengthPrefixedText :: Text -> ByteString
|
||||
encodeLengthPrefixedText t = encode32 (fromIntegral $ BS.length bs) <> bs
|
||||
where bs = encodeUtf8 t
|
||||
|
||||
-- | Decode a length-prefixed UTF-8 text string.
|
||||
-- Returns the decoded Text and the remaining ByteString.
|
||||
decodeLengthPrefixedText :: ByteString -> Either String (Text, ByteString)
|
||||
decodeLengthPrefixedText bs =
|
||||
case decode32be "text_length" bs of
|
||||
Left err -> Left $ "decodeLengthPrefixedText: " ++ err
|
||||
Right (len, rest) -> do
|
||||
let payloadLen = fromIntegral len
|
||||
when (BS.length rest < payloadLen) $
|
||||
Left "decodeLengthPrefixedText: string extends beyond input"
|
||||
let (textBytes, after) = BS.splitAt payloadLen rest
|
||||
case decodeUtf8' textBytes of
|
||||
Right txt -> Right (txt, after)
|
||||
Left _ -> Left "decodeLengthPrefixedText: invalid UTF-8"
|
||||
|
||||
-- | Encode a metadata value as a TLV entry: u16 tag + u32 length + raw bytes.
|
||||
encodeMetadataTLV :: Word16 -> ByteString -> ByteString
|
||||
encodeMetadataTLV tag val = encode16 tag <> encode32 (fromIntegral $ BS.length val) <> val
|
||||
|
||||
-- ---------------------------------------------------------------------------
|
||||
-- Fixed-order manifest encoders
|
||||
-- ---------------------------------------------------------------------------
|
||||
|
||||
-- | Encode the entire manifest in fixed-order core + TLV tail layout.
|
||||
encodeManifest :: BundleManifest -> ByteString
|
||||
encodeManifest m =
|
||||
manifestMagic
|
||||
<> encode16 manifestMajorVersion
|
||||
<> encode16 manifestMinorVersion
|
||||
<> encodeLengthPrefixedText (manifestSchema m)
|
||||
<> encodeLengthPrefixedText (manifestBundleType m)
|
||||
<> encodeLengthPrefixedText (treeCalculus (manifestTree m))
|
||||
<> encodeLengthPrefixedText (nodeHashAlgorithm (treeNodeHash (manifestTree m)))
|
||||
<> encodeLengthPrefixedText (nodeHashDomain (treeNodeHash (manifestTree m)))
|
||||
<> encodeLengthPrefixedText (treeNodePayload (manifestTree m))
|
||||
<> encodeLengthPrefixedText (runtimeSemantics (manifestRuntime m))
|
||||
<> encodeLengthPrefixedText (runtimeEvaluation (manifestRuntime m))
|
||||
<> encodeLengthPrefixedText (runtimeAbi (manifestRuntime m))
|
||||
<> encode32 (fromIntegral $ length (runtimeCapabilities (manifestRuntime m)))
|
||||
<> encodeCapabilities (runtimeCapabilities (manifestRuntime m))
|
||||
<> BS.pack [closureToByte (manifestClosure m)]
|
||||
<> encode32 (fromIntegral $ length (manifestRoots m))
|
||||
<> encodeRoots (manifestRoots m)
|
||||
<> encode32 (fromIntegral $ length (manifestExports m))
|
||||
<> encodeExports (manifestExports m)
|
||||
<> encodeMetadataTLVs (manifestMetadata m)
|
||||
<> encode32 0 -- zero extension fields
|
||||
|
||||
encodeCapabilities :: [Text] -> ByteString
|
||||
encodeCapabilities caps = mconcat (map encodeLengthPrefixedText caps)
|
||||
|
||||
encodeRoots :: [BundleRoot] -> ByteString
|
||||
encodeRoots = mconcat . map encodeRoot
|
||||
|
||||
encodeRoot :: BundleRoot -> ByteString
|
||||
encodeRoot root =
|
||||
merkleHashToRaw (rootHash root)
|
||||
<> encodeLengthPrefixedText (rootRole root)
|
||||
|
||||
encodeExports :: [BundleExport] -> ByteString
|
||||
encodeExports = mconcat . map encodeExport
|
||||
|
||||
encodeExport :: BundleExport -> ByteString
|
||||
encodeExport exp =
|
||||
encodeLengthPrefixedText (exportName exp)
|
||||
<> merkleHashToRaw (exportRoot exp)
|
||||
<> encodeLengthPrefixedText (exportKind exp)
|
||||
<> encodeLengthPrefixedText (exportAbi exp)
|
||||
|
||||
-- | Encode metadata as: u32 field count + TLV entries for present fields.
|
||||
-- Metadata TLV values are raw UTF-8 bytes; the TLV length already carries size.
|
||||
encodeMetadataTLVs :: BundleMetadata -> ByteString
|
||||
encodeMetadataTLVs m =
|
||||
let entries = metadataTLVEntries m
|
||||
in encode32 (fromIntegral $ length entries) <> encodeTLVs entries
|
||||
|
||||
metadataTLVEntries :: BundleMetadata -> [(Word16, ByteString)]
|
||||
metadataTLVEntries m =
|
||||
maybeEntry tagPackage (metadataPackage m)
|
||||
++ maybeEntry tagVersion (metadataVersion m)
|
||||
++ maybeEntry tagDescription (metadataDescription m)
|
||||
++ maybeEntry tagLicense (metadataLicense m)
|
||||
++ maybeEntry tagCreatedBy (metadataCreatedBy m)
|
||||
where
|
||||
maybeEntry _ Nothing = []
|
||||
maybeEntry tag (Just value) = [(tag, encodeUtf8 value)]
|
||||
|
||||
encodeTLVs :: [(Word16, ByteString)] -> ByteString
|
||||
encodeTLVs tlvs = mconcat (map (uncurry encodeMetadataTLV) tlvs)
|
||||
|
||||
-- ---------------------------------------------------------------------------
|
||||
-- Fixed-order manifest decoders
|
||||
-- ---------------------------------------------------------------------------
|
||||
|
||||
-- | Decode the manifest from fixed-order core + TLV tail bytes.
|
||||
-- All remaining bytes after the core fields are treated as the TLV tail.
|
||||
decodeManifest :: ByteString -> Either String BundleManifest
|
||||
decodeManifest bs = do
|
||||
-- Header
|
||||
when (BS.length bs < 8) $ Left "manifest too short for magic"
|
||||
when (BS.take 8 bs /= manifestMagic) $ Left "invalid manifest magic"
|
||||
let rest = BS.drop 8 bs
|
||||
(major, rest') <- decode16be "major" rest
|
||||
when (major /= manifestMajorVersion) $ Left $ "unsupported manifest major version: " ++ show major
|
||||
(_minor, rest'') <- decode16be "minor" rest'
|
||||
|
||||
-- Core strings
|
||||
(schema, rest''') <- decodeLengthPrefixedText rest''
|
||||
(bundleType, rest'''') <- decodeLengthPrefixedText rest'''
|
||||
|
||||
-- Tree spec fields (flat)
|
||||
(calc, rest1) <- decodeLengthPrefixedText rest''''
|
||||
(alg, rest2) <- decodeLengthPrefixedText rest1
|
||||
(domain, rest3) <- decodeLengthPrefixedText rest2
|
||||
(payload, rest4) <- decodeLengthPrefixedText rest3
|
||||
|
||||
-- Runtime spec fields (flat)
|
||||
(sem, restR1) <- decodeLengthPrefixedText rest4
|
||||
(eval, restR2) <- decodeLengthPrefixedText restR1
|
||||
(abi, restR3) <- decodeLengthPrefixedText restR2
|
||||
|
||||
(capCount, restR4) <- decode32be "capability_count" restR3
|
||||
let capLen = fromIntegral capCount
|
||||
(caps, restR5) <- decodeCapabilities capLen restR4
|
||||
|
||||
-- Closure
|
||||
when (BS.length restR5 < 1) $ Left "manifest truncated: missing closure byte"
|
||||
let (closureByte, restR6) = BS.splitAt 1 restR5
|
||||
closure <- closureFromByte (head $ BS.unpack closureByte)
|
||||
|
||||
-- Roots
|
||||
(rootCount, restR7) <- decode32be "root_count" restR6
|
||||
let rootCountInt = fromIntegral rootCount
|
||||
(roots, restR8) <- decodeRoots rootCountInt restR7
|
||||
|
||||
-- Exports
|
||||
(exportCount, restR9) <- decode32be "export_count" restR8
|
||||
let exportCountInt = fromIntegral exportCount
|
||||
(exports, restR10) <- decodeExports exportCountInt restR9
|
||||
|
||||
-- TLV tail
|
||||
(metadata, _ext) <- decodeMetadataAndExtensions restR10
|
||||
|
||||
pure BundleManifest
|
||||
{ manifestSchema = schema
|
||||
, manifestBundleType = bundleType
|
||||
, manifestTree = TreeSpec
|
||||
{ treeCalculus = calc
|
||||
, treeNodeHash = NodeHashSpec
|
||||
{ nodeHashAlgorithm = alg
|
||||
, nodeHashDomain = domain
|
||||
}
|
||||
, treeNodePayload = payload
|
||||
}
|
||||
, manifestRuntime = RuntimeSpec
|
||||
{ runtimeSemantics = sem
|
||||
, runtimeEvaluation = eval
|
||||
, runtimeAbi = abi
|
||||
, runtimeCapabilities = caps
|
||||
}
|
||||
, manifestClosure = closure
|
||||
, manifestRoots = roots
|
||||
, manifestExports = exports
|
||||
, manifestMetadata = metadata
|
||||
}
|
||||
|
||||
-- | Decode length-prefixed capability strings.
|
||||
decodeCapabilities :: Int -> ByteString -> Either String ([Text], ByteString)
|
||||
decodeCapabilities 0 bs = Right ([], bs)
|
||||
decodeCapabilities n bs = do
|
||||
(txt, rest) <- decodeLengthPrefixedText bs
|
||||
(restTxts, restFinal) <- decodeCapabilities (n - 1) rest
|
||||
Right (txt : restTxts, restFinal)
|
||||
|
||||
-- | Decode root entries.
|
||||
decodeRoots :: Int -> ByteString -> Either String ([BundleRoot], ByteString)
|
||||
decodeRoots 0 bs = Right ([], bs)
|
||||
decodeRoots n bs = do
|
||||
when (BS.length bs < 32) $ Left "decodeRoots: truncated root hash"
|
||||
let (hashBytes, rest) = BS.splitAt 32 bs
|
||||
role <- decodeLengthPrefixedText rest
|
||||
(restRoots, restFinal) <- decodeRoots (n - 1) (snd role)
|
||||
Right (BundleRoot (rawToMerkleHash hashBytes) (fst role) : restRoots, restFinal)
|
||||
|
||||
-- | Decode export entries.
|
||||
decodeExports :: Int -> ByteString -> Either String ([BundleExport], ByteString)
|
||||
decodeExports 0 bs = Right ([], bs)
|
||||
decodeExports n bs = do
|
||||
name <- decodeLengthPrefixedText bs
|
||||
when (BS.length (snd name) < 32) $ Left "decodeExports: truncated export root hash"
|
||||
let (hashBytes, rest) = BS.splitAt 32 (snd name)
|
||||
kind <- decodeLengthPrefixedText rest
|
||||
abi <- decodeLengthPrefixedText (snd kind)
|
||||
(restExports, restFinal) <- decodeExports (n - 1) (snd abi)
|
||||
Right (BundleExport (fst name) (rawToMerkleHash hashBytes) (fst kind) (fst abi) : restExports, restFinal)
|
||||
|
||||
-- | Decode TLV tail into metadata and extensions.
|
||||
-- Layout: u32 metadata-count, metadata TLVs, u32 extension-count, extension TLVs.
|
||||
-- For now, known metadata tags are decoded and extension TLVs are skipped.
|
||||
decodeMetadataAndExtensions :: ByteString -> Either String (BundleMetadata, ByteString)
|
||||
decodeMetadataAndExtensions bs = do
|
||||
(metadataCount, rest1) <- decode32be "metadata_field_count" bs
|
||||
(metadataTlvs, rest2) <- decodeTLVs (fromIntegral metadataCount) rest1
|
||||
metadata <- decodeMetadataTLVs metadataTlvs
|
||||
(extensionCount, rest3) <- decode32be "extension_field_count" rest2
|
||||
(_extensionTlvs, rest4) <- decodeTLVs (fromIntegral extensionCount) rest3
|
||||
unless (BS.null rest4) $ Left "trailing bytes after manifest TLV tail"
|
||||
Right (metadata, rest4)
|
||||
|
||||
-- | Decode a fixed number of TLV entries.
|
||||
decodeTLVs :: Int -> ByteString -> Either String ([TLVEntry], ByteString)
|
||||
decodeTLVs 0 bs = Right ([], bs)
|
||||
decodeTLVs n bs = do
|
||||
(tag, rest1) <- decode16be "tlv_tag" bs
|
||||
(len, rest2) <- decode32be "tlv_length" rest1
|
||||
let payloadLen = fromIntegral len
|
||||
when (BS.length rest2 < payloadLen) $ Left "TLV value extends beyond input"
|
||||
let (value, after) = BS.splitAt payloadLen rest2
|
||||
(restTlvs, restFinal) <- decodeTLVs (n - 1) after
|
||||
Right ((tag, value) : restTlvs, restFinal)
|
||||
|
||||
-- | Decode known metadata TLV entries into BundleMetadata.
|
||||
-- Unknown tags are ignored.
|
||||
decodeMetadataTLVs :: [(Word16, ByteString)] -> Either String BundleMetadata
|
||||
decodeMetadataTLVs tlvs = do
|
||||
pkg <- decodeOptionalMetadataText tagPackage
|
||||
ver <- decodeOptionalMetadataText tagVersion
|
||||
desc <- decodeOptionalMetadataText tagDescription
|
||||
lic <- decodeOptionalMetadataText tagLicense
|
||||
by <- decodeOptionalMetadataText tagCreatedBy
|
||||
pure BundleMetadata
|
||||
{ metadataPackage = pkg
|
||||
, metadataVersion = ver
|
||||
, metadataDescription = desc
|
||||
, metadataLicense = lic
|
||||
, metadataCreatedBy = by
|
||||
}
|
||||
where
|
||||
lookupTag t = go t tlvs
|
||||
go _ [] = Nothing
|
||||
go t ((tag, val):rest)
|
||||
| tag == t = Just val
|
||||
| otherwise = go t rest
|
||||
decodeOptionalMetadataText tag =
|
||||
case lookupTag tag of
|
||||
Nothing -> Right Nothing
|
||||
Just raw -> case decodeUtf8' raw of
|
||||
Right txt -> Right (Just txt)
|
||||
Left _ -> Left $ "metadata TLV has invalid UTF-8 for tag " ++ show tag
|
||||
|
||||
type TLVEntry = (Word16, ByteString)
|
||||
|
||||
-- ---------------------------------------------------------------------------
|
||||
-- Data types
|
||||
-- ---------------------------------------------------------------------------
|
||||
|
||||
-- | Closure declaration.
|
||||
data ClosureMode = ClosureComplete | ClosurePartial
|
||||
deriving (Show, Eq, Ord, Generic)
|
||||
|
||||
toCBORClosure :: ClosureMode -> Encoding
|
||||
toCBORClosure = encText . \case
|
||||
ClosureComplete -> "complete"
|
||||
ClosurePartial -> "partial"
|
||||
|
||||
closureFromCBOR :: Decoder s ClosureMode
|
||||
closureFromCBOR = decodeString >>= \case
|
||||
"complete" -> pure ClosureComplete
|
||||
"partial" -> pure ClosurePartial
|
||||
other -> fail $ "ClosureMode: " ++ show other
|
||||
|
||||
-- | Hash specification (algorithm + domain strings).
|
||||
data NodeHashSpec = NodeHashSpec
|
||||
{ nodeHashAlgorithm :: Text
|
||||
, nodeHashDomain :: Text
|
||||
} deriving (Show, Eq, Ord, Generic)
|
||||
|
||||
toCBORNodeHashSpec :: NodeHashSpec -> Encoding
|
||||
toCBORNodeHashSpec (NodeHashSpec alg dom) =
|
||||
cmkPairs
|
||||
[ ("algorithm", encText alg)
|
||||
, ("domain", encText dom)
|
||||
]
|
||||
|
||||
nodeHashSpecFromCBOR :: Decoder s NodeHashSpec
|
||||
nodeHashSpecFromCBOR = do
|
||||
n <- decodeMapLen
|
||||
unless (n == 2) $ fail "NodeHashSpec: must have exactly 2 entries"
|
||||
decodeKey "algorithm"
|
||||
alg <- decodeString
|
||||
decodeKey "domain"
|
||||
dom <- decodeString
|
||||
pure (NodeHashSpec alg dom)
|
||||
|
||||
-- | Tree specification.
|
||||
data TreeSpec = TreeSpec
|
||||
{ treeCalculus :: Text
|
||||
@@ -184,26 +390,6 @@ data TreeSpec = TreeSpec
|
||||
, treeNodePayload :: Text
|
||||
} deriving (Show, Eq, Ord, Generic)
|
||||
|
||||
toCBORTreeSpec :: TreeSpec -> Encoding
|
||||
toCBORTreeSpec (TreeSpec calc hspec payload) =
|
||||
cmkPairs
|
||||
[ ("calculus", encText calc)
|
||||
, ("nodeHash", toCBORNodeHashSpec hspec)
|
||||
, ("nodePayload", encText payload)
|
||||
]
|
||||
|
||||
treeSpecFromCBOR :: Decoder s TreeSpec
|
||||
treeSpecFromCBOR = do
|
||||
n <- decodeMapLen
|
||||
unless (n == 3) $ fail "TreeSpec: must have exactly 3 entries"
|
||||
decodeKey "calculus"
|
||||
calc <- decodeString
|
||||
decodeKey "nodeHash"
|
||||
hspec <- nodeHashSpecFromCBOR
|
||||
decodeKey "nodePayload"
|
||||
payload <- decodeString
|
||||
pure (TreeSpec calc hspec payload)
|
||||
|
||||
-- | Runtime specification.
|
||||
data RuntimeSpec = RuntimeSpec
|
||||
{ runtimeSemantics :: Text
|
||||
@@ -212,53 +398,12 @@ data RuntimeSpec = RuntimeSpec
|
||||
, runtimeCapabilities :: [Text]
|
||||
} deriving (Show, Eq, Ord, Generic)
|
||||
|
||||
toCBORRuntimeSpec :: RuntimeSpec -> Encoding
|
||||
toCBORRuntimeSpec (RuntimeSpec sem eval abi caps) =
|
||||
cmkPairs
|
||||
[ ("semantics", encText sem)
|
||||
, ("evaluation", encText eval)
|
||||
, ("abi", encText abi)
|
||||
, ("capabilities", cakSeq (map encText caps))
|
||||
]
|
||||
|
||||
runtimeSpecFromCBOR :: Decoder s RuntimeSpec
|
||||
runtimeSpecFromCBOR = do
|
||||
n <- decodeMapLen
|
||||
unless (n == 4) $ fail "RuntimeSpec: must have exactly 4 entries"
|
||||
decodeKey "semantics"
|
||||
sem <- decodeString
|
||||
decodeKey "evaluation"
|
||||
eval <- decodeString
|
||||
decodeKey "abi"
|
||||
abi <- decodeString
|
||||
decodeKey "capabilities"
|
||||
clen <- decodeListLen
|
||||
caps <- decodeListN decodeString clen
|
||||
pure (RuntimeSpec sem eval abi caps)
|
||||
|
||||
-- | A root hash reference.
|
||||
data BundleRoot = BundleRoot
|
||||
{ rootHash :: MerkleHash
|
||||
, rootRole :: Text
|
||||
} deriving (Show, Eq, Ord, Generic)
|
||||
|
||||
toCBORBundleRoot :: BundleRoot -> Encoding
|
||||
toCBORBundleRoot (BundleRoot h role) =
|
||||
cmkPairs
|
||||
[ ("hash", encBytes (merkleHashToRaw h))
|
||||
, ("role", encText role)
|
||||
]
|
||||
|
||||
bundleRootFromCBOR :: Decoder s BundleRoot
|
||||
bundleRootFromCBOR = do
|
||||
n <- decodeMapLen
|
||||
unless (n == 2) $ fail "BundleRoot: must have exactly 2 entries"
|
||||
decodeKey "hash"
|
||||
hRaw <- decodeBytes
|
||||
decodeKey "role"
|
||||
role <- decodeString
|
||||
pure (BundleRoot (rawToMerkleHash hRaw) role)
|
||||
|
||||
-- | An export entry.
|
||||
data BundleExport = BundleExport
|
||||
{ exportName :: Text
|
||||
@@ -267,29 +412,6 @@ data BundleExport = BundleExport
|
||||
, exportAbi :: Text
|
||||
} deriving (Show, Eq, Ord, Generic)
|
||||
|
||||
toCBORBundleExport :: BundleExport -> Encoding
|
||||
toCBORBundleExport (BundleExport name h kind abi) =
|
||||
cmkPairs
|
||||
[ ("name", encText name)
|
||||
, ("root", encBytes (merkleHashToRaw h))
|
||||
, ("kind", encText kind)
|
||||
, ("abi", encText abi)
|
||||
]
|
||||
|
||||
bundleExportFromCBOR :: Decoder s BundleExport
|
||||
bundleExportFromCBOR = do
|
||||
n <- decodeMapLen
|
||||
unless (n == 4) $ fail "BundleExport: must have exactly 4 entries"
|
||||
decodeKey "name"
|
||||
name <- decodeString
|
||||
decodeKey "root"
|
||||
hRaw <- decodeBytes
|
||||
decodeKey "kind"
|
||||
kind <- decodeString
|
||||
decodeKey "abi"
|
||||
abi <- decodeString
|
||||
pure (BundleExport name (rawToMerkleHash hRaw) kind abi)
|
||||
|
||||
-- | Optional package metadata.
|
||||
data BundleMetadata = BundleMetadata
|
||||
{ metadataPackage :: Maybe Text
|
||||
@@ -299,33 +421,6 @@ data BundleMetadata = BundleMetadata
|
||||
, metadataCreatedBy :: Maybe Text
|
||||
} deriving (Show, Eq, Ord, Generic)
|
||||
|
||||
metadataFromCBOR :: Decoder s BundleMetadata
|
||||
metadataFromCBOR = do
|
||||
mlen <- decodeMapLen
|
||||
entries <- decodeMapN decodeString decodeString mlen
|
||||
let lookupText k = go k entries
|
||||
go _ [] = Nothing
|
||||
go k ((k', v):rest)
|
||||
| k == k' = Just v
|
||||
| otherwise = go k rest
|
||||
pure BundleMetadata
|
||||
{ metadataPackage = lookupText "package"
|
||||
, metadataVersion = lookupText "version"
|
||||
, metadataDescription = lookupText "description"
|
||||
, metadataLicense = lookupText "license"
|
||||
, metadataCreatedBy = lookupText "createdBy"
|
||||
}
|
||||
|
||||
metadataToCBOR :: BundleMetadata -> Encoding
|
||||
metadataToCBOR (BundleMetadata pkg ver desc lic by) =
|
||||
let pairs =
|
||||
maybe [] (\v -> [("package", encText v)]) pkg
|
||||
++ maybe [] (\v -> [("version", encText v)]) ver
|
||||
++ maybe [] (\v -> [("description", encText v)]) desc
|
||||
++ maybe [] (\v -> [("license", encText v)]) lic
|
||||
++ maybe [] (\v -> [("createdBy", encText v)]) by
|
||||
in cmkPairs pairs
|
||||
|
||||
-- | The manifest: top-level bundle metadata.
|
||||
data BundleManifest = BundleManifest
|
||||
{ manifestSchema :: Text
|
||||
@@ -338,43 +433,6 @@ data BundleManifest = BundleManifest
|
||||
, manifestMetadata :: BundleMetadata
|
||||
} deriving (Show, Eq, Generic)
|
||||
|
||||
manifestToCBOR :: BundleManifest -> Encoding
|
||||
manifestToCBOR m =
|
||||
cmkPairs
|
||||
[ ("schema", encText (manifestSchema m))
|
||||
, ("bundleType", encText (manifestBundleType m))
|
||||
, ("tree", toCBORTreeSpec (manifestTree m))
|
||||
, ("runtime", toCBORRuntimeSpec (manifestRuntime m))
|
||||
, ("closure", toCBORClosure (manifestClosure m))
|
||||
, ("roots", cakSeq (map toCBORBundleRoot (manifestRoots m)))
|
||||
, ("exports", cakSeq (map toCBORBundleExport (manifestExports m)))
|
||||
, ("metadata", metadataToCBOR (manifestMetadata m))
|
||||
]
|
||||
|
||||
manifestFromCBOR :: Decoder s BundleManifest
|
||||
manifestFromCBOR = do
|
||||
n <- decodeMapLen
|
||||
unless (n == 8) $ fail "BundleManifest: must have exactly 8 entries"
|
||||
decodeKey "schema"
|
||||
schema <- decodeString
|
||||
decodeKey "bundleType"
|
||||
bundleType <- decodeString
|
||||
decodeKey "tree"
|
||||
tree <- treeSpecFromCBOR
|
||||
decodeKey "runtime"
|
||||
runtime <- runtimeSpecFromCBOR
|
||||
decodeKey "closure"
|
||||
closure <- closureFromCBOR
|
||||
decodeKey "roots"
|
||||
rlen <- decodeListLen
|
||||
roots <- decodeListN bundleRootFromCBOR rlen
|
||||
decodeKey "exports"
|
||||
elen <- decodeListLen
|
||||
exports <- decodeListN bundleExportFromCBOR elen
|
||||
decodeKey "metadata"
|
||||
metadata <- metadataFromCBOR
|
||||
pure (BundleManifest schema bundleType tree runtime closure roots exports metadata)
|
||||
|
||||
-- | Portable executable-object bundle.
|
||||
--
|
||||
-- Merkle node payloads remain the language-neutral executable core:
|
||||
@@ -388,28 +446,12 @@ data Bundle = Bundle
|
||||
, bundleManifestBytes :: ByteString
|
||||
} deriving (Show, Eq)
|
||||
|
||||
-- ---------------------------------------------------------------------------
|
||||
-- CBOR manifest serialization
|
||||
-- ---------------------------------------------------------------------------
|
||||
|
||||
-- | Encode the manifest as canonical CBOR.
|
||||
encodeManifest :: BundleManifest -> ByteString
|
||||
encodeManifest m = BL.toStrict (toLazyByteString (manifestToCBOR m))
|
||||
|
||||
-- | Decode a manifest from CBOR bytes.
|
||||
decodeManifest :: ByteString -> Either String BundleManifest
|
||||
decodeManifest bs =
|
||||
case deserialiseFromBytes manifestFromCBOR (BL.fromStrict bs) of
|
||||
Right (rest, m)
|
||||
| BS.null (BL.toStrict rest) -> Right m
|
||||
| otherwise -> Left "trailing bytes after manifest CBOR"
|
||||
Left (DeserialiseFailure _ msg) -> Left msg
|
||||
|
||||
-- ---------------------------------------------------------------------------
|
||||
-- Bundle encoding
|
||||
-- ---------------------------------------------------------------------------
|
||||
|
||||
-- | Encode a Bundle to portable Bundle v1 bytes.
|
||||
-- The manifest is serialized using the fixed-order core + TLV tail format.
|
||||
encodeBundle :: Bundle -> ByteString
|
||||
encodeBundle bundle =
|
||||
let nodeSection = encodeNodeSection (bundleNodes bundle)
|
||||
|
||||
BIN
test/fixtures/false.arboricx
vendored
BIN
test/fixtures/false.arboricx
vendored
Binary file not shown.
BIN
test/fixtures/id.arboricx
vendored
BIN
test/fixtures/id.arboricx
vendored
Binary file not shown.
BIN
test/fixtures/map.arboricx
vendored
BIN
test/fixtures/map.arboricx
vendored
Binary file not shown.
BIN
test/fixtures/notQ.arboricx
vendored
BIN
test/fixtures/notQ.arboricx
vendored
Binary file not shown.
BIN
test/fixtures/true.arboricx
vendored
BIN
test/fixtures/true.arboricx
vendored
Binary file not shown.
@@ -41,7 +41,6 @@ executable tricu
|
||||
, base16-bytestring
|
||||
, base64-bytestring
|
||||
, bytestring
|
||||
, cborg
|
||||
, cmdargs
|
||||
, containers
|
||||
, cryptonite
|
||||
@@ -94,7 +93,6 @@ test-suite tricu-tests
|
||||
, base16-bytestring
|
||||
, base64-bytestring
|
||||
, bytestring
|
||||
, cborg
|
||||
, cmdargs
|
||||
, containers
|
||||
, cryptonite
|
||||
|
||||
Reference in New Issue
Block a user