diff --git a/AGENTS.md b/AGENTS.md index d6292c9..d843ef9 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -5,7 +5,9 @@ ## 1. Build & Test ```bash -# Full build + tests +# Tests +nix flake check +# Full build nix build .# ``` diff --git a/docs/arborix-bundle-cbor-v1.md b/docs/arborix-bundle-cbor-v1.md new file mode 100644 index 0000000..b9f9441 --- /dev/null +++ b/docs/arborix-bundle-cbor-v1.md @@ -0,0 +1,339 @@ +# Arborix Portable Bundle v1 (CBOR Manifest Profile) + +Status: **Draft, implementation-aligned** (derived from `src/Wire.hs` as of 2026-05-07) + +This document specifies the **actual on-wire format and validation behavior** currently implemented by `tricu` for Arborix bundles, with a focus on the newer CBOR manifest path. + +--- + +## 1. Scope + +This profile defines: + +1. The binary container envelope (header + section directory + section payloads). +2. The CBOR manifest section format. +3. The Merkle node section format. +4. Decode/verify/import behavior in `Wire.hs`. +5. Known gaps and sane resolutions. + +Non-goals: + +- tricu source parsing/lambda elimination/module semantics. +- Signature systems / trust policy. +- Compression codecs beyond `none`. + +--- + +## 2. Container format + +A bundle is a byte stream: + +``` +[32-byte header] +[section directory: section_count * 60 bytes] +[section payload bytes...] +``` + +### 2.1 Header (32 bytes) + +| Field | Size | Encoding | Value / Notes | +|---|---:|---|---| +| Magic | 8 | raw bytes | `41 52 42 4f 52 49 58 00` (`"ARBORIX\0"`) | +| Major | 2 | u16 BE | Must be `1` | +| Minor | 2 | u16 BE | Currently `0` | +| SectionCount | 4 | u32 BE | Number of section directory entries | +| Flags | 8 | u64 BE | Currently emitted as `0`; not interpreted | +| DirectoryOffset | 8 | u64 BE | Offset of section directory (currently `32`) | + +Reader behavior: +- Reject if total bytes < 32. +- Reject bad magic. +- Reject major != 1. + +### 2.2 Section directory entry (60 bytes each) + +| Field | Size | Encoding | Notes | +|---|---:|---|---| +| Type | 4 | u32 BE | e.g. 1=manifest, 2=nodes | +| Version | 2 | u16 BE | Currently emitted as `1`; not enforced on read | +| Flags | 2 | u16 BE | bit0 = critical | +| Compression | 2 | u16 BE | `0` = none (required) | +| DigestAlgorithm | 2 | u16 BE | `1` = SHA-256 (required) | +| Offset | 8 | u64 BE | Absolute byte offset | +| Length | 8 | u64 BE | Section payload length | +| Digest | 32 | raw bytes | SHA-256 of section bytes | + +Reader behavior: +- Reject unknown **critical** section types. +- Reject compression != 0. +- Reject digest algorithm != 1. +- Reject out-of-bounds sections. +- Reject digest mismatch. + +### 2.3 Required section types + +| Type | Name | Required | +|---:|---|---| +| 1 | manifest | yes | +| 2 | nodes | yes | + +Decode currently rejects duplicate section type 1 or 2. + +--- + +## 3. Manifest section (CBOR) + +Manifest bytes are CBOR-encoded map data (using `cborg`). + +### 3.1 Top-level manifest schema + +Top-level map has **exactly 8 keys** in this exact decode order in current implementation: + +1. `schema` (text) +2. `bundleType` (text) +3. `tree` (map) +4. `runtime` (map) +5. `closure` (text: `"complete"|"partial"`) +6. `roots` (array) +7. `exports` (array) +8. `metadata` (map) + +> Important: Current decoder is order-strict; it expects keys in this sequence. + +### 3.2 Nested structures + +#### `tree` map (3 keys, order-strict) +- `calculus`: text +- `nodeHash`: map +- `nodePayload`: text + +`nodeHash` map (2 keys, order-strict): +- `algorithm`: text +- `domain`: text + +#### `runtime` map (4 keys, order-strict) +- `semantics`: text +- `evaluation`: text +- `abi`: text +- `capabilities`: array(text) + +#### `roots` array of maps +Each root map has 2 keys (order-strict): +- `hash`: bytes (raw 32-byte hash payload encoded as CBOR byte string) +- `role`: text + +#### `exports` array of maps +Each export map has 4 keys (order-strict): +- `name`: text +- `root`: bytes (32-byte hash) +- `kind`: text +- `abi`: text + +#### `metadata` map +Flexible key set; decoded as map(text -> text), then projected into optional fields: +- `package` +- `version` +- `description` +- `license` +- `createdBy` + +Unknown metadata keys are ignored. + +### 3.3 Default emitted manifest values + +Writers in `Wire.hs` currently emit: + +- `schema = "arborix.bundle.manifest.v1"` +- `bundleType = "tree-calculus-executable-object"` +- `tree.calculus = "tree-calculus.v1"` +- `tree.nodeHash.algorithm = "sha256"` +- `tree.nodeHash.domain = "arborix.merkle.node.v1"` +- `tree.nodePayload = "arborix.merkle.payload.v1"` +- `runtime.semantics = "tree-calculus.v1"` +- `runtime.evaluation = "normal-order"` +- `runtime.abi = "arborix.abi.tree.v1"` +- `runtime.capabilities = []` +- `closure = "complete"` +- `metadata.createdBy = "arborix"` + +--- + +## 4. Nodes section (binary) + +Node section payload layout: + +``` +node_count: u64 BE +repeat node_count times: + hash: 32 bytes + payload_len: u32 BE + payload: payload_len bytes +``` + +Node payload grammar: + +- `0x00` => Leaf +- `0x01 || child_hash(32)` => Stem +- `0x02 || left_hash(32)||right(32)` => Fork + +Section decoder rejects: +- duplicate node hashes, +- truncated entries, +- payload overruns, +- trailing bytes after final node. + +--- + +## 5. Verification behavior (`verifyBundle`) + +`verifyBundle` enforces all of: + +1. bundle version >= 1. +2. bundle has at least one node. +3. manifest constants match hardcoded v1 values (schema/type/calculus/hash algo/domain/payload/runtime semantics/ABI). +4. runtime capabilities must be empty. +5. closure must be `complete`. +6. manifest has at least one root and one export. +7. root sets in `bundleRoots` and `manifest.roots` must match exactly. +8. each root and export root exists in node map. +9. each node payload deserializes and re-hashes to declared node hash. +10. all referenced child hashes exist. +11. full closure reachability from roots succeeds. + +`importBundle` runs decode + verify before storing nodes. + +--- + +## 6. Export/import semantics + +### 6.1 Export + +`exportNamedBundle`: +- Traverses reachable nodes for each requested root hash. +- Builds node map. +- Builds default manifest and CBOR bytes. +- Emits two sections (manifest, nodes). + +`exportBundle` auto-names exports: +- 1 root => `root` +- N>1 => `root0`, `root1`, ... + +### 6.2 Import + +`importBundle`: +1. Decode bundle. +2. Verify bundle. +3. Insert all node payloads into content store. +4. For each manifest export: reconstruct tree by export root and store name binding in DB. +5. Return bundle root list. + +--- + +## 7. Determinism properties + +Current implementation is deterministic for identical logical input because: +- Node map serialized in ascending hash order (`Map.toAscList`). +- Field order in manifest encoding is fixed by code. +- Section ordering is fixed: manifest then nodes. + +So repeated exports of same roots produce byte-identical bundles. + +--- + +## 8. Known gaps and sane resolutions + +These are important design gaps visible from current code. + +### Gap A: Node hash domain mismatch risk (critical) + +Status: **resolved in current codebase**. + +What was wrong: +- Manifest declared `tree.nodeHash.domain = "arborix.merkle.node.v1"`. +- Hashing implementation previously used `"tricu.merkle.node.v1"`. + +Current state: +- Haskell hashing now uses `"arborix.merkle.node.v1"`. +- JS reference runtime hashing now uses `"arborix.merkle.node.v1"`. +- JS manifest validation now requires `"arborix.merkle.node.v1"`. + +Remaining recommendation: +- Keep hash-domain constants centralized/shared to prevent future drift. +- Add explicit test vectors for Leaf/Stem/Fork hashes under the Arborix domain. + +### Gap B: CBOR decode is order-strict, not generic-map tolerant + +Observed: +- Decoder expects exact key order for most maps. + +Impact: +- Another canonical CBOR writer that reorders keys may decode-fail even if semantically equivalent. + +Sane resolution: +- For v1 compatibility, decode maps as unordered key/value collections, require key presence and types, and reject unknown keys only where desired. +- Keep writer deterministic, but relax reader. + +### Gap C: “Canonical CBOR” claim is stronger than implementation + +Observed: +- Writer uses fixed order but does not explicitly sort keys per RFC 8949 canonical ordering rules. + +Sane resolution: +- Either (a) rename as “deterministic CBOR” profile, or (b) implement explicit canonical key ordering and canonical-length/minimal integer forms checks. + +### Gap D: Extra section preservation + +Observed: +- Decoder tolerates unknown non-critical sections, but `Bundle` model/encoder drops them on re-encode. + +Sane resolution: +- Add `bundleExtraSections :: [SectionEntry+Bytes]` if round-trip preservation is desired. + +### Gap E: Section version not enforced + +Observed: +- Section entry `Version` is parsed but unused. + +Sane resolution: +- Enforce known version matrix (e.g., manifest v1, nodes v1), or explicitly document “advisory only”. + +### Gap F: Runtime capability policy is hard fail + +Observed: +- Any non-empty capabilities list is rejected. + +Sane resolution: +- Keep strict for now, but define capability negotiation strategy for v1.1+ (unknown capabilities => reject unless explicitly allowed by host policy). + +### Gap G: Error handling style in import/export path + +Observed: +- Several paths throw `error` for malformed data/store misses. + +Sane resolution: +- Return `Either`-style typed errors through public API (`decode`, `verify`, `import`), reserve exceptions for truly internal faults. + +--- + +## 9. Conformance checklist (v1 current) + +A conforming v1 reader/writer for this profile should: + +- Implement the 32-byte header and 60-byte section records exactly. +- Support required sections 1 and 2. +- Verify section digests with SHA-256. +- Decode/encode manifest CBOR matching the field model above. +- Parse nodes section and validate node payload structure. +- Recompute and verify node hashes. +- Enforce complete closure for roots. +- Enforce manifest/runtime constants used by v1. + +--- + +## 10. Suggested follow-up docs + +To stabilize interoperability, add: + +1. `docs/arborix-bundle-test-vectors.md` (golden header/manifest/nodes + expected hashes). +2. `docs/arborix-bundle-errors.md` (normative error codes/strings). +3. `docs/arborix-bundle-evolution.md` (rules for minor/major upgrades, capability negotiation, extra sections). diff --git a/ext/bundle-runtime-profile-v1.md b/ext/bundle-runtime-profile-v1.md deleted file mode 100644 index 0f926ce..0000000 --- a/ext/bundle-runtime-profile-v1.md +++ /dev/null @@ -1,49 +0,0 @@ -1. Scope - This profile defines the minimum required behavior for runtimes that execute tricu bundles. - -2. Non-goals - No tricu source parsing. - No lambda elimination. - No module system. - No package manager. - No local DB requirement. - No authoring names beyond bundle exports. - -3. Required bundle sections - Header - Manifest/exports - Merkle nodes - -4. Optional/skippable sections - Source, debug, package metadata, signatures, provenance, etc. - -5. Entrypoint selection - Explicit export name first. - Else export named main. - Else single default root. - Else error. - -6. Node payload format - Leaf/Stem/Fork byte layouts. - -7. Hash verification - Domain string and payload hashing rules. - -8. Closure verification - All referenced child hashes must exist. - -9. Runtime representation - Suggested JS representation, but not normative. - -10. Reduction semantics - The six Tree Calculus apply rules. - -11. Codecs for v1 - Raw tree required. - Maybe string/bool optional or experimental. - -12. Required error cases - Bad magic/version, missing export, hash mismatch, malformed payload, missing child. - -13. Test fixtures - List of bundles the implementation must pass. diff --git a/ext/js/src/bundle.js b/ext/js/src/bundle.js index b58501f..593c6eb 100644 --- a/ext/js/src/bundle.js +++ b/ext/js/src/bundle.js @@ -18,9 +18,12 @@ * Offset 8B u64 BE * Length 8B u64 BE * SHA256Digest 32B raw + * Manifest: canonical CBOR-encoded map (cborg output from Haskell) + * Nodes: binary section */ import { createHash } from "node:crypto"; +import { decodeCbor } from "./cbor.js"; // ── Constants ─────────────────────────────────────────────────────────────── @@ -170,12 +173,37 @@ export function parseBundle(buffer) { } /** - * Convenience: parse and return just the manifest JSON. + * Post-process a CBOR-decoded manifest to normalize hash fields + * from raw bytes to hex strings (matching the old JSON wire format). + */ +function normalizeManifest(raw) { + const tree = raw.tree; + if (tree && tree.nodeHash && tree.nodeHash.domain) { + tree.nodeHash.domain = tree.nodeHash.domain; + } + + // Convert root hashes from raw bytes to hex + const roots = (raw.roots || []).map((r) => ({ + ...r, + hash: r.hash instanceof Uint8Array ? Buffer.from(r.hash).toString("hex") : r.hash, + })); + + // Convert export root hashes from raw bytes to hex + const exports = (raw.exports || []).map((e) => ({ + ...e, + root: e.root instanceof Uint8Array ? Buffer.from(e.root).toString("hex") : e.root, + })); + + return { ...raw, roots, exports }; +} + +/** + * Convenience: parse and return the manifest from CBOR. */ export function parseManifest(buffer) { const bundle = parseBundle(buffer); const manifestEntry = bundle.sections.get(SECTION_MANIFEST); - return JSON.parse(manifestEntry.data.toString("utf-8")); + return normalizeManifest(decodeCbor(manifestEntry.data)); } /** diff --git a/ext/js/src/cbor.js b/ext/js/src/cbor.js new file mode 100644 index 0000000..9b31f2b --- /dev/null +++ b/ext/js/src/cbor.js @@ -0,0 +1,130 @@ +/** + * cbor.js — Minimal CBOR decoder for the Arborix manifest format. + * + * Decodes the canonical CBOR produced by the Haskell cborg library: + * - Maps: major type 5 (0xa0 + length) + * - Arrays: major type 4 (0x80 + length) + * - Text strings: major type 3, UTF-8 encoded + * - Byte strings: major type 2 + * - Unsigned ints: major type 0 + * - Simple values: 0xc2 = false, 0xc3 = true + * + * Only covers the subset needed for the manifest. + */ + +// ── Decoding state ────────────────────────────────────────────────────────── + +/** + * @param {Buffer} data + * @returns {number} remaining buffer + */ +function makeDecoder(data) { + let offset = 0; + + return { + /** @returns {number} current offset */ + getPos() { return offset; }, + + /** @returns {number} remaining bytes */ + remaining() { return data.length - offset; }, + + /** @returns {number} total length */ + length() { return data.length; }, + + /** Read N bytes and advance */ + read(n) { + if (offset + n > data.length) { + throw new Error(`CBOR read: expected ${n} bytes, ${data.length - offset} remaining at offset ${offset}`); + } + const slice = data.slice(offset, offset + n); + offset += n; + return slice; + }, + + /** Read a single byte */ + readByte() { + if (offset >= data.length) { + throw new Error(`CBOR readByte: no bytes remaining at offset ${offset}`); + } + return data[offset++]; + }, + }; +} + +// ── CBOR helpers ──────────────────────────────────────────────────────────── + +/** + * Read a CBOR length (major type initial byte encodes length for values < 24). + * For 24+, reads additional bytes per spec. + * @returns {number} + */ +function cborReadLength(dec, startByte) { + const additional = startByte & 0x1f; + if (additional < 24) return additional; + if (additional === 24) return dec.read(1)[0]; + if (additional === 25) return dec.read(2).readUint16BE(0); + if (additional === 26) return dec.read(4).readUint32BE(0); + throw new Error(`CBOR: unsupported additional info ${additional}`); +} + +// ── Top-level decode ──────────────────────────────────────────────────────── + +/** + * Decode a single CBOR value from buffer bytes. + * @param {Buffer} buf + * @returns {*} + */ +export function decodeCbor(buf) { + const dec = makeDecoder(buf); + const result = cborDecode(dec); + return result; +} + +function cborDecode(dec) { + const first = dec.readByte(); + const major = (first >> 5) & 0x07; + const info = first & 0x1f; + + switch (major) { + case 0: // unsigned int + case 1: // negative int + return cborReadLength(dec, first); + + case 2: // byte string + return dec.read(cborReadLength(dec, first)); + + case 3: // text string (UTF-8) + const len = cborReadLength(dec, first); + return dec.read(len).toString("utf-8"); + + case 4: // array + const arrLen = cborReadLength(dec, first); + const arr = []; + for (let i = 0; i < arrLen; i++) { + arr.push(cborDecode(dec)); + } + return arr; + + case 5: // map + const mapLen = cborReadLength(dec, first); + const map = {}; + for (let i = 0; i < mapLen; i++) { + const key = cborDecode(dec); + const val = cborDecode(dec); + map[key] = val; + } + return map; + + case 7: // simple values / floats + if (info === 20) return false; + if (info === 21) return true; + if (info === 22) return null; // undefined + if (info === 23) return null; // break (shouldn't appear in definite-length) + // 0xf9-fb are half/float/double floats — not used by our writer + throw new Error(`CBOR: unsupported simple value ${info}`); + + default: + // Tags (major 6) and break (0xff) — not used in our manifest + throw new Error(`CBOR: unsupported major type ${major}, info ${info}`); + } +} diff --git a/ext/js/src/manifest.js b/ext/js/src/manifest.js index 85a3a86..64a5e91 100644 --- a/ext/js/src/manifest.js +++ b/ext/js/src/manifest.js @@ -33,7 +33,7 @@ export function validateManifest(manifest) { `unsupported node hash algorithm: ${tree.nodeHash.algorithm}` ); } - if (tree.nodeHash.domain !== "tricu.merkle.node.v1" && tree.nodeHash.domain !== "arborix.merkle.node.v1") { + if (tree.nodeHash.domain !== "arborix.merkle.node.v1") { throw new Error( `unsupported node hash domain: ${tree.nodeHash.domain}` ); diff --git a/ext/js/src/merkle.js b/ext/js/src/merkle.js index 107e9c2..57f938c 100644 --- a/ext/js/src/merkle.js +++ b/ext/js/src/merkle.js @@ -7,14 +7,14 @@ * Fork: 0x02 || left_hash (32 bytes raw) || right_hash (32 bytes raw) * * Hash computation: - * hash = SHA256( "tricu.merkle.node.v1" || 0x00 || node_payload ) + * hash = SHA256( "arborix.merkle.node.v1" || 0x00 || node_payload ) */ import { createHash } from "node:crypto"; // ── Constants ─────────────────────────────────────────────────────────────── -const DOMAIN_TAG = "tricu.merkle.node.v1"; +const DOMAIN_TAG = "arborix.merkle.node.v1"; const HASH_LENGTH = 32; // raw hash bytes const HEX_LENGTH = 64; // hex-encoded hash length diff --git a/ext/js/test/bundle.test.js b/ext/js/test/bundle.test.js index 6445606..26b2830 100644 --- a/ext/js/test/bundle.test.js +++ b/ext/js/test/bundle.test.js @@ -1,5 +1,6 @@ import { readFileSync } from "node:fs"; import { strictEqual, ok, throws } from "node:assert"; +import { createHash } from "node:crypto"; import { describe, it } from "node:test"; import { parseBundle, @@ -13,12 +14,12 @@ import { parseNodeSection as parseNodes, } from "../src/merkle.js"; -const fixtureDir = "test/fixtures"; +const fixtureDir = "../../test/fixtures"; describe("bundle parsing", () => { it("valid bundle parses header and sections", () => { const bundle = parseBundle( - readFileSync(`${fixtureDir}/id.tri.bundle`) + readFileSync(`${fixtureDir}/id.arborix`) ); strictEqual(bundle.version, "1.0"); strictEqual(bundle.sectionCount, 2); @@ -26,15 +27,16 @@ describe("bundle parsing", () => { ok(bundle.sections.has(2)); // nodes }); - it("parseManifest returns valid JSON", () => { + it("parseManifest returns valid manifest", () => { const manifest = parseManifest( - readFileSync(`${fixtureDir}/id.tri.bundle`) + readFileSync(`${fixtureDir}/id.arborix`) ); strictEqual(manifest.schema, "arborix.bundle.manifest.v1"); strictEqual(manifest.bundleType, "tree-calculus-executable-object"); strictEqual(manifest.closure, "complete"); strictEqual(manifest.tree.calculus, "tree-calculus.v1"); strictEqual(manifest.tree.nodeHash.algorithm, "sha256"); + strictEqual(manifest.tree.nodeHash.domain, "arborix.merkle.node.v1"); strictEqual(manifest.runtime.semantics, "tree-calculus.v1"); strictEqual(manifest.runtime.abi, "arborix.abi.tree.v1"); }); @@ -43,7 +45,7 @@ describe("bundle parsing", () => { describe("hash verification", () => { it("valid bundle nodes verify", () => { const data = bundleParseNodeSection( - readFileSync(`${fixtureDir}/id.tri.bundle`) + readFileSync(`${fixtureDir}/id.arborix`) ); const { nodeMap } = parseNodes(data); const { verified } = verifyNodeHashes(nodeMap); @@ -64,4 +66,69 @@ describe("errors", () => { buf.writeUInt16BE(2, 8); // major version 2 throws(() => parseBundle(buf), /unsupported bundle major version/); }); + + it("bad section digest fails", () => { + const buf = readFileSync(`${fixtureDir}/id.arborix`); + // Corrupt one byte in the manifest section + buf[152] ^= 0x01; + throws(() => parseBundle(buf), /digest mismatch/); + }); + + it("truncated bundle fails", () => { + const buf = readFileSync(`${fixtureDir}/id.arborix`); + const truncated = buf.slice(0, 40); + throws(() => parseBundle(truncated), /truncated/); + }); + + it("missing nodes section fails", () => { + // Build a bundle with only manifest entry in the directory (1 section instead of 2) + const header = Buffer.alloc(32, 0); + header.write("ARBORIX\0", 0, 8); + header.writeUInt16BE(1, 8); // major version + header.writeUInt16BE(0, 10); // minor version + header.writeUInt32BE(1, 12); // 1 section + + // Build a manifest JSON + const manifestObj = { + schema: "arborix.bundle.manifest.v1", + bundleType: "tree-calculus-executable-object", + tree: { + calculus: "tree-calculus.v1", + nodeHash: { + algorithm: "sha256", + domain: "arborix.merkle.node.v1" + }, + nodePayload: "arborix.merkle.payload.v1" + }, + runtime: { + semantics: "tree-calculus.v1", + evaluation: "normal-order", + abi: "arborix.abi.tree.v1", + capabilities: [] + }, + closure: "complete", + roots: [{ hash: Buffer.alloc(32).toString("hex"), role: "default" }], + exports: [{ name: "root", root: Buffer.alloc(32).toString("hex"), kind: "term", abi: "arborix.abi.tree.v1" }], + metadata: { createdBy: "arborix" } + }; + const manifestJson = JSON.stringify(manifestObj); + const manifestBytes = Buffer.from(manifestJson); + + // Section directory entry (60 bytes, all fields are u64 after the u16s) + const entry = Buffer.alloc(60, 0); + entry.writeUInt32BE(1, 0); // type: manifest + entry.writeUInt16BE(1, 4); // version + entry.writeUInt16BE(1, 6); // flags: critical + entry.writeUInt16BE(0, 8); // compression: none + entry.writeUInt16BE(1, 10); // digest algorithm: sha256 + entry.writeBigUInt64BE(BigInt(32 + 60), 12); // offset (u64) + entry.writeBigUInt64BE(BigInt(manifestBytes.length), 20); // length (u64) + entry.set(createHash("sha256").update(manifestBytes).digest(), 28); // digest (32 bytes) + + // Set dirOffset to 32 so parseBundle reads directory from after header + header.writeBigUInt64BE(BigInt(32), 24); + + const bundleBuf = Buffer.concat([header, entry, manifestBytes]); + throws(() => parseBundle(bundleBuf), /missing required section/); + }); }); diff --git a/ext/js/test/merkle.test.js b/ext/js/test/merkle.test.js index c724fe0..bb233a5 100644 --- a/ext/js/test/merkle.test.js +++ b/ext/js/test/merkle.test.js @@ -1,13 +1,14 @@ import { readFileSync } from "node:fs"; import { strictEqual, ok } from "node:assert"; import { describe, it } from "node:test"; -import { parseNodeSection } from "../src/bundle.js"; +import { parseNodeSection as bundleParseNodeSection, parseBundle, parseManifest } from "../src/bundle.js"; import { verifyNodeHashes, verifyClosure, verifyRootClosure, deserializePayload, computeNodeHash, + parseNodeSection, } from "../src/merkle.js"; describe("merkle — deserializePayload", () => { @@ -49,46 +50,70 @@ describe("merkle — computeNodeHash", () => { const hash = computeNodeHash(leaf); strictEqual(hash.length, 64); }); + + it("Leaf hash matches expected Arborix domain", () => { + const leaf = { type: "leaf" }; + const hash = computeNodeHash(leaf); + strictEqual(hash, "e54db458aa8e94782f7c61ad6c1f19a1c0c6fca7ffe53674f0d2bc5ff7ab02ff"); + }); }); describe("merkle — node section parsing", () => { - const fixtureDir = "test/fixtures"; + const fixtureDir = "../../test/fixtures"; - it("parses id.tri.bundle with correct node count", () => { - const data = parseNodeSection( - readFileSync(`${fixtureDir}/id.tri.bundle`) + it("parses id.arborix with correct node count", () => { + const data = bundleParseNodeSection( + readFileSync(`${fixtureDir}/id.arborix`) ); - const { nodeMap } = parseNodes(data); + const { nodeMap } = parseNodeSection(data); strictEqual(nodeMap.size, 4); }); - it("parses true.tri.bundle with correct node count", () => { - const data = parseNodeSection( - readFileSync(`${fixtureDir}/true.tri.bundle`) + it("parses true.arborix with correct node count", () => { + const data = bundleParseNodeSection( + readFileSync(`${fixtureDir}/true.arborix`) ); - const { nodeMap } = parseNodes(data); + const { nodeMap } = parseNodeSection(data); strictEqual(nodeMap.size, 2); }); + + it("parses false.arborix with correct node count", () => { + const data = bundleParseNodeSection( + readFileSync(`${fixtureDir}/false.arborix`) + ); + const { nodeMap } = parseNodeSection(data); + strictEqual(nodeMap.size, 1); + }); }); describe("merkle — hash verification", () => { - const fixtureDir = "test/fixtures"; + const fixtureDir = "../../test/fixtures"; - it("id.tri.bundle nodes all verify", () => { - const data = parseNodeSection( - readFileSync(`${fixtureDir}/id.tri.bundle`) + it("id.arborix nodes all verify", () => { + const data = bundleParseNodeSection( + readFileSync(`${fixtureDir}/id.arborix`) ); - const { nodeMap } = parseNodes(data); + const { nodeMap } = parseNodeSection(data); const { verified, mismatches } = verifyNodeHashes(nodeMap); - ok(verified, "id.tri.bundle node hashes should verify"); + ok(verified, "id.arborix node hashes should verify"); + strictEqual(mismatches.length, 0); + }); + + it("true.arborix nodes all verify", () => { + const data = bundleParseNodeSection( + readFileSync(`${fixtureDir}/true.arborix`) + ); + const { nodeMap } = parseNodeSection(data); + const { verified, mismatches } = verifyNodeHashes(nodeMap); + ok(verified, "true.arborix node hashes should verify"); strictEqual(mismatches.length, 0); }); it("corrupted node payload fails hash verification", () => { - const data = parseNodeSection( - readFileSync(`${fixtureDir}/id.tri.bundle`) + const data = bundleParseNodeSection( + readFileSync(`${fixtureDir}/id.arborix`) ); - const { nodeMap } = parseNodes(data); + const { nodeMap } = parseNodeSection(data); // Find a stem node to corrupt let stemKey = null; for (const [key, node] of nodeMap) { @@ -110,32 +135,39 @@ describe("merkle — hash verification", () => { }); describe("merkle — closure verification", () => { - const fixtureDir = "test/fixtures"; + const fixtureDir = "../../test/fixtures"; - it("id.tri.bundle has complete closure", () => { - const data = parseNodeSection( - readFileSync(`${fixtureDir}/id.tri.bundle`) + it("id.arborix has complete closure", () => { + const data = bundleParseNodeSection( + readFileSync(`${fixtureDir}/id.arborix`) ); - const { nodeMap } = parseNodes(data); + const { nodeMap } = parseNodeSection(data); const { complete, missing } = verifyClosure(nodeMap); - ok(complete, "id.tri.bundle should have complete closure"); + ok(complete, "id.arborix should have complete closure"); strictEqual(missing.length, 0); }); it("verifyRootClosure checks transitive reachability", () => { - const data = parseNodeSection( - readFileSync(`${fixtureDir}/id.tri.bundle`) + const data = bundleParseNodeSection( + readFileSync(`${fixtureDir}/id.arborix`) ); - const { nodeMap } = parseNodes(data); - const rootHash = "039cc9aacf5be78ec1975713e6ad154a36988e3f3df18589b0d0c801d0825d78"; + const { nodeMap } = parseNodeSection(data); + // Use the actual root hash from the fixture's manifest + const manifest = parseManifest(readFileSync(`${fixtureDir}/id.arborix`)); + const rootHash = manifest.exports[0].root; const { complete, missingRoots } = verifyRootClosure(nodeMap, rootHash); ok(complete, "root should be reachable"); strictEqual(missingRoots.length, 0); }); -}); -// Helper import -import { parseNodeSection as parseNodes } from "../src/merkle.js"; + it("parseNodeSection returns correct node count", () => { + const data = bundleParseNodeSection( + readFileSync(`${fixtureDir}/id.arborix`) + ); + const result = parseNodeSection(data); + strictEqual(result.count, 4); + }); +}); // Helper for throws function throws(fn, expected) { diff --git a/ext/js/test/run-bundle.test.js b/ext/js/test/run-bundle.test.js index 854edfc..4a87d45 100644 --- a/ext/js/test/run-bundle.test.js +++ b/ext/js/test/run-bundle.test.js @@ -7,10 +7,10 @@ import { validateManifest, selectExport } from "../src/manifest.js"; import { verifyNodeHashes, parseNodeSection as parseNodes } from "../src/merkle.js"; import { buildTreeFromNodeMap } from "../src/cli.js"; -const fixtureDir = "test/fixtures"; +const fixtureDir = "../../test/fixtures"; -describe("run bundle — id.tri.bundle", () => { - const bundle = readFileSync(`${fixtureDir}/id.tri.bundle`); +describe("run bundle — id.arborix", () => { + const bundle = readFileSync(`${fixtureDir}/id.arborix`); const manifest = parseManifest(bundle); const nodeSectionData = bundleParseNodeSection(bundle); const { nodeMap } = parseNodes(nodeSectionData); @@ -24,25 +24,21 @@ describe("run bundle — id.tri.bundle", () => { ok(verified); }); - it("export 'id' is selectable", () => { - const exp = selectExport(manifest, "id"); - strictEqual(exp.name, "id"); + it("export 'root' is selectable", () => { + const exp = selectExport(manifest, "root"); + strictEqual(exp.name, "root"); }); it("tree reconstructs as a Fork", () => { - const exp = selectExport(manifest, "id"); + const exp = selectExport(manifest, "root"); const tree = buildTreeFromNodeMap(nodeMap, exp.root); ok(Array.isArray(tree)); - // id = t (t t) = Fork (Stem Leaf) Leaf... - // In Haskell: id = S = t (t (t t)) t - // This is Fork (Fork (Fork Leaf Leaf) Leaf) Leaf - // In array form: [[[], []], [], []] ok(tree.length >= 2, "tree should be a Fork (length >= 2)"); }); }); -describe("run bundle — true.tri.bundle", () => { - const bundle = readFileSync(`${fixtureDir}/true.tri.bundle`); +describe("run bundle — true.arborix", () => { + const bundle = readFileSync(`${fixtureDir}/true.arborix`); const manifest = parseManifest(bundle); const nodeSectionData = bundleParseNodeSection(bundle); const { nodeMap } = parseNodes(nodeSectionData); @@ -51,20 +47,60 @@ describe("run bundle — true.tri.bundle", () => { validateManifest(manifest); }); - it("export 'const' is selectable", () => { - const exp = selectExport(manifest, "const"); - strictEqual(exp.name, "const"); + it("export 'root' is selectable", () => { + const exp = selectExport(manifest, "root"); + strictEqual(exp.name, "root"); }); - it("tree reconstructs", () => { - const exp = selectExport(manifest, "const"); + it("tree reconstructs as Stem Leaf", () => { + const exp = selectExport(manifest, "root"); const tree = buildTreeFromNodeMap(nodeMap, exp.root); ok(Array.isArray(tree)); + strictEqual(tree.length, 1, "true should be a Stem (single child)"); + strictEqual(tree[0].length, 0, "child should be Leaf"); + }); +}); + +describe("run bundle — false.arborix", () => { + const bundle = readFileSync(`${fixtureDir}/false.arborix`); + const manifest = parseManifest(bundle); + const nodeSectionData = bundleParseNodeSection(bundle); + const { nodeMap } = parseNodes(nodeSectionData); + + it("manifest validates", () => { + validateManifest(manifest); + }); + + it("export 'root' is selectable", () => { + const exp = selectExport(manifest, "root"); + strictEqual(exp.name, "root"); + }); + + it("tree reconstructs as Leaf", () => { + const exp = selectExport(manifest, "root"); + const tree = buildTreeFromNodeMap(nodeMap, exp.root); + strictEqual(tree.length, 0, "false should be Leaf (empty array)"); + }); +}); + +describe("run bundle — notQ.arborix", () => { + const bundle = readFileSync(`${fixtureDir}/notQ.arborix`); + const manifest = parseManifest(bundle); + const nodeSectionData = bundleParseNodeSection(bundle); + const { nodeMap } = parseNodes(nodeSectionData); + + it("manifest validates", () => { + validateManifest(manifest); + }); + + it("node hashes verify", () => { + const { verified } = verifyNodeHashes(nodeMap); + ok(verified); }); }); describe("run bundle — missing export", () => { - const bundle = readFileSync(`${fixtureDir}/id.tri.bundle`); + const bundle = readFileSync(`${fixtureDir}/id.arborix`); const manifest = parseManifest(bundle); it("nonexistent export fails clearly", () => { @@ -73,8 +109,8 @@ describe("run bundle — missing export", () => { }); describe("run bundle — auto-select", () => { - // true.tri.bundle has only one export, should auto-select - const bundle = readFileSync(`${fixtureDir}/true.tri.bundle`); + // true.arborix has only one export, should auto-select + const bundle = readFileSync(`${fixtureDir}/true.arborix`); const manifest = parseManifest(bundle); it("single export auto-selects", () => { diff --git a/flake.nix b/flake.nix index 5aca8da..acc7401 100644 --- a/flake.nix +++ b/flake.nix @@ -18,9 +18,14 @@ tricuStatic = hsLib.justStaticExecutables self.packages.${system}.default; - tricuPackage = + tricuPackageTests = haskellPackages.callCabal2nix packageName self {}; + tricuPackage = + hsLib.dontCheck ( + haskellPackages.callCabal2nix packageName self {} + ); + customGHC = haskellPackages.ghcWithPackages (hpkgs: with hpkgs; [ megaparsec ]); @@ -28,8 +33,8 @@ packages.${packageName} = tricuPackage; packages.default = tricuPackage; - checks.${packageName} = tricuPackage; - checks.default = tricuPackage; + checks.${packageName} = tricuPackageTests; + checks.default = tricuPackageTests; defaultPackage = self.packages.${system}.default; diff --git a/src/Research.hs b/src/Research.hs index 86101a3..e20b581 100644 --- a/src/Research.hs +++ b/src/Research.hs @@ -85,12 +85,12 @@ serializeNode (NFork l r) = BS.pack [0x02] <> go (decode (encodeUtf8 l)) <> go ( go (Right bs) = bs -- | Hash a node per the Merkle content-addressing spec. --- hash = SHA256( "tricu.merkle.node.v1" <> 0x00 <> node_payload ) +-- hash = SHA256( "arborix.merkle.node.v1" <> 0x00 <> node_payload ) nodeHash :: Node -> MerkleHash nodeHash node = decodeUtf8 (encode (sha256WithPrefix (serializeNode node))) where sha256WithPrefix payload = convert . (hash :: BS.ByteString -> Digest SHA256) $ utf8Tag <> BS.pack [0x00] <> payload - utf8Tag = BS.pack $ map fromIntegral $ BS.unpack "tricu.merkle.node.v1" + utf8Tag = BS.pack $ map fromIntegral $ BS.unpack "arborix.merkle.node.v1" -- | Deserialize a Node from canonical bytes. deserializeNode :: BS.ByteString -> Node diff --git a/src/Wire.hs b/src/Wire.hs index 82ba2bb..836a5b8 100644 --- a/src/Wire.hs +++ b/src/Wire.hs @@ -1,4 +1,5 @@ {-# LANGUAGE DeriveGeneric #-} +{-# LANGUAGE OverloadedStrings #-} module Wire ( Bundle (..) @@ -8,7 +9,7 @@ module Wire , RuntimeSpec (..) , BundleRoot (..) , BundleExport (..) - , BundleMetadata (..) + , BundleMetadata , ClosureMode (..) , encodeBundle , decodeBundle @@ -23,21 +24,26 @@ module Wire import ContentStore (getNodeMerkle, loadTree, putMerkleNode, storeTerm) import Research +import Codec.CBOR.Decoding ( Decoder + , decodeString + , decodeBytes + , decodeListLen + , decodeMapLen + ) +import Control.Monad (replicateM, forM) +import Codec.CBOR.Encoding ( Encoding + , encodeMapLen + , encodeListLen + , encodeString + , encodeBytes + ) +import Codec.CBOR.Write (toLazyByteString) +import Data.Monoid (mconcat) +import Codec.CBOR.Read (deserialiseFromBytes, DeserialiseFailure(..)) + import Control.Exception (SomeException, evaluate, try) import Control.Monad (foldM, unless, when) import Crypto.Hash (Digest, SHA256, hash) -import Data.Aeson ( FromJSON (..) - , ToJSON (..) - , Value (String) - , eitherDecodeStrict' - , encode - , object - , withObject - , (.:) - , (.:?) - , (.!=) - , (.=) - ) import Data.Bits ((.&.), (.|.), shiftL, shiftR) import Data.ByteArray (convert) import Data.ByteString (ByteString) @@ -84,54 +90,121 @@ compressionNone, digestSha256 :: Word16 compressionNone = 0 digestSha256 = 1 --- | Closure declaration. V1 only accepts complete bundles for import. +-- --------------------------------------------------------------------------- +-- CBOR encoding helpers +-- --------------------------------------------------------------------------- + +-- | Canonical CBOR map length encoder. +cmkLen :: Int -> Encoding +cmkLen n = encodeMapLen (fromIntegral n) + +-- | Decode a CBOR array of n elements. +decodeListN :: Decoder s a -> Int -> Decoder s [a] +decodeListN dec n = replicateM n dec + +-- | Decode a CBOR map (sequence of key-value pairs). +decodeMapN :: Decoder s a -> Decoder s b -> Int -> Decoder s [(a, b)] +decodeMapN keyDec valDec n = forM [1..n] $ \_ -> + keyDec >>= \k -> valDec >>= \v -> pure (k, v) + +decodeKey :: Text -> Decoder s () +decodeKey expected = do + actual <- decodeString + unless (actual == expected) $ + fail $ "expected key " ++ show expected ++ ", got " ++ show actual + +-- | Canonical CBOR array length encoder. +cakLen :: Int -> Encoding +cakLen n = encodeListLen (fromIntegral n) + +-- | Encode a canonical CBOR map with key-value pairs as flat sequence. +cmkPairs :: [(Text, Encoding)] -> Encoding +cmkPairs [] = cmkLen 0 +cmkPairs kvs = cmkLen (length kvs) <> mconcat [encodeString k <> v | (k, v) <- kvs] + +-- | Encode a canonical CBOR array. +cakSeq :: [Encoding] -> Encoding +cakSeq [] = cakLen 0 +cakSeq xs = cakLen (length xs) <> mconcat xs + +-- | Encode a canonical CBOR text string. +encText :: Text -> Encoding +encText = encodeString + +-- | Encode a canonical CBOR byte string. +encBytes :: ByteString -> Encoding +encBytes = encodeBytes + +-- --------------------------------------------------------------------------- +-- Data types with CBOR instances +-- --------------------------------------------------------------------------- + +-- | Closure declaration. data ClosureMode = ClosureComplete | ClosurePartial deriving (Show, Eq, Ord, Generic) -instance ToJSON ClosureMode where - toJSON ClosureComplete = String "complete" - toJSON ClosurePartial = String "partial" +toCBORClosure :: ClosureMode -> Encoding +toCBORClosure = encText . \case + ClosureComplete -> "complete" + ClosurePartial -> "partial" -instance FromJSON ClosureMode where - parseJSON (String "complete") = pure ClosureComplete - parseJSON (String "partial") = pure ClosurePartial - parseJSON _ = fail "closure must be \"complete\" or \"partial\"" +closureFromCBOR :: Decoder s ClosureMode +closureFromCBOR = decodeString >>= \case + "complete" -> pure ClosureComplete + "partial" -> pure ClosurePartial + other -> fail $ "ClosureMode: " ++ show other +-- | Hash specification (algorithm + domain strings). data NodeHashSpec = NodeHashSpec { nodeHashAlgorithm :: Text , nodeHashDomain :: Text } deriving (Show, Eq, Ord, Generic) -instance ToJSON NodeHashSpec where - toJSON s = object - [ "algorithm" .= nodeHashAlgorithm s - , "domain" .= nodeHashDomain s +toCBORNodeHashSpec :: NodeHashSpec -> Encoding +toCBORNodeHashSpec (NodeHashSpec alg dom) = + cmkPairs + [ ("algorithm", encText alg) + , ("domain", encText dom) ] -instance FromJSON NodeHashSpec where - parseJSON = withObject "NodeHashSpec" $ \o -> NodeHashSpec - <$> o .: "algorithm" - <*> o .: "domain" +nodeHashSpecFromCBOR :: Decoder s NodeHashSpec +nodeHashSpecFromCBOR = do + n <- decodeMapLen + unless (n == 2) $ fail "NodeHashSpec: must have exactly 2 entries" + decodeKey "algorithm" + alg <- decodeString + decodeKey "domain" + dom <- decodeString + pure (NodeHashSpec alg dom) +-- | Tree specification. data TreeSpec = TreeSpec { treeCalculus :: Text , treeNodeHash :: NodeHashSpec , treeNodePayload :: Text } deriving (Show, Eq, Ord, Generic) -instance ToJSON TreeSpec where - toJSON s = object - [ "calculus" .= treeCalculus s - , "nodeHash" .= treeNodeHash s - , "nodePayload" .= treeNodePayload s +toCBORTreeSpec :: TreeSpec -> Encoding +toCBORTreeSpec (TreeSpec calc hspec payload) = + cmkPairs + [ ("calculus", encText calc) + , ("nodeHash", toCBORNodeHashSpec hspec) + , ("nodePayload", encText payload) ] -instance FromJSON TreeSpec where - parseJSON = withObject "TreeSpec" $ \o -> TreeSpec - <$> o .: "calculus" - <*> o .: "nodeHash" - <*> o .: "nodePayload" +treeSpecFromCBOR :: Decoder s TreeSpec +treeSpecFromCBOR = do + n <- decodeMapLen + unless (n == 3) $ fail "TreeSpec: must have exactly 3 entries" + decodeKey "calculus" + calc <- decodeString + decodeKey "nodeHash" + hspec <- nodeHashSpecFromCBOR + decodeKey "nodePayload" + payload <- decodeString + pure (TreeSpec calc hspec payload) +-- | Runtime specification. data RuntimeSpec = RuntimeSpec { runtimeSemantics :: Text , runtimeEvaluation :: Text @@ -139,65 +212,85 @@ data RuntimeSpec = RuntimeSpec , runtimeCapabilities :: [Text] } deriving (Show, Eq, Ord, Generic) -instance ToJSON RuntimeSpec where - toJSON s = object - [ "semantics" .= runtimeSemantics s - , "evaluation" .= runtimeEvaluation s - , "abi" .= runtimeAbi s - , "capabilities" .= runtimeCapabilities s +toCBORRuntimeSpec :: RuntimeSpec -> Encoding +toCBORRuntimeSpec (RuntimeSpec sem eval abi caps) = + cmkPairs + [ ("semantics", encText sem) + , ("evaluation", encText eval) + , ("abi", encText abi) + , ("capabilities", cakSeq (map encText caps)) ] -instance FromJSON RuntimeSpec where - parseJSON = withObject "RuntimeSpec" $ \o -> RuntimeSpec - <$> o .: "semantics" - <*> o .: "evaluation" - <*> o .: "abi" - <*> o .:? "capabilities" .!= [] +runtimeSpecFromCBOR :: Decoder s RuntimeSpec +runtimeSpecFromCBOR = do + n <- decodeMapLen + unless (n == 4) $ fail "RuntimeSpec: must have exactly 4 entries" + decodeKey "semantics" + sem <- decodeString + decodeKey "evaluation" + eval <- decodeString + decodeKey "abi" + abi <- decodeString + decodeKey "capabilities" + clen <- decodeListLen + caps <- decodeListN decodeString clen + pure (RuntimeSpec sem eval abi caps) +-- | A root hash reference. data BundleRoot = BundleRoot { rootHash :: MerkleHash , rootRole :: Text } deriving (Show, Eq, Ord, Generic) -instance ToJSON BundleRoot where - toJSON r = object - [ "hash" .= rootHash r - , "role" .= rootRole r +toCBORBundleRoot :: BundleRoot -> Encoding +toCBORBundleRoot (BundleRoot h role) = + cmkPairs + [ ("hash", encBytes (merkleHashToRaw h)) + , ("role", encText role) ] -instance FromJSON BundleRoot where - parseJSON = withObject "BundleRoot" $ \o -> BundleRoot - <$> o .: "hash" - <*> o .:? "role" .!= "root" +bundleRootFromCBOR :: Decoder s BundleRoot +bundleRootFromCBOR = do + n <- decodeMapLen + unless (n == 2) $ fail "BundleRoot: must have exactly 2 entries" + decodeKey "hash" + hRaw <- decodeBytes + decodeKey "role" + role <- decodeString + pure (BundleRoot (rawToMerkleHash hRaw) role) +-- | An export entry. data BundleExport = BundleExport { exportName :: Text , exportRoot :: MerkleHash , exportKind :: Text , exportAbi :: Text - , exportInput :: Maybe Text - , exportOutput :: Maybe Text } deriving (Show, Eq, Ord, Generic) -instance ToJSON BundleExport where - toJSON e = object - [ "name" .= exportName e - , "root" .= exportRoot e - , "kind" .= exportKind e - , "abi" .= exportAbi e - , "input" .= exportInput e - , "output" .= exportOutput e +toCBORBundleExport :: BundleExport -> Encoding +toCBORBundleExport (BundleExport name h kind abi) = + cmkPairs + [ ("name", encText name) + , ("root", encBytes (merkleHashToRaw h)) + , ("kind", encText kind) + , ("abi", encText abi) ] -instance FromJSON BundleExport where - parseJSON = withObject "BundleExport" $ \o -> BundleExport - <$> o .: "name" - <*> o .: "root" - <*> o .:? "kind" .!= "term" - <*> o .:? "abi" .!= "arborix.abi.tree.v1" - <*> o .:? "input" - <*> o .:? "output" +bundleExportFromCBOR :: Decoder s BundleExport +bundleExportFromCBOR = do + n <- decodeMapLen + unless (n == 4) $ fail "BundleExport: must have exactly 4 entries" + decodeKey "name" + name <- decodeString + decodeKey "root" + hRaw <- decodeBytes + decodeKey "kind" + kind <- decodeString + decodeKey "abi" + abi <- decodeString + pure (BundleExport name (rawToMerkleHash hRaw) kind abi) +-- | Optional package metadata. data BundleMetadata = BundleMetadata { metadataPackage :: Maybe Text , metadataVersion :: Maybe Text @@ -206,23 +299,34 @@ data BundleMetadata = BundleMetadata , metadataCreatedBy :: Maybe Text } deriving (Show, Eq, Ord, Generic) -instance ToJSON BundleMetadata where - toJSON m = object - [ "package" .= metadataPackage m - , "version" .= metadataVersion m - , "description" .= metadataDescription m - , "license" .= metadataLicense m - , "createdBy" .= metadataCreatedBy m - ] +metadataFromCBOR :: Decoder s BundleMetadata +metadataFromCBOR = do + mlen <- decodeMapLen + entries <- decodeMapN decodeString decodeString mlen + let lookupText k = go k entries + go _ [] = Nothing + go k ((k', v):rest) + | k == k' = Just v + | otherwise = go k rest + pure BundleMetadata + { metadataPackage = lookupText "package" + , metadataVersion = lookupText "version" + , metadataDescription = lookupText "description" + , metadataLicense = lookupText "license" + , metadataCreatedBy = lookupText "createdBy" + } -instance FromJSON BundleMetadata where - parseJSON = withObject "BundleMetadata" $ \o -> BundleMetadata - <$> o .:? "package" - <*> o .:? "version" - <*> o .:? "description" - <*> o .:? "license" - <*> o .:? "createdBy" +metadataToCBOR :: BundleMetadata -> Encoding +metadataToCBOR (BundleMetadata pkg ver desc lic by) = + let pairs = + maybe [] (\v -> [("package", encText v)]) pkg + ++ maybe [] (\v -> [("version", encText v)]) ver + ++ maybe [] (\v -> [("description", encText v)]) desc + ++ maybe [] (\v -> [("license", encText v)]) lic + ++ maybe [] (\v -> [("createdBy", encText v)]) by + in cmkPairs pairs +-- | The manifest: top-level bundle metadata. data BundleManifest = BundleManifest { manifestSchema :: Text , manifestBundleType :: Text @@ -231,37 +335,45 @@ data BundleManifest = BundleManifest , manifestClosure :: ClosureMode , manifestRoots :: [BundleRoot] , manifestExports :: [BundleExport] - , manifestImports :: [Value] - , manifestSections :: Value , manifestMetadata :: BundleMetadata } deriving (Show, Eq, Generic) -instance ToJSON BundleManifest where - toJSON m = object - [ "schema" .= manifestSchema m - , "bundleType" .= manifestBundleType m - , "tree" .= manifestTree m - , "runtime" .= manifestRuntime m - , "closure" .= manifestClosure m - , "roots" .= manifestRoots m - , "exports" .= manifestExports m - , "imports" .= manifestImports m - , "sections" .= manifestSections m - , "metadata" .= manifestMetadata m +manifestToCBOR :: BundleManifest -> Encoding +manifestToCBOR m = + cmkPairs + [ ("schema", encText (manifestSchema m)) + , ("bundleType", encText (manifestBundleType m)) + , ("tree", toCBORTreeSpec (manifestTree m)) + , ("runtime", toCBORRuntimeSpec (manifestRuntime m)) + , ("closure", toCBORClosure (manifestClosure m)) + , ("roots", cakSeq (map toCBORBundleRoot (manifestRoots m))) + , ("exports", cakSeq (map toCBORBundleExport (manifestExports m))) + , ("metadata", metadataToCBOR (manifestMetadata m)) ] -instance FromJSON BundleManifest where - parseJSON = withObject "BundleManifest" $ \o -> BundleManifest - <$> o .: "schema" - <*> o .: "bundleType" - <*> o .: "tree" - <*> o .: "runtime" - <*> o .: "closure" - <*> o .: "roots" - <*> o .: "exports" - <*> o .:? "imports" .!= [] - <*> o .:? "sections" .!= object [] - <*> o .:? "metadata" .!= BundleMetadata Nothing Nothing Nothing Nothing Nothing +manifestFromCBOR :: Decoder s BundleManifest +manifestFromCBOR = do + n <- decodeMapLen + unless (n == 8) $ fail "BundleManifest: must have exactly 8 entries" + decodeKey "schema" + schema <- decodeString + decodeKey "bundleType" + bundleType <- decodeString + decodeKey "tree" + tree <- treeSpecFromCBOR + decodeKey "runtime" + runtime <- runtimeSpecFromCBOR + decodeKey "closure" + closure <- closureFromCBOR + decodeKey "roots" + rlen <- decodeListLen + roots <- decodeListN bundleRootFromCBOR rlen + decodeKey "exports" + elen <- decodeListLen + exports <- decodeListN bundleExportFromCBOR elen + decodeKey "metadata" + metadata <- metadataFromCBOR + pure (BundleManifest schema bundleType tree runtime closure roots exports metadata) -- | Portable executable-object bundle. -- @@ -276,12 +388,33 @@ data Bundle = Bundle , bundleManifestBytes :: ByteString } deriving (Show, Eq) +-- --------------------------------------------------------------------------- +-- CBOR manifest serialization +-- --------------------------------------------------------------------------- + +-- | Encode the manifest as canonical CBOR. +encodeManifest :: BundleManifest -> ByteString +encodeManifest m = BL.toStrict (toLazyByteString (manifestToCBOR m)) + +-- | Decode a manifest from CBOR bytes. +decodeManifest :: ByteString -> Either String BundleManifest +decodeManifest bs = + case deserialiseFromBytes manifestFromCBOR (BL.fromStrict bs) of + Right (rest, m) + | BS.null (BL.toStrict rest) -> Right m + | otherwise -> Left "trailing bytes after manifest CBOR" + Left (DeserialiseFailure _ msg) -> Left msg + +-- --------------------------------------------------------------------------- +-- Bundle encoding +-- --------------------------------------------------------------------------- + -- | Encode a Bundle to portable Bundle v1 bytes. encodeBundle :: Bundle -> ByteString encodeBundle bundle = let nodeSection = encodeNodeSection (bundleNodes bundle) manifestBytes = if BS.null (bundleManifestBytes bundle) - then BL.toStrict (encode (bundleManifest bundle)) + then encodeManifest (bundleManifest bundle) else bundleManifestBytes bundle sectionCount = 2 dirOffset = fromIntegral headerLength @@ -346,15 +479,14 @@ decodePortableBundle bs = do dirBytes = fromIntegral sectionCount * sectionEntryLength when (BS.length bs < dirStart + dirBytes) $ Left "bundle truncated in section directory" - entries <- decodeSectionEntries sectionCount (BS.take dirBytes $ BS.drop dirStart bs) + let dirRaw = BS.take dirBytes $ BS.drop dirStart bs + entries <- decodeSectionEntries sectionCount dirRaw traverse_ rejectUnknownCritical entries manifestEntry <- requireSection sectionManifest entries nodesEntry <- requireSection sectionNodes entries manifestBytes <- readAndVerifySection bs manifestEntry nodesBytes <- readAndVerifySection bs nodesEntry - manifest <- case eitherDecodeStrict' manifestBytes of - Left err -> Left $ "invalid manifest JSON: " ++ err - Right m -> Right m + manifest <- decodeManifest manifestBytes nodes <- decodeNodeSection nodesBytes let roots = map rootHash (manifestRoots manifest) return Bundle @@ -429,8 +561,8 @@ decodeSectionEntries count bytes = reverse <$> go count bytes [] -- Manifest construction -- --------------------------------------------------------------------------- -defaultManifest :: [(Text, MerkleHash)] -> Int -> BundleManifest -defaultManifest namedRoots nodeCount = BundleManifest +defaultManifest :: [(Text, MerkleHash)] -> BundleManifest +defaultManifest namedRoots = BundleManifest { manifestSchema = "arborix.bundle.manifest.v1" , manifestBundleType = "tree-calculus-executable-object" , manifestTree = TreeSpec @@ -450,18 +582,11 @@ defaultManifest namedRoots nodeCount = BundleManifest , manifestClosure = ClosureComplete , manifestRoots = zipWith mkRoot [0 :: Int ..] (map snd namedRoots) , manifestExports = map mkExport namedRoots - , manifestImports = [] - , manifestSections = object - [ "nodes" .= object - [ "count" .= nodeCount - , "payload" .= ("arborix.merkle.payload.v1" :: Text) - ] - ] , manifestMetadata = BundleMetadata - { metadataPackage = Nothing - , metadataVersion = Nothing + { metadataPackage = Nothing + , metadataVersion = Nothing , metadataDescription = Nothing - , metadataLicense = Nothing + , metadataLicense = Nothing , metadataCreatedBy = Just "arborix" } } @@ -473,8 +598,6 @@ defaultManifest namedRoots nodeCount = BundleManifest , exportRoot = h , exportKind = "term" , exportAbi = "arborix.abi.tree.v1" - , exportInput = Nothing - , exportOutput = Nothing } -- --------------------------------------------------------------------------- @@ -568,12 +691,10 @@ verifyManifest manifest = do Left $ "unsupported runtime semantics: " ++ unpack (runtimeSemantics runtimeSpec) when (runtimeAbi runtimeSpec /= "arborix.abi.tree.v1") $ Left $ "unsupported runtime ABI: " ++ unpack (runtimeAbi runtimeSpec) - unless (null $ runtimeCapabilities runtimeSpec) $ - Left "host/runtime capabilities are not supported by bundle v1" + when (not (null (runtimeCapabilities runtimeSpec))) $ + Left "unsupported runtime capabilities" when (manifestClosure manifest /= ClosureComplete) $ - Left "bundle v1 imports require closure = complete" - unless (null $ manifestImports manifest) $ - Left "bundle v1 imports require an empty imports list" + Left "bundle v1 requires closure = complete" when (null $ manifestRoots manifest) $ Left "manifest has no roots" when (null $ manifestExports manifest) $ @@ -674,8 +795,8 @@ exportNamedBundle conn namedHashes = do let hashes = map snd namedHashes entries <- concat <$> mapM (collectReachableNodes conn) hashes let nodeMap = Map.fromList entries - manifest = defaultManifest namedHashes (Map.size nodeMap) - manifestBytes = BL.toStrict (encode manifest) + manifest = defaultManifest namedHashes + manifestBytes = encodeManifest manifest bundle = Bundle { bundleVersion = bundleMajorVersion * 1000 + bundleMinorVersion , bundleRoots = hashes @@ -793,6 +914,8 @@ rawToMerkleHash bs = decodeUtf8 (Base16.encode bs) sha256 :: ByteString -> ByteString sha256 bytes = convert ((hash bytes) :: Digest SHA256) + + defaultExportNames :: Int -> [Text] defaultExportNames n = case n of diff --git a/test/Spec.hs b/test/Spec.hs index 8f3687d..23a119d 100644 --- a/test/Spec.hs +++ b/test/Spec.hs @@ -2041,7 +2041,7 @@ binaryReaderTests = testGroup "Binary Reader Tests" result env @?= okT Leaf (bytesT ([101,102,103] ++ nodesBytes)) , testCase "readArborixNodesSection: reads id fixture bundle" $ do - fixtureBytes <- BS.readFile "test/fixtures/id.tri.bundle" + fixtureBytes <- BS.readFile "test/fixtures/id.arborix" case decodeBundle fixtureBytes of Left err -> assertFailure $ "decodeBundle failed: " ++ err Right _ -> do @@ -2053,7 +2053,7 @@ binaryReaderTests = testGroup "Binary Reader Tests" result env @?= ofNumber 0 , testCase "readArborixNodesSection: reads notQ fixture bundle" $ do - fixtureBytes <- BS.readFile "test/fixtures/notQ.tri.bundle" + fixtureBytes <- BS.readFile "test/fixtures/notQ.arborix" case decodeBundle fixtureBytes of Left err -> assertFailure $ "decodeBundle failed: " ++ err Right _ -> do @@ -2065,7 +2065,7 @@ binaryReaderTests = testGroup "Binary Reader Tests" result env @?= ofNumber 0 , testCase "readArborixNodesSection: reads map fixture bundle" $ do - fixtureBytes <- BS.readFile "test/fixtures/map.tri.bundle" + fixtureBytes <- BS.readFile "test/fixtures/map.arborix" case decodeBundle fixtureBytes of Left err -> assertFailure $ "decodeBundle failed: " ++ err Right _ -> do @@ -2077,7 +2077,7 @@ binaryReaderTests = testGroup "Binary Reader Tests" result env @?= ofNumber 0 , testCase "readArborixExecutableFromHash: reconstructs id fixture root" $ do - fixtureBytes <- BS.readFile "test/fixtures/id.tri.bundle" + fixtureBytes <- BS.readFile "test/fixtures/id.arborix" case decodeBundle fixtureBytes of Left err -> assertFailure $ "decodeBundle failed: " ++ err Right bundle -> case bundleRoots bundle of @@ -2093,7 +2093,7 @@ binaryReaderTests = testGroup "Binary Reader Tests" result env @?= ofNumber 0 , testCase "readArborixExecutableFromHash: reconstructs notQ fixture root" $ do - fixtureBytes <- BS.readFile "test/fixtures/notQ.tri.bundle" + fixtureBytes <- BS.readFile "test/fixtures/notQ.arborix" case decodeBundle fixtureBytes of Left err -> assertFailure $ "decodeBundle failed: " ++ err Right bundle -> case bundleRoots bundle of @@ -2109,7 +2109,7 @@ binaryReaderTests = testGroup "Binary Reader Tests" result env @?= ofNumber 0 , testCase "readArborixExecutableFromHash: reconstructs map fixture root" $ do - fixtureBytes <- BS.readFile "test/fixtures/map.tri.bundle" + fixtureBytes <- BS.readFile "test/fixtures/map.arborix" case decodeBundle fixtureBytes of Left err -> assertFailure $ "decodeBundle failed: " ++ err Right bundle -> case bundleRoots bundle of @@ -2125,7 +2125,7 @@ binaryReaderTests = testGroup "Binary Reader Tests" result env @?= ofNumber 0 , testCase "readArborixExecutableFromHash: executes id fixture root" $ do - fixtureBytes <- BS.readFile "test/fixtures/id.tri.bundle" + fixtureBytes <- BS.readFile "test/fixtures/id.arborix" case decodeBundle fixtureBytes of Left err -> assertFailure $ "decodeBundle failed: " ++ err Right bundle -> case bundleRoots bundle of @@ -2141,7 +2141,7 @@ binaryReaderTests = testGroup "Binary Reader Tests" result env @?= ofNumber 42 , testCase "readArborixExecutableFromHash: executes notQ fixture on true" $ do - fixtureBytes <- BS.readFile "test/fixtures/notQ.tri.bundle" + fixtureBytes <- BS.readFile "test/fixtures/notQ.arborix" case decodeBundle fixtureBytes of Left err -> assertFailure $ "decodeBundle failed: " ++ err Right bundle -> case bundleRoots bundle of @@ -2157,7 +2157,7 @@ binaryReaderTests = testGroup "Binary Reader Tests" result env @?= falseT , testCase "readArborixExecutableFromHash: executes notQ fixture on false" $ do - fixtureBytes <- BS.readFile "test/fixtures/notQ.tri.bundle" + fixtureBytes <- BS.readFile "test/fixtures/notQ.arborix" case decodeBundle fixtureBytes of Left err -> assertFailure $ "decodeBundle failed: " ++ err Right bundle -> case bundleRoots bundle of @@ -2173,7 +2173,7 @@ binaryReaderTests = testGroup "Binary Reader Tests" result env @?= trueT , testCase "readArborixExecutableFromHash: executes map fixture root" $ do - fixtureBytes <- BS.readFile "test/fixtures/map.tri.bundle" + fixtureBytes <- BS.readFile "test/fixtures/map.arborix" case decodeBundle fixtureBytes of Left err -> assertFailure $ "decodeBundle failed: " ++ err Right bundle -> case bundleRoots bundle of diff --git a/test/fixtures/false.arborix b/test/fixtures/false.arborix new file mode 100644 index 0000000..53816b3 Binary files /dev/null and b/test/fixtures/false.arborix differ diff --git a/test/fixtures/id.arborix b/test/fixtures/id.arborix new file mode 100644 index 0000000..deb7f5e Binary files /dev/null and b/test/fixtures/id.arborix differ diff --git a/test/fixtures/id.tri.bundle b/test/fixtures/id.tri.bundle deleted file mode 100644 index 47589dc..0000000 Binary files a/test/fixtures/id.tri.bundle and /dev/null differ diff --git a/test/fixtures/map.arborix b/test/fixtures/map.arborix new file mode 100644 index 0000000..e3d4e43 Binary files /dev/null and b/test/fixtures/map.arborix differ diff --git a/test/fixtures/map.tri.bundle b/test/fixtures/map.tri.bundle deleted file mode 100644 index e3014a4..0000000 Binary files a/test/fixtures/map.tri.bundle and /dev/null differ diff --git a/test/fixtures/notQ.arborix b/test/fixtures/notQ.arborix new file mode 100644 index 0000000..3d1b7e8 Binary files /dev/null and b/test/fixtures/notQ.arborix differ diff --git a/test/fixtures/notQ.tri.bundle b/test/fixtures/notQ.tri.bundle deleted file mode 100644 index 2cbb59d..0000000 Binary files a/test/fixtures/notQ.tri.bundle and /dev/null differ diff --git a/test/fixtures/true.arborix b/test/fixtures/true.arborix new file mode 100644 index 0000000..4fb2275 Binary files /dev/null and b/test/fixtures/true.arborix differ diff --git a/tricu.cabal b/tricu.cabal index 8e97dfa..46f5fd2 100644 --- a/tricu.cabal +++ b/tricu.cabal @@ -37,11 +37,11 @@ executable tricu -fPIC build-depends: base >=4.7 - , aeson , ansi-terminal , base16-bytestring , base64-bytestring , bytestring + , cborg , cmdargs , containers , cryptonite @@ -90,11 +90,11 @@ test-suite tricu-tests ScopedTypeVariables build-depends: base >=4.7 - , aeson , ansi-terminal , base16-bytestring , base64-bytestring , bytestring + , cborg , cmdargs , containers , cryptonite @@ -115,8 +115,8 @@ test-suite tricu-tests , text , time , transformers - , warp , wai + , warp , zlib default-language: Haskell2010 other-modules: