Arboricx bundle format 1.1
We don't need SHA verification or Merkle dags in our transport bundle. Content stores can handle both bundle and term verification and hashing.
This commit is contained in:
@@ -1,117 +1,119 @@
|
||||
# Arboricx Portable Bundle Format Specification
|
||||
|
||||
**Version:** 0.1
|
||||
**Status:** Exploratory
|
||||
**Author:** A range of slopmachines guided by James Eversole
|
||||
**Human Review Status:** 5 minute scan-through - this is an evolving and malleable document
|
||||
**Version:** 1.1 (Indexed)
|
||||
|
||||
The Arboricx Portable Bundle is a self-contained, content-addressed binary format for distributing Tree Calculus programs and their associated Merkle DAGs. It provides:
|
||||
**Status:** Stable
|
||||
|
||||
- A fixed binary container with header, section directory, and typed sections
|
||||
- A language-neutral Merkle node layer for content-addressed tree values
|
||||
- A fixed-order binary manifest for semantic metadata, exports, and optional extensions
|
||||
**Author:** Slopmachines guided by James Eversole
|
||||
|
||||
The Arboricx Portable Bundle is a self-contained binary format for distributing Tree Calculus programs. It uses topological indexing instead of cryptographic hashing for node identity, making it writable from pure Tree Calculus and verifiable via structural inspection.
|
||||
|
||||
## Table of Contents
|
||||
|
||||
1. [Top-Level Container Layout](#1-top-level-container-layout)
|
||||
2. [Header](#2-header)
|
||||
3. [Section Directory](#3-section-directory)
|
||||
4. [Section: Manifest (type 1)](#4-section-manifest-type-1)
|
||||
5. [Section: Nodes (type 2)](#5-section-nodes-type-2)
|
||||
6. [Merkle Node Payload Format](#6-merkle-node-payload-format)
|
||||
7. [Merkle Hash Computation](#7-merkle-hash-computation)
|
||||
1. [Design Principles](#1-design-principles)
|
||||
2. [Top-Level Container Layout](#2-top-level-container-layout)
|
||||
3. [Header](#3-header)
|
||||
4. [Section Directory](#4-section-directory)
|
||||
5. [Section: Manifest (type 1)](#5-section-manifest-type-1)
|
||||
6. [Section: Nodes (type 2)](#6-section-nodes-type-2)
|
||||
7. [Node Payload Format](#7-node-payload-format)
|
||||
8. [Tree Calculus Reduction Semantics](#8-tree-calculus-reduction-semantics)
|
||||
9. [Binary Primitives](#9-binary-primitives)
|
||||
10. [Bundle Verification](#10-bundle-verification)
|
||||
11. [Known Section Types](#11-known-section-types)
|
||||
11. [Canonicalization](#11-canonicalization)
|
||||
12. [Known Section Types](#12-known-section-types)
|
||||
|
||||
---
|
||||
|
||||
## 1. Top-Level Container Layout
|
||||
## 1. Design Principles
|
||||
|
||||
An Arboricx bundle is a flat binary blob with the following layout:
|
||||
- **No cryptographic primitives required.** Node identity is topological (array index), not a SHA-256 hash.
|
||||
- **Self-contained.** A bundle includes all nodes reachable from its exports. No external references.
|
||||
- **Deterministic.** Canonical bundles produce byte-identical output for identical input terms.
|
||||
- **Small.** ~5 bytes per node entry (length + payload) versus ~36 bytes in hash-based formats.
|
||||
- **Verifiable via structure.** Bounds checking and acyclicity verification replace hash recomputation.
|
||||
|
||||
Global artifact identity (for registries, lockfiles, or content-addressed caches) is achieved by hashing the complete canonical bundle file externally. The bundle format itself knows nothing about this hash.
|
||||
|
||||
---
|
||||
|
||||
## 2. Top-Level Container Layout
|
||||
|
||||
```
|
||||
+------------------+------------------+------------------+------------------+
|
||||
| Header | Section Directory| Manifest Section | Nodes Section |
|
||||
| (32 bytes) | (N × 60 bytes) | (variable) | (variable) |
|
||||
| (32 bytes) | (N × 32 bytes) | (variable) | (variable) |
|
||||
+------------------+------------------+------------------+------------------+
|
||||
```
|
||||
|
||||
The container uses **big-endian** byte order for all multi-byte integers.
|
||||
Total bundle size = 32 + (sectionCount × 32) + manifestSize + nodesSize
|
||||
|
||||
Total bundle size = 32 + (sectionCount × 60) + manifestSize + nodesSize
|
||||
All multi-byte integers use **big-endian** byte order.
|
||||
|
||||
---
|
||||
|
||||
## 2. Header
|
||||
## 3. Header
|
||||
|
||||
| Offset | Size | Field | Description |
|
||||
|--------|------|-------|-------------|
|
||||
| 0 | 8 bytes | Magic | ASCII `"ARBORICX"` (`0x41 0x52 0x42 0x4F 0x52 0x49 0x43 0x58`) |
|
||||
| 0 | 8 bytes | Magic | ASCII `"ARBORICX"` |
|
||||
| 8 | 2 bytes | Major version | `u16` BE. Currently `1` |
|
||||
| 10 | 2 bytes | Minor version | `u16` BE. Currently `0` |
|
||||
| 12 | 4 bytes | Section count | `u32` BE. Number of entries in the section directory |
|
||||
| 16 | 8 bytes | Flags | `u64` BE. Reserved; currently all zeros |
|
||||
| 24 | 8 bytes | Directory offset | `u64` BE. Byte offset from the start of the bundle to the section directory |
|
||||
|
||||
**Constraints:**
|
||||
- Major version must be `1`. Bundles with unsupported major versions are rejected.
|
||||
- The directory offset must point to a valid location within the bundle.
|
||||
- The directory offset is always `32` for bundles with the current layout (header immediately followed by the directory).
|
||||
| 24 | 8 bytes | Directory offset | `u64` BE. Byte offset to the section directory (always `32`) |
|
||||
|
||||
---
|
||||
|
||||
## 3. Section Directory
|
||||
## 4. Section Directory
|
||||
|
||||
The section directory is an array of `N` entries, where `N` is the section count from the header. Each entry is exactly **60 bytes**.
|
||||
Array of `N` entries, each exactly **32 bytes**.
|
||||
|
||||
| Offset (within entry) | Size | Field | Description |
|
||||
|----------------------|------|-------|-------------|
|
||||
| 0 | 4 bytes | Type | `u32` BE. Section type identifier (see [Known Section Types](#11-known-section-types)) |
|
||||
| 0 | 4 bytes | Type | `u32` BE. Section type identifier |
|
||||
| 4 | 2 bytes | Version | `u16` BE. Section-specific version |
|
||||
| 6 | 2 bytes | Flags | `u16` BE. Bit flags: bit 0 (`0x0001`) = critical section |
|
||||
| 8 | 2 bytes | Compression | `u16` BE. Compression codec (currently only `0` = none) |
|
||||
| 10 | 2 bytes | Digest algorithm | `u16` BE. Hash algorithm (currently only `1` = SHA-256) |
|
||||
| 12 | 8 bytes | Offset | `u64` BE. Byte offset from the start of the bundle to the section data |
|
||||
| 20 | 8 bytes | Length | `u64` BE. Length of the section data in bytes |
|
||||
| 28 | 32 bytes | SHA-256 digest | Raw digest of the section data |
|
||||
| 6 | 2 bytes | Flags | `u16` BE. Bit 0 (`0x0001`) = critical section |
|
||||
| 8 | 2 bytes | Compression | `u16` BE. `0` = none (currently the only value) |
|
||||
| 10 | 2 bytes | Reserved | `u16` BE. Padding; must be zero |
|
||||
| 12 | 8 bytes | Offset | `u64` BE. Byte offset from bundle start to section data |
|
||||
| 20 | 8 bytes | Length | `u64` BE. Length of section data in bytes |
|
||||
| 28 | 4 bytes | Reserved | Padding; must be zero |
|
||||
|
||||
**Verification:**
|
||||
- Unknown critical sections (flags & `0x0001`) are rejected.
|
||||
- Unknown critical sections are rejected.
|
||||
- Compression must be `0` (none).
|
||||
- Digest algorithm must be `1` (SHA-256).
|
||||
- The SHA-256 digest in the directory entry must match `SHA256(section_data)`.
|
||||
- Reserved fields must be zero.
|
||||
|
||||
**Note:** No per-section digest is stored. Integrity is verified at the distribution layer (e.g. SHA-256 of the complete bundle file) rather than inside the container.
|
||||
|
||||
---
|
||||
|
||||
## 4. Section: Manifest (type 1)
|
||||
## 5. Section: Manifest (type 1)
|
||||
|
||||
The manifest is a binary encoding of bundle metadata. It uses a **fixed-order core** layout followed by an optional **TLV tail** for extensibility.
|
||||
|
||||
### 4.1 Format
|
||||
Binary encoding of bundle metadata. Fixed-order core layout followed by optional TLV tail.
|
||||
|
||||
```
|
||||
Manifest =
|
||||
magic 8 bytes "ARBMNFST"
|
||||
major u16 BE Manifest major version (1)
|
||||
minor u16 BE Manifest minor version (0)
|
||||
minor u16 BE Manifest minor version (1)
|
||||
|
||||
schema string Length-prefixed UTF-8 text
|
||||
bundleType string Length-prefixed UTF-8 text
|
||||
schema string "arboricx.bundle.manifest.v1"
|
||||
bundleType string "tree-calculus-executable-object"
|
||||
|
||||
treeCalculus string Length-prefixed UTF-8 text
|
||||
treeHashAlgorithm string Length-prefixed UTF-8 text
|
||||
treeHashDomain string Length-prefixed UTF-8 text
|
||||
treeNodePayload string Length-prefixed UTF-8 text
|
||||
treeCalculus string "tree-calculus.v1"
|
||||
treeHashAlgorithm string "indexed"
|
||||
treeHashDomain string "arboricx.indexed.node.v1"
|
||||
treeNodePayload string "arboricx.indexed.payload.v1"
|
||||
|
||||
runtimeSemantics string Length-prefixed UTF-8 text
|
||||
runtimeEvaluation string Length-prefixed UTF-8 text
|
||||
runtimeAbi string Length-prefixed UTF-8 text
|
||||
capabilityCount u32 BE Number of capability strings
|
||||
capabilities string[] Array of length-prefixed UTF-8 capability strings
|
||||
runtimeSemantics string "tree-calculus.v1"
|
||||
runtimeEvaluation string "normal-order"
|
||||
runtimeAbi string "arboricx.abi.tree.v1"
|
||||
capabilityCount u32 BE Number of capability strings (currently 0)
|
||||
capabilities string[] Array of length-prefixed UTF-8 strings
|
||||
|
||||
closure u8 0 = complete, 1 = partial
|
||||
closure u8 0 = complete
|
||||
rootCount u32 BE Number of root entries
|
||||
roots Root[] Array of root entries
|
||||
exportCount u32 BE Number of export entries
|
||||
@@ -119,93 +121,76 @@ Manifest =
|
||||
|
||||
metadataFieldCount u32 BE Number of metadata TLV entries
|
||||
metadataFields TLV[] Metadata tag-value entries
|
||||
extensionFieldCount u32 BE Number of extension TLV entries
|
||||
extensionFields TLV[] Extension tag-value entries (skipped by parsers)
|
||||
extensionFieldCount u32 BE Number of extension TLV entries (currently 0)
|
||||
extensionFields TLV[] Extension entries (skipped by parsers)
|
||||
```
|
||||
|
||||
**Trailing bytes after the manifest must be zero** (no leftover data).
|
||||
|
||||
### 4.2 String Format
|
||||
|
||||
Every `string` field uses the same encoding:
|
||||
### String Format
|
||||
|
||||
```
|
||||
string =
|
||||
length u32 BE Number of UTF-8 bytes in the string (not the number of characters)
|
||||
bytes byte[length] UTF-8 encoded string content
|
||||
length u32 BE Number of UTF-8 bytes
|
||||
bytes byte[length] UTF-8 content
|
||||
```
|
||||
|
||||
The length field carries the byte count, so parsers can skip strings without decoding UTF-8.
|
||||
|
||||
### 4.3 Root Entry
|
||||
### Root Entry
|
||||
|
||||
```
|
||||
Root =
|
||||
hash 32 bytes Raw SHA-256 hash of the Merkle node
|
||||
role string Length-prefixed UTF-8 text ("default" for the first root, "root" for others)
|
||||
index u32 BE Node index into the nodes section
|
||||
role string Length-prefixed UTF-8 ("default" for first root, "root" for others)
|
||||
```
|
||||
|
||||
The hash is stored as **raw bytes** (not hex-encoded). It corresponds to the Merkle hash of the node.
|
||||
|
||||
### 4.4 Export Entry
|
||||
### Export Entry
|
||||
|
||||
```
|
||||
Export =
|
||||
name string Length-prefixed UTF-8 text (export identifier)
|
||||
root 32 bytes Raw SHA-256 hash of the Merkle node
|
||||
kind string Length-prefixed UTF-8 text (currently "term")
|
||||
abi string Length-prefixed UTF-8 text (ABI string)
|
||||
name string Length-prefixed UTF-8 export identifier
|
||||
root u32 BE Node index into the nodes section
|
||||
kind string Length-prefixed UTF-8 (currently "term")
|
||||
abi string Length-prefixed UTF-8 ABI string
|
||||
```
|
||||
|
||||
### 4.5 TLV Entry
|
||||
### TLV Entry
|
||||
|
||||
```
|
||||
TLV =
|
||||
tag u16 BE Tag identifier (type)
|
||||
length u32 BE Number of bytes in the value
|
||||
value byte[length] Raw bytes
|
||||
tag u16 BE Tag identifier
|
||||
length u32 BE Value length in bytes
|
||||
value byte[length]
|
||||
```
|
||||
|
||||
TLV entries support variable-length values and are skippable by parsers that do not recognize a tag: read the `u32` length and advance by `2 + 4 + length` bytes.
|
||||
|
||||
### 4.6 Metadata Tags
|
||||
### Metadata Tags
|
||||
|
||||
| Tag | Name | Value |
|
||||
|-----|------|-------|
|
||||
| 1 | package | UTF-8 text: package name |
|
||||
| 2 | version | UTF-8 text: version string |
|
||||
| 3 | description | UTF-8 text: description |
|
||||
| 4 | license | UTF-8 text: license identifier or text |
|
||||
| 5 | createdBy | UTF-8 text: creator identifier |
|
||||
| 1 | package | UTF-8 text |
|
||||
| 2 | version | UTF-8 text |
|
||||
| 3 | description | UTF-8 text |
|
||||
| 4 | license | UTF-8 text |
|
||||
| 5 | createdBy | UTF-8 text |
|
||||
|
||||
Unknown metadata tags are ignored. Unknown extension tags are skipped by length.
|
||||
|
||||
### 4.7 Semantic Constraints
|
||||
|
||||
A valid bundle manifest must satisfy:
|
||||
### Semantic Constraints
|
||||
|
||||
| Constraint | Value |
|
||||
|-----------|-------|
|
||||
| `schema` | `"arboricx.bundle.manifest.v1"` |
|
||||
| `bundleType` | `"tree-calculus-executable-object"` |
|
||||
| `treeCalculus` | `"tree-calculus.v1"` |
|
||||
| `treeHashAlgorithm` | `"sha256"` |
|
||||
| `treeHashDomain` | `"arboricx.merkle.node.v1"` |
|
||||
| `treeNodePayload` | `"arboricx.merkle.payload.v1"` |
|
||||
| `treeHashAlgorithm` | `"indexed"` |
|
||||
| `treeHashDomain` | `"arboricx.indexed.node.v1"` |
|
||||
| `treeNodePayload` | `"arboricx.indexed.payload.v1"` |
|
||||
| `runtimeSemantics` | `"tree-calculus.v1"` |
|
||||
| `runtimeAbi` | `"arboricx.abi.tree.v1"` |
|
||||
| `runtimeCapabilities` | Empty array |
|
||||
| `closure` | `0` (complete) |
|
||||
| `rootCount` | At least 1 |
|
||||
| `exportCount` | At least 1 |
|
||||
| Export names | Non-empty |
|
||||
| Export roots | Non-empty (32 bytes each) |
|
||||
|
||||
---
|
||||
|
||||
## 5. Section: Nodes (type 2)
|
||||
|
||||
The nodes section contains all Merkle DAG nodes referenced by the manifest. It is a sequence of node entries preceded by a count.
|
||||
## 6. Section: Nodes (type 2)
|
||||
|
||||
```
|
||||
NodesSection =
|
||||
@@ -213,22 +198,21 @@ NodesSection =
|
||||
entries NodeEntry[]
|
||||
```
|
||||
|
||||
Each node entry:
|
||||
### Node Entry
|
||||
|
||||
```
|
||||
NodeEntry =
|
||||
hash 32 bytes Raw SHA-256 hash of this node
|
||||
payloadLen u32 BE Length of the payload in bytes
|
||||
payload byte[payloadLen] Node payload (see Section 6)
|
||||
payloadLen u32 BE Length of payload in bytes
|
||||
payload byte[payloadLen]
|
||||
```
|
||||
|
||||
The node count is `u64` to support large bundles. Entries are stored in the order produced by the exporter (typically sorted by hash for determinism).
|
||||
There is **no hash field**. The node is identified solely by its position in the array.
|
||||
|
||||
---
|
||||
|
||||
## 6. Merkle Node Payload Format
|
||||
## 7. Node Payload Format
|
||||
|
||||
Each node in the Merkle DAG is one of three types. The payload is a single byte type tag followed by hash references:
|
||||
Child references are `u32` big-endian indices into the node array. The array **must** be topologically sorted: every child index must be strictly less than the entry's own position.
|
||||
|
||||
### Leaf
|
||||
|
||||
@@ -236,152 +220,116 @@ Each node in the Merkle DAG is one of three types. The payload is a single byte
|
||||
Payload = 0x00
|
||||
```
|
||||
|
||||
A leaf has no children. The payload is exactly 1 byte.
|
||||
Exactly 1 byte.
|
||||
|
||||
### Stem
|
||||
|
||||
```
|
||||
Payload = 0x01 || child_hash (32 bytes raw)
|
||||
Payload = 0x01 || child_index (u32 BE)
|
||||
```
|
||||
|
||||
A stem has exactly one child. The payload is 33 bytes.
|
||||
Exactly 5 bytes.
|
||||
|
||||
### Fork
|
||||
|
||||
```
|
||||
Payload = 0x02 || left_hash (32 bytes raw) || right_hash (32 bytes raw)
|
||||
Payload = 0x02 || left_index (u32 BE) || right_index (u32 BE)
|
||||
```
|
||||
|
||||
A fork has exactly two children. The payload is 65 bytes.
|
||||
|
||||
**Validation:**
|
||||
- Leaf payloads must be exactly 1 byte (`0x00`).
|
||||
- Stem payloads must be exactly 33 bytes.
|
||||
- Fork payloads must be exactly 65 bytes.
|
||||
- Unknown type bytes are rejected.
|
||||
|
||||
---
|
||||
|
||||
## 7. Merkle Hash Computation
|
||||
|
||||
Each node is identified by a SHA-256 hash of its canonical payload:
|
||||
|
||||
```
|
||||
hash = SHA256( domain_tag || 0x00 || payload )
|
||||
```
|
||||
|
||||
Where:
|
||||
|
||||
| Component | Value |
|
||||
|-----------|-------|
|
||||
| `domain_tag` | `"arboricx.merkle.node.v1"` as UTF-8 bytes |
|
||||
| Separator | `0x00` (one zero byte) |
|
||||
| `payload` | The node's canonical serialization from Section 6 |
|
||||
|
||||
**Examples:**
|
||||
|
||||
- **Leaf:** `SHA256("arboricx.merkle.node.v1" || 0x00 || 0x00)`
|
||||
- **Stem:** `SHA256("arboricx.merkle.node.v1" || 0x00 || 0x01 || child_hash_bytes)`
|
||||
- **Fork:** `SHA256("arboricx.merkle.node.v1" || 0x00 || 0x02 || left_hash_bytes || right_hash_bytes)`
|
||||
|
||||
The resulting SHA-256 hash is stored as a hex-encoded string in the manifest (64 hex characters). Within the nodes section, it is stored as raw bytes.
|
||||
Exactly 9 bytes.
|
||||
|
||||
---
|
||||
|
||||
## 8. Tree Calculus Reduction Semantics
|
||||
|
||||
The bundle represents a **Tree Calculus** term as a Merkle DAG. The reduction rules are:
|
||||
|
||||
### Apply Rules
|
||||
The bundle represents a **Tree Calculus** term. The reduction rules are:
|
||||
|
||||
```
|
||||
apply(Fork(Leaf, a), _) = a
|
||||
apply(Fork(Stem(a), b), c) = apply(apply(a, c), apply(b, c))
|
||||
apply(Fork(Fork, _, _), Leaf) = left of inner Fork
|
||||
apply(Fork(Fork, _, _), Stem) = right of inner Fork
|
||||
apply(Fork(Fork, _, _), Fork) = apply(apply(c, u), v) where c = Fork(u, v)
|
||||
apply(Leaf, b) = Stem(b)
|
||||
apply(Stem(a), b) = Fork(a, b)
|
||||
The t operator is left associative.
|
||||
1. t t a b -> a
|
||||
2. t (t a) b c -> a c (b c)
|
||||
3a. t (t a b) c t -> a
|
||||
3b. t (t a b) c (t u) -> b u
|
||||
3c. t (t a b) c (t u v) -> c u v
|
||||
```
|
||||
|
||||
### Internal Representation
|
||||
|
||||
In the reduction engine, Fork nodes use a `[right, left]` (stack) ordering:
|
||||
- `Fork = [right_child, left_child]`
|
||||
- `Stem = [child]`
|
||||
- `Leaf = []`
|
||||
|
||||
This ordering supports stack-based reduction: pop two terms, apply, push results back.
|
||||
|
||||
### Closure
|
||||
|
||||
The bundle declares `closure = "complete"`, meaning all nodes reachable from export roots are present in the nodes section. No external references exist.
|
||||
**Closure:** The bundle declares `closure = "complete"`, meaning all nodes reachable from export roots are present in the nodes section. No external references exist.
|
||||
|
||||
---
|
||||
|
||||
## 9. Binary Primitives
|
||||
|
||||
All multi-byte integers use **big-endian** byte order.
|
||||
### u8
|
||||
|
||||
Single byte, value `0-255`.
|
||||
|
||||
### u16 (2 bytes)
|
||||
|
||||
```
|
||||
byte[0] | byte[1]
|
||||
value = (byte[0] << 8) | byte[1]
|
||||
```
|
||||
|
||||
### u32 (4 bytes)
|
||||
|
||||
```
|
||||
byte[0] | byte[1] | byte[2] | byte[3]
|
||||
value = (byte[0] << 24) | (byte[1] << 16) | (byte[2] << 8) | byte[3]
|
||||
```
|
||||
|
||||
### u64 (8 bytes)
|
||||
|
||||
```
|
||||
byte[0] ... byte[7]
|
||||
value = (byte[0] << 56) | ... | byte[7]
|
||||
```
|
||||
|
||||
### u8 (1 byte)
|
||||
|
||||
A single byte, value `0-255`.
|
||||
|
||||
---
|
||||
|
||||
## 10. Bundle Verification
|
||||
|
||||
A complete bundle verification proceeds in this order:
|
||||
|
||||
1. **Magic check:** First 8 bytes must be `"ARBORICX"`.
|
||||
2. **Version check:** Major version must be `1`.
|
||||
3. **Section directory:** Parse all entries; reject unknown critical sections.
|
||||
4. **Digest verification:** For each section, compute `SHA256(section_data)` and compare with the digest in the directory entry.
|
||||
5. **Manifest parsing:** Decode the fixed-order manifest; validate semantic constraints.
|
||||
6. **Node section:** Parse all node entries; reject duplicates.
|
||||
7. **Root verification:** All root hashes from the manifest must exist in the node map.
|
||||
8. **Export verification:** All export root hashes must exist in the node map.
|
||||
9. **Node hash verification:** For each node, compute `SHA256(domain || 0x00 || payload)` and compare with the stored hash.
|
||||
10. **Children verification:** For each Stem/Fork node, both child hashes must exist in the node map.
|
||||
11. **Closure verification:** Starting from each root hash, traverse the DAG and confirm all reachable nodes are present.
|
||||
3. **Section directory:** Parse all entries; reject unknown critical sections. Verify reserved fields are zero.
|
||||
4. **Manifest parsing:** Decode fixed-order manifest; validate semantic constraints.
|
||||
5. **Nodes section:** Parse all entries.
|
||||
6. **Bounds checking:**
|
||||
- Every root index `< nodeCount`
|
||||
- Every export index `< nodeCount`
|
||||
- In every Stem payload, `child_index < entry_position` and `child_index < nodeCount`
|
||||
- In every Fork payload, both indices `< entry_position` and `< nodeCount`
|
||||
7. **Acyclicity:** Guaranteed by the `child < parent` rule above.
|
||||
8. **Closure:** Traverse from all root/export indices; confirm every reached index is valid.
|
||||
|
||||
No hash computation is required.
|
||||
|
||||
---
|
||||
|
||||
## 11. Known Section Types
|
||||
## 11. Canonicalization
|
||||
|
||||
A bundle is **canonical** iff:
|
||||
|
||||
1. **Maximal deduplication.** No two entries represent structurally identical subtrees.
|
||||
2. **Topological order.** Children precede parents.
|
||||
3. **Deterministic post-order traversal.** Nodes are emitted in the order discovered by a left-to-right recursive post-order walk.
|
||||
4. **No trailing bytes** in any section.
|
||||
5. **Reserved fields are zero.**
|
||||
|
||||
Canonical bundles produce deterministic bytes and can be file-level hashed for global identity.
|
||||
|
||||
---
|
||||
|
||||
## 12. Known Section Types
|
||||
|
||||
| Type | Name | Required | Version | Description |
|
||||
|------|------|----------|---------|-------------|
|
||||
| 1 | Manifest | Yes | 1 | Bundle metadata in fixed-order binary format |
|
||||
| 2 | Nodes | Yes | 1 | Merkle DAG node entries |
|
||||
| 1 | Manifest | Yes | 1 | Bundle metadata |
|
||||
| 2 | Nodes | Yes | 1 | Topological DAG node entries |
|
||||
|
||||
Unknown section types are permitted if not marked as critical (flags bit 0 is not set).
|
||||
Unknown section types are permitted if not marked critical.
|
||||
|
||||
---
|
||||
|
||||
## Appendix A: Complete Example Layout (id.arboricx)
|
||||
## Appendix A: Complete Example Layout
|
||||
|
||||
A minimal `id.arboricx` bundle has:
|
||||
A minimal bundle for `Stem(Leaf)` (the Tree Calculus encoding of `t t`):
|
||||
|
||||
```
|
||||
+---------------------------------------------------+
|
||||
@@ -392,28 +340,25 @@ A minimal `id.arboricx` bundle has:
|
||||
| Flags: 0 |
|
||||
| Dir offset: 32 |
|
||||
+---------------------------------------------------+
|
||||
| Section Directory (120 bytes = 2 × 60) |
|
||||
| Entry 0: type=1 (manifest), offset=152, len=375 |
|
||||
| Entry 1: type=2 (nodes), offset=527, len=284 |
|
||||
| Section Directory (64 bytes = 2 × 32) |
|
||||
| Entry 0: type=1 (manifest), offset=96, len=~200 |
|
||||
| Entry 1: type=2 (nodes), offset=~296, len=10 |
|
||||
+---------------------------------------------------+
|
||||
| Manifest Section (375 bytes) |
|
||||
| Magic: "ARBMNFST" |
|
||||
| Version: 1.0 |
|
||||
| Core strings (schema, bundleType, tree spec, |
|
||||
| runtime spec, capabilities, closure, roots, |
|
||||
| exports, metadata TLVs, extension fields) |
|
||||
| Manifest Section (~200 bytes) |
|
||||
| Magic: "ARBMNFST", Version: 1.1 |
|
||||
| Schema, bundleType, tree spec, runtime spec |
|
||||
| Closure: 0, Roots: [1], Exports: ["main" -> 1] |
|
||||
| Metadata TLVs, zero extension fields |
|
||||
+---------------------------------------------------+
|
||||
| Nodes Section (284 bytes) |
|
||||
| Nodes Section (10 bytes) |
|
||||
| Node count: 2 |
|
||||
| Node entry 1: hash + payload (Leaf) |
|
||||
| Node entry 2: hash + payload (Fork) |
|
||||
| Entry 0: payloadLen=1, payload=[0x00] |
|
||||
| Entry 1: payloadLen=5, payload=[0x01, 0,0,0,0] |
|
||||
+---------------------------------------------------+
|
||||
```
|
||||
|
||||
The manifest section starts at byte 152 (0x98) and the nodes section at byte 527 (0x20F).
|
||||
|
||||
---
|
||||
|
||||
## Appendix B: File Extension
|
||||
|
||||
Bundles produced by the `tricu` tool use the `.arboricx` file extension. The `.tri` extension is used for plain source files; the `.arboricx` extension identifies the portable binary format.
|
||||
Bundles use the `.arboricx` file extension. Plain source files use `.tri`.
|
||||
|
||||
Reference in New Issue
Block a user