Skip to main content

DataStore

💜 Support the DIG Network

Help build the future of decentralized storage! The DIG Network is an open-source project that needs community support to continue development.

💜 Support the Project → - Donate crypto, buy NFTs, or sponsor development

Overview

DataStores are NFT-based data containers on the Chia blockchain that combine ownership properties of NFTs with content integrity guarantees of Merkle trees. Each DataStore maintains an on-chain Merkle root enabling cryptographic verification of data inclusion and integrity, serving as a zero-knowledge proof of the existence and form of off-chain data.

Technical Architecture

NFT Structure

DataStores extend Chia's NFT1 specification as on-chain singletons:

interface DataStore extends NFT1 {
// Standard NFT fields
launcherId: bytes32; // Store ID - unique identifier
nftCoinId: bytes32;

// DataStore extensions
dataStoreMetadata: {
merkleRoot: bytes32; // Current content hash
storeId: bytes32; // Immutable unique identifier
generation: uint64; // Sequential state counter
contentSize: uint64; // Total bytes
fileCount: uint32; // Number of files

// Access control
owner: bytes32; // NFT owner puzzle hash
delegatedWriters: bytes32[]; // Additional write permissions

// Generation history
generationHistory: {
generation: uint64;
merkleRoot: bytes32;
timestamp: uint64;
updatedBy: bytes32;
}[];
};
}

Merkle Tree Implementation

DataStores use a Merkle tree structure to encode key/value pairs, where both keys and values are stored as binary data for maximum flexibility:

                    Root Hash
/ \
Hash 1 Hash 2
/ \ / \
Hash 1a Hash 1b Hash 2a Hash 2b
| | | |
K/V 1 K/V 2 K/V 3 K/V 4

Properties:

  • Hash Function: SHA-256
  • Leaf Nodes: SHA-256(key || value)
  • Internal Nodes: SHA-256(left_hash || right_hash)
  • Sorting: Deterministic key ordering required
  • Empty Values: SHA-256("")

Generations

A generation represents the sequential state of a DataStore:

class Generation:
def __init__(self, root_hash: bytes32, previous_generation: int):
self.generation_number = previous_generation + 1
self.root_hash = root_hash
self.timestamp = current_time()
self.merkle_tree = construct_tree(data)

Each root hash change creates a new generation, allowing:

  • Historical verification
  • State rollback
  • Audit trails
  • Sync optimization

Proof of Inclusion

DataStores enable cryptographic proof that specific data belongs to a root hash:

To prove K/V 1 belongs to Root Hash:
1. Provide K/V 1
2. Provide Hash 1b (sibling)
3. Provide Hash 2 (uncle)
4. Client computes: Hash(K/V 1) → Hash 1a
5. Client computes: Hash(Hash 1a || Hash 1b) → Hash 1
6. Client computes: Hash(Hash 1 || Hash 2) → Root Hash
7. Verify computed root matches on-chain root

This enables trustless verification without downloading the entire dataset.

Deterministic Operations

Insert Operation

def insert_key_value(tree: MerkleTree, key: bytes, value: bytes):
# 1. Find insertion point maintaining sorted order
position = find_sorted_position(tree.keys, key)

# 2. Insert at correct position
tree.insert(position, key, value)

# 3. Rebalance tree if necessary
tree.rebalance()

# 4. Recalculate hashes from leaf to root
tree.recalculate_hashes(position)

return tree.root_hash

Delete Operation

def delete_key(tree: MerkleTree, key: bytes):
# 1. Find key position
position = tree.find_key(key)

# 2. Remove leaf node
tree.remove(position)

# 3. Merge or rebalance nodes
tree.rebalance()

# 4. Recalculate affected hashes
tree.recalculate_hashes(position)

return tree.root_hash

Access Control

Permission Matrix

OperationOwnerDelegated WriterPublic
Read
Update Content
Modify Permissions
Transfer Ownership
Melt DataStore

State Transitions

enum StateTransition:
UPDATE_ROOT # Update merkle root (new generation)
ADD_WRITER # Grant write permission
REMOVE_WRITER # Revoke write permission
TRANSFER_OWNERSHIP # Transfer NFT
UPDATE_METADATA # Modify metadata
MELT_DATASTORE # Destroy permanently

Integration Workflow

CLI Operations

# Create DataStore
dig datastore create --name "my-project"

# Add content
dig add ./files/*
dig commit -m "Initial version"

# Push to network (creates new generation)
dig push

# Update content
dig add ./updated-file
dig commit -m "Update v2"
dig push

# Sync from mirror
dig sync --store-id abc123 --mirror https://mirror1.dig.net

Programmatic Access

# Create DataStore
datastore = DataStore.create(
name="my-project",
owner=wallet.puzzle_hash
)

# Add key/value pairs
datastore.insert("index.html", html_content)
datastore.insert("style.css", css_content)

# Commit new generation
new_root = datastore.commit()
await datastore.push_to_chain(new_root)

# Verify data inclusion
proof = datastore.get_proof("index.html")
is_valid = verify_proof(proof, new_root)

Network Integration

Publishing Flow

Local Changes → Generate Merkle Tree → Push Root to Chain

Economic Commitments → DIG Handle Registration + CapsuleStakeCoin Creation

Network Discovery → DIG Nodes detect economic signals → Plot content

Validation → Witness Nodes validate storage → Multisig authorizes rewards

User Access ← DIG Nodes serve content ← PlotCoin registry

Discovery Mechanisms

  1. Direct Access: Via DataStore launcher ID (store ID)
  2. DIG Handles: Human-readable names (required for network propagation)
  3. Content Hash: Specific generation by merkle root
  4. PlotCoin Registry: Find storage providers with verified content
  5. NFT Marketplaces: Standard NFT discovery

Note: For content to be propagated across the DIG Network, publishers must register a DIG Handle and create CapsuleStakeCoins. Without these economic commitments, DataStores remain private and are not automatically distributed by storage providers.

Use Cases

Primary Applications

  1. DeFi Frontends: Censorship-resistant application hosting
  2. NFT Metadata: Permanent storage for NFT collections
  3. Software Distribution: Verified software releases with integrity proofs
  4. Document Archives: Immutable document storage with audit trails
  5. Global CDN: Decentralized content delivery network

Benefits

  • Trustless Mirrors: Anyone can mirror without trust requirements
  • Censorship Resistance: Geographic and jurisdictional diversity
  • Data Integrity: Cryptographic verification of all content
  • Version Control: Complete history with rollback capability
  • Anonymous Operation: Mirrors can operate anonymously

Technical Specifications

Performance Characteristics

  • Merkle Proof Generation: O(log n)
  • Proof Verification: O(log n)
  • State Update: ~30 second confirmation
  • Query Time: O(1) by store ID
  • Storage Overhead: ~1KB per key/value metadata
  • Sync Speed: 10-100 MB/s depending on method

Implementation Requirements

  • Deterministic Sorting: Required for cross-client compatibility
  • Binary Data Support: Keys and values as raw bytes
  • Hash Algorithm: SHA-256 throughout
  • Generation Tracking: Sequential numbering from 1
  • Proof Format: Standardized inclusion proof structure

Best Practices

Development Workflow

  1. Local Testing: Verify tree generation locally
  2. Staging Mirrors: Test propagation on testnet
  3. Generation Planning: Plan update frequency
  4. Key Management: Use hardware wallets for high-value stores
  5. Mirror Monitoring: Track mirror health and availability

Optimization Strategies

  1. Batch Updates: Group changes into single generation
  2. Key Design: Use efficient key structures for fast lookup
  3. Compression: Apply before storing in tree
  4. Caching: Cache frequently accessed proofs
  5. Mirror Selection: Choose geographically diverse mirrors