Kolme: The Future of Blockchain Applications

Introduction

Kolme is a Rust-powered blockchain framework that empowers developers to build fast, secure, and scalable decentralized applications, each running on its own dedicated, public, and transparent blockchain. Created by FP Block, Kolme addresses the limitations of shared blockchains like Ethereum, Solana, and Near—such as congestion, gas fees, execution limits, and complex multichain integrations—offering unparalleled performance and flexibility. Whether you’re a validator ensuring network integrity, a developer crafting innovative apps, an investor seeking scalable solutions, or a founder aiming to accelerate your vision, Kolme is designed to drive your success.

This document provides an overview of Kolme’s architecture, real-world use cases, and key benefits, serving as your entry point to building the future of decentralized technology.

Want to see the code? Check out the Kolme GitHub repo.

Why Kolme?

Kolme redefines blockchain development by combining the speed of modern web applications, the security of leading blockchains, and seamless multichain accessibility. Its innovative design delivers:

Dedicated Blockchains: Each app runs on its own public blockchain, eliminating block space competition and ensuring instant, predictable transaction processing without congestion or delays.
No Gas Fees or Limits: Kolme removes gas costs, execution time restrictions, and scheduled block times, allowing apps to operate flexibly and cost-free.
Rust Flexibility: Developers write full application logic in Rust, leveraging its high-performance ecosystem without the constraints of smart contracts, enabling rapid development and complex features.
Multichain Support: Secure bridge contracts enable users to interact with Kolme apps via Solana, Ethereum, Near, and future chains like Aptos or Cosmos, broadening accessibility.
External Chain Resilience: Kolme apps remain operational during external chain disruptions (e.g., Ethereum outages), relying on them only for fund deposits and withdrawals.
Transparency and Trust: Public, verifiable blockchains record all actions—trades, swaps, or administrative updates—ensuring auditability for users, validators, and auditors.

Architecture

Kolme’s architecture is built for speed, security, and scalability, with three core components:

Dedicated Blockchains

Every Kolme application operates on its own blockchain, free from the congestion of shared platforms. This ensures instant transaction processing and consistent performance, ideal for high-frequency trading, cross-chain swaps, or betting apps. All actions are recorded on-chain, enhancing transparency and trust.

Triadic Security Model

Kolme’s security relies on three validator groups working in a checks-and-balances system:

Listeners: Monitor bridge contracts on external blockchains (e.g., Solana, Ethereum) for events like fund deposits, confirming actions to the Kolme chain when a majority agrees.
Processor: Executes transactions and produces blocks, running in a high-availability setup for uninterrupted operation.
Approvers: Authorize outgoing actions, such as fund withdrawals, ensuring secure cross-chain movements.

Administrative tasks, like software upgrades, require a quorum of at least two groups, balancing rapid adaptation with tamper resistance. This model rivals the security of leading blockchains while minimizing centralized risks.

External Integration

Kolme seamlessly connects with external data and blockchains:

Secure Data Loads: Apps fetch data from public oracles (e.g., Pyth, Chainlink), proprietary APIs, or custom feeds via HTTP requests. Data is stored on-chain with cryptographic signatures for verification, bypassing slow oracles. For example, trading apps access live price feeds, while betting apps use real-time sports odds.
Multichain Bridges: Bridge contracts manage funds across Solana, Ethereum, Near, and future chains, allowing users to deposit and withdraw from their preferred ecosystem. Developers can add new chains without rewriting apps, ensuring future-proof scalability.

Real-World Use Cases

Kolme’s versatility powers diverse applications, showcasing its ability to deliver fast, secure, and accessible solutions. Below are three examples:

Perpetual Swaps App

A decentralized trading platform offers perpetual swaps for leveraged asset speculation.

How It Works: The app fetches cryptographically signed Pyth price feeds via HTTP, storing them on-chain for verification. Rust-based logic processes trades, margin calculations, and liquidations instantly. Users link Web3 wallets (Solana, Ethereum, Near) to Kolme accounts, signing trades with secure ephemeral keys generated in the browser.
Benefits: No congestion ensures instant execution during volatile markets. Gas-free operation and full Rust logic eliminate delays and constraints of shared blockchains like Ethereum.

Cross-Chain Swap App

A multichain AMM enables users to swap tokens across Solana, Ethereum, and Near.

How It Works: Users deposit tokens via bridge contracts, validated by listeners and confirmed on the Kolme chain. The Rust-based AMM processes swaps instantly, with approvers authorizing withdrawals to users’ preferred chains. Web3 wallet integration ensures seamless onboarding.
Benefits: Dedicated chains eliminate fees and delays. Multichain support and extensible bridge contracts simplify cross-chain swaps and future chain additions, unlike Ethereum’s complex middleware.

Sports Betting App

A decentralized betting platform processes bets with proprietary odds data.

How It Works: The app fetches real-time odds via HTTP, storing them on-chain with signatures. Rust logic calculates payouts and settles bets. Users deposit funds via Web3 wallets and place bets with ephemeral keys for security. All actions are recorded on-chain.
Benefits: Instant processing handles high-traffic events. Gas-free, flexible logic supports complex payouts, and transparency builds trust, avoiding the centralization risks of shared chain oracles.

Advantages Over Traditional Blockchains

Kolme outperforms shared blockchains like Ethereum and Solana:

Congestion-Free: Dedicated chains eliminate delays from block space competition.
Gas-Free: No transaction fees, unlike gas-heavy platforms.
Unlimited Execution: Full Rust logic on-chain, free from smart contract compute limits.
Fast Data Integration: Direct HTTP data loads with on-chain verification, bypassing oracle delays.
Simplified Multichain: Bridge contracts and extensible chain support reduce cross-chain complexity.
Full Transparency: All actions are recorded on public blockchains, avoiding off-chain opacity.

User Experience

Kolme prioritizes accessibility and security for users:

Web3 Wallet Integration: Users connect Solana, Ethereum, or Near wallets to deposit funds via bridge contracts, leveraging familiar tools.
Ephemeral Keys: Browser-generated keys, validated by Web3 wallets, secure transactions like trades or bets without exposing long-term keys.
Instant Notifications: Failed transactions trigger immediate alerts, with planned watchdog nodes to monitor fairness.
Multichain Accessibility: Users join from any supported chain, with developers able to add new chains as needed.

Call to Action

Kolme is your gateway to building fast, secure, and innovative blockchain applications. Contact FP Block to explore how our platform and expert engineering team can accelerate your vision. Whether you’re a validator, developer, investor, or founder, let’s shape the future of decentralized technology together. Visit FP Block to get started!

Kolme Technical Overview

Purpose

Kolme is a Rust-powered blockchain framework that enables developers to build fast, secure, and flexible decentralized applications, each running on its own dedicated, public, and transparent blockchain. Designed to overcome the limitations of shared blockchains like Ethereum, Solana, and Near—such as congestion, gas fees, and restrictive smart contract environments—Kolme provides a robust platform for engineers to create scalable apps with seamless multichain integration. This page offers a high-level technical overview of Kolme’s architecture and core features, serving as an entry point for developers building on the platform. It introduces key concepts like dedicated chains, Rust-based logic, and efficient state management, preparing you for deeper dives into specific components.

Dedicated Blockchains

Kolme’s defining feature is its dedicated blockchain architecture: each application operates on its own isolated, public blockchain, eliminating the block space competition that slows down shared platforms like Ethereum or Solana. This ensures instant transaction processing, free from congestion-based delays, making Kolme ideal for high-performance apps like trading platforms, cross-chain swaps, or betting systems. Each blockchain is transparent, with all transactions—user actions, fund transfers, or software upgrades—recorded on-chain and verifiable by any node, developer, or auditor. Dedicated chains enable developers to focus on app functionality without worrying about network contention, while validators benefit from simplified, efficient node operations. This scalability and independence make Kolme a powerful platform for building and deploying decentralized applications.

Deterministic Rust Execution

Kolme liberates developers from the constraints of traditional smart contracts by allowing full application logic to be written in Rust, a high-performance, memory-safe programming language. Unlike Ethereum’s Solidity or Solana’s constrained environments, which impose gas fees, execution time limits, and split logic requirements, Kolme apps run entirely on-chain with no such restrictions. Rust’s deterministic execution ensures that transactions produce consistent, reproducible results across all nodes, critical for maintaining blockchain integrity. Developers can leverage the full Rust ecosystem—including libraries and server-side tools—to build complex features, such as real-time trading algorithms, payout calculations for betting apps, or automated market makers (AMMs) for cross-chain swaps. This flexibility accelerates development, enhances security, and supports rapid prototyping, addressing concerns about the complexity of blockchain programming.

Efficient State Management

Kolme manages application and system state efficiently using a MerkleMap, a base-16 balanced tree optimized for large datasets (hundreds of MBs or GBs). This structure supports fast hashing, cloning, and updates, ensuring scalability without compromising performance.

Framework State: Represents the cumulative system state from all transactions since the genesis block, including account balances, validator sets, nonces (for transaction ordering), admin proposals (e.g., upgrades), and the chain_version (the code version used by the chain). Stored in a MerkleMap, only state hashes are included in blocks to avoid exceeding storage and network limits, addressing concerns about state size and scalability.
App State: Arbitrary data defined by the application, also stored in a MerkleMap, allowing developers to manage custom data structures (e.g., trade histories, betting outcomes) tailored to their app’s needs.

The MerkleMap’s design ensures efficient state updates and verification, enabling nodes to validate processor outputs quickly. Note that the framework state is not just a block number but a comprehensive snapshot of the system, crucial for developers building robust apps.

Multichain Integration

Kolme seamlessly integrates with multiple blockchains, including Solana, Ethereum, and Near, allowing apps to support users across ecosystems. Bridge contracts on external chains handle fund deposits and withdrawals, validated by listeners and approvers, ensuring secure, transparent interactions. Kolme’s design makes it easy to add new chains (e.g., Aptos, Cosmos) without rewriting apps, addressing concerns about choosing the “wrong” chain initially. For example, a cross-chain swap app can start with Solana and Ethereum support and later add Near with minimal changes, all while maintaining on-chain security and transparency. This flexibility reduces technical debt and enhances scalability, making Kolme attractive for developers building multichain apps and investors seeking adaptable platforms.

External Chain Resilience

Kolme’s architecture minimizes reliance on external blockchains, ensuring apps remain operational during disruptions. External chains are only needed for payment on-ramps and off-ramps (e.g., depositing or withdrawing funds), meaning downtime or congestion on Solana, Ethereum, or Near won’t interrupt core app operations like transaction processing or state updates. For instance, a trading app continues executing orders even if Ethereum faces delays. Additionally, Kolme simplifies future chain migrations compared to other systems. If you need to switch or add chains (e.g., from Solana to Cosmos), Kolme’s bridge contract system allows seamless transitions without extensive reengineering, preserving the security and transparency of on-chain logic. This resilience and adaptability make Kolme a reliable choice for developers and founders building high-availability apps.

Key Features

Kolme’s technical foundation includes several features that empower developers:

No Gas Costs or Execution Limits: Build apps without worrying about transaction fees or compute constraints, enabling complex, cost-free logic.
No Scheduled Block Times: Process transactions on-demand, optimizing performance for your app’s needs.
Public, Verifiable Chains: Ensure transparency with all actions recorded on-chain, verifiable by any stakeholder.
High Availability: Run validators in robust clusters to guarantee uptime, with no risk of forks.
Secure Data Integration: Fetch data from any source (Pyth, Chainlink, custom APIs) and store it on-chain for verification, bypassing oracle delays.
Web3 Wallet Support: Integrate with Solana, Ethereum, and Near wallets, using ephemeral keys validated by wallets for secure, user-friendly interactions.

These features address common developer concerns, such as performance, cost, and complexity, making Kolme an accessible platform for building innovative decentralized applications.

Core Chain Mechanics

Blocks

Kolme’s blockchain is an append-only chain of blocks, each containing exactly one transaction and signed by the processor. Unlike traditional blockchains with fixed block times (e.g., Ethereum’s 12-second slots), Kolme blocks are produced on-demand, optimizing performance for each app’s needs and eliminating delays from scheduled intervals.

Structure:
- Metadata: Includes timestamp, block height, and previous block hash, ensuring chain continuity and ordering.
- Transaction: The single transaction processed in the block, containing one or more messages (see Transactions).
- Data Loads: External data fetched during execution (e.g., Pyth price feeds, sports odds), stored with cryptographic signatures for verification.
- Logs: Execution outputs, such as trade confirmations or error messages, for debugging and auditing.
- State Hashes: Hashes of the updated framework and application state, ensuring efficient validation without storing full state in blocks.
Purpose: Blocks record state transitions, enabling other nodes to validate the processor’s outputs and maintain a consistent, transparent chain. The on-demand production supports high-performance apps, addressing concerns about congestion or delays.

Transactions

Transactions are the core units of action in Kolme, representing signed instructions submitted to the blockchain for processing. They are distinct from messages, clarifying a common point of confusion for developers.

Definition:
- A transaction is a signed package containing:
  - Messages: One or more individual actions, such as placing a trade, proposing a software upgrade, or transferring funds. Messages are app-specific (e.g., execute a swap), administrative (e.g., rotate a validator key), or related to fund transfers (e.g., deposit via bridge contract).
  - Nonce: A unique sequence number per account, ensuring transaction ordering and preventing double-spending.
  - Signature: Cryptographic signature from the account’s private key, verifying authenticity.
- Transactions are broadcast to the mempool via libp2p, where the processor selects them for inclusion in blocks.
Nonces:
- Stored in the framework state’s account mapping (within a MerkleMap), which maps account IDs to account information, including the next expected nonce.
- Updated atomically during transaction execution, ensuring correctness without scanning blockchain history. For example, if an account’s next nonce is 5, a transaction with nonce 6 is rejected, triggering a failure notification.
- This addresses concerns about transaction ordering, as nonces provide a clear, efficient mechanism to prevent out-of-order or duplicate submissions.
Validation:
- The processor checks the signature, nonce, and message validity before execution.
- Invalid transactions (e.g., wrong nonce, malformed messages) are dropped, with signed notifications broadcast via libp2p to inform users, as detailed in failed transactions.

Processor Role

The processor is a critical component of Kolme’s architecture, responsible for executing transactions and producing blocks. Its design ensures high availability and reliability while addressing concerns about centralized block production.

Function:
- Validation: Checks each transaction’s signature, nonce, and message validity to ensure it can be executed.
- Execution: Runs the transaction’s messages using deterministic Rust code, updating the framework and application state.
- Block Production: Creates a signed block containing the transaction, metadata, data loads, logs, and state hashes, then broadcasts it via libp2p.
- State Updates: Applies changes to the MerkleMap-based framework and app state, ensuring consistency across nodes.
Sole Block Producer:
- Only the processor produces signed blocks, preventing multiple nodes from creating conflicting blocks and risking hard forks. This design enables multiple processor executables to run in a high-availability cluster, with a PostgreSQL advisory lock (construct lock) ensuring one active processor at a time.
- This addresses confusion about block production, as other nodes validate blocks but cannot produce them, maintaining decentralization through validation.
Validation by Other Nodes:
- Non-processor nodes (e.g., listeners, approvers, or community nodes) execute transactions locally to verify the processor’s blocks, checking state hashes and execution results.
- This ensures the processor’s outputs are trustworthy, with planned watchdog nodes monitoring for discrepancies, as described in watchdogs.
High Availability:
- The processor runs in a cluster of three nodes across availability zones, with the construct lock coordinating leadership. If the active processor fails, another node takes over, ensuring zero downtime, as detailed in high availability.

Framework State

The framework state is the cumulative system state resulting from all transactions since the genesis block, providing a verifiable snapshot of Kolme’s operations.

Components:
- Account Balances: Tracks funds for each account ID, updated during deposits, withdrawals, or app-specific actions.
- Validator Sets: Lists keys for the processor, listeners (with quorum), and approvers (with quorum), modified via key rotations or upgrades.
- Nonces: Maps account IDs to the next expected nonce, ensuring transaction ordering.
- Admin Proposal State: Stores pending proposals (e.g., upgrades, key set changes) in a MerkleMap, supporting multiple proposals with first-come, first-served approval.
- Chain Version (chain_version): Tracks the code version used by the chain (e.g., v1), stored in the MerkleMap and genesis block, ensuring reproducibility during upgrades, as described in version upgrades.
Storage:
- Held in a MerkleMap, a base-16 balanced tree optimized for efficient hashing, cloning, and updates, capable of handling large datasets (hundreds of MBs or GBs).
- Only state hashes are stored in blocks, as full state inclusion would exceed storage and network limits, addressing scalability concerns.
Role:
- Provides a single source of truth for the system, enabling nodes to validate processor outputs and maintain consistency.
- The framework state is not just a block number but a comprehensive snapshot of the system, crucial for developers building robust apps.

App State

The app state is the application-specific data defined by the developer, stored separately from the framework state but also in a MerkleMap.

Structure: Arbitrary data structures tailored to the app’s needs, such as trade histories for a swaps platform, bet outcomes for a betting app, or liquidity pools for an AMM.
Storage: Held in a MerkleMap for efficient updates and hashing, with state hashes included in blocks alongside framework state hashes.
Updates: Modified during transaction execution based on the app’s Rust logic, ensuring deterministic results across nodes.
Purpose: Allows developers to manage custom data flexibly, supporting complex app requirements without impacting system state. This separation clarifies concerns about state management, as app state is isolated but validated similarly to framework state.

Multichain Context

Kolme’s chain mechanics support multichain integration with blockchains like Solana, Ethereum, and Near, enabling apps to serve users across ecosystems. Transactions involving external chains (e.g., deposits, withdrawals) are processed via bridge contracts, validated by listeners and approvers, as detailed in external chain resilience. Kolme’s design allows developers to add new chains (e.g., Aptos, Cosmos) without rewriting apps, ensuring future-proof scalability. The framework state tracks bridge-related data (e.g., validator sets for external chains), while the processor handles on-chain execution, maintaining security and transparency across multichain operations.

Key Features

Kolme’s core mechanics enable several developer-friendly features:

Instant Transaction Processing: On-demand block production eliminates delays, ideal for high-performance apps.
No Gas Costs or Execution Limits: Process complex logic without fees or constraints, leveraging Rust’s capabilities.
Deterministic Execution: Ensures consistent results across nodes, critical for blockchain integrity.
Scalable State Management: MerkleMap and state hashes support large datasets without storage/network bloat.
Transparent Validation: Public blocks and state hashes enable node verification, with watchdogs enhancing trust.
Multichain Flexibility: Seamless integration with external chains, with easy migration to new chains.

These features address common developer concerns, such as performance, cost, and multichain complexity, making Kolme a robust platform for building decentralized apps.

Triadic Security Model

Validator Groups

Kolme’s triadic security model distributes responsibilities across three validator groups—listeners, a processor, and approvers—forming a checks-and-balances system that ensures trust, transparency, and resilience, rivaling the security of blockchains like Solana, Ethereum, and Near while avoiding centralized risks. Each group has distinct roles in maintaining the integrity of the Kolme chain and its multichain integrations.

Processor

Role: Executes transactions and produces signed blocks for the application’s dedicated blockchain. Validates transaction signatures, nonces, and message validity, runs deterministic Rust code, updates framework and application state, and broadcasts blocks via libp2p, as detailed in Core Chain Mechanics.
Key Characteristics:
- Uses a single private key for block production, ensuring clarity and consistency.
- Runs in a high-availability cluster of three nodes across availability zones, coordinated by a PostgreSQL advisory lock (construct lock) to prevent forks, as described in High Availability.
- Sole block producer to avoid conflicting blocks, addressing concerns about forks. Other nodes validate blocks but cannot produce them, preserving decentralization through verification.
Security Contribution: Centralizes block production for efficiency while relying on listeners and approvers for external validations, balancing performance and security.

Listeners

Role: Monitor bridge contracts on external blockchains (Solana, Ethereum, Near) for events like deposits, public key associations, or withdrawal requests. Submit signed confirmations to the processor when a quorum agrees, for inclusion in the Kolme chain.
Key Characteristics:
- Operate as a multi-signature group with a configurable quorum (e.g., 2-of-3 signatures), ensuring resilience to node failures.
- Can run as single instances or replicas, with quorum-based design tolerating downtime without disrupting chain operations, per High Availability.
- Downtime delays deposit confirmations but not core app functionality, reinforcing external chain resilience, as noted in External Chain Resilience.
Security Contribution: Provide decentralized validation of external events, preventing fraudulent bridge actions and ensuring trust.

Approvers

Role: Validate outgoing actions, such as withdrawals to external blockchains, by signing off on bridge contract transactions, ensuring legitimacy and sufficient funds.
Key Characteristics:
- Multi-signature group with configurable quorum, enhancing fault tolerance.
- Single or replicated nodes, with downtime delaying withdrawals but not app operations, per High Availability.
- Collaborate with listeners to secure external fund movements.
Security Contribution: Act as gatekeepers for external actions, protecting against unauthorized withdrawals and maintaining user trust.

Quorum-Based Approval

Administrative tasks, such as software upgrades, validator key rotations, or validator set changes, use a quorum-based approval process for flexibility and robustness.

Mechanism:
- Requires agreement from at least two of the three validator groups (processor, listeners, approvers). All three can propose and approve actions, addressing concerns about quorum flexibility and preventing single-group dominance.
- Proposals are stored in the framework state’s admin proposal state (a MerkleMap mapping proposal IDs to details), supporting multiple proposals with first-come, first-served approval, as detailed in Core Chain Mechanics.
- Approvals update the framework state and, for external actions (e.g., validator set changes), emit atomic bridge contract updates, per External Chain Resilience.
Examples:
- Upgrades: A listener proposes a new chain_version (e.g., v2), the processor and approvers approve, updating the framework state, as described in Version Upgrades.
- Key Rotations: An approver proposes a validator set change, listeners and processor approve, updating bridge contracts atomically, per Key Rotation.
Security Benefit: Quorum-based approval ensures no single group can unilaterally alter the chain, while flexibility allows rapid adaptation, supporting multichain operations and chain migrations (e.g., adding Aptos, Cosmos) without compromising security.

Security Features

Kolme’s triadic model incorporates several features to enhance security and transparency:

Transparent Chains: All transactions, including user actions (e.g., trades, bets) and administrative messages (e.g., upgrades, key rotations), are recorded on the public blockchain, verifiable by any node, developer, or auditor, ensuring trust across ecosystems like Solana, Ethereum, and Near.
Decentralized Validation: Non-processor nodes (listeners, approvers, community nodes) validate blocks by re-executing transactions and checking state hashes, confirming the processor’s integrity without producing blocks.
Quorum Resilience: The two-of-three quorum tolerates downtime or compromise of one group, maintaining chain operations and external integrations, critical for high-availability apps.
Planned Watchdogs: Future watchdog nodes will monitor processor behavior, verifying failed transaction notifications, state hash accuracy, and quorum compliance, detecting issues like censorship or state manipulation, as outlined in Watchdogs.
Multichain Security: Bridge contracts secure fund movements across external chains, with listeners and approvers validating events, ensuring consistency and trust. Kolme’s design simplifies adding new chains (e.g., Cosmos) while preserving on-chain security, per External Chain Resilience.

These features address developer concerns about centralized risks, validator reliability, and multichain integrity, making Kolme a secure platform for building decentralized applications.

External Data Handling

The problem with oracles

In standard blockchain architecture, smart contracts are unable to directly load any data from outside the blockchain. As a result, any data that a smart contract will use must be provided on-chain. The standard solution for this is oracles: smart contracts or other mechanisms that bridge external data to the chain. This can be used for providing such information as:

Pricing for futures markets
Real-world results for predictions markets
Live odds for gambling applications

While oracles allow such applications to work, they introduce a number of complications:

You're limited to what data you can pull into your application. By contrast, with non-blockchain applications, your application can pull in data from any data source it can access.
Oracles can be a source of security holes, specifically around the robustness of their upgrade mechanism, and censorship attacks through congestion of the underlying chain.
Updates to oracles on many blockchains represent a significant gas cost to the operators.

Kolme takes a different approach to loading external data.

Data Fetching Mechanism

Kolme allows an application to load up any piece of data from any source, just like non-blockchain applications. Normally, this would defeat reproducibility and transparency. However, Kolme's approach requires that all external data loads be logged in the block itself when produced by the processor.

As an example, consider the pricing information from the Pyth Network, which includes cryptographic Wormhole signatures attesting to the validity of the data. A Kolme application can pull in these attestations. The Kolme framework will automatically include the full attestation in the block itself, and any node in the network will be able to rerun the block, validating that the cryptographic signatures match and that the output of running the transaction is identical.

With this approach, a Kolme application can automatically leverage existing oracle data on any existing blockchain. But additionally, a Kolme application can pull in data from any other source. While ideally this data should be verifiable--either via querying an external source or validating signatures--Kolme leaves the application developers the latitude to choose their own security and trust models.

As an example, some applications may require access to proprietary data sources that cannot be validated by external parties. Kolme is unopinionated about this. It is the decision of each application how its trust model works, and users of an application can make informed decisions about how much trust to place in the application authors and validators.

Process

During transaction execution, the processor fetches data via HTTP requests to specified endpoints (e.g., Pyth’s price feed API, a custom sports odds service), as part of the deterministic Rust code defined by the application, per Core Chain Mechanics.
Fetched data is included in the block’s data load field, alongside cryptographic signatures or other verification metadata (e.g., API keys, checksums) to ensure authenticity.
Data loads are recorded in blocks with the transaction, metadata, logs, and state hashes, ensuring all nodes can access and validate the data during block verification.

Validation

To maintain security and trust, Kolme ensures external data is verifiable by all nodes, eliminating reliance on centralized oracles and addressing concerns about data integrity.

Non-processor nodes (listeners, approvers, community nodes) validate data loads during block verification by checking cryptographic signatures or re-fetching data from the same endpoint, as outlined in Triadic Security Model.
If signatures are invalid or re-fetched data differs, nodes reject the block, flagging potential processor errors or tampering, with planned watchdogs enhancing detection, per Watchdogs.
By rejecting new blocks, validators are able to stop fund transfers from being approved, protecting user funds.

Advantages

Kolme’s external data handling offers significant benefits over traditional blockchain platforms, addressing developer pain points:

No Oracle Delays: Unlike Ethereum or Solana, which rely on oracles like Chainlink that introduce latency and potential failures, Kolme’s direct HTTP fetching delivers near-instant data access, critical for time-sensitive apps like trading or betting.
Flexible Data Sources: Supports any data provider—public oracles (Pyth, Chainlink), proprietary APIs, or custom feeds—without requiring specialized oracle contracts, enabling apps like sports betting with unique data needs.
On-Chain Transparency: Data loads and signatures are stored in blocks, verifiable by all nodes, unlike off-chain oracle systems that obscure data provenance, enhancing trust for users and auditors.
Deterministic Execution: Ensures consistent data across nodes, maintaining blockchain integrity, as detailed in Core Chain Mechanics.
Multichain Scalability: Easily integrates data for new chains (e.g., Aptos) without app rewrites, supporting Kolme’s extensible multichain design, per External Chain Resilience.
No Gas Costs: Data fetching incurs no fees, unlike gas-heavy oracle calls on shared blockchains, allowing cost-free, complex data operations.

These advantages make Kolme a developer-friendly platform for building data-driven, multichain applications with security and performance.

Implementation Considerations

Developers integrating external data into Kolme apps should consider:

Source Reliability: Choose stable, high-availability data providers (e.g., Pyth for price feeds) to minimize fetch failures, which trigger transaction failures and notifications, per Failed Transactions.
Signature Strategy: Use signed data where possible to reduce re-fetching overhead, balancing performance and security. For unsigned data, ensure endpoints are consistent to avoid validation failures.
Caching: Implement app-level caching in Rust logic for frequently accessed data (e.g., stable sports odds), reducing external requests while maintaining determinism.
Multichain Data: Design data fetching to support multiple chains’ ecosystems, leveraging Kolme’s flexibility to add new sources as chains are integrated, per External Chain Resilience.
Error Handling: Handle fetch failures gracefully in Rust code, using Kolme’s failure notifications to inform users, ensuring a robust user experience.

These considerations ensure efficient, secure data integration, aligning with Kolme’s high-performance, transparent architecture.

High Availability

Processor High Availability

Kolme’s processor, responsible for transaction execution and block production, is designed for maximum uptime through a robust, fault-tolerant cluster configuration, ensuring continuous operation for applications like trading platforms or cross-chain swaps.

Cluster Setup:
- Runs as a cluster of three nodes deployed across different availability zones to mitigate regional outages, as detailed in Core Chain Mechanics.
- A PostgreSQL advisory lock, known as the construct lock, ensures only one processor node produces signed blocks at a time, preventing forks by coordinating leadership exclusively for the processor, as it’s the only component requiring a single active process.
Failover:
- If the active processor node fails (e.g., due to hardware issues or network disruptions), the construct lock is released, and another node in the cluster assumes leadership, achieving zero downtime.
- Hot standbys are always ready, with state synchronized via shared storage (e.g., Fjall volumes) or fast sync, per Node Synchronization.
Security:
- The construct lock ensures a single block producer, addressing concerns about fork risks, while other nodes validate blocks, per Triadic Security Model.
- High availability supports multichain operations (Solana, Ethereum, Near), with no impact on core chain functionality during external disruptions, as noted in External Chain Resilience.

Listeners and Approvers

Listeners and approvers, critical for validating external chain events (e.g., deposits, withdrawals), are designed for resilience, ensuring Kolme apps remain operational even during partial validator downtime.

Deployment Options:
- Can run as single instances, relying on quorum-based resilience (e.g., 2-of-3 signatures), or as replicated nodes for enhanced uptime, supporting ecosystems like Solana, Ethereum, and Near.
- Quorum configurations tolerate individual node failures, ensuring external event validation continues, as described in Triadic Security Model.
Impact of Downtime:
- Listeners: Downtime delays deposit confirmations from external chains but does not affect core app operations (e.g., transaction processing, state updates), reinforcing external chain resilience, per External Chain Resilience.
- Approvers: Downtime delays withdrawal authorizations but allows in-app transactions to proceed uninterrupted, critical for user experience in high-traffic apps.
Scalability:
- Replicated setups scale with demand, supporting multichain apps that add new chains (e.g., Aptos, Cosmos) without reengineering, aligning with Kolme’s flexible design.

Other Services

Kolme’s supporting services—API servers and indexers—are architected for high availability, ensuring robust access and data processing for applications.

API Servers:
- Scale horizontally behind load balancers, handling ephemeral queries for user interfaces or external integrations (e.g., Web3 wallet interactions).
- Use persistent storage (e.g., PostgreSQL) for critical data or ephemeral storage for transient queries, ensuring fault tolerance.
- Support multichain user access (Solana, Ethereum, Near) with no downtime impact from external chain issues, per Wallets and Keys.
Indexers:
- Process blockchain data in batches, storing results in shared databases (e.g., PostgreSQL) for high availability and efficient querying.
- Scale horizontally to handle large transaction volumes, supporting apps like trading or betting with heavy data needs.
- Maintain consistency across multichain operations, enabling seamless chain migrations, as noted in External Chain Resilience.
Resilience:
- Load balancers and shared storage ensure service continuity during node failures, with rapid recovery via persistent volumes or fast sync.

Startup and Recovery

Kolme’s high-availability design extends to node startup and recovery, minimizing downtime and ensuring rapid deployment for new or restarting nodes.

Persistent Storage:
- Nodes use Fjall volumes to retain recent blocks and state (framework and app), reducing sync time for restarts, as detailed in Storage.
- Persistent storage ensures quick recovery after failures, maintaining app availability.
Fast Sync:
- New or ephemeral nodes use fast sync to load the full framework and application state, signed by the processor, enabling rapid onboarding, per Node Synchronization.
- Fast sync supports multichain apps by quickly aligning with external chain states (e.g., bridge contract events), ensuring no disruption during chain migrations.
Recovery:
- Failed nodes restart with Fjall volumes or fast sync, with HA clusters ensuring other nodes handle operations during recovery.
- Planned watchdogs will monitor recovery processes, detecting delays or errors, per Watchdogs.

Advantages

Kolme’s high-availability design offers several benefits for developers building robust, multichain applications:

Zero Downtime: Processor clusters and quorum-based validators ensure continuous operation, critical for apps like trading or betting, unlike shared blockchains prone to congestion delays.
Fault Tolerance: HA clusters, replicated services, and persistent storage tolerate failures without disrupting app functionality, supporting Solana, Ethereum, Near integrations.
Scalability: Horizontal scaling of API servers and indexers handles high transaction volumes, with flexible chain migration (e.g., adding Cosmos) maintaining uptime, per External Chain Resilience.
Rapid Recovery: Persistent volumes and fast sync minimize startup time, ensuring quick restoration after failures, unlike slower sync methods on traditional platforms.
Multichain Resilience: External chain downtime affects only deposits/withdrawals, not core operations, with easy chain additions enhancing adaptability.
Transparency: HA operations are recorded on-chain, verifiable by nodes, ensuring trust, as described in Triadic Security Model.

These advantages make Kolme a reliable platform for building high-performance, fault-tolerant decentralized applications.

Node Synchronization

Slow Sync

Slow sync is Kolme’s most trustless synchronization method, ideal for nodes requiring maximum security, such as validators or auditors verifying applications like trading platforms or cross-chain swaps across ecosystems like Solana, Ethereum, and Near.

Process:
- Nodes process every block sequentially from the genesis block, validating each transaction, data load, and state transition, as detailed in Core Chain Mechanics.
- Validates signatures, nonces, external data (e.g., Pyth price feeds), and state hashes against the processor’s outputs, per External Data Handling.
Bandwidth Usage:
- Heavy during initial load or catching up, as it downloads and verifies all block data, including transactions, metadata, logs, and data loads.
- Light after catching up, as nodes only download new blocks, which exclude state data (only state hashes are included), reducing bandwidth needs significantly.
Characteristics:
- Trustless, relying solely on cryptographic verification without assuming processor integrity, addressing concerns about centralized trust.
Use Cases:
- Suitable for high-security nodes (e.g., listeners, approvers) ensuring chain integrity, per Triadic Security Model.
- Supports multichain apps by validating bridge contract events, ensuring consistency across Solana, Ethereum, Near, or new chains like Aptos, per External Chain Resilience.
Performance:
- Slower during initial sync due to comprehensive validation, but optimized for ongoing operation with lightweight block downloads after catching up.

Fast Sync

Fast sync is Kolme’s quickest synchronization method, designed for rapid node onboarding, such as new validators or ephemeral nodes, offering support for optional block validation for added security.

Process:
- Can either transfer full framework and application state, signed by the processor, from an existing node or shared storage, or download individual blocks and execute locally, depending on which will be faster, as outlined in Core Chain Mechanics.
- Nodes verify the processor’s signature on the state (if transferred), then begin processing new blocks without validating historical transactions by default.
- Once caught up, fast sync uses block-by-block execution for new blocks, similar to slow sync, validating transactions and data loads with minimal bandwidth, as only block data (not state) is downloaded.
- Optional Validation: Nodes can validate historical block data (e.g., transactions, data loads) post-sync, re-executing a subset or all blocks to confirm state integrity, balancing speed and security, per Triadic Security Model.
Characteristics:
- Optimizes the trade-off between bandwidth and compute during initial sync, enabling nodes to join in seconds or minutes, depending on state size (hundreds of MBs or GBs), with block-by-block execution post-catchup further reducing bandwidth needs.
- Requires trust in the processor’s signature if state is transferred, less secure than slow sync unless optional validation is enabled, suitable for trusted environments, per High Availability.
Use Cases:
- Critical for rapid startup in high-availability clusters, such as processor standbys or API servers, per High Availability.
- Supports multichain apps by quickly aligning with external chain states (e.g., bridge events), facilitating chain migrations (e.g., adding Cosmos), per External Chain Resilience.
- Used during version upgrades to transfer state to new nodes, as detailed in Version Upgrades.
Performance:
- Fastest sync method, with optional validation and block-by-block execution providing flexibility for scenarios prioritizing speed or enhanced verification.

Startup

Kolme’s node startup process leverages persistent storage and sync methods to minimize downtime and ensure rapid deployment, supporting robust, multichain applications.

Persistent Storage:
- Nodes use persistent storage volumes to store recent blocks and state (framework and app), enabling quick restarts by loading cached data, as described in Storage.
- Persistent storage reduces reliance on full sync, maintaining availability during failures, per High Availability.
Sync Selection:
- Nodes with persistent storage resume with slow sync for new blocks, minimizing startup time with lightweight block downloads.
- New or ephemeral nodes use fast sync for rapid onboarding, optionally validating block data for added security, ideal for high-availability setups.
- Security-focused nodes (e.g., listeners) opt for slow sync to validate all data, supporting trustless operations across Solana, Ethereum, Near.
Multichain:
- Startup aligns with external chain states via bridge contract events, ensuring seamless multichain operation and easy additions (e.g., Aptos), per External Chain Resilience.
- Fast sync accelerates recovery during chain migrations, maintaining uptime.
Monitoring:
- Planned watchdogs will monitor sync processes, detecting delays or errors, enhancing reliability, per Watchdogs.

Advantages

Kolme’s node synchronization offers several benefits for developers building scalable, multichain applications:

Flexible Sync Options: Slow and fast sync cater to trustless validation or rapid onboarding, with fast sync’s optional block validation and block-by-block execution adding security and efficiency, unlike one-size-fits-all methods on shared blockchains.
High Availability: Fast sync and persistent storage ensure quick startup and recovery, supporting zero-downtime apps, per High Availability.
Multichain Scalability: Sync methods align with external chain states, enabling seamless operation across Solana, Ethereum, Near, and new chains like Cosmos, per External Chain Resilience.
Transparency: Slow sync validates all data on-chain, while fast sync’s signed state, optional validation, and block-by-block execution maintain verifiable integrity, per Triadic Security Model.
Performance: Optimized trade-off between bandwidth and compute supports large-scale apps, with both sync methods lightweight post-catchup, unlike resource-intensive syncs on Ethereum or Solana.
Resilience: Sync processes tolerate external chain disruptions, maintaining core app functionality, enhancing Kolme’s reliability.

These advantages make Kolme a developer-friendly platform for building high-performance, fault-tolerant decentralized applications.

Gossip

Libp2p Integration

Kolme leverages libp2p, a peer-to-peer networking framework, to enable efficient, decentralized communication between nodes.

Role:
- Facilitates broadcasting of proposed transactions, produced blocks, and other notifications between nodes (processor, listeners, approvers, community nodes), as detailed in Core Chain Mechanics.
- Ensures reliable message delivery for validator coordination and chain synchronization, per Triadic Security Model.
Configuration:
- Nodes establish peer connections using libp2p’s discovery mechanisms, forming a gossip network for real-time data exchange.
Security:
- Relies on cryptographic signatures for all transactions, blocks, and security messages (e.g., listener confirmations, approver authorizations) to ensure authenticity and integrity.

Message Types

Kolme’s gossip network handles several message types critical for chain operation and coordination, ensuring robust, transparent communication.

Transactions:
- Users submit signed transactions (containing messages, nonces, signatures) to the mempool via libp2p, where the processor selects them for block inclusion, as described in Core Chain Mechanics.
- Broadcast to all nodes for validation and mempool synchronization, supporting high-throughput apps.
Blocks:
- Processor broadcasts signed blocks (containing transactions, data loads, logs, state hashes) to all nodes for validation, per Triadic Security Model.
- Ensures nodes stay synchronized, supporting slow and fast sync processes, per Node Synchronization.
Failure Notifications:
- Processor broadcasts signed notifications for failed transactions (e.g., invalid nonce, data fetch errors), informing users and nodes, as outlined in Failed Transactions.
- Planned watchdogs will monitor notifications for fairness, per Watchdogs.
Validator Messages:
- Listeners broadcast signed confirmations of external chain events (e.g., deposits), and approvers broadcast withdrawal authorizations, processed as transactions, per External Chain Resilience.
- Supports quorum-based approvals for administrative tasks (e.g., upgrades, key rotations), per Version Upgrades and Key Rotation.

Network Resilience

Kolme’s gossip network is designed for resilience, ensuring continuous operation even under adverse conditions.

Peer Redundancy:
- Nodes maintain multiple peer connections, ensuring message delivery despite node failures or network partitions, per High Availability.
Message Retransmission:
- Libp2p retransmits messages to ensure delivery, handling temporary network issues without data loss.
- Critical for validator messages (e.g., listener confirmations) to maintain quorum-based security, per Triadic Security Model.
Bandwidth Optimization:
- Gossip protocol minimizes redundant transmissions by propagating messages efficiently across the network, supporting high-throughput apps with minimal overhead.
- Lightweight block downloads (excluding state data) further optimize bandwidth, per Node Synchronization.
- Bandwidth-intensive operations are sent directly to other nodes via libp2p's request/response mechanism, avoiding spamming the entire network with broadcast messages.

Advantages

Kolme’s gossip network offers several benefits for developers building scalable, multichain applications:

Decentralized Communication: Libp2p enables peer-to-peer message exchange, eliminating single points of failure, unlike centralized relay systems in some blockchains.
High Throughput: Efficient broadcasting supports high-frequency apps.
Security: Cryptographic signatures on all messages ensure data integrity and authenticity.
Resilience: Peer redundancy and retransmission maintain network operation during failures, supporting zero-downtime apps, per High Availability.
Transparency: All gossip messages (e.g., blocks, notifications) are verifiable on-chain, enhancing trust, per Triadic Security Model.

These advantages make Kolme a robust platform for building high-performance, fault-tolerant decentralized applications.

Version Upgrades

A cornerstone of the security and transparency of Kolme is reproducibility in block production. This means that, given the same input chain state and the same block information (transaction and data loads), Kolme must guarantee byte-identical results for the various states it produces. This is the most important reason for including framework and app state hashes in the serialized blocks.

New versions of an application will almost always introduce changes in this binary output, either through changes in app logic, updates to underlying libraries, or modifications to the data storage format. To handle this in a reproducible, transparent manner, we explicit track version upgrades within the chain.

Versions within Kolme

Kolme uses a simple string to represent versions. The motivation for this is that we only care about versions matching or not matching, not whether a version is older or newer.

Versions appear in four different places:

When creating a Kolme value, we specify the current version of the codebase. We call this the code_version.
The framework state tracks the current version of the code used by the chain. This is the chain_version.
The genesis information for a chain contains the initial value for the chain_version.
The upgrade procedure (described below) includes admin messages that set the version, which will eventually be used to update the framework state's chain_version.

All operations that work on processing blocks (e.g., executing transactions, producing new blocks) check if the current chain version matches the running code version. If not, execution is blocked, since this may result in incorrect state representations. This necessitates the usage of fast sync, as described below.

The upgrade process

Upgrading is handled as an admin message, where the validator set must propose and vote on a migration to a new code version. This follows the same voting procedure as used for other admin messages like validator set changes, namely that once 2 out of 3 groups within the validator set agree, the change goes through.

Let's consider a situation where we have a chain that started at version v1 and is trying to upgrade to v2. The process works as follows:

One of the validators (e.g., one of the listeners) proposes an upgrade to v2. This updates the framework state's proposals data structure. This is generally handled by the upgrader component, described below.
Other validators detect and vote in favor of this proposal until a quorum is reached. (This is also handled by the upgrader component.) Once the quorum is reached, the processor running the v1 code will produce one final block that switches the framework state's code_version to v2. At this point, the v1 processor no longer produces any more blocks, since it's running the wrong code version.
Nodes on the network running v2 are unable to execute any blocks that have occurred so far, since they are all using chain version v1. Instead, these nodes must use fast sync to transfer the entirety of the framework and app state directly to those nodes.
Once validators transfer the framework and app state, they resume chain operation as usual.

Upgrader component

The upgrader component is a recommended component for all validators to run. It handles the logic of the upgrade procedure above, namely:

Check if the desired and actual chain version differ
Check if an existing proposal exists to move to the desired chain version
Voting on the existing proposal if it exists
Creating the proposal if it doesn't

Applications should accept runtime parameters (e.g., environment variables or command line arguments) to indicate if a version upgrade is desired. Note that the upgrader component must be running on the old version of the code, e.g. the v1 processors, listeners, and approvers from the example above.

Once the upgrade process is complete, the old nodes should be taken down, as they will simply drain network bandwidth by performing state transfers.

Ensuring high availability

To keep high availability, we recommend the following deployment strategy:

Publish a new version of the executable with the new code version.
Launch a parallel set of all validator nodes running this new code version (v2).
Modify the existing validator nodes to begin running the upgrader component, setting the desired version to v2.
Wait for the chain to upgrade to v2 (should be very fast once the validators are reconfigured).
Shut down the old v1 validators.

Key rotation

Motivation

There are three groups of specially recognized public keys within Kolme: the processor node, the listener set, and the approver set. Each set has its own quorum rules, requiring a certain number of members from the set to perform their operations. Since the goal of the processor is to allow fast, centralized block production, the processor has only one key and operates autonomously.

Key rotation recognizes the fact that, at some point in the future, these keys may need to be replaced. Our design must handle these cases:

Normal key rotation for security or hardware migration for a single operator (processor, listener, or approver). This should not require any assistance from other operators.
A non-responsive or misbehaving operator needs to be replaced. Misbehaving can either mean:
- The original operator is issuing incorrect data (e.g., a listener reporting on fund transfers that never happened).
- A key has been compromised and is now being abused by a third party attacker.

In any of these cases, we need to both initiate a key rotation, and then execute a key rotation.

Initiating key rotation

The use cases above can roughly be divided into "key replaces itself" and "others replace key." The former is a normal maintenance operation for network maintenance and does not require additional approval. The latter is a response to a security threat to the network, and requires quorum to initiate. Kolme provides two different routes for initiating key rotation.

Self replacement

In the self replacement case, we have a single message that says "replace me as the processor, listener, or approver." The message fails if the signing key is not currently a member of one of those sets. If the same key is used in multiple sets, each set would require a separate message to initiate the change.

No further action is needed to initiate key rotation. As this point, the chain can proceed with the execute case.

Change the set

Instead of replacing a single key, this message initiates a complete replacement of the current set of keys. The new set may contain any set of keys, including keys used in previous sets. This message includes:

Processor key
Listener keys and quorum requirement
Approver keys and quorum requirement

Any member of the processor, listener, or approvers sets may propose a set change. Each set change gets its own unique ID (potentially the block height it was issued at), allowing multiple proposals to exist simultaneously to avoid a misbehaving set member from disrupting the voting process.

Any members of the current set can submit a message voting for the change. (Question: should we also support voting against?) Voting requires 2 out of 3 of the processor, listener, and approver sets to vote in favor of the change. For the listener and approver sets, a normal quorum is needed for the group to vote in favor of the change.

Once a change proposal receives enough votes, it is approved and can move on to execution. At that point, all previous proposals are canceled.

Executing the change

Each accepted change is stored within the FrameworkState, in a MerkleMap with monotonically increasing keys. This sequence of changes includes the full signature history. The motivation of this is that, by just observing this history, you can prove the current set of keys. This allows for a secure fast sync, requiring only trust in the original set of signers.

Immediately upon executing the change, the FrameworkState is also updated with the modified key set. All listener and approver actions will require a quorum from the new set of keys. If the processor key changed, the next block will be signed by that new key.

Each block that executes a key set change will also emit an external chain action to be performed. This will update the contract with the new set of keys. Note that this necessitates that all bridge contracts track not just the processor and approver keys for normal execution. It also means the contracts will need to be aware of listeners to validate a "change the set" action, which may rely upon listener votes to execute.

Transition period

The basic idea of this key rotation flow is:

Perform actions on the Kolme chain
Wait for finality on the Kolme chain (immediate for self-replace, or wait for sufficient approvals for changing the set)
Kolme chain begins using the new set of validators
New set of processor and approvers generate bridge actions to update bridge contracts
Submitter takes the newly signed action and submits to the bridge contracts

The tricky bit here is the fact that the validators that signed off in steps (1) and (2) are the old validators, while the bridge action itself will be signed by the new validators. Therefore, the bridge contracts need to have special logic to handle this transition period, namely:

Confirm that there are sufficient signatures from the old validators on the change itself
Confirm that the new set of validators have signed the bridge action message itself

This leads to some code duplication: we need to reimplement the quorum rule checks in both the smart contracts and Kolme itself. However, this is an unavoidable duplication.

Force-replacing a processor

If the old processor is no longer behaving correctly, it won't be able to produce blocks to allow itself to be replaced. Not yet implemented, but https://github.com/fpco/kolme/issues/207 will cover that case. A theoretical approach:

Add a new special "replace the processor" message
It requires signatures from listeners and approvers be included in that one message
It has the special behavior that, unlike every other message in the system, it changes the expected processor immediately, allowing that block to be produced by the new processor instead of the old one.

Wallets and keys

Kolme keeps an internal concept of accounts. Accounts are able to receive and send funds and perform other actions. Each account is either a multisig account or a regular account, equivalent to Externally Owned Accounts (EOAs) from other chains.

Each regular account has 0 or more wallets and public keys associated with it, and must at all times have at least 1 wallet or 1 public key. Public keys are the only authentication mechanism supported within Kolme, meaning every transaction you send to the chain must be signed with a public/private keypair. Wallets, on the other hand, represent a wallet on an external blockchain. Since many blockchains uses the same wallet addresses (e.g., all EVM chains use identical representations), we internally track wallet addresses as simple strings, not tied to a specific chain.

Wallet addresses can only be used for controlling an account through the bridge contract. It's easiest to understand the workflow by following how a user will normally initiate and use an account.

Most applications begin with a fund transfer to bridge funds into the Kolme app. Let's say that the user has a wallet address 0xDEADBEEF. They first initiate a transfer of 100 USDC into the bridge contract on an external chain. The message includes a public key, PUBKEY1.
Listeners see this bridge event and submit it to the chain as a new deposit. When a quorum of listeners signs off, the processor accepts the event and executes it. During execution, it does the following:
- Look up the 0xDEADBEEF wallet address. If it has an existing account ID associated with it, we use that ID. Otherwise, we add a new account entry with the next available account ID and associate 0xDEADBEEF with it.
- Look up the PUBKEY1 public key. If it is currently unused, we associate that public key with our account.
- Increase the balance for the account by 100 USDC.
The user uses PUBKEY1's secret key to sign a message to interact with Kolme. The public key is looked up, we find the appropriate account, and all actions are taken on behalf of this account.
A few days later, the user is on a new machine and no longer has the secret key for PUBKEY1. They only have the 0xDEADBEEF wallet. They need to add a new key to their account, so they send a new bridge contract message. This one includes no USDC, but specifies the public key PUBKEY2.
The same listener/processor dance occurs as in step (2), and unless PUBKEY2 is already used by another account, we add it to our existing account.
The user can now perform Kolme transactions on their new device. They can also choose to remove PUBKEY1 from their account, or even disassociate the 0xDEADBEEF wallet to allow it to be used with a new account.

Failed transactions

There are two different categories of "failed transactions" within Kolme: transactions submitted within Kolme that are rejected by the processor, and transactions that fail on an external chain. We'll cover each of these categories separately.

Kolme chain transaction failures

The process for submitting a transaction for inclusion in a block is:

Broadcast the proposed, signed transaction to any node in the network
Node uses the gossip component to share that proposed transaction with other nodes
One of the processor nodes picks up the transaction
The processor attempts to execute the transaction
If the execution is successful
- The processor produces a new block
- The processor gossips that block to all nodes in the network
- The nodes are able to observe that the transaction has been added and remove it from their mempools

If, however, the execution is unsuccessful, what do we do? Many blockchains will include the transaction as a failed transaction within a block. We could elect to do that as well in Kolme. However, doing so would unnecessarily bloat the size of our chain, something we're trying to avoid.

Instead, we simply drop the transaction, together with a gossiped notification indicating that the transaction failed. At that point, the processor's job is done.

For security and censorship protection, other nodes in the network should confirm that the attempt to run that transaction fails. The purpose of this is to detect if the processor is unfairly rejecting transactions.

Current plan: This will ultimately be added as part of the watchdog component.

External chain failures

External chain failures cover any kind of failure case with the bridge contracts on external chains.

User deposits

User deposits can fail due to insufficient gas, insufficient user funds, or something else. In all such cases, either no transaction is generated, or a failed transaction is generated. In any event, these transactions should be completely ignored by listeners. They do not generate a new bridge event ID and are not relayed to the Kolme chain.

Invalid funds deposited

A variation on the above is when a user deposits unsupported funds in a bridge contract. In the current design, those funds will simply be lost within the contract. This may sound surprising, but falls in line with standard behavior for unsupported transfers into a contract (e.g., using a MsgBank to transfer funds into a contract in Cosmos does not trigger execute messages).

We have potential alternatives to consider, such as:

Keeping a list of permitted received coins/tokens, and rejecting transactions without them.
Providing an administrative "send untracked tokens" feature for recovery of funds.

We'll wait for sufficient demand before implementing such a change.

Failed actions/withdrawals

Kolme blocks can emit actions to be run on external chains. These actions have the potential to fail. Kolme provides a mechanism for one common failure mode (insufficient funds), and a back door for fixing any other failing action.

One thing to note in particular is that actions must be executed in ascending order. That means that if action 56 fails, no other transactions will be able to proceed. Therefore, unblocking a broken action is a requirement for correct chain operation.

Insufficient funds

It's possible for a Kolme application to generate a fund transfer message that cannot be supported on chain. A simple example would be a multichain application supporting USDC. A client can legitimately deposit 1,000 USDC on chain A, then issue a withdrawal request for chain B, essentially turning Kolme into a token bridge.

We need to hold off on issuing an action until there are sufficient funds on chain B. This may require generating an external bridge transaction, for instance, that will move USDC from one chain to another (probably using the cross-chain transfer protocol).

To allow for this, we have the following model (note not yet implemented!):

For each chain, we maintain an internal accounting balance of tokens held by the bridge contract. This can be calculated by summing deposits and withdrawals.
Additionally, we provide listeners with a special message type to synchronize balances. This can be used to account for a token transfer initiated outside the normal deposit flow.
When a transaction generates a fund transfer action, we check if there are sufficient funds to transfer. If so, we immediately emit the action. Otherwise, we add it to a FIFO queue of pending transfers per chain.
Each time the balance of funds on a chain changes, we check the pending transfers queue and emit as many actions as we can currently support.

Hard override

Ideally no other actions should ever fail, assuming all code is written correctly. Given that such an assumption is guaranteed to be proven false, we include a hard override mechanism. This is a manual workaround for a stalled action.

Any approver may issue a message to either skip a bridge action, or replace a bridge action with a new action. The other approvers must confirm this message, with a final confirmation by the processor. Once the processor confirms the change, the new action is emitted by the chain, and the submitter components will attempt to broadcast the new transaction (or skip it).

Note that this is a fully manual process, intended to be used in exceptional circumstances.

Hard fork/reverted transaction

If a blockchain has a hard fork or reverts a transaction we have already observed, we risk solvency of the system. For all bridge events added to Kolme that no longer exist on the destination chain (note not yet implemented):

A listener can send a message (manually) requesting that an event be reverted.
Once a quorum of listeners vote in agreement, the processor must confirm and add a new block to the chain.
That new block will roll back the next expected event ID to the previous one.
And, if possible, funds remaining in the account will be burned to revert the transaction. Unfortunately, if the funds have already been used, this will be impossible, and may introduce an insolvency issue.

The best defense against this is to ensure that listeners wait for sufficient confirmations on transactions before submitting them to Kolme.

Storage

Understanding blocks

Each block in a Kolme action is made up of essentially five parts:

Metadata about the block: timestamp, signature, block height, previous block hash, etc.
The transaction itself: a request from a client to perform some state transformation. This includes messages (individual actions to take, which may be standard Kolme messages or app-specific messages), a submission timestamp, and signature information. (Note: in Kolme, each block always contains exactly one transaction.)
Any data loads required to reexecute the messages in the transaction.
Logs generating when running messages.
The new state of the blockchain. More on this below.

Our goal with Kolme is to maintain a lean chain of blocks that can be used to fully replay the chain history and validate that the state of the blockchain--as claimed by the processor that produces the blocks--is accurate.

What is state?

Now let's return to a question from above: what's the state of the blockchain? It comes down to two pieces of data:

Framework state: information that Kolme itself maintains about your app chain, such as account balances, public key associations, and bridge contract addresses.
App state: fully defined by the application. This can be any arbitrary data that an application decides to store.

There are some interesting properties about these pieces of data:

They can be relatively large, easily in the hundreds of megabytes for active applications.
The data will change on nearly every single block. For example, the framework state will change every block due to usage of nonces from accounts.
Most of the data, however, will remain unchanged.
We need to be able to cheaply clone this data in memory for executing transactions. We need to cheaply clone because we need to be able to maintain the old state, either for supporting concurrent queries or rolling back a failed transaction.
Since the data is large, we don't want to store the data itself inside a block. Therefore, we store only a hash of the data in a block. As a result, we need to be able to cheaply hash these states.

The storage mechanism of Kolme is built around optimizing for these properties.

MerkleMap

The core data structure we leverage is a MerkleMap. This is a Rust data structure with a BTreeMap-like API. Internally, it is a base16 tree with aggressive caching, clone-on-write functionality, and various other optimizations. It won't be faster than a BTreeMap or HashMap for most operations. However, it provides an incredibly cheap clone (just an Arc clone) and does not require recomputing hashes for unchanged subtrees. By using the merkle-map package for maintaining framework and app state, a Kolme application gets aggressive data sharing, further reducing the total storage size needed for holding onto state from multiple blocks, without requiring pruning of the data. MerkleMap data can also efficiently be transferred over a network, allowing for fast sync.

Pluggable storage

Kolme offers a pluggable storage backend mechanism. Each storage backend needs to hold onto essentially three pieces of data:

A mapping between hashes and payloads. This is used by MerkleMap for storage of state data.
The block history itself. This contains just the hashes referencing the state, not the state itself.
Some way of efficiently determining what the latest block is. This may be a separate field, or could be derived from the block history itself (such as a MAX query on a SQL database).

Additionally, some storage mechanisms provide a mechanism for a construction lock. This is used to allow multiple processors to run in parallel with only one producing a block at a time. This is part of Kolme's high availability mechanism.

The following storage backends are currently provided.

In memory

This is a simple storage mechanism intended only for testing. However, it could potentially be useful for ephemeral services. All data is kept in memory. There is a trivial construction lock provided to allow for better simulated testing.

Fjall

Uses the Fjall crate as a local filesystem key-value store. This backend does not provide any construction lock mechanism.

PostgreSQL

The PostgreSQL backend is primarily intended for high availability processors. It still uses Fjall for Merkle hash storage for efficiency, since in early testing using a PostgreSQL table for storing hashes was too inefficient.

Side note: It may be worth revisiting this in the future, and at the very least have a background synchronization job between hashes in the local Fjall store and the PostgreSQL database. This would allow for faster launch of new nodes in a cluster without needing to synchronize data from other nodes on the network.

In addition to providing storage of the block data within a PostgreSQL table, this backend also provides a construction lock. It leverages advisory locks in PostgreSQL.

External Chain Resilience

Minimal Dependency

Kolme minimizes reliance on external blockchains, ensuring applications remain operational during disruptions in either external chains or node providers providing data from them. Additionally, by minimizing the logic living on external chains, Kolme applications can easily and naturally live on multiple chains or migrate to other chains as business needs evolve.

Scope:
- External chains are used only for payment on-ramps (deposits) and off-ramps (withdrawals) via bridge contracts and ephemeral key verification, as detailed in Core Chain Mechanics.
- Core app operations (e.g., transaction processing, state updates) run independently on Kolme’s dedicated chain.
Impact:
- Downtime or congestion on external chains delays only deposits and withdrawals, not in-app actions like trades or bets, ensuring user experience continuity.
- Supports multichain apps by maintaining functionality during chain migrations (e.g., adding Cosmos), per Node Synchronization.
Validation:
- Listeners and approvers validate bridge events (e.g., deposits, withdrawals) as transactions, ensuring security without external chain dependency, per Triadic Security Model.

Bridge Contracts

Bridge contracts facilitate secure fund movements between Kolme and external blockchains and association of ephemeral keys with user wallets.

Function:
- Handle deposits (users lock funds on external chains, listeners confirm to Kolme) and withdrawals (approvers authorize fund releases), as described in Core Chain Mechanics.
- Securely associate new public keys with an external wallet, allowing users to continue to rely on their preferred wallet provider for private key management.
- Processed as transactions, gossiped via libp2p, ensuring decentralized validation, per Gossip.
Resilience:
- External chain downtime delays bridge operations but not core app functionality, supporting high-availability apps, per High Availability.
- Atomic updates during key rotations or chain migrations prevent mismatches, per Key Rotation.

Advantages

Kolme’s external chain resilience offers several benefits for developers building multichain applications:

Uninterrupted Operations: Minimal dependency ensures core app functionality persists during external chain disruptions, unlike shared blockchains, per High Availability.
Seamless Migration: Adding or switching chains requires minimal app changes, supporting scalability (e.g., Cosmos), per Node Synchronization.
Security: Quorum-validated bridge events and signed contracts ensure trust, per Triadic Security Model.
Transparency: Bridge transactions are recorded on-chain, verifiable by nodes, enhancing trust, per Core Chain Mechanics.
Flexibility: Supports diverse ecosystems (Solana, Ethereum, Near) with easy chain additions, per Gossip.
Efficiency: Fast sync and atomic updates minimize migration overhead, per Version Upgrades.

These advantages make Kolme a robust platform for building high-performance, fault-tolerant decentralized applications.

Multisig accounts

Kolme supports two different account types: regular accounts controlled by external wallets and public keys, and multisig accounts. (NOTE at time of writing, multisig accounts have not yet been implemented.) Multisig accounts allow for a quorum of users to control actions from an account, a common desire for more secure management. The basic workflow of multisig accounts is:

Create the account. This is done by using a normal account to send a "create multisig account" transaction. Any account can perform this action, and the new account will be created with the given set of keys and quorum rules, with no connection to the original account.
Any member of the multisig set can propose a list of messages to be run by the multisig account. These messages will be assigned a multisig proposal ID and await voting.
Other members can vote yes or no. If a quorum of yes can no longer be made, the proposal is removed from the pending proposals. If a quorum for yes is achieved, the proposal is removed and the messages are executed in the same block that the final yes vote occurred.
- Note that we need to consider how we associate log messages with each individual transaction, we may end up needing some kind of "hierarchical messages", TBD. We also need to handle the possibility of these messages failing, probably by putting an explicit "this proposal failed" message in the logs.
- Question: any reason to consider separating voting from execution?
A special message can be used to change the voting set needed for a multisig. This must be voted on by the existing quorum.
- Question: do we want to generalize this to a "convert account" message, and allow converting multisigs to regular accounts and vice-versa?

Watchdogs

Monitoring Role

Kolme’s planned watchdog nodes enhance network integrity by monitoring validator behavior and system operations across multichain ecosystems like Solana, Ethereum, and Near.

Function:
- Observe processor outputs (blocks, failure notifications), listener confirmations, and approver authorizations to detect anomalies (e.g., censorship, invalid state hashes), as detailed in Core Chain Mechanics.
- Validate transactions, data loads, and state transitions, ensuring compliance with protocol rules, per Triadic Security Model.
Scope:
- Monitor gossip messages (e.g., transactions, blocks) for consistency and fairness, per Gossip.
- Verify bridge contract events (e.g., deposits, withdrawals) to maintain multichain trust, per External Chain Resilience.
Operation:
- Run independently, syncing via slow or fast sync to access chain state, per Node Synchronization.
- Report discrepancies to the network or external auditors, enhancing transparency.

Detection Capabilities

Watchdogs are designed to identify and flag potential issues, ensuring robust security and reliability.

Processor Errors:
- Detect invalid blocks (e.g., incorrect state hashes, unsigned transactions) by re-executing transactions, per External Data Handling.
- Flag censorship if valid transactions are excluded from blocks, per Failed Transactions.
Validator Misbehavior:
- Identify inconsistencies in listener confirmations or approver authorizations (e.g., quorum violations), per Triadic Security Model.
- Monitor key rotation and version upgrade proposals for unauthorized actions, per Key Rotation and Version Upgrades.
Multichain Issues:
- Verify bridge contract event processing to prevent fraudulent deposits or withdrawals, supporting Solana, Ethereum, Near, or new chains like Aptos, per External Chain Resilience.
- Ensure chain migrations maintain integrity, per Node Synchronization.
Storage and Sync:
- Check storage consistency (e.g., MerkleMap state) and sync processes for errors, per Storage and Node Synchronization.

Reporting Mechanism

Watchdogs provide transparent reporting to maintain trust and facilitate rapid issue resolution.

Alerts:
- Broadcast signed alerts via libp2p gossip for detected issues (e.g., invalid blocks, quorum failures), informing nodes and users, per Gossip.
- Alerts are recorded on-chain for auditability, enhancing transparency, per Core Chain Mechanics.
User and Validator Feedback:
- Notifications reach user interfaces via API servers, enabling corrective actions, per High Availability.
- Validators (processor, listeners, approvers) receive alerts to address issues, per Triadic Security Model.
Security:
- Alerts use cryptographic signatures for authenticity, verified by nodes, per External Data Handling.

Advantages

Kolme’s watchdog system offers several benefits for developers building secure, multichain applications:

Enhanced Security: Detects and flags validator errors or misbehavior, ensuring trust, per Triadic Security Model.
Transparency: On-chain alerts and verifiable reports maintain auditability, per Core Chain Mechanics.
Resilience: Independent operation ensures monitoring persists during failures, supporting zero-downtime apps, per High Availability.
Multichain Integrity: Validates bridge events and chain migrations (e.g., Cosmos), per External Chain Resilience.
Proactive Monitoring: Early detection of issues like censorship or storage errors improves reliability, per Storage.
Scalability: Supports high-throughput apps with minimal overhead, per Node Synchronization.

These advantages make Kolme a robust platform for building high-performance, fault-tolerant decentralized applications.

Timestamp verification

Every transaction submitted to the Kolme network includes a timestamp of when--on the client machine--the transaction was generated. Similarly, every block contains a timestamp of the processor's machine time when producing the block.

Timestamps are generally considered "best effort" in Kolme, and shouldn't overall be used for any security-sensitive topics. Kolme does not take significant efforts to avoid clock skew among network participants. That said, we do implement the following basic verification checks:

No nodes accept timestamps from the future. To account for clock skew, we define "future" as "more than 2 minutes ahead of the current machine clock time." Accepting here means accepting a transaction into a mempool, or accepting a block from the processor (checking the timestamps on both the block and its transaction).
Block time must be monotonically increasing. Each subsequent block must take place after the preceding block.
The difference between a block's timestamp and its transaction's timestamp must always be, at most, 2 minutes. This also means that all transactions in the mempool can be flushed after 2 minutes, and should be resubmitted.

NOTE these checks have not been implemented at time of writing. This is currently a design document outlining plans.

Relying on timestamp

While timestamps are in general best effort, if an application needs to rely on a timestamp it can do so. This follows the general trust model of Kolme: trusting the processor with significant autonomy. Consider, however, that a processor has the option of significantly changing timestamps, so that timestamps used for things like calculating payouts may be manipulated. This is probably fine for things like interest charges, where two minutes won't make much difference. If, however, this is used for something where a minute of difference is significant, it opens an attack vector for a processor to abuse timestamps.

Watchdogs

We haven't built out the watchdog component yet. Its idea would be to observe the network and, if it observes potentially abusive operations, raise an alert. For timestamp verification, one possibility would be to track the transaction timestamps in the blockchain and, if they regularly arrive out of order, consider the possibility that the processor is reordering transactions.

Note that even this idea isn't fullproof, since it could simply be a consequence of clock skew among clients. More real world testing would be needed to ascertain if this is a good idea or not.

Versioning policy

Kolme is a Rust library, and as such can follow a standard semver-inspired versioning scheme for backwards compatibility of the API. However, due to the nature of how Kolme is used, the story is a bit more complicated. In particular:

Changes to Kolme may impact downstream applications using its APIs. This is the standard versioning issues libraries face.
New versions of Kolme may modify any number of protocol components exposed to the outside world, network APIs and serialization formats being the most obvious.
When applications themselves make changes, they may change their own APIs, serialization, or application logic, all of which would make it impossible to reproducibly reexecute prior transactions.

The last point is mostly out of scope for this document. It is a responsibility of application authors, and is enabled by the version upgrade system provided by Kolme. Please see that document for a better understanding of the goals in this document.

The purpose of this document is to ensure:

Developers of Kolme make changes in a way that allows for backwards compatibility with old serialized data.
We have a clear signposting mechanism for communicating breaking changes that downstream application developers need to handle.

Single version number

In theory, we could use multiple version numbers for Kolme:

Version each sublibrary (like merkle-map separately).
Version the serialized block format.
Version the network API.

And we may ultimately decide to go in that direction. However, we're early in Kolme's development, and that level of complexity isn't currently warranted. Instead, we currently simply track one version number of Kolme: the version of the kolme crate itself. This represents all different pieces of the system as one.

We follow Semantic Versioning (SemVer) for this version number.

Library versioning policy

As a Rust library, Kolme does not need to reinvent any wheels. We can follow standard Rust versioning rules. These are documented at length.

However, at its current state of development, Kolme does not strive to keep stable APIs. It is primarily an internal FP Block tool used for our internal development. As such, we strive to reduce unnecessary code breakage, but need not insist on such compatibility.

This will change at some point in the future, but not yet.

As a result of this, we currently have no specific policy around whether a change below results in a major, minor, or patch version bump. We'll refine this over time.

Application versioning impact

Applications maintain a version string to indicate compatibility with old block production, as discussed in the version upgrading guide. As a simple, conservative measure: any time you release any new version of the application, you should bump the application version number (code/chain version) and go through the full version upgrade process.

Technically, however, you only need to perform such a version bump if a change could result in differences in block production. In practice, almost any change could result in that, even a simple bump to a decimal library (since it may result in slightly different arithmetic results).

Unless explicitly stated otherwise, any change discussed below should be considered as requiring a new application version.

Changing Merkle serialization

Merkle serialization is the most important piece of Kolme to maintain compatibility for. Without this, old block data will be unreadable by newer versions of the library.

Here are some basic rules:

Any data structures that may be modified in the future should provide MerkleSerialize/MerkleDeserialize impls, instead of their Raw variants.
Any modification to the serialized data must result in a bump to the merkle_version method's return value.
As a strong recommendation, new fields should be added at the end of a data structure.
Any newly added fields can be serialized as normal, but when deserializing, you need to check the version number and ensure the field is parsed only for versions it was serialized in.
New fields must include a fallback value for parsing old data. This could either be via wrapping with Option, or providing a default value.

If all that seems a bit abstract, the easiest way to understand it is via the merkle-map versioning test code, e.g.:

#![allow(unused)]
fn main() {
#[derive(Clone, PartialEq, Eq, Debug)]
struct Person0 {
    name: String,
    age: u16,
}

impl MerkleSerialize for Person0 {
    fn merkle_serialize(&self, serializer: &mut MerkleSerializer) -> Result<(), MerkleSerialError> {
        let Self { name, age } = self;
        serializer.store(name)?;
        serializer.store(age)?;
        Ok(())
    }

    fn merkle_version() -> usize {
        0
    }
}

impl MerkleDeserialize for Person0 {
    fn merkle_deserialize(
        deserializer: &mut merkle_map::MerkleDeserializer,
        version: usize,
    ) -> Result<Self, MerkleSerialError> {
        Ok(Self {
            name: deserializer.load()?,
            age: deserializer.load()?,
        })
    }
}

#[derive(Clone, PartialEq, Eq, Debug)]
struct Person1 {
    name: String,
    age: u16,
    street: String,
}

const DEFAULT_STREET: &str = "Default street";

impl From<Person0> for Person1 {
    fn from(Person0 { name, age }: Person0) -> Self {
        Self {
            name,
            age,
            street: DEFAULT_STREET.to_owned(),
        }
    }
}

impl MerkleSerialize for Person1 {
    fn merkle_serialize(&self, serializer: &mut MerkleSerializer) -> Result<(), MerkleSerialError> {
        let Self { name, age, street } = self;
        serializer.store(name)?;
        serializer.store(age)?;
        serializer.store(street)?;
        Ok(())
    }

    fn merkle_version() -> usize {
        1
    }
}

impl MerkleDeserialize for Person1 {
    fn merkle_deserialize(
        deserializer: &mut merkle_map::MerkleDeserializer,
        version: usize,
    ) -> Result<Self, MerkleSerialError> {
        Ok(Self {
            name: deserializer.load()?,
            age: deserializer.load()?,
            street: if version == 0 {
                DEFAULT_STREET.to_owned()
            } else {
                deserializer.load()?
            },
        })
    }
}
}

Changes to logs

Changing log messages may seem like something that doesn't affect downstream. However, it's something that can cause breakage in two ways:

Some log messages may be relied upon and parsed by downstream tools.
Since hashes of logs are stored in blocks, any change in logging will impact reproducibility of blocks.

Make sure that any change to logs is well documented in the changelog.

Changes to messages

This is more obvious than logs. Any change to built-in messages (admin, fund transfer, etc.) will result in changes to transactions and therefore blocks. This doesn't just apply to the API itself, but any change in the handling may result in differences in binary output.

Modifying gossip

Gossip modifications are less severe than the changes above. They impact the network protocol, but do not directly affect block production. Keeping compatibility with the immediately prior version of gossip is a good thing for seamless migrations.

Changelog

The changelog for Kolme is maintained in CHANGELOG.md at the repository root, following the Keep a Changelog format.

The initial version of the changelog is generated using git-cliff, which parses the commit history and creates a structured changelog.
After the initial generation, the changelog is updated manually by the team for each release or significant change.
There is no required commit message convention; all changelog updates are made directly in CHANGELOG.md as part of the release process.
When making changes, update the relevant sections ("Added", "Changed", "Fixed", etc.) in CHANGELOG.md to reflect what has been done since the last release.
The changelog is committed to the repository and should be kept up to date as part of the release process.

We're going to follow the "bump right before" strategy of bumping version numbers in Cargo.toml files just before cutting a release. That means that the repo will always have the newest released version number in the Cargo.toml files.

Sync manager

Motivation

One of the core functionalities of Kolme's network layer (the Gossip component) is the ability to sync blocks. Block syncing can occur in two different ways:

Block sync, where the raw block data is gossiped between nodes, and each node independently executes the transactions to generate blockchain state.
State sync, where the entirety of the state is gossiped between nodes.

Block sync is great for the common case of staying up to date with the chain, and provides for better verification by insisting on executing transactions locally. However, state sync is vital because it allows for:

Rapid synchronization of new nodes, without requiring running through the entire chain history and executing each transaction.
Accounts for version upgrades, where one version of the code base cannot run both old and new blocks.

The logic for performing both block and state sync is surprisingly complex, exacerbated by interacting with libp2p and needing to deal with various network limitations such as rate limiting and data caps. To handle this complexity, Gossip has its own state machine, the sync manager, which handles the process of:

Determining which block needs to be synced
Determining whether to use block or state sync, based on config parameters and local versus remote chain state
Tracking the complex process of collecting merkle layers for state sync
Inserting new data into the data store

This document gives an overview of the sync manager to help developers understand how it works.

State sync: naive version

The basic mechanism for state sync is simple:

Find the latest block height (or a specific block height you're interested in).
Use gossip to ask the network who has that block.
Use libp2p's request/response to transfer that block.

The naive way of doing this is to serialize the entirety of the block and all of its Merkle data (framework state, app state, and logs) in one message. There are two problems with this:

It means that synchronizing additional blocks requires transferring all data again, bypassing the data deduplication logic of merkle-map.
For large enough stores, we blow past libp2p's 16mb buffer limit, making state transfer impossible.

Therefore, we need something more intelligent.

State sync: real version

The real process of state sync is:

Get the raw information on a block, which includes the Merkle hashes of the framework state, app state, and logs. (From now on, to avoid breaking fingers during typing, we'll just call these three "block state.")
Request a single layer of the Merkle data for that block state. A single layer consists of the serialized content of that layer, plus the hashes of any children of that layer.
Recursively download all children until we get to layers that either have no children, or for which we already have all the children.
Store those "leaf layers" in the Merkle store.
Traverse back up the tree, writing successive parents to the store as we discover all their children.
When we finally store the entirety of the block state, write the block itself.

It's important that we only write a layer or block after we get all the children layers. Otherwise, we would leave behind an inconsistent state in the Merkle store. For efficiency, we keep the invariant that a store must always have all the children layers for any layer written in the store. This avoids the need for full recursive decent when resaving a Merkle map.

Complexities

Unfortunately, there are quite a few complexities in implementing this:

libp2p's implementation is highly state-machine focused. You can't simply write a recursive decent algorithm to continue querying additional layers. Instead, you need a state machine to track the progress.
We need to deal with the possibility of timeouts and nodes going offline.
We also want to be able to do this in a CPU-efficient way. Retraversing all pending layers each time we write a new layer would not be acceptable.

`kolme::gossip::sync_manager` module

The heart of this implementation lives in the kolme::gossip::sync_manager module. This module provides a state machine with an API exposed to kolme::gossip. The API allows the gossip code itself to perform simple things:

Request which pieces of data need to be requested from the network
Submit data from the network back to the state machine

The sync manager provides a Trigger which the gossip layer subscribes to. Every time new work is available, the trigger is tripped, causing more network requests to be made. This allows us to batch data loads while still proceeding quickly through processing the data.

But the heart of the mechanism, and the heart of this document, is an explanation of the state machine itself.

The state machine

The state machine explicitly tracks the syncing of only one block at a time. This is to deal with the realities of rate limiting within libp2p. Blocks are processed in ascending order.

A block first goes into the needed state, which means we need to download the raw block data from the network. Once we have the block, the sync manager determines whether we should perform a block or state sync. If we can perform a block sync, no further data downloads are needed, we execute the block, and we store the executed data in our data store.

However, if we need to perform a state sync, the entirety of the state data needs to be downloaded as well. We transition that block into the pending state and begin processing the layers.

We start off by setting the three top level layers (framework state, application state, and logs) as needed layers. Needed layers are requested from the network, and arrive with their payload and a list of their children.

Once we have all the children of a layer in the data store, we can add the parent. However, if a layer arrives with children we don't yet have, we have to:

Add the parent layer to the pending layers map
Add a reverse dependency from each child to the parent
Add all the missing children to the needed layers list

The list of needed layers is fed (in rate-limited chunks) to gossip to request from the network. As each layer arrives, gossip calls back into sync manager to add the layer.

Each time a layer is completed and written to storage, we also need to "traverse up":

Check all reverse dependencies to find the parents waiting for this layer
For each parent, check if all of its children are now in the data store
If so, recursively perform this process on the parent

Once all needed layers are downloaded, we have all of our block state available, and we can finally write the block to storage and proceed with any further block syncing.

Peer discovery

We need to know how to find peers that have the data we need. We do this in two ways:

When gossip originally gets a response from a peer providing information on a block, we tag that peer as the peer we'll request data from. This gets stored in RequestStatus.
If a piece of data remains in needed for too long (by default, 5 seconds), we assume the peer we're talking to is slow or offline, and then we request additional peers from the network. This is tracked by the request_new_peers field of DataRequest.

Choosing blocks

Gossip provides three different sync modes which impact how sync manager chooses which blocks to sync:

In block sync mode, we only ever use block sync, not state sync. To make that work, we must (1) be running the same code version as the chain version, and (2) need to synchronize all blocks from the beginning of the chain.
In state sync mode, we try to jump to the latest block via state sync. From then on, if possible, we'll use block sync to stay up to date. Additionally, if we're missing old blocks, the wait_for_block API call in Kolme will trigger a state sync of that older block.
In archive mode, we synchronize blocks using state sync, but ensure we have a full chain history from the beginning. This is good for running an archive node, thus the sync mode name.

Kolme Framework Developer Docs