BitVMX Off-chain Communication System: Key Components and Secure Strategies

Kevin Wahle
·
18.10.2024
·

BitVMX aims to build a flexible framework for blockchain bridges, oracles, and proof verifiers. A crucial component of BitVMX is its off-chain communication system. This system provides an authenticated method for secure communication between the Prover and the Challenger throughout the entire protocol, including setup and key sharing. The implementation must prioritize security to eliminate vulnerabilities, while maintaining a high level of flexibility to accommodate various scenarios.

General Characteristics

The BitVMX off-chain communication system uses an asynchronous framework, built with Tokio, to run operations without blocking other parts of the protocol. While these asynchronous tasks handle specific events, other components of the BitVMX system can continue to run in a synchronous way. From here on, we will refer to these as sync and async processes, which are carefully isolated to ensure they do not interfere with one another.

Async events are triggered in various scenarios, such as during connection, handshake, identity validation, message transmission and verification. This async handling enables the network to maintain continuous, decentralized operations while the rest of the BitVMX protocol executes.

The behavior of the system is guided by specific events, which determine the key actions the program will focus on. The most important event in this setup is the Connection-Response event. This event plays a crucial role in enabling message exchanges between nodes.

To support the async nature of network communications and the library use of async operations, we have encapsulated it within a sync interface that communicates through channels (queues). This design maintains isolation for the rest of the system, allowing it to consume messages at its own pace while ensuring that connections, requests, and messages are handled in a timely manner.

Single point of failure

To prevent single points of failure, we rely on peer-to-peer (P2P) communication protocols rather than a centralized network. In a centralized network, all nodes or users depend on a central authority or server to manage communication, data storage, and overall network functionality. This creates a single point of failure: if the central server experiences an outage, gets compromised, or becomes overloaded, the entire network can fail, leading to service disruptions or complete downtime.

In contrast, a P2P network distributes the responsibilities across all participating nodes. Each node can communicate directly with others without relying on a central authority. This decentralized approach ensures that the network remains operational even if one or more nodes fail or are compromised. Since there is no single node the entire network depends on, the impact of a failure in any part of the network is minimized.

Keys and IDs

To establish communication with a node, its IP address must be known in advance. Once a connection is made, the node is identified within the network by its Peer ID. Both the IP address and Peer ID are unique identifiers, but ensuring the Peer ID’s uniqueness and securing the communication channel requires the use of different cryptographic keys:

  • Identity Keys: Every node in the network has a pair of identity keys (public and private). The Peer ID is derived from the public identity key, essentially functioning as a hashed version of it. This Peer ID uniquely identifies the node within the network and remains permanently tied to that node.
  • Ephemeral Keys: These are temporary key pairs (public and private) generated for each individual communication session. They play a crucial role in the Diffie-Hellman key exchange, which allows two nodes to compute a shared secret key without ever transmitting it over the network.
  • Static Keys: These keys are used by nodes to authenticate each other during the key exchange process, ensuring trust between the communicating parties.

By using this combination of identity, ephemeral, and static keys, nodes can securely identify each other and establish encrypted communication channels. This setup protects against security threats like Man-in-the-Middle (MitM) attacks and ensures the integrity of the communication.

Initial Handshake

In previous sections, we discussed encryption channels and key types. Now, let’s break down how this works in practice. The initial handshake is crucial for establishing secure communication and identifying the nodes.

We use the TCP protocol for data transmission, ensuring reliable delivery of messages. However, for security and encryption, BitVMX relies on the Noise Protocol Framework to protect the communication channel. The protocol is designed to:

  • Authenticate both parties: Each node must prove its identity to the other to prevent impersonation attacks.
  • Secure the communication channel: The protocol establishes a shared secret that encrypts all subsequent communication. By using ephemeral keys and Diffie-Hellman exchanges, Noise ensures that even if an attacker intercepts the communication, they cannot impersonate or eavesdrop on the session.

Here’s how the initial handshake works when establishing a connection using the Noise Protocol:

  1. Key Exchange: During the initial handshake, both nodes exchange their ephemeral public keys. These keys are essential for the Diffie-Hellman key exchange, allowing both parties to derive a shared secret without revealing it over the network. The shared secret is later used to encrypt communication.
  2. Authentication: After creating the shared secret using ephemeral keys, the nodes authenticate each other using their static key. This static key is signed with the private identity key. This means that the static key is tied to the node’s Peer ID, proving that the node you are talking to is the correct one. This ensures that both nodes are who they claim to be, protecting against impersonation.
  3. Encryption: With both the shared secret and verified identities, all further communication between the nodes is encrypted. This means that even if someone intercepts the data, they won’t be able to read it.

By employing Noise for the handshake and session establishment, BitVMX ensures that the communication between nodes is both secure and efficient, maintaining the integrity of the P2P network while protecting against various types of attacks.

Allow list

In this initial implementation, we use a pre-established allow list that contains the peer IDs and IP addresses of nodes permitted to connect to the network. If a node is not on this allow list, it is immediately blocked and cannot even begin the handshake process. Similarly, if a node is on the list but its peer ID does not match its IP address, it will be automatically disconnected.

However, being on the allow list does not guarantee that a node is harmless. To guard against potential attacks, we have implemented a rate limiter. This feature prevents a single node from overwhelming the system with certain types of attacks. We’ll dive into how this works in a later section.

Any disconnections or suspicious activities are reported to the sync part of the system, which can then take appropriate action.

The allow list is designed with the potential for updates in mind, allowing for adjustments as needed based on the consensus of other nodes. This approach ensures that the network can adapt to new circumstances and evolving requirements over time.

Message exchange

A crucial part of message exchange is the request-response event. This means that when one node sends a message, the other node has the opportunity to respond within specific limitations. The sender can transmit any combination of bytes, up to 1MB in size. Once the message is received, the recipient has a limited amount of time (usually a few seconds, but this is adjustable) to send a response. The reply can contain up to 10MB of data.

While the structure of message data is predefined, the system is designed to handle a wide variety of data types. To ensure flexibility, the message sent by the sender is represented as a vector of bytes. In response, the recipient typically sends a simple boolean value, confirming whether the message was successfully received. If the message was not received correctly, the recipient may respond with a false value, or in other cases, they might not respond at all, depending on the nature of the error.

Importantly, this communication process only occurs between nodes that have successfully completed the handshake protocol and verified each other's identity. Once verification is complete, nodes are added to a trusted peer list, ensuring secure communication channels.

The channels created for sending responses have a limited lifespan, as mentioned earlier, and can only be used once per message. These channels must follow a specific pre-established format for communication to function correctly. If the response message doesn’t meet the format or protocol requirements, it might not reach the original sender. However, if it does manage to arrive despite the error, it may still have consequences for the sender, which will be explained in a future article when we describe how rate limiting works.

Channels

Since the P2P code operates async, channels play a crucial role in enabling communication between different parts of the system. There are multiple channels, each with a specific function. For instance, as mentioned earlier, there is a dedicated channel for handling responses within the request-response protocol, where each message request receives a corresponding response.

However, channels are not only used for handling external communication between peers. They also manage internal communication between the async and sync parts of the code. This separation is vital, as it ensures that the entire protocol does not need to be async, allowing for better flexibility and isolation of the different components.

To achieve this, the system implements two channels: one for sending messages (the sender) and one for receiving them (the receiver). Each serves a distinct purpose in facilitating communication across the async and sync components.

From the perspective of the sync part of the system, the receiver channel is responsible for handling incoming messages and events from other nodes, while the sender channel manages outgoing messages and directives.

Summary

The BitVMX off-chain communication system provides a secure, decentralized P2P framework using an async model to allow continuous, secure interaction between nodes. By eliminating single points of failure, BitVMX maintains resilience, incorporating robust identity protocols and secure handshakes to safeguard against common attacks. Key features like an allow list and rate limiter protect against network overload, while channels enable interaction between sync and async tasks. In the next article, we’ll explore BitVMX's advanced channel structure, rate-limiting mechanism, practical usage, and upcoming work.

Join our community