Monitoring and Verification
Real-Time Monitoring Overview
Monitoring is the backbone of Centive Network’s performance validation. To ensure that nodes consistently deliver on their commitments, Centive deploys real-time Application Performance Monitoring (APM) agents that track node performance, resource utilization, and availability. This proactive monitoring allows for immediate detection of inefficiencies and faults, ensuring that only reliable contributors are rewarded.
APM Agents in Action
APM agents continuously monitor node performance metrics such as CPU usage, memory, bandwidth, and uptime. These metrics are fed back into the Centive Network in real time to update the node's contribution score.
For example, a compute node contributing 16 cores is monitored based on actual CPU utilization and uptime. If a node's performance drops below a certain threshold, it may lose reward eligibility for that period.
Monitoring Metrics
Centive Network uses several key performance indicators (KPIs) to evaluate the contributions of each node:
- Latency: Measures the delay between a request and the node’s response.
- Throughput: Assesses the amount of data processed within a given time.
- Uptime: Evaluates the total time the node remains active and available.
- Resource Utilization: Tracks how effectively the node is using its committed resources (CPU, memory, etc.).
Efficiency Calculation
The efficiency of a node is calculated based on how well it utilizes its resources:
Where:
- represents the actual resources used for service.
- represents the total resources the node initially pledged.
Nodes that maintain high efficiency consistently over time are rewarded accordingly.
End-to-End Testing Process
Beyond performance metrics, Centive Network employs end-to-end (E2E) testing to simulate real-world usage scenarios. These tests ensure that the contributions made by nodes are not only consistent but also meet the quality standards required by DePIN services.
How E2E Tests Work
- Simulated Workloads: E2E tests create simulated tasks or workloads, mimicking actual service requests for resources.
- Performance Metrics: The tests monitor how quickly nodes respond to these workloads and how consistently they perform.
- Fault Detection: Any failure to meet the expected performance, such as downtime or resource underutilization, is logged and flagged for review.
Example
In a decentralized storage system, E2E tests may request data retrievals from different nodes. If a node is unable to retrieve the data within the required time or the data is corrupt, the node will be flagged for failure. The validators will review the logs and determine whether the node should be penalized.
Key Testing Metrics
- Response Time: How fast a node processes a request.
- Reliability: The node’s ability to consistently meet performance standards.
- Integrity: Whether the node returns the correct data or performs the correct computation.
Performance Metrics
Centive Network evaluates node contributions using several performance metrics that define the overall quality of a node’s contribution. The goal is to reward nodes that not only contribute resources but also deliver them reliably and efficiently.
Primary Metrics
- Latency: Measures how quickly a node can process requests.
- Throughput: The volume of data or tasks a node can process within a specific time.
- Uptime: Ensures that nodes are available consistently.
- Fault Tolerance: Determines how well the node can handle faults without service interruptions.
Performance Score Calculation
Each node is assigned a performance score based on weighted metrics:
Where , , , and are weighting factors that depend on the service type and its requirements.
For example, in a compute service, throughput and latency might carry more weight than uptime, whereas in a storage service, uptime and fault tolerance might be prioritized.
Fault Detection & Resolution
Fault detection is critical to maintaining the integrity of a decentralized network. Centive Network’s APM agents and validators work in tandem to identify faulty nodes and address issues proactively.
How Fault Detection Works
- Real-Time Detection: APM agents continuously monitor nodes for performance drops or unresponsiveness.
- Validator Review: When a node is flagged for poor performance, validators review the fault and determine whether to penalize the node.
- Secret Voting: A secret voting process by validators determines whether the node should be slashed or removed from the network.
Example
In a content delivery network, if a node is unable to serve requested content within the expected time, the fault is logged. If the issue persists over multiple requests, the node is flagged, and validators vote on whether it should be penalized.
Fault Detection Threshold
To ensure fairness, Centive uses a threshold system to determine if a node has failed:
If the fault threshold exceeds a certain percentage, the node is flagged for review.