Presenting NodeStats: The One Stop Shop for Ethereum Node Metrics
Despite the number of smart contracts platform beating well above their weight, it cannot be denied that Ethereum continues to be the most popular platform to fidget with decentralized ideas in the true sense of the term.
The meteoric rise in popularity of Ethereum has spurred the growth of industry and tech intelligence platforms that can provide robust and dynamic data pertaining to nodes, servers, and other metrics of the smart contracts platform. Seeing the large market appetite for such data-crunching services, TokenAnalyst, in collaboration with BitMEX Research, has launched nodestats.org.
What Does NodeStats Do?
NodeStats functions as one stop shop for everything about Ethereum network node data. The website gathers data from five Ethereum nodes every five seconds. This way, NodeStats relays direct real-time data metrics which cover the computational resources being utilized by each node.
The website essentially juxtaposes the figures from the two largest Ethereum node client implementations in terms of overall adoption – Geth and Parity. From these two network implementations, NodeStats compares the performance of different facets of node configurations – fast, full, and archive nodes.
According to BitMEX Research’s blog post, the three primary objectives of NodeStats are:
- To offer people metrics related to the computational efficiency of various Ethereum node implementations. These metrics can be in the form of CPU usage, memory usage, peer count, bandwidth, and storage space, among other things.
- To provide precise data on resource requirements between running different node software. For example, the website offers a comparison between resource requirements between running Ethereum node software and that of other coins, such as Bitcoin (BTC), and Litecoin (LTC), among others.
- Lastly, NodeStats also aims to extract data that could help analysts and intelligence firms gauge the strength of the Ethereum peer-to-peer network and its transaction processing speed. This can be achieved by collecting data from various Ethereum nodes and then processing that data to check whether the nodes have processed data blocks fast enough to remain at the tip of the chain. It is also possible that data blocks are treated in a sluggish manner that leads to nodes being out of sync for a large period of time.
The Type of Data Extracted by NodeStats
NodeStats records data every five seconds from Ethereum client implementations. This means that it collects data a total of 720 times in an hour which gives the platform enough data sets to draw firm and accurate conclusions regarding the quality of data.
Some of the major data metrics collected by NodeStats are as follows:
- Percentage of time in sync
This metric represents the percentage of time the Ethereum node has correctly verified and downloaded all the block data from the chain. NodeStats reports the number of times the node is found to be at the forefront of the blockchain that records the addition of the latest block of data. This data is obtained by collecting hourly metric data which is measured by sending 720 queries to nodes every hour.
According to data gathered by NodeStats as yet, nodes report they are at the tip around 99.8 percent of the time. This means that out of the 720 hourly queries sent to nodes, only one node reports that the nodes are not at the chain tip.
NodeStats adds that the data integrity of this metric is “poor” and that going forward they aim to devise a more effective way of calculating this particular data point.
- Percentage of time on a conflicting chain
This metric represents the percentage of time the Ethereum node follows a conflicting chain to the node opposite to it on the site. The way to determine this metric is by checking whether nodes have different block hash at the same chain height. If yes, then they are considered to be on a conflicting chain. NodeStats achieves this by gathering all the existing block hashes in its database.
Till date, NodeStats has not identified any instance when the client implementations for following conflicting chains. Hence, this metric is zero percent or zero times out of 720 in a one hour period.
- CPU usage
This metric determines the average utilization of the machine’s CPU resources.
Per the relatively small volume of data collected by NodeStats till now, CPU usage is found to typically vary between 0.01 percent to one percent levels. Further, it’s worth highlighting that the Geth client implementation uses comparatively less CPU power than Parity.
- Memory usage
This metric measures how much memory is consumed by the Ethereum client. To conclude, NodeStats collects data consumption readings from the machines every five seconds.
According to data collected by NodeStats, it was found that nodes, by far, use up the majority of the memory space available – more than 95 percent. Notably, it was also found that memory demands of Ethereum client implementations are “reasonably stable.”
- Peer count
NodeStats gathers data continually from nodes (every five seconds) to determine the number of network peers.
NodeStats has drawn conclusions pertaining to Ethereum’s two largest client implementations. While the Parity client tends to have peers hovering around 450, Geth, on an average, only has around eight peers.
Further, it was also found that Geth’s peer count is more volatile than that of Parity, as the former’s peer count appears to occasionally fall to around six.
- Chain data size
NodeStats also gauges the chain data size of Ethereum clients. In essence, this metric represents the total volume of data used by all the directories dedicated to the client.
However, this particular metric is fundamentally different from all the other aforementioned metrics in that it discloses the absolute value and is not a rolling one hour average.
NodeStats reports that currently, the Etheruem client full archive node utilizes up 2.36TB of data, while the Parity and Geth clients use around 180GB and 200GB of data, respectively.
Ethereum Parity Full Node Data Integrity Issues
Although the Ethereum Parity Full node uses a high-end machine with 14GB of RAM and 10GB/s internet connection, recently there have been issues pertaining to Ethereum nodes data integrity that have come to surface.
According to BitMEX, the Parity full node, at times, has reported that it is in sync, despite being thousands of blocks behind the chain tip.
As can be inferred from the below graph, the blue block seen on the network figure is factually incorrect. The blue data block which also happens to be the highest block seen on data figure, at times, dips in value with time, and has historically been found to be lagging behind the actual chain tip (highlighted in green).
In essence, NodeStats accepts its website incorrectly reports the node as “in sync.” Speaking of the possible repercussion of such incorrect data, a number of Ethereum users might consider it a potential danger as the Parity node has a large number of connections to the network.
The blog post reads in part:
“For example, a user could accept an incoming payment or smart contract execution as verified, while their node claims to be at the network chain tip. However, the client may not really be at the chain tip and an attacker could exploit this to trick the recipient into delivering a good or service. The attacker would need to double spend at a height the vulnerable node wrongly thought was the chain tip, which could have a lower proof of work requirement than the main chain tip. Although successful execution of this attack is highly unlikely and users are not likely to be using the highest seen block feature anyway.”
However, despite the seemingly incorrect metric, NodeStats continues to include it on its website as it displays data directly gathered from the Ethereum nodes. The team is open to implement its own improved metric standard in the future to fine-tune this data.