Blockchain Environment health
helping engineers monitor and track metrics that matter
Kaleido (a Consensys Company) is a full-stack Blockchain Software As a Service that simplifies & accelerates the entire enterprise journey to get users to production faster. Using Kaleido, clients are able to launch nodes, create multiple test environments and onboard other clients into a consortium to collaborate and build blockchain solutions
A common need for DevOps, IT, and blockchain engineers is to track and manage the overall health and operations of the network infrastructure including nodes and environments. This is why we built a dedicated dashboard to achieve these needs.
Problem: How do we identify what qualifies as a “healthy” environment- and how can we bring those data points to user so that they can help?
Solution: Work with Devops and protocol engineers to identify data points and time frames to help build data visualization graphs and tools to help paint a picture of overall environment health.
Outcome: A dashboard and powerful tool that helps users track incidents and errors, analyze data and trends, anticipate issues, and monitor nodes.
Role: Lead product design, UX, visual design.
Establishing goals
In order to better understand environment health, I interviewed and worked with protocol and devops blockchain engineers to understand what pieces of information they monitor on a daily basis to identify environment health. It became aparant that the solution we created did the following tasks:
Show what is going on across chain and nodes in an environment
Use progressive disclosure
Lean on interaction patterns to allow users to troubleshoot and find out moreOptimize for scannability
Use trends to find anomalies. Lean on graphing techniques and colorMake big wins
See how users interact with the ui to see what metrics are interesting to them.Work towards “what can I do next…”
Give users tools to help them troubleshoot errors
CHain health
The first section we focused on was Chain summary. This provides users with information about the blockchain in the past 24 hours. It provides information on the current block height, contains two panels:
Blocks - provides the transaction count in each of the previous 20 blocks
Transactions - provides the transactions count over time (public or private transactions)
node health
The section section includes more data points that help show an overall picture of all the nodes in an environment. We also allowed users to isolate specific nodes to diagnose issues and better clarity. This view helps engineers to scan and analyze how to best troubleshoot issues. The graphs focus on the following data points:
CPU Utilization
Memory Utilization
Peer Count - This represents how aware each node is of each other.
Disk Utilization - Total disk usage from data generated by the node (blockchain data and logs)