Monitoring Node Health
Logs are emitted to stdout
as JSON formed strings. Where these logs are available on your system depends on how you launch your containers. On systemd
systems for example, the logs will be sent to the system journal daemon journald
. Other systems may log to /var/log
.
Flow nodes produce health metrics in the form of Prometheus metrics, exposed from the node software on /metrics
.
If you wish to make use of these metrics, you'll need to set up a Prometheus server to scrape your Nodes. Alternatively, you can deploy the Prometheus Server on top of your current Flow node to see the metrics without creating an additional server.
Copy the following Prometheus configuration into your current flow node
prometheus.ymlglobal: scrape_interval: 15s # By default, scrape targets every 15 seconds. scrape_configs: # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config. - job_name: 'prometheus' # Override the global default and scrape targets from this job every 5 seconds. scrape_interval: 5s static_configs: - targets: ['localhost:8080']
Start Prometheus server
docker run \
--network=host \
-p 9090:9090 \
-v /path/to/prometheus.yml:/etc/prometheus/prometheus.yml \
prom/prometheus"
(optional) Port forward to the node if you are not able to access port 9090 directly via the browser
ssh -L 9090:127.0.0.1:9090 YOUR_NODE
Open your browser and go to the URL
http://localhost:9090/graph
to load the Prometheus Dashboard
Key Metric Overview
The following are some important metrics produced by the node
Metric Name | Description |
---|---|
go_* | Go runtime metrics |
consensus_finalized_blocks | Number of blocks this node has finalized |
consensus_hotstuff_busy_duration | How long Hotstuff spends in a busy state before returning to idle |
consensus_hostuff_busy_second_total | The total time Hotstuff has spent in a busy state |
consensus_hotstuff_view_of_newest_known_qc | The latest block known to Hotstuff |