Fixups
This commit is contained in:
parent
521eb692ad
commit
32279b7780
@ -115,10 +115,28 @@ Please mail [dn42@burble.com](mailto:dn42@burble.com) for further details.
|
||||
|
||||
## Network Status and Reporting
|
||||
|
||||
### Hosted Grafana Service
|
||||
|
||||
[http://grafana.burble.dn42](http://grafana.burble.dn42) dn42 link
|
||||
[https://grafana.burble.com](https://grafana.burble.com) public internet link
|
||||
|
||||
The hosted grafana service has it's own page [here](/home/grafana-service).
|
||||
|
||||
### DN42 Infrastructure Monitoring
|
||||
|
||||
burble.dn42 hosts monitoring and alerting of key DN42 services, see the
|
||||
[hosted grafana service](/home/grafana-service) for more details.
|
||||
|
||||
### burble.dn42 status
|
||||
|
||||
[dn42.status.burble.com](https://dn42.status.burble.com/)
|
||||
|
||||
Each node in the network is monitored by [UptimeRobot](https://uptimerobot.com/) with alerts if a node becomes unavailable.
|
||||
Each node in the network is monitored by [UptimeRobot](https://uptimerobot.com/) with alerts
|
||||
if a node becomes unavailable.
|
||||
|
||||
Internally, nodes are measured by [netdata](https://github.com/netdata/netdata) which provides a real time view of each node. [prometheus](https://prometheus.io/) is then used to collect and store that data for historical reporting. [grafana](https://grafana.com/) is used for visualisation, but this is not currently a public service.
|
||||
Internally, nodes are measured by [netdata](https://github.com/netdata/netdata) which provides
|
||||
a real time view of each node. [prometheus](https://prometheus.io/) is then used to collect and
|
||||
store that data for historical reporting. [grafana](https://grafana.com/) is used for
|
||||
visualisation. Some public graphs are available on the [hosted grafana service](/home/grafana-service).
|
||||
|
||||
Syslogs are exported in real time to a central logging node on the internal network.
|
@ -1,17 +1,23 @@
|
||||
---
|
||||
title: Hosted Grafana
|
||||
visible: true
|
||||
---
|
||||
|
||||
Details of the burble.dn42 hosted Grafana service.
|
||||
|
||||
===
|
||||
|
||||
# Hosted Grafana Service
|
||||
## Hosted Grafana Service
|
||||
|
||||
|Host / URL|Service|
|
||||
|:--|:--|
|
||||
|[http://grafana.burble.dn42](http://grafana.burble.dn42)|Grafana Dashboards (dn42 link)|
|
||||
|[https://grafana.burble.com](https://grafana.burble.com)|Grafana Dashboards (public internet link)|
|
||||
|[http://grafana.burble.dn42/](http://grafana.burble.dn42/)|Grafana Dashboards (dn42 link)|
|
||||
|[https://grafana.burble.com/](https://grafana.burble.com/)|Grafana Dashboards (public internet link)|
|
||||
|influx.burble.dn42:8086|InfluxDB Endpoint|
|
||||
|
||||
|
||||
The hosted grafana service provides an [InfluxDB](https://www.influxdata.com/) and
|
||||
[Grafana](https://grafana.com/) combination for storing and displaying stats and metrics.
|
||||
[Grafana](https://grafana.com/) combination for storing and displaying stats and metrics.
|
||||
The service can accept metrics from any source that is able to
|
||||
[publish](https://docs.influxdata.com/influxdb/v1.7/supported_protocols/) to the InfluxDB, including
|
||||
[Prometheus](https://prometheus.io/) and
|
||||
@ -27,9 +33,9 @@ The grafana service is hosted on dn42-fr-rbx1.burble.dn42. Service users are enc
|
||||
directly with the service node in order to lower latencies and avoid sending large amounts of
|
||||
data through other nodes in DN42.
|
||||
|
||||
# DN42 Infrastructure Monitoring
|
||||
## DN42 Infrastructure Monitoring
|
||||
|
||||
The burble.dn42 network provides monitoring and alerting of key DN42 infrastructure.
|
||||
The burble.dn42 network hosts monitoring and alerting of key DN42 infrastructure.
|
||||
The monitoring service logs metrics to the hosted grafana service, and presents alerts to
|
||||
the #dn42-bots channel and slack. Two monitoring nodes hosted in separate regions ensure that
|
||||
alerts will be generated if the main monitoring node fails.
|
||||
@ -40,7 +46,7 @@ The monitoring architecture is detailed below:
|
||||
|
||||
#### Nodes
|
||||
|
||||
The main monitoring node is hosted on dn42-de-fra1, with a secondary backup node on dn42-us-nyc1.
|
||||
The main monitoring node is hosted on dn42-de-fra1, with a secondary backup node on dn42-us-nyc1.
|
||||
Both nodes monitor the availability of services on each other and are capable of alerting if the
|
||||
peer node is unavailable.
|
||||
|
||||
@ -51,8 +57,8 @@ Metrics collected by the service are presented as public graphs in the burble.dn
|
||||
|
||||
#### Alerting
|
||||
|
||||
AlertManager is configured as a cluster, operating across both monitoring nodes. Alerts are
|
||||
published in real time to the #dn42-bots hackint IRC channel (using
|
||||
AlertManager is configured as a cluster, operating across both monitoring nodes.
|
||||
Alerts are published in real time to the #dn42-bots hackint IRC channel (using
|
||||
[alertmanager-irc-relay](https://github.com/google/alertmanager-irc-relay) and
|
||||
burble.dn42/dn42-alerts channel in slack.
|
||||
|
||||
@ -61,7 +67,8 @@ Alerts typically fire when a problem occurs for 5 minutes or longer.
|
||||
#### Collection and Storage
|
||||
|
||||
Prometheus is used to collect metrics from the various probes and publish them to the hosted Influx
|
||||
database. Typically metrics are collected every minute, although this is reduced to every five minutes
|
||||
database.
|
||||
Typically metrics are collected every minute, although this is reduced to every five minutes
|
||||
for the clearnet DN42 services to avoid excessive load.
|
||||
|
||||
The main node for data collection is monitor.de-fra1.burble.dn42
|
||||
@ -72,4 +79,4 @@ The main node for data collection is monitor.de-fra1.burble.dn42
|
||||
|:--|:--|
|
||||
|[blackbox_exporter](https://github.com/prometheus/blackbox_exporter)|Used to ping hosts or query services (e.g. HTTP/s probes)|
|
||||
|[netdata](https://github.com/netdata/netdata)|Used to collect many host system metrics|
|
||||
|[dn42promsrv](https://git.burble.com/burble.dn42/dn42promsrv)|Custom scripts for DN42 specific probdes|
|
||||
|[dn42promsrv](https://git.burble.com/burble.dn42/dn42promsrv)|Custom collector for DN42 specific probes|
|
||||
|
Loading…
x
Reference in New Issue
Block a user