Broadly speaking, our monitored infrastructures are: Global Point of Presence (PoP) and App Gateway. Please refer to this article for details: https://support.safous.com/kb/general-architecture.
Method Overview
- Web test: Is a scenario-based monitoring where agent performs a set of predefined HTTP requests. This can also be used to test the functionality of web application. Collected data which sent back to the monitoring host can be used to set up monitoring alert.
- Active agent: In contrast with Passive Agent--whereby monitoring host request monitoring data from agent each time host reaches its update interval, Active Agent works the other way around, with the agent actively sending monitoring data to monitoring host every time agent reaches its update interval. Monitoring data can then be used directly to set up alert based on certain threshold without the need of pre-processing.
- Internal check: In simplest term, this is a self-check. The monitoring host will ask the agent to collect the monitoring data of itself which then sent back to monitoring host--in a similar fashion to Active check, monitoring data is also ready-to-use.
- HTTP agent: An agent is set up with the sole responsibility of answering request from authorized monitoring host with HTML web page filled with monitoring data as the content. Pre-processing of the return page is needed before monitoring data can be used in the host.
- External check: The monitoring host will ask an agent to execute a script which supports custom parameters. The agent will simply send the return result of the script back to the monitoring host--in a similar fashion to Active check, monitoring data is also ready-to-use.
From service side, Global PoPs are monitored using agent-based system. The monitored items includes:
Regional Availability
Item | Description | Interval | Threshold | Method |
Affinity check | Monitor PoP availability | 1m | Response code is not 200 | Web test |
System Resources
Item | Description | Interval | Threshold | Method |
Load average | Monitor current edge load avg (1m, 5m, 15m) | 1m | > 1.5 CPU load avg | Active agent |
Memory usage | Monitor currently available edge memory in percentage | 1m | > 90% | Active agent |
Disk usage | Monitor currently used edge disk space in percentage | 1m | > 90% | Active agent |
Inodes usage | Monitor currently available edge disk inodes | 1m | < 10 inodes | Active agent |
Edge Properties
Item | Description | Interval | Threshold | Method |
Agent availability | Monitor agent availability | 1m | Response timeout = 0 | Internal |
Edge NATS | Monitor availability of edge nats function | 1m | Status code > 0 | Active agent |
Edge router | Monitor availability of edge router function | 1m | Status code > 0 | Active agent |
Edge watchtower | Monitor availability of edge watchtower function | 1m | Status code > 0 | Active agent |
Meanwhile from customer side, App Gateway are monitored through agent-less HTTP request. The monitored items includes:
System Resources
Item | Description | Interval | Threshold | Method |
CPU number | Total CPU on host | 1m | - | HTTP agent |
Free disk | Free space left on disk | 1m | 75% | HTTP agent |
Free memory | Free memory on host | 1m | - | HTTP agent |
Host health check | Monitor host availability | 1m | No health check data | HTTP agent |
App Gateway Properties
Item | Description | Interval | Threshold | Method |
SSL Cert. Validity | Check SSL certificate expiration | 1d | <1d | External check |
App Gateway Num. Check | Number of active App Gateway | 1m | > #ordered | HTTP agent |
SSL Cert. Expiration (7d) | Check SSL certificate expiration in 7 days | 1d | < 7d | External check |
App Gateway Health Check | Monitor App Gateway availability | 1m | Response code is not 200 | Web test |
Login Health Check | Monitor login page availability | 1m | Response code is not 200 | Web test |
User Num. Check | Number of active users | 6h | > #ordered | HTTP agent |
License Expiration | Check license status in 90 days | 12h | < 90d | HTTP agent |
Since a lot of data are collected from many points around the world, we also set up monitoring proxies in different regions. This to minimize latencies and pre-process collected load with greater flexibility.
Safous Internal Team will receive PoPs and App Gateway's alerts, while customer PIC will only receive alerts regarding their own App Gateway via the registered PIC tenant email.