โ Back to Cloud & DevOps
Infrastructure Monitoring
Real-time metrics, alerting and observability with Grafana-style dashboards
https://grafana.pyco.cloud/d/overview
๐
๐
๐
๐ฅ๏ธ
โ๏ธ
Infrastructure Overview
Last 6 hours
CPU Usage
67%
โ 3% from avg
Memory
82%
โ 5% from avg
Network I/O
1.2 GB/s
โ 12% from avg
Requests/sec
4,521
โ 8% from avg
System Resources
CPU
Memory
Network
100%
75%
50%
25%
API Gateway
99.99% uptime
Avg latency: 45ms
Database
99.95% uptime
Avg latency: 12ms
Cache Layer
98.5% uptime
Avg latency: 8ms
๐
๐
๐
๐ฅ๏ธ
โ๏ธ
Active Alerts
Alert Rules
3 firing
๐ด
High Memory Usage - cache-01
Memory usage exceeded 90% threshold
5m ago
๐ก
Elevated Error Rate - api-gateway
Error rate above 1% for 10 minutes
12m ago
๐ก
High Latency - database-primary
P99 latency exceeded 500ms
18m ago
๐ข
CPU Usage Normalized
CPU usage returned below 80%
1h ago
๐ข
Disk Space Cleared
Disk usage returned below 70%
2h ago
๐
๐
๐
๐ฅ๏ธ
โ๏ธ
Service Health
API Gateway
99.99% uptime
Latency: 45ms ยท RPS: 2,450
Auth Service
99.98% uptime
Latency: 28ms ยท RPS: 890
Database Primary
99.95% uptime
Latency: 12ms ยท QPS: 5,200
Cache Layer
98.5% uptime
Latency: 8ms ยท Hit rate: 94%
Message Queue
99.97% uptime
Latency: 5ms ยท Throughput: 12K/s
Search Service
99.92% uptime
Latency: 85ms ยท RPS: 320
๐
๐
๐
๐ฅ๏ธ
โ๏ธ
Settings
Data Sources
Prometheus
Loki
Jaeger
Notification Channels
Slack #alerts
PagerDuty
Email Team
Retention
Metrics
30 days
Logs
14 days
Traces
7 days
Overview
CPU
67%
Memory
82%
Network
1.2 GB
Requests
4,521
Alerts
High Memory - cache-01
Error Rate - api-gateway
High Latency - database
CPU Normalized
Services
API Gateway
99.99% ยท 45ms
Database
99.95% ยท 12ms
Cache Layer
98.5% ยท 8ms
Message Queue
99.97% ยท 5ms
Settings
Data Sources
Prometheus
Loki
Notifications
Slack, PagerDuty, Email
โ
Screen
1
of 4
Overview
โ