Blog
Lessons from building and operating infrastructure at scale.
2026-03-25-8 minobservabilityvictoriametricsprometheus
Why VictoriaMetrics Over Prometheus
After running both in production across 70+ clusters, here's why I chose VictoriaMetrics as the metrics backend and what trade-offs come with it.
2026-03-18-10 minplatform-engineeringpythonredis
Building a Health Dashboard for 10K+ Services
How I replaced manual PRTG endpoint management with an automated health monitoring system that processes 12M+ logs daily into sub-50ms responses.
2026-03-10-7 minalertingautomationsre
Designing a Custom Alert Severity System
One YAML template, three severity levels, automatic inhibition. How I built an alert pipeline that reduced noise and made on-call manageable.