Engineering craft
SRE
Keep the service alive.
- •200 XPSLOs and Error BudgetsWhat 99.9% actually costs you.
- •200 XPObservabilityMetrics, logs, traces — and when each one wins.
- •200 XPIncident ResponseDeclare, stabilize, communicate, rollback.
- •200 XPOn-CallAlert hygiene, runbooks, and sustainable rotations.
- •250 XPCapacity PlanningKnowing what breaks first, before it does.
- •250 XPSLI DesignPicking measures that catch what hurts customers.
- •200 XPPercentiles & DistributionsWhy averages lie and what the tail tells you.
- •200 XPQueueing Theory BasicsWhy running hot makes everything worse.
- •200 XPReading Flame GraphsWhere did all the time actually go?
- •250 XPBackpressure & Load SheddingHow systems stay alive under stress.