Destigmatising Mistakes: A Game Launch Incident Review

Making a mistake can be a horrible feeling. Guilt, shame, fear and anxiety all rolled into one. So in order to try reduce this pressure, I’m sharing a recent mistake our team made.

Category:

Incident Response

Time:

6 minute read

It's Just a Monitoring Change

Have you ever had a seemingly innocuous change to one system affect another in a catastrophic way? If yes, you might notice a few familiar themes in this write-up. If no, then read it now, before it’s too late.

Category:

Incident Response

Time:

11 minute read

Zero-Downtime Kubernetes Deployments

When migrating services to shiny new cloud-native infrastructure, special care must be taken to ensure that releases that were zero-downtime continue to be so. When said service is the login system for your entire customer-facing product offering, a little extra effort is probably needed

Category:

Operations

Time:

10 minute read

Visualising Complex Systems

I recently gave a Tech Talk called “Beyond Dashboards - Visualising Complex Systems”

Author:

Andy Burgin

Category:

Community

Time:

1 minute read

"What's the worst that could happen?": A worked example of how we deal with live incidents.

This post is going to outline roughly how we make changes, and what we should do when those changes go bad. Using an incident that actually occurred as an example of how we should deal with these incidents, and how we did in that specific case.

Author:

Craig Stewart

Category:

Devops

Time:

10 minute read

Rising from the Ashes

We’ve always enjoyed running incident response drills, but they were becoming stale. This post covers how we addressed the problems with our fire drills and iterated upon them

Category:

Operations

Time:

8 minute read

performance.now() Conference 2019.

Key takeaways from the 2019 performance.now() conference in Amsterdam

Author:

Paul Whitehead

Category:

Conferences

Time:

15 minute read

Chaos + Resilience Community Day 2019

Our resident chaos monkey ols went to London for the first Chaos and Resilience Community Day held in Europe

Category:

Conferences

Time:

16 minute read

Automated Prize Draws with Kafka

A look at how we improve on our existing manual processes and remove customer pain using technology.

Author:

Dave Chaston

Time:

11 minute read

Never give up!

The importance of mental wellbeing and overcoming adversity

Author:

Rami Alnawas

Category:

Community

Time:

2 minute read