The Psychology of Chaos Engineering

A presentation at DevOpsDays Baltimore 2019 in April 2019 in Baltimore, MD, USA by Matt Stratton

Slide 1

Slide 1

The Psychology of Chaos Engineering Matty Stratton, PagerDuty @mattstratton

Slide 2

Slide 2

Chaos Engineering is the discipline of experimenting on a system in order to build confidence in the system’s capability to withstand turbulent conditions in production. https://principlesofchaos.org/ @mattstratton

Slide 3

Slide 3

What chaos engineering is NOT @mattstratton

Slide 4

Slide 4

@mattstratton

Slide 5

Slide 5

It’s not about breaking things @mattstratton

Slide 6

Slide 6

@mattstratton

Slide 7

Slide 7

Experimenting in production is preferred @mattstratton

Slide 8

Slide 8

You can’t do this without good measurement @mattstratton

Slide 9

Slide 9

Minimize your blast radius @mattstratton

Slide 10

Slide 10

Some helpful tools • Netflix Simian Army - https://principlesofchaos.org/ • Gremlin - https://www.gremlin.com/ • ChaosToolkit - https://chaostoolkit.org/ @mattstratton

Slide 11

Slide 11

But what about the people? @mattstratton

Slide 12

Slide 12

How does it make you feel to know Netflix practices chaos engieering? @mattstratton

Slide 13

Slide 13

What about your bank? @mattstratton

Slide 14

Slide 14

@mattstratton

Slide 15

Slide 15

Management can get… …nervous @mattstratton

Slide 16

Slide 16

Consider your words @mattstratton

Slide 17

Slide 17

It’s about the philosophy @mattstratton

Slide 18

Slide 18

@mattstratton

Slide 19

Slide 19

Safety first @mattstratton

Slide 20

Slide 20

• Know your conditions - when will you shut down the experiment? • This isn’t about causing stress on your people - be transparent • There are humans at the other end of those numbers @mattstratton

Slide 21

Slide 21

Further Reading • Chaos Engineering Traps - Nora Jones bit.ly/2Pr53ZH • ChaosCat: Automating Failure Injection at PagerDuty bit.ly/2UCbdXN • ChaoSlingr: Introducing Security into Chaos Testing bit.ly/2GDZN1V @mattstratton