Better Call Fallback: Designing resilient services
14:30 - 15:10, 24th of May (Wednesday) 2023/ DEV ARCHITECTURE STAGE
"If builders built houses the way programmers built programs, the first woodpecker to come along would destroy civilization". That woodpecker could be just one stray query to a database without a time limit. Or that tiny API that displays an insignificant icon in the footer of a page. Without which the entire application won't load. And what about those innocent temporary files that quietly accumulate on the disk in hundreds of thousands? What if the woodpecker accidentally strikes twice? Will we charge the card twice? I'm not talking about black swans here. But about woodpeckers, which even if they appear once in a million, with a thousand transactions per second, we will see them on average... every quarter of an hour.
We spend a lot of time implementing the so-called happy path. And sometimes even testing it. But we dedicate so little time to potential errors and how the application will react to them. I would like to share with you proven techniques for finding and securing dangerous areas in the code. I will show you a dozen ready-to-use and tested patterns and advice, including:
* exception handling and retrying
* circuit breaker and bulkheading
* idempotence, deduplication, and outbox pattern
* dry-run, graceful degradation, and sharding
* and how to test all of this
The world is not perfect, and our code is not perfect. Let's stop pretending it's otherwise. And let's build systems that can heal themselves instead of constantly waking us up at night with alerts.