I assume many of you faced outages or slowdowns of your favorite services in the last few weeks.
Our digitized world won't work without the internet, electricity, and payment capabilities. If these services fail, customers suffer, and businesses lose billions in revenue and loss of reputation.
Amazon: Lose 1% sales for every 100 ms slowdown
Bank in Singapore: 700 million
E-Commerce company: 6 Million
Imagine how valuable system performance optimization for such businesses is. By investing just 1% of their damage costs, they could have a return on performance investment of 99% and would have saved hundreds of million dollars.
Banking services were not available for almost one day
In 2021, a central bank in Singapore already had to hold almost 700 million additional risk capital after a system interruption, and the renewed failure in 2023 will probably be similarly expensive.
Here the report
https://www.marketscreener.com/news/latest/Singapore-s-cenbank-says-digital-services-outage-at-DBS-unacceptable--43363754/
Stable and high-performance banking systems are becoming increasingly important. I would not be surprised if the supervisory authorities carried out stricter controls in this area and enforced stress tests for both the financial perspective and IT systems.
It's great to see that financial regulators take performance and reliability issues seriously now.
Ticket Master Crash
A few months ago, the famous Ticketmaster website crashed and could not sell Taylor Swift's Eras tour tickets. Customers were very frustrated, and businesses involved lost millions in revenue.
A closer look behind the scenes uncovered that the high demand for Taylor Swift's tickets pushed the system to its limits until the site was no longer responsive.
The lesson learned that day is:
We must create performance requirements.
We must revisit our performance requirements.
We must Stress Test our platform frequently.
How to mitigate such risks
System Performance and the reliability of our mission-critical services are no longer nice to have. Instead, they must be treated similarly to our most valuable resources. They require careful design, validation, and permanent care.
I recommend the following mitigation.
Practices: Performance requirements, risk-based performance testing, performance-quality gates, shift left, continuous performance, shift right
Tools: load injection, application performance monitoring, tracing, prediction, Ai-powered problem detection and remediation, alerting, knowledge management
I am happy to make performance and reliability a fundamental part of your value stream. From my perspective, every organization should have access to the latest knowledge about IT reliability, performance, and observability. This is why we've developed Gobenchmark, our unique platform to guide everyone interested in fast and reliable business applications.
Keep up the great work! Happy Performance Engineering!
Comments