Testing applications for speed, stability, and scalability become more important these days. Businesses are realizing the performance is one of the most important features because once it fails all other components won't be accessible. In one of our recent assignments, I have been tasked to validate the following performance requirement: "Check if 25 K users are able to login to the streaming portal and attend the streaming event during a 10-minute time slot. Response times must be within industry standards".
Initially, I thought that the network and the load injectors would be some of the biggest challenges in this project. After completing this assignment, I learned that both were no problems at all.
Our load testing approach
Documentation of NFRs
Creation of Performance Testing Strategy
Evaluation of load injection tools
Integration of Performance Monitoring
Execution of load tests
Tuning
Re-Tests
25 K users, fully authenticated, video streaming. What could be the best load testing tool for these requirements? Dealing with such a high number of users is very easy if SaaS-based load testing would be an option. I was hoping that the customer would open their firewalls to simulate load from a Cloud-based load injection environment. For various reasons, this was declined, and I had to continue my research for other powerful load testing solutions.
Open source load testing tools for high request volumes
The preferred option for my client was to use open source load testing tools. Gatling and JMeter were the first tools that came into my mind. We've conducted initial benchmark tests and simulated 100 video streaming events using JMeter and Gatling on a single machine. These tests resulted in big surprises. The memory and CPU utilization was 70 % lower in our load tests with Gatling compared to JMeter. It was very clear that Gatling would be a tool of choice for this 25 K user load testing project.
We've ordered 14 Linux servers, installed Java and Gatling, and enabled key-based authentication. Gatling does not provide a controller-based test execution out of the box. Starting such a 25 K load test on 14 servers is too much manual effort. Therefore, we've implemented a shell script to
1. upload the Gatling configuration, scripts, data files
2. start the test,
3. collect real-time insights
4. collect the test results.
Thanks to this controller-based test execution we were able to upload the load testing scripts and data to a single server, start the load test, watch real-time insights and collect the results in a fully automated fashion.
100 % of all identified issues were related to code and configuration
Within about 2 months, we've completed this project. Neither the network nor the load injectors were a bottleneck at all. During this performance testing project, we've identified hotspots in the code and the configuration of the streaming portal. The bad news for those who are still hoping that sizing helps to improve performance; Zero issues had been fixed by adding more Hardware.
Toolbox
Gatling for load injection of internal streaming events
Dynatrace for Ai-powered performance monitoring
LoadView for full Browser based testing of external streaming events
Lessons learned
Run benchmark tests before you select your tool of choice
Develop a controller-based load testing approach is good invested energy
Increase ulimit to 10K and XMX to 5 GB to simulate 2000 user per Linux server
Speed up script creation time by involving developers
Performance monitoring is a must-have
Keep doing the great work. Happy Performance Engineering!
Hi Josef. Thank you for writing and sharing this blog. I am currently on a new project using Gatling for the first time and I am also impressed by what it can do. The one item you shared is the lack of real-time monitoring when using Gatling. I have used InfluxD with Grafana for JMeter and was wondering if anyone out there has successfully added this combination with Gatling?