top of page

25.000 User Streaming Portal Stress Test powered by Open Source

Writer's picture: Josef MayrhoferJosef Mayrhofer

Updated: Jun 25, 2023

Testing applications for speed, stability, and scalability become more important these days. Businesses are realizing the performance is one of the most important features because once it fails all other components won't be accessible. In one of our recent assignments, I have been tasked to validate the following performance requirement: "Check if 25 K users are able to login to the streaming portal and attend the streaming event during a 10-minute time slot. Response times must be within industry standards".


Initially, I thought that the network and the load injectors would be some of the biggest challenges in this project. After completing this assignment, I learned that both were no problems at all.


Our load testing approach

  1. Documentation of NFRs

  2. Creation of Performance Testing Strategy

  3. Evaluation of load injection tools

  4. Integration of Performance Monitoring

  5. Execution of load tests

  6. Tuning

  7. Re-Tests


25 K users, fully authenticated, video streaming. What could be the best load testing tool for these requirements? Dealing with such a high number of users is very easy if SaaS-based load testing would be an option. I was hoping that the customer would open their firewalls to simulate load from a Cloud-based load injection environment. For various reasons, this was declined, and I had to continue my research for other powerful load testing solutions.

Open source load testing tools for high request volumes


The preferred option for my client was to use open source load testing tools. Gatling and JMeter were the first tools that came into my mind. We've conducted initial benchmark tests and simulated 100 video streaming events using JMeter and Gatling on a single machine. These tests resulted in big surprises. The memory and CPU utilization was 70 % lower in our load tests with Gatling compared to JMeter. It was very clear that Gatling would be a tool of choice for this 25 K user load testing project.


We've ordered 14 Linux servers, installed Java and Gatling, and enabled key-based authentication. Gatling does not provide a controller-based test execution out of the box. Starting such a 25 K load test on 14 servers is too much manual effort. Therefore, we've implemented a shell script to

1. upload the Gatling configuration, scripts, data files

2. start the test,

3. collect real-time insights

4. collect the test results.


Thanks to this controller-based test execution we were able to upload the load testing scripts and data to a single server, start the load test, watch real-time insights and collect the results in a fully automated fashion.


100 % of all identified issues were related to code and configuration


Within about 2 months, we've completed this project. Neither the network nor the load injectors were a bottleneck at all. During this performance testing project, we've identified hotspots in the code and the configuration of the streaming portal. The bad news for those who are still hoping that sizing helps to improve performance; Zero issues had been fixed by adding more Hardware.


Toolbox


Gatling for load injection of internal streaming events

Dynatrace for Ai-powered performance monitoring

LoadView for full Browser based testing of external streaming events


Lessons learned

  • Run benchmark tests before you select your tool of choice

  • Develop a controller-based load testing approach is good invested energy

  • Increase ulimit to 10K and XMX to 5 GB to simulate 2000 user per Linux server

  • Speed up script creation time by involving developers

  • Performance monitoring is a must-have

Keep doing the great work. Happy Performance Engineering!



831 views1 comment

Recent Posts

See All

1 Comment


lewanpd
Jul 12, 2020

Hi Josef. Thank you for writing and sharing this blog. I am currently on a new project using Gatling for the first time and I am also impressed by what it can do. The one item you shared is the lack of real-time monitoring when using Gatling. I have used InfluxD with Grafana for JMeter and was wondering if anyone out there has successfully added this combination with Gatling?

Like
bottom of page