Skip to navigation

Testing the cluster with live traffic

My reconfigured Banana Pi cluster has been running for a few weeks, and I have moved several sites to it:

None of these sites gets a lot of traffic, so most of the traffic served by the cluster is for this site, banoffeepiserver.com.

The cluster has been getting a trickle of traffic from Google, but not enough to really test it properly.  At the time of writing, it only serves about 20 hits an hour at peak time. I've done some testing with Siege, but I want to see how the cluster copes with real traffic.

I posted a link to my site in the sysadmin section on reddit.com to generate a surge in traffic. Traffic started ramping up quickly, and peaked at 479 hits per hour.  There were 2000 page views on Sunday evening, and 4507 in the following 24 hours.  

This isn't a lot of traffic (my Raspberry Pi site handles this many hits on a daily basis), but it's best to start small so that I can spot issues and fix them before testing with larger amounts of traffic.

These Ganglia screenshots show the cluster's performance four hours after I posted the link on Reddit.  I posted the link at about 5.15pm on Sunday the 4th of January, and Ganglia shows an increase in traffic at about this time.  These graphs show aggregate statistics for the entire server farm:

ARM server farm statistics

These graphs show statistics for the database cluster and the web server cluster:

Information for the database cluster and the web server cluster

This screenshot show statistics for the entire cluster during the 24 hour period after traffic started coming in:

Cluster information for 24 hours

And the statistics for the the database cluster and web server cluster over 24 hours:

web server cluster statistics over 24 hours

There's an increase in the amount of traffic being served by the web server cluster, but the CPUs appear to be under very little load.  The database cluster wasn't really affected at all. All pages on each site are cached on the web servers, and the CMS only generates pages dynamically when someone requests a page that doesn't exist and gets a 404 response.  

According to Google Analytics the average page load time on Sunday was 6.99 seconds, and 4.30 seconds on Monday. These timing measurements include time taken to render pages in a browser and download advertisments and images, not just the time taken to download pages from the cluster.  

I use UptimeRobot.com to monitor my servers performance from the outside.  It gets the head section of my site's home page every 5 minutes and shows a graph of response times. This graph stayed pretty flat:

Response times measured externally with UptimeRobot.com

This was not enough traffic to really test the cluster thoroughly, but there weren't any major problems.  The increased network traffic was visible in Ganglia, the cluster handled the load the way I expected that it would.

Share this page:

comments powered by Disqus