Skip to navigation

Set up Ganglia on multiple clusters

Ganglia can be used to monitor groups of clusters.  Ganglia-monitor needs to be installed and configured on each node that's being monitored.  The Ganglia meta daemon (gmetad) runs on a master node and connects to the monitor processes to collect system information.  A group of clusters monitored by a single Ganglia server is called a grid.  

Ganglia grid overview

I recently built a database cluster and a file storage cluster, so I installed Ganglia-monitor on each node in these clusters.  I used two more Banana Pis as control nodes in a cluster of their own.

Set up the Ganglia server

I installed the following packages on the Ganglia master node:

sudo apt-get install apache2 php5 libapache2-mod-php5 php5-json 
sudo apt-get install ganglia-frontendweb gmetad gmetad rrdtool ganglia-monitor

The next step is to edit /etc/ganglia/gmetad.conf:

sudo nano /etc/ganglia/gmetad.conf

Ganglia uses different ports to distinguish between different clusters.  I'm using port 8650 for the database cluster, port 8655 for a small cluster of control nodes, and port 8656 for the cluster of cluster servers.  I've set up three data_source lines which specify the cluster names, refresh interval, and a list of hosts in each cluster:

data_source "cmd_cluster" 60 192.168.0.8:8655 192.168.0.9:8655
data_source "db_cluster" 60 192.168.0.35:8650 192.168.0.36:8650 192.168.0.37:8650 192.168.0.38:8650
data_source "g_cluster" 60 192.168.0.30:8656 192.168.0.31:8656 192.168.0.32:8656 192.168.0.33:8656

I uncommented the gridname directive and set it to "ARM_Farm":

gridname "ARM_Farm"

I also edited the list of trusted servers to include each IP address:

trusted_hosts 127.0.0.1 192.168.0.9  192.168.0.30 192.168.0.31 192.168.0.32 192.168.0.33 192.168.0.35 192.168.0.36 192.168.0.37 192.168.0.38

Next I copied the Ganglia Apache configuration file to Apache's config directories and enabled it:

sudo cp /etc/ganglia-webfrontend/apache.conf /etc/apache2/sites-available/ganglia.conf
sudo a2ensite ganglia.conf

Restart Apache, the monitor, and server processes:

sudo service ganglia-monitor restart ; sudo service gmetad restart ; sudo service apache2 restart

Set up the monitors

Install ganglia-monitor on each client:

sudo apt-get update
sudo apt-get upgrade
sudo apt-get install ganglia-monitor

Edit /etc/ganglia/gmond.conf:

sudo nano /etc/ganglia/gmond.conf

Enter the name of the cluster:

cluster { 
  name = "cmd_cluster" 
  owner = "unspecified" 
  latlong = "unspecified" 
  url = "unspecified" 

Edit the port number for the udp send and recv channels, and for the tcp connection port:

/* Feel free to specify as many udp_send_channels as you like.  Gmond
   used to only support having a single channel */
udp_send_channel {
  mcast_join = 239.2.11.71
  port = 8650
  ttl = 1
}

/* You can specify as many udp_recv_channels as you like as well. */
udp_recv_channel {
  mcast_join = 239.2.11.71
  port = 8650
  bind = 239.2.11.71
}

/* You can specify as many tcp_accept_channels as you like to share
   an xml description of the state of the cluster */
tcp_accept_channel {
  port = 8650
}

Restart the ganglia-monitor process:

sudo service ganglia-monitor restart
Stopping Ganglia Monitor Daemon: gmond.
Starting Ganglia Monitor Daemon: gmond.

Unicast and multicast communication

There are some situations where UDP multicasting won't work.  The most common reason for this is that some ethernet switches don't support it.  Unicasting can be used in stead.  You need to comment out the lines with the mcast_join address, set the bind IP address to the server's IP address in the recv channel, and the host IP address (also the server's IP address) in the send channel.

/* Feel free to specify as many udp_send_channels as you like.  Gmond 

   used to only support having a single channel */ 
udp_send_channel { 
  host = 192.168.0.8
#  mcast_join = 239.2.11.71 
  port = 8655
  ttl = 1 


/* You can specify as many udp_recv_channels as you like as well. */ 
udp_recv_channel { 

#  mcast_join = 239.2.11.71 
  port = 8655
  bind = 192.168.0.8


/* You can specify as many tcp_accept_channels as you like to share 
   an xml description of the state of the cluster */ 
tcp_accept_channel { 
  port = 8655 

You may have to wait a minute or two before all nodes show up in the UI, but eventually when you visit Ganglia's url in your browser (http://<your Banana Pi's IP address>/ganglia/), you should see the Ganglia grid overview (see screen shot above) in your browser.

Note that you can use the links in the top left hand of the page to move between different views and select individual nodes.

Ganglia node overview graphs

Detailed information is available about each node's resources.  These graphs show a node's CPU usage:

These graphs show information about a node's network utilization:

Network usage information

Share this page:

comments powered by Disqus