Skip to navigation

Set up GlusterFS on two nodes

What is GlusterFS?

GlusterFS is a distributed file system which can be used to build volumes that span several hosts. It's used in a variety large scale cloud and web hosting applications.

A GlusterFS volume is a virtual disk which can be read and written from across a network. GlusterFS can be used to build high perfomance storage clusters that hold large volumes of data.

The data in GlusterFS volumes is divided into bricks, where each brick is a portion of a pysical drive that's used to store volume data.  In practice a brick is just a directory whose contents are shared over the network.

Bricks can be replicated for improved fault tolerance.  If one of the disks in a GlusterFS node dies, all the data in that drive's brick will be lost.  If the cluster has been configured for replication, then any brick that is lost when a disk fails is available on another node.  The broken disk just needs to be replaced and GlusterFS will automatically rebuild the lost brick from the contents of the other bricks.

Bricks can be distributed across more than one server to improve performance.  This means that individual files may be written to different servers, even if those files are in the same directory.  Distributing files helps to balance the load on the servers.  If a volume contains a lot of files, spreading them across more than one server helps to spread the load when they are accessed.

GlusterFS supports striping where parts of a file can be written to different disks. Usually this is only recommended for very large files.

Setting up two GlusterFS nodes

I installed Raspbian on three Banana Pis - one client, and two nodes for a simple storage cluster.

GlusterFS on Banana Pi servers

The client Banana Pi gets its IP address from the local DHCP server, and the Gluster nodes have static IP addresses - 192.168.0.30 and 192.168.0.31.

Connect to each node using ssh in different terminal windows:

ssh bananapi@192.168.0.30

And in another terminal:

ssh bananapi@192.168.0.31

I ran the bpi-config program on both Gluster nodes to expand the SD card's partition and change some basic settings like host names and passwords:

sudo bpi-config

Next, I opened /etc/resolv.conf in nano, and entered DNS  server IP addresses:

sudo nano /etc/resolv.conf

You can use Google's DNS servers for convenience:

nameserver 8.8.8.8
nameserver 8.8.4.4

Update Linux and install glusterfs-server:

sudo apt-get update && sudo apt-get upgrade -y
sudo apt-get install glusterfs-server -y

Open /etc/hosts in a text editor on each node so that you can add the host names of the nodes:

sudo nano /etc/hosts

In the hosts file on the first node, add these entries:

127.0.1.1  glus1.storage glus1
192.168.0.31  glus2.storage glus2

Add these entries to the host file on the second node:

127.0.1.1  glus2.storage glus2
192.168.0.30  glus1.storage glus1

You also need to open the hosts file on the client, and add both names:

192.168.0.30  glus1.storage glus1
192.168.0.31  glus2.storage glus2

On both nodes, you need to create directories to store the volume's contents:

sudo mkdir --parents /srv/brick/glusv0

Create a GlusterFS trusted storage pool

In the ssh session for the first node, run this command to tell Gluster to form a server pool with the other node:

sudo gluster peer probe glus2
Probe successful

sudo gluster peer status
Number of Peers: 1

Hostname: glus2
Uuid: 186a69a4-423a-49bb-9061-b2e695328ffb
State: Peer in Cluster (Connected)

On the second node, use the gluster peer probe command to probe the first node:

sudo gluster peer probe glus1

sudo gluster peer status
Number of Peers: 1

Hostname: glus1
Uuid: 0d6dcbcb-e277-444e-b2df-3187330160e3
State: Peer in Cluster (Connected)

Create the volume

You can run the next commands on either node.  First you need to create the volume:

sudo gluster volume create glusv0 replica 2 glus1:/srv/brick/glusv0 glus2:/srv/brick/glusv0
Creation of volume glusv0 has been successful. Please start the volume to access data.

This tells gluster to create a volume called glusv0 with two bricks, where the contents of the brick on glus1 are replicated on glus2.  Start the volume

sudo gluster volume start glusv0
Starting volume glusv0 has been successful

You can get information about the volume with this command:

sudo gluster volume info 

Volume Name: glusv0
Type: Replicate
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: glus1:/srv/brick/glusv0
Brick2: glus2:/srv/brick/glusv0

Mount the volume locally

In order to check that everything is working, mount the volume from the command line of the first node:

sudo mount -t glusterfs glus1:/glusv0 /mnt

Now create a file in the volume:

sudo echo "Hello World" > /mnt/hello.txt

If you also mount the volume on the second node, the file should be accessible:

sudo mount -t glusterfs glus2:/glusv0 /mnt
cat /mnt/hello.txt 
Hello World

Access the volume from the client

I set up another Banana Pi running Raspbian.  Again I used bpi-config to set the hostname and password. 

Update Linux and install glusterfs-client

sudo bpi-config

sudo apt-get update && sudo apt-get upgrade -y

sudo apt-get install glusterfs-client -y

You may need to restart the client Pi.  To mount the volume from the client, run this command:

sudo mount -t glusterfs glus1:/glusv0 /gluster -o backupvolfile-server=glus2

The -o option is used to pass the backupvolfile parameter which sets an alternate host name to access the volume in case the first server goes down.  The client will automatically re-mount the volume via glus2.

In theory adding the following line to the client's fstab file should make the client mount the GlusterFS share at boot:

glus1:/glusv0 /mnt glusterfs defaults,_netdev 0 0

This didn't work because the GlusterFS client wasn't running when the fstab file was processed. Instead, I opened root's crontab file so that I could add a command to mount the share at reboot. This command opens the crontab file:

sudo crontab -u root -e

Add this line, and press control-o and return to save changes, and control-x to quit from nano:

@reboot sleep 10;mount -t glusterfs glus1:/glusv0 /gluster -o backupvolfile-server=glus2

This will execute two commands when the server boots up: the first is just a 10 second delay to allow the GlusterFS daemon to boot, and the second command mounts the volume.

You may need to make your Pi wait longer before running mount. If your Pi doesn't mount the volume when it boots, try using 'sleep 15' instead.  This isn't an ideal way to fix this problem, but it's ok for most uses.

Change the owner of the volume so that it can be accessed by user bananapi:

sudo chown bananapi -R /gluster/

Now your Banana Pi powered storage cluster should be up and running.

Setting up GlusterFS on four Banana Pi servers

In this post I'm going to set up GlusterFS volume on four Banana Pi servers, each with its own SATA hard disk.  The volume will have two bricks, each occupying one disk. The bricks will be replicated on the two remaining disks.

4 node Banana Pi GlusterFS volume

I'm using a fifth Banana Pi as a client which I will use to access the volume.  The client has a mouse, keyboard and monitor attached to it.  I'm using ssh to connect from the client to the nodes in the cluster.  All five Banana Pi boards are connected to an ethernet switch.  

I'm using four 1TB hard disks.  I'm using replication, so although the total amount of disk space is 4TB, the actual capacity is 2TB.  Some of that space will be taken up with file system overheads, so there will be around 1.7TB of space available to store data. The hard disks are powered with a PC power supply.

I ran bpi-config on all nodes to set basic parameters.  I gave the nodes in the cluster host names glus1 to glus4, andd I set static IP addresses ranging from 192.168.0.30 to 192.168.0.33.  

Set up the client node

On the client, install updates and install the GlusterFS client:

sudo apt-get update && sudo apt-get upgrade -y
sudo apt-get install glusterfs-client -y

Open the hosts file in a text editor:

sudo nano /etc/hosts

Enter these host names:

192.168.0.30   glus1
192.168.0.31   glus2
192.168.0.32   glus3
192.168.0.33   glus4

Set up each server node

Log into each node from the client:

ssh bananapi@glus1

Install Linux updates and the GlusterFS server: 

sudo apt-get update && sudo apt-get upgrade -y
sudo apt-get install glusterfs-server -y

Enter the host names.  On each server, use the loopback address for that server's host name. The hosts file for glus2 looks like this:

192.168.0.30   glus1
127.0.0.1      glus2
192.168.0.32   glus3
192.168.0.33   glus4

Each SATA drive needs to be partitioned and formatted.  Use fdisk to partition the SATA disk:

sudo fdisk /dev/sda
Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel
Building a new DOS disklabel with disk identifier 0x7a48f006.
Changes will remain in memory only, until you decide to write them.
After that, of course, the previous content won't be recoverable.

Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)

Command (m for help): p

Disk /dev/sda: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders, total 1953525168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x7a48f006

   Device Boot      Start         End      Blocks   Id  System

Command (m for help): n
Partition type:
   p   primary (0 primary, 0 extended, 4 free)
   e   extended
Select (default p): p
Partition number (1-4, default 1): 1
First sector (2048-1953525167, default 2048): 
Using default value 2048
Last sector, +sectors or +size{K,M,G} (2048-1953525167, default 1953525167): 
Using default value 1953525167

Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.
Syncing disks.

Create an ext4 filesystem in the partition:

sudo mkfs -t ext4 /dev/sda1

Create a directory where the drive will be mounted:

sudo mkdir /srv/store/

Edit fstab to make sure the new drive is mounted when the server boots up:

sudo nano /etc/fstab

Add a line for the new disk:

proc            /proc           proc    defaults          0       0
/dev/mmcblk0p1  /boot           vfat    defaults          0       2
/dev/mmcblk0p2  /               ext4    defaults,noatime  0       1
/dev/sda1       /srv/store      ext4    defaults,noatime  0       0

I used the noatime option to reduce the amount of disk access.  Reboot and make sure the new drive is mounted using the df command:

df
Filesystem     1K-blocks    Used Available Use% Mounted on
rootfs           7628360 2968404   4315408  41% /
/dev/root        7628360 2968404   4315408  41% /
devtmpfs          447604       0    447604   0% /dev
tmpfs              89544     256     89288   1% /run
tmpfs               5120       0      5120   0% /run/lock
tmpfs             179080       0    179080   0% /run/shm
/dev/mmcblk0p1     57288   14592     42696  26% /boot
/dev/sda1      961433632  204436 912391120   1% /srv/store

Set up the other three servers the same way. 

Set up GlusterFS

At this point, the servers and disks are ready for GlusterFS to be set up.  Log back into glus1, and probe the other peers to set up a trusted pool of Gluster servers:

sudo gluster peer probe glus2
sudo gluster peer probe glus3
sudo gluster peer probe glus4

Check that the peers are connected:

bananapi@glus1 ~ $ sudo gluster peer status
Number of Peers: 3

Hostname: glus2
Uuid: 186a69a4-423a-49bb-9061-b2e695328ffb
State: Peer in Cluster (Connected)

Hostname: glus3
Uuid: 95dc286e-affa-4431-89eb-76babfa5185e
State: Peer in Cluster (Connected)

Hostname: glus4
Uuid: 682b38db-d80a-42fb-a58b-86d31a05f5f8
State: Peer in Cluster (Connected)
bananapi@glus1 ~ $ 

Create a volume called vol0 with two bricks which are distributed and replicated:

sudo gluster volume create vol0 replica 2 glus1:/srv/store/vol0 glus2:/srv/store/vol0 glus3:/srv/store/vol0 glus4:/srv/store/vol0
Creation of volume vol0 has been successful. Please start the volume to access data.

Start the volume:

sudo gluster volume start vol0
Starting volume vol0 has been successful

Check the volume's status:

sudo gluster volume info 
Volume Name: vol0
Type: Distributed-Replicate
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: glus1:/srv/store/vol0
Brick2: glus2:/srv/store/vol0
Brick3: glus3:/srv/store/vol0
Brick4: glus4:/srv/store/vol0

Set up the client

The GlusterFS volume should be running now, so you can mount the volume to check that it is accessible on the client:

sudo mount -t glusterfs glus1:/vol0 /mnt

Use the df command to make sure the volume has been mounted:

df -h
Filesystem      Size  Used Avail Use% Mounted on
rootfs          7.3G  2.9G  4.1G  42% /
/dev/root       7.3G  2.9G  4.1G  42% /
devtmpfs        438M     0  438M   0% /dev
tmpfs            88M  284K   88M   1% /run
tmpfs           5.0M     0  5.0M   0% /run/lock
tmpfs           175M     0  175M   0% /run/shm
/dev/mmcblk0p1   56M   15M   42M  26% /boot
glus1:/vol0     1.8T  400M  1.7T   1% /mnt

Create a directory where this volume will be mounted when the Pi boots up:

sudo mkdir /srv/data

Next I need to configure the client to mount the volume every time it boots.  This would normally be done by adding a line to fstab, but the last time I set up Gluster I found that adding a line to the client's fstab file didn't work.  When the Pi boots up, the GlusterFS client daemon isn't available until after the fstab file has been processed.  This caused the client node to fail to mount the voume.  

A simple workaround is to put a mount command in the root's crontab file.  Use this command to open the file for editing:

sudo crontab -u root -e
no crontab for root - using an empty one
crontab: installing new crontab

Add this line:

@reboot sleep 1;mount -t glusterfs glus1:/vol0 /srv/data -o backupvolfile-server=glus2

I used the -o option to pass the backupvolfile-server parameter to the Glusterfs client.  This means thst if there's a problem with glus1, the client will failover to glus2.

Quit from the text editor.  Change the owner of the mounted volume to make it accessible to user bananapi:

sudo chown bananapi /srv/data

Reboot the client node.  Now it should be possible to write a file to the volume on /srv/data, and read it back again:

echo "Hello World" > /srv/data/hello.txt
cat /srv/data/hello.txt
Hello World

Share this page: