Skip to navigation

DB Cluster Utils

Sometimes it is necessary to do maintenance tasks on a cluster of database servers. Servers may breakdown, disks fail from time to time, and servers need to be added as the load on a cluster grows.

Typing commands in the MySQL command prompt on each server is a cumbersome way to deal with these kinds of tasks.  I'd rather have a Banana Pi set up as a control node where I can use simple commands to manage the entire cluster.

I recently wrote a simple script to simplify setting up a cluster of databases.  I decided to extend this script into a more complete program that I can use to do the following:

  • add new servers to the cluster,
  • demote the master,
  • promote a slave
  • move the database to a new server,
  • start and stop replication,
  • remove a server.

You can download DB Cluster Utils from Github at https://github.com/Pyplate/db-cluster-utils.

If anything goes wrong with my database cluster, I'll use this tool to fix it.  This script also provides a Python API which will probably be useful at some point in the future.

DB Cluster Utils is intended to work with MySQL 5.5.  In MySQL 5.6, Global Transaction IDentifiers (GTIDs) are used to synchronize servers, so there's no need to keep track of the binary logs on each server.

I don't recommend using this tool with databases that contain critical data.  This program was written mainly for demonstration purposes.  If you see any errors or omissions in it, feel free to comment below.

Install

I've set up a cluster of four Banana Pi servers, and a fifth Pi to use as a control node. Each one is running Raspbian and is set up as described on this page.

All nodes have static IP addresses.  The IP address of my control node is 192.168.0.8, and the IP addresses of the server nodes range from 192.168.0.35 to 192.168.0.38.

I logged into each server using ssh and logged into the MySQL shell so that I could run this query:

GRANT ALL ON *.* TO 'root'@'192.168.0.8' IDENTIFIED BY 'rootpassword';

This allows the root user to access each server from the control node.  On the control node, install the MySQL client and the Python bindings:

sudo apt-get install python-mysqldb mysql-client -y

Next install DB Cluster Utils:

cd ~
git clone git://github.com/Pyplate/db-cluster-utils.git
cd db_cluster_utils

Configuration

There is a configuration file containing the following parameters:

  • database name - the name of the database to be managed
  • master ip - the IP address of the master server
  • control host - the IP address of the node used for controlling the cluster
  • slave ip list - a comma separated list of slave servers
  • root password - the password of the MySQL root user's account
  • db user pw - the password for the MySQL user account that's used for queries
  • slave user pw - the password for the MySQL user account that's used for queries
  • host username - the password for the MySQL user account that's used for replication
  • server pool - a list of servers that are available to join the cluster

You can find this configuration file in ~/db-cluster-utils/cluster_utils.conf.  I set it up as follows:

[cluster_utils]

# the name of the database to create
database name = my_db

# IP address of the master server
master ip = 192.168.0.35

# the IP address of the control node
control host = 192.168.0.8

# a list of IP addresses of the slave servers
slave ip list = 192.168.0.36, 192.168.0.37

# root user's pass word 
root password = rootpwd

# database user's password
db user pw = mypassword

# slave user's password
slave user pw = mypassword

# server host username
host username = bananapi

# server pool
server pool = 192.168.0.35,192.168.0.36,192.168.0.37,192.168.0.38

The server pool includes all the servers in the cluster.  The server at 192.168.0.35 is the master server, and the servers at 192.168.0.36 and 192.168.0.37 are slaves.  I left 192.168.0.38 unused to begin with.

Using DB Cluster Utils

On the control node, open a terminal window, change directory to ~/db_cluster utils/ and use this command to display the available options:

./db_cluster_utils.py -h
Usage:
db_cluster_utils.py -option [argument]
 -a    --add <slave ip>        add a new slave to the cluster
 -c    --config                view cluster configuration information
 -d    --demote                demote a master to a slave
 -h    --help                  display this help message
 -i    --init                  initialize the cluster based on settings 
                               in cluster_utils.conf
 -m    --move <destination ip> Move the database to another server
 -p    --promote <slave ip>    promote a slave to master
 -r    --remove <slave ip>     remove this node from the cluster
 -s    --start                 start replication
 -t    --stop                  stop replication
 -w    --wipe [IP address]     wipe a database from the entire cluster,
                               or from a single server if an IP address 
                               is specified

Commands need to be used in the correct sequence.  At the time of writing the use of combinations of options hasn't been tested - just use one option at a time.

The first time you use DB Cluster Utils, you need to use the '-i' option to initialize the cluster:

./db_cluster_utils.py -i
Database name: my_db
Master server IP: 192.168.0.35
Control Host: 192.168.0.8
Slave IP address list:
['192.168.0.36', '192.168.0.37']
Root password: T4nk3r!12
User password: mypassword
Slave user password: mypassword
192.168.0.35 Created database my_db
192.168.0.35 Created user db_user
192.168.0.35 Granted privileges
192.168.0.35 Created user slave_user
192.168.0.35 Granted privileges
192.168.0.35 Got master status: mysql-bin.000007, 64322
192.168.0.35 Granted replication privilege to slave_user@192.168.0.36 id'd by mypassword
192.168.0.35 Granted replication privilege to slave_user@192.168.0.37 id'd by mypassword
Init server 192.168.0.36
192.168.0.36 Created database my_db
192.168.0.36 Created user db_user
192.168.0.36 Granted privileges
192.168.0.36 Created user slave_user
192.168.0.36 Granted privileges
192.168.0.36 Got master status: mysql-bin.000031, 27173
192.168.0.36 Set read only
CHANGE MASTER TO MASTER_HOST='192.168.0.35', 
                    MASTER_USER='slave_user', 
                    MASTER_PASSWORD='mypassword', 
                    MASTER_LOG_FILE='mysql-bin.000007', 
                    MASTER_LOG_POS=64322;
192.168.0.36 Set master
Init server 192.168.0.37
192.168.0.37 Created database my_db
192.168.0.37 Created user db_user
192.168.0.37 Granted privileges
192.168.0.37 Created user slave_user
192.168.0.37 Granted privileges
192.168.0.37 Got master status: mysql-bin.000030, 25414
192.168.0.37 Set read only
CHANGE MASTER TO MASTER_HOST='192.168.0.35', 
                    MASTER_USER='slave_user', 
                    MASTER_PASSWORD='mypassword', 
                    MASTER_LOG_FILE='mysql-bin.000007', 
                    MASTER_LOG_POS=64322;
192.168.0.37 Set master

This command initializes each server, creating the database, creating users and granting privileges.  

On the master server, replication privileges are granted for the replication users on the other servers.  On each slave, the CHANGE MASTER directive is used to tell the slave which node to use as master.  The slaves are then set read only.

This command also creates a file containing information about each node.  The log file and log position for each server are recorded in a file called status_log_records.dat.

ip address:binlog file:position:role

A typical example of the contents of status_log_records.dat might look like this:

192.168.0.35:mysql-bin.000032:8813:master
192.168.0.36:mysql-bin.000031:34486:slave
192.168.0.37:mysql-bin.000030:31273:slave

Insert Data into the cluster

I wrote a script called insert_data.py which creates some tables and stores sample data on the master server.  When I run it, the script connects to the master server, and runs some queries to create tables and insert records:

./insert_data.py
Done
Done
Done
Done
Done
Done
Done
Done
Done
Done
Done
DB Connection closed

Start replication

At this point the master server contains data, but the slaves don't.  When I run db_cluster_utils.py with the '-s' option, the slaves start replicating data on the master server:

./db_cluster_utils -s
192.168.0.36 Started slave thread
192.168.0.37 Started slave thread

Now each node contains the sample data.  I used ssh to log into each node in a different terminal window, and log into the MySQL shell as root:

mysql -u root -p

I can run the following query in the MySQL shell of any of the servers that has been initialized so far (192.168.0.35 to 192.168.0.37), and I will see the same result:

mysql> SELECT * FROM users;
+----------+----------+-----------------+---------------------+
| username | password | email_addr      | joined              |
+----------+----------+-----------------+---------------------+
| Anna     | 123456   | anna@email.com  | 2014-10-20 00:00:00 |
| Bob      | 333456   | bob@addr.com    | 2014-10-20 00:00:00 |
| Claire   | 444456   | claire@addr.com | 2014-10-20 00:00:00 |
| Dave     | 555456   | dave@addr.com   | 2014-10-20 00:00:00 |
| Emma     | 666456   | emma@addr.com   | 2014-10-20 00:00:00 |
| Fred     | 111456   | fred@addr.com   | 2014-10-20 00:00:00 |
| Gwen     | 222456   | gwen@addr.com   | 2014-10-20 00:00:00 |
| Harry    | 123123   | harry@mail.com  | 2014-10-20 00:00:00 |
+----------+----------+-----------------+---------------------+
8 rows in set (0.00 sec)

This shows that the database has been set up, and replication is working.  I can confirm this by adding a record on the master server at IP address 192.168.0.35:

mysql> INSERT INTO users VALUES ('Isabelle', '123456', 'email@addr.com', CURRENT_DATE());
Query OK, 1 row affected (0.02 sec)

>mysql> COMMIT;
Query OK, 0 rows affected (0.01 sec)

Again, when I select all records in the users table, I should see the same result on all servers:

mysql> SELECT * FROM users;
+----------+----------+-----------------+---------------------+
| username | password | email_addr      | joined              |
+----------+----------+-----------------+---------------------+
| Anna     | 123456   | anna@email.com  | 2014-10-20 00:00:00 |
| Bob      | 333456   | bob@addr.com    | 2014-10-20 00:00:00 |
| Claire   | 444456   | claire@addr.com | 2014-10-20 00:00:00 |
| Dave     | 555456   | dave@addr.com   | 2014-10-20 00:00:00 |
| Emma     | 666456   | emma@addr.com   | 2014-10-20 00:00:00 |
| Fred     | 111456   | fred@addr.com   | 2014-10-20 00:00:00 |
| Gwen     | 222456   | gwen@addr.com   | 2014-10-20 00:00:00 |
| Harry    | 123123   | harry@mail.com  | 2014-10-20 00:00:00 |
| Isabelle | 123456   | email@addr.com  | 2014-10-20 00:00:00 |
+----------+----------+-----------------+---------------------+
9 rows in set (0.01 sec)

Command Line Usage Examples

The following examples demonstrate how to use DB Cluster Utils.  I'm using the same database cluster that I set up earlier.

Add a server

If any of the slave servers breakdown and need to be replaced, I need to be able to add servers to the cluster.

For the purposes of these posts, I've set up the DB Cluster Utils configuration file to initialize two slave servers. The server at address 192.168.0.38 hasn't been initialized, so I'm going to add it to the cluster as a slave.

./db_cluster_utils.py -a 192.168.0.38
192.168.0.35 Got master status: mysql-bin.000007, 80022
192.168.0.35 Granted replication privilege to slave_user@192.168.0.38 id'd by mypassword
Initializing new slave
Init server 192.168.0.38
192.168.0.38 Created database my_db
192.168.0.38 Created user db_user
192.168.0.38 Granted privileges
192.168.0.38 Created user slave_user
192.168.0.38 Granted privileges
192.168.0.38 Got master status: mysql-bin.000002, 36483
192.168.0.38 Set read only
CHANGE MASTER TO MASTER_HOST='192.168.0.35', 
                    MASTER_USER='slave_user', 
                    MASTER_PASSWORD='mypassword', 
                    MASTER_LOG_FILE='mysql-bin.000007', 
                    MASTER_LOG_POS=80022;
192.168.0.38 Set master
New slave IP: 192.168.0.38

This command gets the log file and position information from the cluster master, and grants replication privileges to the new slave.  The new slave server is initialized, and the CHANGE MASTER directive is used to tell the slave which node to use as the master server.

At this point, the new slave has joined the cluster, but it doesn't contain any data yet. The database needs to be moved from the master server to the new slave:

./db_cluster_utils.py -m 192.168.0.38
Dump the database on the master server
mysqldump -h 192.168.0.35 -u root -p --opt my_db --result-file=my_db_2014.10.20.133409.sql
Enter password: 
Move the database file
scp my_db_2014.10.20.133409.sql bananapi@192.168.0.38:./my_db_2014.10.20.133409.sql
bananapi@192.168.0.38's password: 
my_db_2014.10.20.133409.sql                                                    100% 3703     3.6KB/s   00:00    
Enter password: 

You will be prompted for the MySQL server's root user password on the master.  Next you'll be prompted for the password of your Linux user account on the master, and finally you need to enter the password of the MySQL server's root user on the new slave.

Start replication across the entire cluster:

./db_cluster_utils.py -s
192.168.0.36 Started slave thread
192.168.0.37 Started slave thread
192.168.0.38 Started slave thread

Make sure replication is working by adding another record to the database on the master server:

mysql> INSERT INTO users VALUES ('James', '123456', 'email@addr.com', CURRENT_DATE());

Query OK, 1 row affected (0.02 sec)
mysql> COMMIT;
Query OK, 0 rows affected (0.00 sec)

Now if I log into the MySQL shell on the server at 192.168.0.38, and run a query to select all users, I should see the sample data plus the newly added record:

mysql> SELECT * FROM users;
+----------+----------+-----------------+---------------------+
| username | password | email_addr      | joined              |
+----------+----------+-----------------+---------------------+
| Anna     | 123456   | anna@email.com  | 2014-10-20 00:00:00 |
| Bob      | 333456   | bob@addr.com    | 2014-10-20 00:00:00 |
| Claire   | 444456   | claire@addr.com | 2014-10-20 00:00:00 |
| Dave     | 555456   | dave@addr.com   | 2014-10-20 00:00:00 |
| Emma     | 666456   | emma@addr.com   | 2014-10-20 00:00:00 |
| Fred     | 111456   | fred@addr.com   | 2014-10-20 00:00:00 |
| Gwen     | 222456   | gwen@addr.com   | 2014-10-20 00:00:00 |
| Harry    | 123123   | harry@mail.com  | 2014-10-20 00:00:00 |
| Isabelle | 123456   | email@addr.com  | 2014-10-20 00:00:00 |
| James    | 123456   | email@addr.com  | 2014-10-20 00:00:00 |
+----------+----------+-----------------+---------------------+
10 rows in set (0.00 sec)

Switch masters

If you need to do any maintenance on the master server, you may need to take it off line, so there needs to be a way to demote a master and promote a slave to take its place.  The '-d' option can be used to stop replication, revoke replication privileges from the master, and make it read only:

./db_cluster_utils.py -d
192.168.0.36 Stopped slave thread
192.168.0.37 Stopped slave thread
192.168.0.38 Stopped slave thread
192.168.0.36 Reset slave
192.168.0.37 Reset slave
192.168.0.38 Reset slave
192.168.0.35 Revoked replication privilege from slave_user@192.168.0.36
192.168.0.35 Revoked replication privilege from slave_user@192.168.0.37
192.168.0.35 Revoked replication privilege from slave_user@192.168.0.38
192.168.0.35 Set read only

Now the cluster doesn't have a master, so I'm going to promote the slave at IP address 192.168.0.36:

bananapi@master ~/db-cluster-utils $ ./db_cluster_utils.py -p 192.168.0.36
./db_cluster_utils:369: Warning: Slave already has been stopped
  self.curs.execute("STOP SLAVE")
192.168.0.35 Stopped slave thread
192.168.0.36 Stopped slave thread
192.168.0.37 Stopped slave thread
192.168.0.38 Stopped slave thread
192.168.0.36 Cleared read only
Slave IP list:
['192.168.0.35', '192.168.0.37', '192.168.0.38']
192.168.0.36 Granted replication privilege to slave_user@192.168.0.35 id'd by mypassword
192.168.0.36 Granted replication privilege to slave_user@192.168.0.37 id'd by mypassword
192.168.0.36 Granted replication privilege to slave_user@192.168.0.38 id'd by mypassword
192.168.0.36 Got master status: mysql-bin.000031, 32279
192.168.0.35
CHANGE MASTER TO MASTER_HOST='192.168.0.36', 
                    MASTER_USER='slave_user', 
                    MASTER_PASSWORD='mypassword', 
                    MASTER_LOG_FILE='mysql-bin.000031', 
                    MASTER_LOG_POS=32279;
192.168.0.35 Set master
192.168.0.37
CHANGE MASTER TO MASTER_HOST='192.168.0.36', 
                    MASTER_USER='slave_user', 
                    MASTER_PASSWORD='mypassword', 
                    MASTER_LOG_FILE='mysql-bin.000031', 
                    MASTER_LOG_POS=32279;
192.168.0.37 Set master
192.168.0.38
CHANGE MASTER TO MASTER_HOST='192.168.0.36', 
                    MASTER_USER='slave_user', 
                    MASTER_PASSWORD='mypassword', 
                    MASTER_LOG_FILE='mysql-bin.000031', 
                    MASTER_LOG_POS=32279;
192.168.0.38 Set master
192.168.0.35 Started slave thread
192.168.0.37 Started slave thread
192.168.0.38 Started slave thread
Database name: my_db
Master server IP: 192.168.0.36
Control Host: 192.168.0.8
Slave IP address list:
['192.168.0.35', '192.168.0.37', '192.168.0.38']
Root password: T4nk3r!12
User password: mypassword
Slave user password: mypassword
New master IP: 192.168.0.36

This command grants replication privileges on the new master, and tells each slave to use the new master.

In the MySQL shell on the server at 192.168.0.36, I added a record to the database:

mysql> INSERT INTO users VALUES ('Kelly', '123456', 'email@addr.com', CURRENT_DATE());
Query OK, 1 row affected (0.02 sec)

mysql> COMMIT;
Query OK, 0 rows affected (0.00 sec)

Now on any of the slaves, if I select all users, I can see the following output:

mysql> SELECT * FROM users;
+----------+----------+-----------------+---------------------+
| username | password | email_addr      | joined              |
+----------+----------+-----------------+---------------------+
| Anna     | 123456   | anna@email.com  | 2014-10-20 00:00:00 |
| Bob      | 333456   | bob@addr.com    | 2014-10-20 00:00:00 |
| Claire   | 444456   | claire@addr.com | 2014-10-20 00:00:00 |
| Dave     | 555456   | dave@addr.com   | 2014-10-20 00:00:00 |
| Emma     | 666456   | emma@addr.com   | 2014-10-20 00:00:00 |
| Fred     | 111456   | fred@addr.com   | 2014-10-20 00:00:00 |
| Gwen     | 222456   | gwen@addr.com   | 2014-10-20 00:00:00 |
| Harry    | 123123   | harry@mail.com  | 2014-10-20 00:00:00 |
| Isabelle | 123456   | email@addr.com  | 2014-10-20 00:00:00 |
| James    | 123456   | email@addr.com  | 2014-10-20 00:00:00 |
| Kelly    | 123456   | email@addr.com  | 2014-10-20 00:00:00 |
+----------+----------+-----------------+---------------------+
11 rows in set (0.00 sec)

Wipe the database

During testing, I often find it useful to remove the databases from each node in the cluster.  The '-w' option allows me to do this:

./db_cluster_utils.py -w
Are you sure you want to wipe database my_db? (y/n): y
192.168.0.36 Stopped slave thread
192.168.0.37 Stopped slave thread
./db_cluster_utils:369: Warning: Slave already has been stopped
  self.curs.execute("STOP SLAVE")
192.168.0.35 Stopped slave thread
192.168.0.35 Dropped user slave_user
192.168.0.35 Dropped user db_user
192.168.0.35 Dropped database my_db
192.168.0.36 Stopped slave thread
192.168.0.36 Dropped user slave_user
192.168.0.36 Dropped user db_user
192.168.0.36 Dropped database my_db
192.168.0.37 Stopped slave thread
192.168.0.37 Dropped user slave_user
192.168.0.37 Dropped user db_user
192.168.0.37 Dropped database my_db
192.168.0.38 Stopped slave thread
192.168.0.38 Dropped user slave_user
192.168.0.38 Dropped user db_user
192.168.0.38 Dropped database my_db

The wipe command effects each server in the server pool, including the master, slaves, and nodes that haven't yet been added to the cluster.  You will be prompted to see if you're sure you want to wipe the databases from the cluster.

Now I can re-initialize the cluster using the '-i' option and start again.

 

Error Recovery

At some point, a server will probably break down and need to be replaced.  When this happens it's important to minimize down-time and avoid losing data.  Error recovery procedures need to be simple so that you can solve problems quickly.

I'm not planning on setting up automated failover.  Automated failover can kick in at the wrong time and cause more problems than it solves, as the engineers at Github discovered.

It's a good idea to rehearse error recovery before you have to do it for real.  I need to plan for either a master failure or a slave failure.  A slave failure is fairly easy to deal with - I just need to remove the failed slave from the cluster and add another one.

A master failure is only slightly harder to deal with.  I need to promote one of the remaining slaves and add another slave to the cluster.

Master Failure

I'm using the same cluster as in the other pages in this section.  I've initialized it and started replication.  I forced a failure by simply unplugging the master server's power cable.

The first thing I need to do is remove the server from the cluster:

./db_cluster_utils.py -r 192.168.0.35
Remove node: 192.168.0.35
 Failed to connect
Line: 74
(2002, "Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)")
 *** Couldn't revoke replication privilege from slave_user
Line: 290
db_server instance has no attribute 'curs'
Removing slave
192.168.0.35 Failed to connect
Line: 74
(2003, "Can't connect to MySQL server on '192.168.0.35' (113)")
192.168.0.35 Failed to stop slave
Line: 375
db_server instance has no attribute 'curs'
192.168.0.35 *** Failed to drop user slave_user
Line: 193
db_server instance has no attribute 'curs'
192.168.0.35 *** Failed to drop user db_user
Line: 193
db_server instance has no attribute 'curs'
192.168.0.35 *** Failed to delete database my_db
Line: 204
db_server instance has no attribute 'curs'
Exception AttributeError: "db_server instance has no attribute 'conn'" in <bound method db_server.__del__ of <__main__.db_server instance at 0xb6aa7da0>> ignored
Exception AttributeError: "db_server instance has no attribute 'conn'" in <bound method db_server.__del__ of <__main__.db_server instance at 0xf5db20>> ignored

There are lots of error messages when this command is executed.  The server that's being removed can't accept a connection, so parts of this command didn't work properly.  The important thing is that the entry for this server was removed from status_log_records.dat.

The master server has been removed from the cluster, so one of the slaves needs to be promoted:

./db_cluster_utils.py -p 192.168.0.36
192.168.0.36 Stopped slave thread
192.168.0.37 Stopped slave thread
192.168.0.36 Cleared read only
Slave IP list:
['192.168.0.37']
192.168.0.36 Granted replication privilege to slave_user@192.168.0.37 id'd by mypassword
192.168.0.36 Got master status: mysql-bin.000031, 34726
192.168.0.37
CHANGE MASTER TO MASTER_HOST='192.168.0.36', 
                    MASTER_USER='slave_user', 
                    MASTER_PASSWORD='mypassword', 
                    MASTER_LOG_FILE='mysql-bin.000031', 
                    MASTER_LOG_POS=34726;
192.168.0.37 Set master
192.168.0.37 Started slave thread
Database name: my_db
Master server IP: 192.168.0.36
Control Host: 192.168.0.8
Slave IP address list:
['192.168.0.37']
Root password: T4nk3r!12
User password: mypassword
Slave user password: mypassword
New master IP: 192.168.0.36

Again, I'm going to test that replication is working by adding a record to the new master:

mysql> INSERT INTO users VALUES ('Isabelle', '123456', 'email@addr.com', CURRENT_DATE());
Query OK, 1 row affected (0.03 sec)

mysql> COMMIT;
Query OK, 0 rows affected (0.00 sec)

At the MySQL prompt on the remaining slave, I should see the new record in the list of all records from the users table:

mysql> SELECT * FROM users;
+----------+----------+-----------------+---------------------+
| username | password | email_addr      | joined              |
+----------+----------+-----------------+---------------------+
| Anna     | 123456   | anna@email.com  | 2014-10-20 00:00:00 |
| Bob      | 333456   | bob@addr.com    | 2014-10-20 00:00:00 |
| Claire   | 444456   | claire@addr.com | 2014-10-20 00:00:00 |
| Dave     | 555456   | dave@addr.com   | 2014-10-20 00:00:00 |
| Emma     | 666456   | emma@addr.com   | 2014-10-20 00:00:00 |
| Fred     | 111456   | fred@addr.com   | 2014-10-20 00:00:00 |
| Gwen     | 222456   | gwen@addr.com   | 2014-10-20 00:00:00 |
| Harry    | 123123   | harry@mail.com  | 2014-10-20 00:00:00 |
| Isabelle | 123456   | email@addr.com  | 2014-10-20 00:00:00 |
+----------+----------+-----------------+---------------------+
9 rows in set (0.00 sec)

Add the recovered master as a slave

Normally I would need to fix whatever went wrong with the server, but in this instance, I can simply boot it up again and drop the database from this server.  Then I can add the server back into the cluster:

./db_cluster_utils.py -a 192.168.0.35
192.168.0.36 Got master status: mysql-bin.000032, 1067
192.168.0.36 Granted replication privilege to slave_user@192.168.0.35 id'd by mypassword
Initializing new slave
Init server 192.168.0.35
192.168.0.35 Created database my_db
192.168.0.35 Created user db_user
192.168.0.35 Granted privileges
192.168.0.35 Created user slave_user
192.168.0.35 Granted privileges
192.168.0.35 Got master status: mysql-bin.000008, 8813
192.168.0.35 Set read only
CHANGE MASTER TO MASTER_HOST='192.168.0.36', 
                    MASTER_USER='slave_user', 
                    MASTER_PASSWORD='mypassword', 
                    MASTER_LOG_FILE='mysql-bin.000032', 
                    MASTER_LOG_POS=1067;
192.168.0.35 Set master
New slave IP: 192.168.0.35

Now I need to copy the database from the new master to the new slave:

./db_cluster_utils.py -m 192.168.0.35
Dump the database on the master server
mysqldump -h 192.168.0.36 -u root -p --opt my_db --result-file=my_db_2014.10.21.175558.sql
Enter password: 
Move the database file
scp my_db_2014.10.21.175558.sql bananapi@192.168.0.35:./my_db_2014.10.21.175558.sql
bananapi@192.168.0.35's password: 
my_db_2014.10.21.175558.sql                   100% 3703     3.6KB/s   00:00    
Enter password: 

This command shows the configuration information for the cluster:

./db_cluster_utils.py -c
Database name: my_db
Master server IP: 192.168.0.36
Control Host: 192.168.0.8
Slave IP address list:
['192.168.0.37', '192.168.0.35']
Root password: T4nk3r!12
User password: mypassword
Slave user password: mypassword

192.168.0.36:mysql-bin.000031:34486:master
192.168.0.37:mysql-bin.000030:31273:slave
192.168.0.35:mysql-bin.000008:8813:slave

I wanted to check that replication is working, so I added a new record to the master at 192.168.0.36:

mysql> INSERT INTO users VALUES ('John', '123456', 'email@addr.com', CURRENT_DATE());
Query OK, 1 row affected (0.02 sec)

mysql> COMMIT;
Query OK, 0 rows affected (0.00 sec)

On other servers, the contents of the users table should look like this:

mysql> SELECT * FROM users;
+----------+----------+-----------------+---------------------+
| username | password | email_addr      | joined              |
+----------+----------+-----------------+---------------------+
| Anna     | 123456   | anna@email.com  | 2014-10-20 00:00:00 |
| Bob      | 333456   | bob@addr.com    | 2014-10-20 00:00:00 |
| Claire   | 444456   | claire@addr.com | 2014-10-20 00:00:00 |
| Dave     | 555456   | dave@addr.com   | 2014-10-20 00:00:00 |
| Emma     | 666456   | emma@addr.com   | 2014-10-20 00:00:00 |
| Fred     | 111456   | fred@addr.com   | 2014-10-20 00:00:00 |
| Gwen     | 222456   | gwen@addr.com   | 2014-10-20 00:00:00 |
| Harry    | 123123   | harry@mail.com  | 2014-10-20 00:00:00 |
| Isabelle | 123456   | email@addr.com  | 2014-10-20 00:00:00 |
| John     | 123456   | email@addr.com  | 2014-10-21 00:00:00 |
+----------+----------+-----------------+---------------------+
10 rows in set (0.00 sec)

DB Cluster Utils Cluster API

The db_cluster class contains methods that are performed on the cluster as a whole.

Attritbutes

database_name - the name of the database to be administered.
master_ip - the IP address of the master server.
control_host - the IP address of the control host
slave_ip_list - a list of slave IP addresses
root_password - the password of the root MySQL user
db_user_pw - the password of the MySQL user 'db_user'
slave_user_pw - the password of the replication MySQL user

 

Methods

__init__ (self)

Load settings from config file and from the log.
Return a cluster object.

Example:

import db_cluster_utils

cluster = db_cluster_utils.db_cluster()


init_cluster(self)

Load settings from config file, ignoring setting in the status log.

Example:

import db_cluster_utils

cluster = db_cluster_utils.db_cluster()
cluster.init_cluster()

 

init_master(self)

Initialize a server, get and save its log file and position, and grant replication privileges for the slaves.

Example:

import db_cluster_utils

cluster = db_cluster_utils.db_cluster()
cluster = cluster.init_master()


init_slave(self, host)

Initialize a server, get and save its log file and position, set it read only, and tell it which master to use.

Argument

  • host - IP address of a slave server expressed as a string in format xxx.xxx.xxx.xxx

Example:

import db_cluster_utils

cluster = db_cluster_utils.db_cluster()
for host in cluster.slave_ip_list:
    cluster.init_slave(host)

 

wipe (self, db_name)

Delete a database from each server listed in the server pool.

Argument

  • db_name - the name of the database to be removed

Example:

import db_cluster_utils

cluster = db_cluster_utils.db_cluster()
cluster.stop_replication ()
cluster.wipe (cluster.database_name)

 

add_slave (self, host)

Add a slave to the cluster.
Get the master server's log file and position.
Grant replication privileges for the new slave on the master.
Initialize the slave.

Argument

  • host - IP address expressed as a string in format xxx.xxx.xxx.xxx

Example:

import db_cluster_utils

cluster = db_cluster_utils.db_cluster()
cluster.add_slave(arg)

 

remove_slave (self, host)

Remove a slave from the cluster.
Remove the host's record from the status log file.
Revoke replicatio privileges for this host on the master server.
Stop the slave.

Argument

  • host - IP address of a slave expressed as a string in format xxx.xxx.xxx.xxx

Example:

import db_cluster_utils

cluster = db_cluster_utils.db_cluster()
cluster.remove_slave (arg)

 

print_config (self)

Load the cluster configuration information and print it.

Example:

import db_cluster_utils

cluster = db_cluster_utils.db_cluster()
cluster.print_config ()


demote_cluster_master (self)

Demote a master to a slave.
Reset all slaves.
Revoke replication privileges for all slaves.
Update role in status file.
Set read only.

Example:

import db_cluster_utils

cluster = db_cluster_utils.db_cluster()
cluster.stop_replication ()
cluster.demote_cluster_master()


promote_cluster_slave (self, host)

Promote a slave to a master.
Stop replication across the whole cluster.
Clear read only on the new master.
Update status log file.
Update slave IP address list.
Set the slave as a master.
Tell all the other slaves to use the new master.

Argument

  • host - IP address of a slave expressed as a string in format xxx.xxx.xxx.xxx

Example:

import db_cluster_utils

cluster = db_cluster_utils.db_cluster()
cluster.promote_cluster_slave (arg)
cluster.start_replication ()


start_replication(self)

Start replication on all slaves in the cluster.

Example:

import db_cluster_utils

cluster = db_cluster_utils.db_cluster()
cluster.start_replication ()


stop_replication(self)

Stop replication on all slaves in the cluster.

Example:

import db_cluster_utils

cluster = db_cluster_utils.db_cluster()
cluster.stop_replication ()


move_db (self, host)

Move the database to a new slave.
Dump the database from the master.
Use the scp command to transfer the database file to the destination host.
Import the database on the destination host.

Argument

  • host - destination IP address expressed as a string in format xxx.xxx.xxx.xxx

Example:

import db_cluster_utils

cluster = db_cluster_utils.db_cluster ()
cluster.move_db (arg)


get_master_ip (self)

Get the master server's IP address from the status log.

Return the IP address of master.

Example:

import db_cluster_utils

cluster = db_cluster_utils.db_cluster ()
master_ip = cluster.get_master_ip ()


update_slave_ip_list (self)

Get an updated slave list from the status log.

Example:

import db_cluster_utils

slave_list = []
cluster = db_cluster_utils.db_cluster ()
slave_list = cluster.update_slave_ip_list()

 

update_bin_log_role (self, host, role)

Update the role field in a status log file.

Arguments

  • host - IP address of a host expressed as a string in format xxx.xxx.xxx.xxx
  • role - the role of the host (master or slave)

Example:

import db_cluster_utils

cluster = db_cluster_utils.db_cluster ()
cluster.update_bin_log_role (host, "master")

 

get_master_bin_log (self, host)

Get the log file and log position information for a host from the status log file.

Return the log file and its postition.

Argument

  • host - IP address expressed as a string in format xxx.xxx.xxx.xxx

Example:

import db_cluster_utils

cluster = db_cluster_utils.db_cluster ()
binlog_file, binlog_position = cluster.get_master_bin_log (host)


save_master_bin_log (self, host, mysqlbin, binlog, role)

Save log file and log position information for a host.

Arguments

  • host - IP address expressed as a string in format xxx.xxx.xxx.xxx
  • mysqlbin - the MySQL binary log file for this host
  • binlog - the position in the MySQL binary log file for this host
  • role - the role of this host (master or slave)

Example:

import db_cluster_utils

cluster = db_cluster_utils.db_cluster ()
mysql_bin_file, bin_log_pos = server.get_master_status ()
cluster.save_master_bin_log (server.ip_addr, mysql_bin_file, bin_log_pos, "slave")


remove_bin_log (self, host)

Remove a host's record from the status log.

Argument

  • host - IP address expressed as a string in format xxx.xxx.xxx.xxx

Example:

import db_cluster_utils

cluster = db_cluster_utils.db_cluster ()
cluster.remove_bin_log(host)

DB Cluster Utils Server API

The db_server class is used to manage individual servers.  When an instance of db_server is created, the control node connects to one of the nodes in the cluster specified by its IP address. The connection is stored as a variable in the db_server class, and it's used to execute SQL queries and commands.

Attributes

conn - database connection object
curs - database cursor object
ip_addr - the IP address of the server

 

Methods

__init__ (self, host, rt_password)

Create a server object and connect to a physical server

Arguments

  • host - the IP address of the server to connect to
  • rt_password - the MySQL root user password

Example:

import db_cluster_utils

server = db_cluster_utils.db_server (host, cluster.root_password)


__del__ (self)

Close the connection to the database.

 

init_server (self, db_name, usr_pass, slv_pass)

Initialize a MySQL server as either a master or slave.

Arguments

  • db_name - database name
  • usr_pass - user password
  • slv_pass - slave user password

Example:

import db_cluster_utils

server = db_cluster_utils.db_server (host, cluster.root_password)
server.init_server ("some_db", "usrpword", "slvpword")

 

create_db (self, db_name)

Create a database.

Argument

  • db_name - database name

Example:

import db_cluster_utils

server = db_cluster_utils.db_server (host, cluster.root_password)

server.create_db (db_name)

 

create_users (self, user_passwd, slave_passwd)

Create users for the database.  Each server has two users, db_user, which is for general data access, and slave_user, which is used for replication.

Arguments

  • user_passwd - the password for the database user
  • slave_passwd - the replication user password

Example:

import db_cluster_utils

server = db_cluster_utils.db_server (host, cluster.root_password)
server.create_users (usr_pass, slv_pass)

 

create_user (self, user_name, pass_word)

Create a database user.

Arguments

  • user_name - user name
  • pass_word - pass word

Example:

import db_cluster_utils

server = db_cluster_utils.db_server (host, cluster.root_password)
server.create_user ('slave_user', slave_passwd)

 

wipe (self, db_name)

Remove the database and users.

Argument

  • db_name - database name

Example:

import db_cluster_utils

server = db_cluster_utils.db_server (slave, self.root_password)
server.stop_slave ()
server.wipe (db_name)
            

 

drop_user (self, username)

Delete a user from this server.

Argument

  • usesrname - user name to be dropped

Example:

import db_cluster_utils

server = db_cluster_utils.db_server (slave, self.root_password)
server.drop_user (user_name)

 

drop_database (self, db_name)

Delete a datbase from this server.

Argument

  • db_name - database name

Example:

import db_cluster_utils

server = db_cluster_utils.db_server (slave, self.root_password)
server.drop_database (db_name)

 

get_master_status (self)

Get the name of the log file and the log position of a server.

Return the name of the log file and the log position of a server.

Example:

import db_cluster_utils

server = db_cluster_utils.db_server (ip_address, root_password)
server.init_server (cluster.database_name, cluster.db_user_pw, cluster.slave_user_pw)
mysqlbin, binlog = server.get_master_status ()

 

set_as_master (self, slave_list, slave_password)

Make a host a master.

Arguments

  • slave_list - a list of IP addresses of slave servers
  • slave_password - password for the replication user

Example:

import db_cluster_utils

server = db_cluster_utils.db_server (ip_address, root_password)
server.set_as_master (cluster.slave_ip_list, cluster.slave_user_pw)

 

grant_replication (self, user_name, ip, password)

Grant replication privileges on the master to a slave.

Arguments

  • user_name - the name of the user that will be granted replication privileges
  • ip - the IP address of the host that will be granted replication privileges
  • password - the password of the replication user

Example:

import db_cluster_utils

master_server = db_cluster_utils.db_server (host, cluster.root_password)
master_server.grant_replication ('slave_user', host, cluster.slave_user_pw)

 

demote_master (self, slave_list, slave_password)

Make a host a slave.

Arguments

  • slave_list - a list of IP addresses of slave servers
  • slave_password - password for the replication user

Example:

import db_cluster_utils

master_server = db_cluster_utils.db_server (cluster.master_ip, cluster.root_password)
master_server.demote_master (cluster.slave_ip_list, cluster.slave_user_pw)

 

revoke_replication (self, user_name, ip)

Revoke slave replication privileges on the master.

Arguments

  • user_name - the user name of the replication user
  • ip - the IP address of the the host whose replication privileges are being revoked

Example:

import db_cluster_utils

master_server = db_cluster_utils.db_server (host, cluster.root_password)
master_server.revoke_replication ('slave_user', host)

 

set_read_only (self)

Make a database server read only.

Example:

import db_cluster_utils

server = db_cluster_utils.db_server (host, cluster.root_password)
server.set_read_only()

 

clear_read_only (self)

Clear read only status from database.

Example:

import db_cluster_utils

server = db_cluster_utils.db_server (host, cluster.root_password)
server.clear_read_only()

 

use_master (self, master_ip_addr, slave_user_pword, mysqlbin_str, binlog_str)

Tell a slave which server to use as a master.

Arguments

  • master_ip_addr - the IP address of the master server
  • slave_user_pword - the password of the replication user
  • mysqlbin_str - the name of the MySQL log file on the master
  • binlog_str - the position of the master's log file

Example:

import db_cluster_utils

slave_server = db_cluster_utils.db_server (host, cluster.root_password)
slave_server.use_master (cluster.master_ip, cluster.slave_user_pw, cluster.master_mysqlbin, cluster.master_binlog)

 

start_slave (self)

Start the slave thread.

Example:

import db_cluster_utils

server = db_cluster_utils.db_server (host, root_password)
server.start_slave ()

 

stop_slave (self)

Stop the slave thread.

Example:

import db_cluster_utils

server = db_cluster_utils.db_server (host, root_password)
server.stop_slave ()

 

reset_slave (self)

Reset the slave.

Example:

import db_cluster_utils

server = db_cluster_utils.db_server (host, cluster.root_password)
server.reset_slave ()

 

import_db (self, file_name, db_name)

Import data from a file.

Arguments

  • file_name - the name of the file to import
  • db_name - the name of the database to import data into.

Example:

import db_cluster_utils

server = db_cluster_utils.db_server (host, cluster.root_password)
server.import_db(dump_file, self.database_name)

Share this page: