[CentOS] mysql replication - problems

Thu Mar 12 13:57:34 UTC 2015
Tim Dunphy <bluethundr at gmail.com>

Hey everybody,

 I'm trying to get mysql master/slave replication to work under SSL. I've
created the certs for both the slave and the master. I've configured the
master and slave my.cnf. And it does appear that replication is actually
working.

Master is actually MariaDB (version 5.5.41-MariaDB-log, and the slave is
MySQL (version 5.5.41-log).

But there are two issues I'd like to resolve. One is that SSL appears to be
disabled.

If I look at both the master and the slave and do a 'show variables'
command, I can see that it's recognizing the certs. But the 'have_openssl'
and 'have_ssl' variables are showing as DISABLED.

Watch, on the master:

MariaDB [(none)]> show variables like '%ssl%';
+---------------+--------------------------------+
| Variable_name | Value                          |
+---------------+--------------------------------+
| have_openssl  | DISABLED                       |
| have_ssl      | DISABLED                       |
| ssl_ca        | /etc/pki/CA/certs/ca.crt       |
| ssl_capath    |                                |
| ssl_cert      | /etc/pki/tls/certs/mysql.crt   |
| ssl_cipher    |                                |
| ssl_key       | /etc/pki/tls/private/mysql.key |
+---------------+--------------------------------+
7 rows in set (0.01 sec)

On the slave:

mysql> show variables like '%ssl%';
+---------------+--------------------------------------+
| Variable_name | Value                                |
+---------------+--------------------------------------+
| have_openssl  | DISABLED                             |
| have_ssl      | DISABLED                             |
| ssl_ca        | /etc/pki/CA/certs/ca.crt             |
| ssl_capath    |                                      |
| ssl_cert      | /etc/pki/tls/certs/mysql-slave.crt   |
| ssl_cipher    |                                      |
| ssl_key       | /etc/pki/tls/private/mysql-slave.key |
+---------------+--------------------------------------+
7 rows in set (0.00 sec)

And yet I clearly have SSL enabled in both configurations.

In the master mysql configuration I have:

[root at web2:~] #cat /etc/my.cnf
[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
symbolic-links=0
*ssl*
*ssl-ca=/etc/pki/CA/certs/ca.crt*
*ssl-cert=/etc/pki/tls/certs/mysql.crt*
*ssl-key=/etc/pki/tls/private/mysql.key*
server-id               = 1
log_bin                 = /var/log/mariadb/mysql-bin.log
expire_logs_days        = 10
max_binlog_size         = 100M
binlog_do_db            = jokefire

[mysqld_safe]
log-error=/var/log/mariadb/mariadb.log
pid-file=/var/run/mariadb/mariadb.pid

On the mysql slave:

[root at ops:~] #cat /etc/my.cnf
[mysqld]
# Settings user and group are ignored when systemd is used (fedora >= 15).
# If you need to run mysqld under different user or group,
# customize your systemd unit file for mysqld according to the
# instructions in http://fedoraproject.org/wiki/Systemd
user=mysql
*ssl*
*server-id=2*

*replicate-do-db=jokefire*
*ssl-ca=/etc/pki/CA/certs/ca.crt*
*ssl-cert=/etc/pki/tls/certs/mysql-slave.crt*
*ssl-key=/etc/pki/tls/private/mysql-slave.key*
thread_cache_size = 4

datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
symbolic-links=0
;plugin-load=rpl_semi_sync_master=semisync_master.so
;plugin-load=rpl_semi_sync_slave=semisync_slave.so
;rpl_semi_sync_master_enabled=1
;rpl_semi_sync_master_timeout=10
;rpl_semi_sync_slave_enabled=1
;performance_schema
query_cache_size = 8MB
innodb_buffer_pool_size = 199M
general_log_file=/var/log/mysql/mysql.log
general_log=1
log-error=/var/log/mysql/mysql_error_log
log-slow-queries=/var/log/mysql/mysql_slow_log
wait_timeout = 86400

[mysqld_safe]
general_log_file=/var/log/mysql/mysql.log
general_log=1
log-error=/var/log/mysql/mysql_error_log
log-slow-queries=/var/log/mysql/mysql_slow_log
pid-file=/var/run/mysqld/mysqld.pid
innodb_buffer_pool_size = 199M
wait_timeout = 28800
interactive_timeout = 28800
master-connect-retry=60

So my first question is, why is SSL not enabled in either database? I
restarted the service on both machines before taking a look at the
variables.

The next problem I'm having is that I can't seem to get the replication
user to connect. I had to use an account with more privileges (grant all)
in order to connect from the slave to the master.

I used this grant on the master to try and setup the replication user:

GRANT REPLICATION SLAVE ON *.* TO 'jf_slave'@'ops.somewhere.com' IDENTIFIED
BY 'secret' REQUIRE SSL;

Then back on the slave I used this command to connect the slave to the
master:

mysql> CHANGE MASTER TO MASTER_HOST='web2.somewhere.com',
MASTER_USER='jf_slave', MASTER_PASSWORD='secret',
MASTER_LOG_FILE='mysql-bin.000002', MASTER_LOG_POS=34697, MASTER_SSL=1,
MASTER_SSL_CA = '/etc/pki/CA/certs/ca.crt', MASTER_SSL_CERT =
'/etc/pki/tls/certs/mysql.crt', MASTER_SSL_KEY =
'/etc/pki/tls/private/mysql.key';

And when I start up the slave I see that there's a problem connecting from
the slave to the master:

mysql> show slave status \G
*************************** 1. row ***************************
               *Slave_IO_State: Connecting to master*
                  Master_Host: web2.somewhere.com
                  Master_User: jf_slave
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000002
          Read_Master_Log_Pos: 761404
               Relay_Log_File: mysqld-relay-bin.000001
                Relay_Log_Pos: 4
        Relay_Master_Log_File: mysql-bin.000002
             *Slave_IO_Running: Connecting*
            Slave_SQL_Running: Yes
              Replicate_Do_DB: testdb
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 761404
              Relay_Log_Space: 107
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: Yes
           Master_SSL_CA_File: /etc/pki/CA/certs/ca.crt
           Master_SSL_CA_Path:
              Master_SSL_Cert: /etc/pki/tls/certs/mysql.crt
            Master_SSL_Cipher:
               Master_SSL_Key: /etc/pki/tls/private/mysql.key
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 1045
                Last_IO_Error: error connecting to master
*'jf_slave at web2.somewhere.com:3306
<http://jf_slave@web2.somewhere.com:3306>' - retry-time: 60  retries: 86400*
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 1
1 row in set (0.00 sec)

And if I go back the command line in bash, and try to connect from the
slave to the master, it seems that I can't:

[root at ops:~] #mysql -ujf_slave -p -h web2.somewhere.com
Enter password:
ERROR 1045 (28000): *Access denied* for user 'jf_slave'@'ops.somewhere.com'
(using password: YES)

So I made sure that I could connect from the slave to the master using an
admin account, that has some more privileges:

[root at ops:~] #mysql -uadmin -p -h web2.somewhere.com
Enter password:
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1062
Server version: 5.5.41-MariaDB-log MariaDB Server

Copyright (c) 2000, 2014, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input
statement.

mysql>

And then setup that account as the replication user:

mysql> CHANGE MASTER TO MASTER_HOST='web2.somewhere.com',
MASTER_USER='admin', MASTER_PASSWORD='secret',
MASTER_LOG_FILE='mysql-bin.000002', MASTER_LOG_POS=767030, MASTER_SSL=1,
MASTER_SSL_CA = '/etc/pki/CA/certs/ca.crt', MASTER_SSL_CERT =
'/etc/pki/tls/certs/mysql.crt', MASTER_SSL_KEY =
'/etc/pki/tls/private/mysql.key';
Query OK, 0 rows affected (0.02 sec)


You can see that replication is working:

mysql> show slave status \G
*************************** 1. row ***************************
               *Slave_IO_State: Waiting for master to send event*
                  Master_Host: web2.somewhere.com
                  Master_User: admin
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000002
         * Read_Master_Log_Pos: 771825*
               Relay_Log_File: mysqld-relay-bin.000002
                Relay_Log_Pos: 391
        Relay_Master_Log_File: mysql-bin.000002
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: testdb
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 771825
              Relay_Log_Space: 548
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: Yes
           Master_SSL_CA_File: /etc/pki/CA/certs/ca.crt
           Master_SSL_CA_Path:
              Master_SSL_Cert: /etc/pki/tls/certs/mysql.crt
            Master_SSL_Cipher:
               Master_SSL_Key: /etc/pki/tls/private/mysql.key
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 1
1 row in set (0.00 sec)

And if you run that command a couple times you can see that the bin log
position changes.

I realize that it can be dangerous to setup a user with elevated privileges
to perform the replication. But I'm using a test database with test data
until I can get this working correctly. Plus I also have the firewall
limiting the connection to only the slave from the master over the database
port.

Ok, so my second question is, why can't the replication user connect from
the slave to the master, using that grant command I showed you a bit
earlier? It seems to me like it should have worked.

And my last question is more of a minor annoyance, and shouldn't affect the
overall operation of the database.

If I put this command: master-connect-retry=60 in the [mysqld] section on
the slave, the mysqld service will not start. If, instead I put it into
the [mysqld_safe] section, I'm able to start up mysql with no issues.
Again, this is something I'm just curious about. The other two questions
are quite a bit more important.

I realize this is more of a mysql question, than it is a CentOS admin
question. But you guys seem really knowledgable on this topic. And I've had
great luck with this list in the past. So I hope you won't mind me tapping
your expertise in this area.

I definitely welcome the advice of the experts in this community.

Thanks!
Tim







-- 
GPG me!!

gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B