User Tools

Site Tools


progetti:cloud-areapd:best_practices:optimize_mysql_db_wan_keystone

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
progetti:cloud-areapd:best_practices:optimize_mysql_db_wan_keystone [2014/05/18 20:31] – [PROBLEM] aiftim@infn.itprogetti:cloud-areapd:best_practices:optimize_mysql_db_wan_keystone [2014/06/18 10:14] (current) – [SOLUTION] aiftim@infn.it
Line 1: Line 1:
 + ====== How to Optimize MySQL DB - WAN Keystone ======
 +
 +===== Reference =====
 +  * [[https://bugs.launchpad.net/fuel/+bug/1316003 | MySQL is not configured with innodb_file_per_table = 1 ]]
 +  * [[https://dev.mysql.com/doc/refman/5.5/en/innodb-parameters.html#sysvar_innodb_file_per_table | nnodb_file_per_table]]
 +  * [[https://bugs.launchpad.net/python-keystoneclient/+bug/1195924 | Keystone client BUG #1195924 - expiration check is using different timezone other than UTC ]]
 +
 +===== PROBLEM =====
 +
 +After one week of activity the ibdata1 file was 8.1G and... growing
 +<code>
 +root@keystone:~# ls -lh /var/lib/mysql/ibdata1 
 +-rw-rw---- 1 mysql mysql 8.1G May 12 13:01 /var/lib/mysql/ibdata1
 +</code>
 +
 +This caused a "disk full problem" in LNGS, followed up by a degradation of the DB and the impossibility to restart mysql on the primary node of the cluster, keystone.pd.infn.it
 +<code>
 +# less  /var/lib/mysql/keystone.err
 +[...]
 +InnoDB: Your database may be corrupt or you may have copied the InnoDB
 +InnoDB: tablespace but not the InnoDB log files. See
 +InnoDB: http://dev.mysql.com/doc/refman/5.5/en/forcing-innodb-recovery.html
 +InnoDB: for more information.
 +140509 12:51:34 [ERROR] Slave SQL: Could not execute Write_rows event on table keystone.token; Duplicate entry '494830e055f015b77457cefe1ff2c9cc' for key 'PRIMARY', Error_co
 +de: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 30457, Error_code: 1062
 +140509 12:51:34 [Warning] WSREP: RBR event 2 Write_rows apply warning: 121, 861859
 +140509 12:51:34 [Warning] WSREP: Failed to apply app buffer: seqno: 861859, status: 1
 +         at galera/src/replicator_smm.cpp:apply_wscoll():57
 +Retrying 2th time
 +[...]
 +140509 12:51:34 [ERROR] Slave SQL: Could not execute Write_rows event on table keystone.token; Duplicate entry '494830e055f015b77457cefe1ff2c9cc' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 30457, Error_code: 1062
 +140509 12:51:34 [Warning] WSREP: RBR event 2 Write_rows apply warning: 121, 861859
 +140509 12:51:34 [ERROR] WSREP: receiving IST failed, node restart required: Failed to apply trx 861859 10 times
 +140509 12:51:34 [Note] WSREP: Closing send monitor...
 +140509 12:51:34 [Note] WSREP: Closed send monitor.
 +140509 12:51:34 [Note] WSREP: gcomm: terminating thread
 +140509 12:51:34 [Note] WSREP: gcomm: joining thread
 +140509 12:51:34 [Note] WSREP: gcomm: closing backend
 +140509 12:51:34 [Note] WSREP: view(view_id(NON_PRIM,157f634e-d5ce-11e3-b07e-265b1d5e5cf0,1461) memb {
 +        dff534f1-d767-11e3-b8c5-ffab4b83e1d2,
 +} joined {
 +} left {
 +} partitioned {
 +        157f634e-d5ce-11e3-b07e-265b1d5e5cf0,
 +        27f583aa-c60a-11e3-af90-6a5bb07181cf,
 +})
 +140509 12:51:34 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
 +140509 12:51:34 [Note] WSREP: view((empty))
 +140509 12:51:34 [Note] WSREP: gcomm: closed
 +140509 12:51:34 [Note] WSREP: Flow-control interval: [16, 16]
 +140509 12:51:34 [Note] WSREP: Received NON-PRIMARY.
 +140509 12:51:34 [Note] WSREP: Shifting JOINER -> OPEN (TO: 862345)
 +140509 12:51:34 [Note] WSREP: Received self-leave message.
 +140509 12:51:34 [Note] WSREP: Flow-control interval: [0, 0]
 +140509 12:51:34 [Note] WSREP: Received SELF-LEAVE. Closing connection.
 +140509 12:51:34 [Note] WSREP: Shifting OPEN -> CLOSED (TO: 862345)
 +140509 12:51:34 [Note] WSREP: RECV thread exiting 0: Success
 +140509 12:51:34 [Note] WSREP: recv_thread() joined.
 +140509 12:51:34 [Note] WSREP: Closing replication queue.
 +140509 12:51:34 [Note] WSREP: Closing slave action queue.
 +140509 12:51:34 [Note] WSREP: /usr/sbin/mysqld: Terminated.
 +Aborted (core dumped)
 +140509 12:51:34 mysqld_safe Number of processes running now: 0
 +140509 12:51:34 mysqld_safe WSREP: not restarting wsrep node automatically
 +140509 12:51:34 mysqld_safe mysqld from pid file /var/lib/mysql/keystone.pid ended
 +</code>
 +
 +The issue was due to the increased number of tokens, ~20000 in 24h, that are registered in the keystone DB, and not flushed.
 +
 +<code>
 +# mysql -u root x xxxxxx
 +mysql> use keystone;
 +mysql> SELECT COUNT(*) FROM token;
 ++----------+
 +| COUNT(*) |
 ++----------+
 +|    18141 |
 ++----------+
 +</code>
 +
 +==== SOLUTION ====
 +
 +In order to reduce the number of tokens registered, to better handle the MySQL DBs it was decided to do the following steps:
 +  * configure mysql with "innodb_file_per_table=1" (see reference)
 +  * set in Keystone the token expiration to 3h
 +  * define a cron, in one of the hosts to flush tokens and optimize the token table in the keystone db
 +
 +
 +**STEPS**:
 +
 +1. Stop mysql and keystone on all nodes of the cluster
 +
 +2. Choose one node and do the following steps:
 +  * start mysql
 +  * mysqldump of all DBs
 +  * stop mysql
 +  * in my.cnf comment all lines containing wsrep* and add innodb_file_per_table in the [mysqld] section
 +    <code bash>
 +    # grep innodb_file_per_table /etc/mysql/my.cnf 
 +    innodb_file_per_table=1</code>
 +  * start mysql
 +  * drop all databases except mysql, information_schema, performance_schema, certs and restore them from the dump previousley done
 +  * stop mysql
 +  * in my.cnf uncomment all wsrep* previously commented and change the wsrep_cluster_address
 +    <code bash>
 +    # grep wsrep_cluster_address /etc/mysql/my.cnf |egrep -v '   #|^#|^$'
 +    wsrep_cluster_address=gcomm://</code>
 +  * due to a bug in Havana (see details bellow and [[expiration check is using different timezone other than UTC  | Ref3]]) - in keystone.conf add "expiration = 10800" in the [token] section
 +    <code bash>
 +    [token]
 +    driver = keystone.token.backends.sql.Token 
 +    # Amount of time a token should remain valid (in seconds)
 +    # expiration = 86400
 +    expiration=10800</code>
 +  * restart mysql
 +  * define a cron to execute the following script every 3h:
 +    <code bash>
 +    #!/bin/bash
 +    /usr/bin/keystone-manage -v -d token_flush
 +    /bin/ls -ltr /var/lib/mysql/keystone/token.ibd
 +    echo "optimize table token" | /usr/bin/mysql -u root -pxxxxxx keystone
 +    /bin/ls -ltr /var/lib/mysql/keystone/token.ibd</code>
 +
 +3. On the other nodes of the cluster:
 +  * stop mysql
 +  * in my.cnf comment all lines containing wsrep*
 +  * start mysql
 +  * drop all databases except mysql, information_schema, performance_schema, certs
 +  * stop mysql
 +  * in /var/lib/mysql remove galera.cache grastate.dat ibdata1 ib_logfile* wsrep_recovery*
 +    <code bash>
 +    # cd /var/lib/mysql
 +    # rm galera.cache grastate.dat ibdata1 ib_logfile* wsrep_recovery* </code>
 +  * add innodb_file_per_table in the [mysqld] section
 +     <code bash>
 +     # grep innodb_file_per_table /etc/mysql/my.cnf 
 +     innodb_file_per_table=1</code>
 +  * start mysql
 +  * stop mysql
 +  * in my.cnf uncomment all wsrep* previously commented and change the wsrep_cluster_address
 +    <code bash>
 +    # grep wsrep_cluster_address /etc/mysql/my.cnf |egrep -v '   #|^#|^$'
 +    wsrep_cluster_address=gcomm://193.206.210.164,90.147.112.3,90.147.102.131</code>
 +  * start mysql
 +
 +4. On the "chosen" server, in my.cnf change:
 +<code bash>
 +    wsrep_cluster_address=gcomm://</code>
 +
 +back to 
 +<code bash>
 +    wsrep_cluster_address=gcomm://193.206.210.164,90.147.112.3,90.147.102.131</code>
 +
 +and restart mysql
 +
 +
 +Now there should be directories for each DB, in particular for keystone:
 +<code bash>
 +    # ls -ltrh /var/lib/mysql/keystone/token.*
 +    -rw-rw---- 1 mysql mysql 8.6K May 18 08:00 /var/lib/mysql/keystone/token.frm
 +    -rw-rw---- 1 mysql mysql 596M May 18 22:26 /var/lib/mysql/keystone/token.ibd</code>
 +
 +
 +=== Details of BUG #1195924 ===
 +
 +[[https://bugs.launchpad.net/python-keystoneclient/+bug/1195924 | BUG # 1195924 | expiration check is using different timezone other than UTC ]]
 +
 +   * the token are generated with start and expiration hour expresed in UTC, but the client libraries do the verification using the local time (in Italy = UTC +2h) during 1 hour.
 +   * the patch was released for/in Icehouse - for which version we can think to leave the tokens duration of 1 h by default and configure the cron to run every hour.
 +
 + 
  

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki