Constant hanging of wsrep in pre-commit stage and Preparing for TO isolation Happening Now

Comments

8 comments

  • Avatar
    Jason Mallory

    On node 3:

     

    | 414015 | stsadmin        | 10.1.105.42:34607 | axe_sts    | Query   |  12660 | wsrep waiting on replaying   | LOAD DATA LOCAL
    INFILE '/var/ose/projects/ericsson_stat/axe_sts/RAW/GW01STS_201509090330_6002/ECPOO |         0 |             0 |
    | 415780 | stsadmin        | 10.1.105.42:37379 | axe_sts    | Query   |  11340 | wsrep waiting on replaying   | LOAD DATA LOCAL
    INFILE '/var/ose/projects/ericsson_stat/axe_sts/RAW/GW01STS_201509090345_6003/APDIS |         0 |             0 |
    | 416646 | stsadmin        | 10.1.105.42:38700 | axe_sts    | Query   |  10415 | wsrep waiting on replaying   | LOAD DATA LOCAL
    INFILE '/var/ose/projects/ericsson_stat/axe_sts/RAW/GW01STS_201509090345_6003/APPLA |         0 |             0 |
    | 416683 | stsadmin        | 10.1.105.42:38759 | axe_sts    | Query   |  10327 | wsrep waiting on replaying   | LOAD DATA LOCAL
    INFILE '/var/ose/projects/ericsson_stat/axe_sts/RAW/GW01STS_201509090345_6003/AUCMA |         0 |             0 |
    | 417596 | stsadmin        | 10.1.105.42:40203 | axe_sts    | Query   |   9323 | wsrep waiting on replaying   | LOAD DATA LOCAL
    INFILE '/var/ose/projects/ericsson_stat/axe_sts/RAW/GW01STS_201509090345_6003/AUCSU |         0 |             0 |
    | 418513 | stsadmin        | 10.1.105.42:41639 | axe_sts    | Query   |   8321 | wsrep waiting on replaying   | LOAD DATA LOCAL
    INFILE '/var/ose/projects/ericsson_stat/axe_sts/RAW/GW01STS_201509090345_6003/AUTHE |         0 |             0 |
    | 418702 | stsadmin        | 10.1.105.42:41935 | axe_sts    | Query   |   8075 | wsrep waiting on replaying   | LOAD DATA LOCAL
    INFILE '/var/ose/projects/ericsson_stat/axe_sts/RAW/GW01STS_201509090345_6003/AUTHS |         0 |             0 |
    | 419245 | stsadmin        | 10.1.105.42:42787 | axe_sts    | Query   |   7458 | wsrep waiting on replaying   | LOAD DATA LOCAL
    INFILE '/var/ose/projects/ericsson_stat/axe_sts/RAW/GW01STS_201509090345_6003/BSCST |         0 |             0 |
    | 419703 | stsadmin        | 10.1.105.42:43510 | axe_sts    | Query   |   6929 | wsrep waiting on replaying   | LOAD DATA LOCAL
    INFILE '/var/ose/projects/ericsson_stat/axe_sts/RAW/GW01STS_201509090345_6003/CHASS |         0 |             0 |
    | 420016 | gtowrite        | 10.1.105.42:44010 | gtocdr     | Query   |   6956 | wsrep waiting on replaying   | LOAD DATA  LOCAL INFILE '/opt/www/gtoweb/dataimport/axecdr/med/ecdrsmall' INTO TABLE ecdrsmall FIELD |         0 |             0 |
    | 420108 | stsadmin        | 10.1.105.42:44155 | axe_sts    | Query   |   6452 | wsrep waiting on replaying   | LOAD DATA LOCAL
    INFILE '/var/ose/projects/ericsson_stat/axe_sts/RAW/GW01STS_201509090345_6003/DIGPA |         0 |             0 |
    | 420959 | stsadmin        | 10.1.105.42:45513 | axe_sts    | Query   |   5504 | wsrep waiting on replaying   | LOAD DATA LOCAL
    INFILE '/var/ose/projects/ericsson_stat/axe_sts/RAW/GW01STS_201509090345_6003/DTIST |         0 |             0 |
    | 421659 | stsadmin        | 10.1.105.42:46613 | axe_sts    | Query   |   4721 | wsrep waiting on replaying   | LOAD DATA LOCAL
    INFILE '/var/ose/projects/ericsson_stat/axe_sts/RAW/GW01STS_201509090345_6003/ECPOO |         0 |             0 |
    | 422321 | stsadmin        | 10.1.105.42:47664 | axe_sts    | Query   |   3972 | wsrep waiting on replaying   | LOAD DATA LOCAL
    INFILE '/var/ose/projects/ericsson_stat/axe_sts/RAW/GW01STS_201509090345_6003/EOS.d |         0 |             0 |
    | 422674 | stsadmin        | 10.1.105.42:48223 | axe_sts    | Query   |   3602 | wsrep waiting on replaying   | LOAD DATA LOCAL
    INFILE '/var/ose/projects/ericsson_stat/axe_sts/RAW/GW01STS_201509090345_6003/EOS.d |         0 |             0 |
    | 424004 | stsadmin        | 10.1.105.42:50327 | axe_sts    | Query   |   2149 | wsrep waiting on replaying   | LOAD DATA LOCAL
    INFILE '/var/ose/projects/ericsson_stat/axe_sts/RAW/GW01STS_201509090345_6003/EQIDC |         0 |             0 |
    | 424233 | stsadmin        | 10.1.105.42:50694 | axe_sts    | Query   |   1854 | wsrep waiting on replaying   | LOAD DATA LOCAL
    INFILE '/var/ose/projects/ericsson_stat/axe_sts/RAW/GW01STS_201509090345_6003/gener |         0 |             0 |
    | 425236 | stsadmin        | 10.1.105.42:52294 | axe_sts    | Query   |   1457 | wsrep waiting on replaying   | LOAD DATA LOCAL
    INFILE '/var/ose/projects/ericsson_stat/axe_sts/RAW/GW01STS_201509090400_6004/APDIS |         0 |             0 |
    | 425253 | gtowrite        | 10.1.105.42:52322 | gtoauth    | Query   |   1441 | wsrep waiting on replaying   | Insert into loghist(stamp, userid, status) values(now(), 'cwalker', 'S')
       |         0 |             0 |
    | 426112 | stsadmin        | 10.1.105.42:53708 | axe_sts    | Query   |    467 | wsrep waiting on replaying   | LOAD DATA LOCAL
    INFILE '/var/ose/projects/ericsson_stat/axe_sts/RAW/GW01STS_201509090400_6004/APPLA |         0 |             0 |
    | 426496 | root            | localhost         | NULL       | Sleep   |    109 |                              | NULL
       |         0 |             0 |
    | 426602 | root            | localhost         | NULL       | Query   |      0 | init                         | show processlist

  • Avatar
    Jason Mallory

    Killed node 3. Node 1 and 2 cleared. We had to  restart due to clients waiting. Need answers on this

  • Avatar
    Jason Mallory

    Node 3 logs

  • Avatar
    Johan

     

    Hi,

    I am not sure how the data files are loaded in, but it seems that they are loaded in parallel on multiple servers.

    E.g I see multiple LOAD DATA .. for the table 'ecdrsmall'.

    Furthermore, i don't know what the table 'ecdrsmall' look like (DDL).

    Is it an INNODB table?  (Myisam replication is very )

    Does it have a PK?

    Do you use TRIGGERS / FOREIGN KEYS ?


    I suggest you try and load the data files onto one server only.

    If you have a Loadbalancer, don't load the files through the LB, instead go directly on one node, and one node only.

     


    BR
    johan

  • Avatar
    Johan

    You can also try ( if it is a problem with flow control) set:

     

    SET GLOBAL wsrep_provider_options='gcs.fc_limit=1024';

    On ALL galera nodes (But then they must be a good state, i.e, not as indicated above since setting it when you have a problem is too late).


    BR
    johan

  • Avatar
    Johan

    1024 is a bit extreme:

    SET GLOBAL wsrep_provider_options='gcs.fc_limit=256';

     Is probably more sane. It is hard to say. Depends on disk, network, #cores etc.. And of course the table structures of the tables you are trying to load.

  • Avatar
    Jason Mallory

    Was there a way to make one node a master and the other two nodes backups?

  • Avatar
    Johan

    Hi,

    Galera itself does not facilitate read/write splitting.

    If you dont use a load balancer, then you could just simply connect to one galera node, and then if that galera node fails, failover to another node. But all connections must failover to the same galera node.

    If you use HAProxy: http://www.severalnines.com/blog/avoiding-deadlocks-galera-set-haproxy-single-node-writes-and-multi-node-reads

    BR
    johan

     

Please sign in to leave a comment.

Powered by Zendesk