Constant hanging of wsrep in pre-commit stage

Comments

2 comments

  • Avatar
    Amin Sh

    Hey Jason,

    Did you find anyway to get rid of this ? I've same problem.

    0
    Comment actions Permalink
  • Avatar
    Krzysztof Ksiazek

    Connections hanging in the pre-commit state are a result of slow writeset certification process on the remaining nodes in the cluster. Certification has to happen before the commit, thus pre-commit stage. There's really no one-size-fit-all solution for this problem as the culprit may be different in different cases. There are couple of things, though, that you may try. For starters, if you see that the CPU utilization is within limits and I/O looks fine, you may try to increase the number of the workers (wsrep_slave_threads). This may help to utilize resources more fully and increase the writeset certification speed. Keep in mind that there's no point in increasing it more than 'wsrep_cert_deps_distance' status counter. Also, if you have wsrep_slave_threads set to more than twice the CPU cores, it's unlikely that further increase will help.

    Another thing you may try is to set gcs.fc_limit to some higher value. This setting decides on the size of the write queue that causes the writeset replication to stop. The idea here is that setting this variable higher may help to accommodate some temporary spikes in number of DML's, that'd otherwise caused the writeset replication to pause.

    If the problem is caused by too many connections running queries in the same time or simply by lack of resources, there's not much you can do. What can be done is to implement some kind of connection pooling and limit number of connections to the database. 

    As I said, it's kinda tricky to tell exactly how to solve this problem without detailed knowledge of the workload but I hope this will point you to the right direction for further investigation.

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk