How to check/calculate the latency of Galera Cluster replication process

Comments

10 comments

  • Avatar
    Krzysztof Ksiazek

    Hi,

    Generally speaking, as long as the flow control is not used, Galera is affected by network latency therefore checking it will give you quite a good picture of how the writeset replication performs. Another method would be to track commit times - this will give you an insight into how fast writeset certification works. Still, there is a delay (which depends mostly on the writeset size) between commit and actual update on all of the nodes. Typically this is a fraction of second although for larger write sets it might be even seconds. Unfortunately there are no metrics to track this, as far as I know. The only thing is to query remaining nodes and see how long it took since commit returned for data to show up on all of them.

    0
    Comment actions Permalink
  • Avatar
    Jaswenny Huang

    Hello,

    Thankyou for your answer, i really appreciate it

    For your information, i'm using centos 7

    How do I track commit times? I've searched about it but i can't get it done

    Could you give me more explanation about this sentence below?

    "The only thing is to query remaining nodes and see how long it took since commit returned for data to show up on all of them"

     

    0
    Comment actions Permalink
  • Avatar
    Jaswenny Huang

    Hello, i found this

    Can this link be used as a reference to see the latency of galera replication process?

    There are 4 kind of times, which is i dont know what should i use.

    Minimum / Average / Maximum / Standard Deviation / Sample Size

    https://jira.percona.com/browse/PMM-586?jql=text%20~%20%22galera%20replication%20latency%22

    0
    Comment actions Permalink
  • Avatar
    Krzysztof Ksiazek

    Hi,

    This particular variable, as far as I can tell, tracks only intra-cluster communication. It can be useful to track the network latency but doesn't seem to tell how long it takes to replicate and apply writesets on all nodes.

    Regarding measuring the commit latency, as long as you don't use autocommit, you can time how long it takes to execute COMMIT, that's all. For autocommit traffic, there's no easy way to do so.

    Regarding that unclear sentence - what I meant is that you can try to track latency in a following way:
    1. Run a query (insert a row, for example)
    2. Wait for COMMIT of wait until query is completed on the host you run it
    3. Start the clock
    4. In the background, query other nodes for that added row
    5. Once you see it replicated on all nodes, stop the clock
    You'll have detailed data how long it took to replicate and execute given transaction on all nodes in the cluster

    0
    Comment actions Permalink
  • Avatar
    Ashraf Sharif

    Hi Jaswenny,

    In addition to what Krzysztof mentioned in the previous reply, the following reading might be helpful for you to understand Galera replication latency:
    https://severalnines.com/blog/how-improve-performance-galera-cluster-mysql-or-mariadb 

    Regards,
    Ashraf

    0
    Comment actions Permalink
  • Avatar
    Jaswenny Huang

    Hi, Krzysztof and Ashraf
    Thankyou so much for your reply and explaination. I'm sorry for my late reply, i didn't see this on email before, until i check it twice.

    Regarding measuring the commit latency, i've used a simple web to do the testing of replication database. So i can insert update delete through website.

    But, using the web, i think it's autocommit.
    So, i tried to write "autocommit=0" in /my.cnf.d/server.cnf and restarting the service of mysql.

    Then, i insert a row in node 1 through web ( i hv 3 node with 1 haproxy). There's no row inserted in that node. Then, back to terminal, going to mysql -u root -p, and type "commit";

    But still, there's no data inserted into node 1.
    Am i doing it wrong?
    I really need your suggestion
    Thankyou

    0
    Comment actions Permalink
  • Avatar
    Paul Namuag

    Hi Jaswenny Huang,

     

    Maybe your node1 wasn't connected to the cluster correctly. Have you check the following:

    SHOW GLOBAL STATUS LIKE 'wsrep_local%'; 

    and check the values if wsrep_local_state_comment and wsrep_local_state mark as synced. Take note that wsrep_local_state = 4 means synced i.e.

    I also wonder why the need to set autocommit=0 in your configuration file if that's what you really need? Anyhow, just let us know how's the progress and how we can be of help.

     

    Thanks.

    0
    Comment actions Permalink
  • Avatar
    Jaswenny Huang

    Hi, Paul

    Thankyou so much for your reply

    I'm doing a testing about calculating the latency of Galera Cluster replication process when replicate from 1 node to another.

    Krzysztof said that :
    "Regarding measuring the commit latency, as long as you don't use autocommit, you can time how long it takes to execute COMMIT, that's all. For autocommit traffic, there's no easy way to do so."

    The problem is i'm not using mysql console for the testing. In this condition, im using a simple web to do the testing ( to insert, update, and delete data to database). Which means it's autocommit. So, i set autocommit=0 and do the commit through mysql console, in order to calculate the latency by Krzysztof. But it didn't work (when set autocommit=0, no data inserted)

    Do you have any suggestion about measuring the replication latency when i do the testing by using web?

    0
    Comment actions Permalink
  • Avatar
    Paul Namuag

    Hi Jaswenny Huang,

    Sorry if I ask you this, but do you have a fair understanding how does Galera works? The replication is different from the vanilla-type MySQL async replication. Though some people do not call Galera as synchronous, but the way it replicates to the nodes differs. So maybe I can point you to http://galeracluster.com/documentation-webpages/monitoringthecluster.html#checking-the-replication-health and http://galeracluster.com/documentation-webpages/detectingaslownode.html if that could help you out? Ideally, you should check variables wsrep_flow_control_paused and wsrep_flow_control_paused_ns

    Also with regards to the commit, technically, that would work though but you can also set it through a session to have SET autocomit=0; INSERT ...; COMMIT;

     

    0
    Comment actions Permalink
  • Avatar
    Jaswenny Huang

    It was really helpful! Thank you so much, Paul.

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk