How to check/calculate the latency of Galera Cluster replication process

October 05, 2018 09:36

Hello, I configured Galera Cluster to join 3 node. Then, i used HAProxy as a load balancer.
I prepare a simple website to insert, update, or delete data to database. As we know, by using Galera Cluster, the database on 1 node can be replicated to the other nodes.

i'm wondering about how do I check the latency when 1 node replicate the database to the other nodes in the cluster (replication process)

Is it possible to check/calculate it?

Thankyou

Comments

10 comments

Krzysztof Ksiazek

October 05, 2018 09:44
Hi,

Generally speaking, as long as the flow control is not used, Galera is affected by network latency therefore checking it will give you quite a good picture of how the writeset replication performs. Another method would be to track commit times - this will give you an insight into how fast writeset certification works. Still, there is a delay (which depends mostly on the writeset size) between commit and actual update on all of the nodes. Typically this is a fraction of second although for larger write sets it might be even seconds. Unfortunately there are no metrics to track this, as far as I know. The only thing is to query remaining nodes and see how long it took since commit returned for data to show up on all of them.
0

Comment actions Permalink
Jaswenny Huang

October 05, 2018 14:06
Hello,

Thankyou for your answer, i really appreciate it

For your information, i'm using centos 7

How do I track commit times? I've searched about it but i can't get it done

Could you give me more explanation about this sentence below?

"The only thing is to query remaining nodes and see how long it took since commit returned for data to show up on all of them"
0

Comment actions Permalink
Jaswenny Huang

October 05, 2018 15:23
Hello, i found this

Can this link be used as a reference to see the latency of galera replication process?

There are 4 kind of times, which is i dont know what should i use.
```
Minimum / Average / Maximum / Standard Deviation / Sample Size
```
https://jira.percona.com/browse/PMM-586?jql=text%20~%20%22galera%20replication%20latency%22
0

Comment actions Permalink
Krzysztof Ksiazek

October 05, 2018 15:44
Hi,

This particular variable, as far as I can tell, tracks only intra-cluster communication. It can be useful to track the network latency but doesn't seem to tell how long it takes to replicate and apply writesets on all nodes.

Regarding measuring the commit latency, as long as you don't use autocommit, you can time how long it takes to execute COMMIT, that's all. For autocommit traffic, there's no easy way to do so.

Regarding that unclear sentence - what I meant is that you can try to track latency in a following way:
1. Run a query (insert a row, for example)
2. Wait for COMMIT of wait until query is completed on the host you run it
3. Start the clock
4. In the background, query other nodes for that added row
5. Once you see it replicated on all nodes, stop the clock
You'll have detailed data how long it took to replicate and execute given transaction on all nodes in the cluster
0

Comment actions Permalink
Ashraf Sharif

October 08, 2018 03:16
Hi Jaswenny,

In addition to what Krzysztof mentioned in the previous reply, the following reading might be helpful for you to understand Galera replication latency:
https://severalnines.com/blog/how-improve-performance-galera-cluster-mysql-or-mariadb

Regards,
Ashraf
0

Comment actions Permalink
Jaswenny Huang

October 26, 2018 09:41
Hi, Krzysztof and Ashraf
Thankyou so much for your reply and explaination. I'm sorry for my late reply, i didn't see this on email before, until i check it twice.

Regarding measuring the commit latency, i've used a simple web to do the testing of replication database. So i can insert update delete through website.

But, using the web, i think it's autocommit.
So, i tried to write "autocommit=0" in /my.cnf.d/server.cnf and restarting the service of mysql.

Then, i insert a row in node 1 through web ( i hv 3 node with 1 haproxy). There's no row inserted in that node. Then, back to terminal, going to mysql -u root -p, and type "commit";

But still, there's no data inserted into node 1.
Am i doing it wrong?
I really need your suggestion
Thankyou
0

Comment actions Permalink
Paul Namuag

October 29, 2018 15:34
Hi Jaswenny Huang,

Maybe your node1 wasn't connected to the cluster correctly. Have you check the following:

SHOW GLOBAL STATUS LIKE 'wsrep_local%';

and check the values if wsrep_local_state_comment and wsrep_local_state mark as synced. Take note that wsrep_local_state = 4 means synced i.e.

I also wonder why the need to set autocommit=0 in your configuration file if that's what you really need? Anyhow, just let us know how's the progress and how we can be of help.

Thanks.
0

Comment actions Permalink
Jaswenny Huang

October 30, 2018 08:27
Hi, Paul

Thankyou so much for your reply

I'm doing a testing about calculating the latency of Galera Cluster replication process when replicate from 1 node to another.

Krzysztof said that :
"Regarding measuring the commit latency, as long as you don't use autocommit, you can time how long it takes to execute COMMIT, that's all. For autocommit traffic, there's no easy way to do so."

The problem is i'm not using mysql console for the testing. In this condition, im using a simple web to do the testing ( to insert, update, and delete data to database). Which means it's autocommit. So, i set autocommit=0 and do the commit through mysql console, in order to calculate the latency by Krzysztof. But it didn't work (when set autocommit=0, no data inserted)

Do you have any suggestion about measuring the replication latency when i do the testing by using web?
0

Comment actions Permalink
Paul Namuag

October 31, 2018 20:13
Hi Jaswenny Huang,

Sorry if I ask you this, but do you have a fair understanding how does Galera works? The replication is different from the vanilla-type MySQL async replication. Though some people do not call Galera as synchronous, but the way it replicates to the nodes differs. So maybe I can point you to http://galeracluster.com/documentation-webpages/monitoringthecluster.html#checking-the-replication-health and http://galeracluster.com/documentation-webpages/detectingaslownode.html if that could help you out? Ideally, you should check variables wsrep_flow_control_paused and wsrep_flow_control_paused_ns.

Also with regards to the commit, technically, that would work though but you can also set it through a session to have SET autocomit=0; INSERT ...; COMMIT;
0

Comment actions Permalink
Jaswenny Huang

November 12, 2018 07:53
It was really helpful! Thank you so much, Paul.
0

Comment actions Permalink

Please sign in to leave a comment.

Comments

Didn't find what you were looking for?