Garbd cannot connect to cluster?
Hey everyone,
I am trying to create the following out-of-the-box deployment.
gallera cluster: 2 node mariadb 10.5 + garbd 4.7.on centos 7.9 (latest), selinux/firewall off.
- node 1: 10.10.38.13
- node 2: 10.10.38.14
- garb : 10.10.38.11
Its all deployed via Cluster control (SSL is enabled etc..).
Nodes are synced but garb is reporting:
- connection time out and cannot join the cluster
- no nodes coming from prim view, prim not possible.
My knowledge in mysql clusters is basic and this is test/learning environment so I may be missing something.
I tried to pulling the SSL certifacts/keys/CAs and launching the garbd manually but the result was the same.
on the nodes logs /var/log/mysql/mysqd.log there is no sign of connections attempts.
On the witness:
/etc/garb.conf
address = gcomm://10.10.38.13:4567,10.10.38.14:4567
group = TEST-CL
options = gmcast.listen_addr=tcp://0.0.0.0:4567;socket.ssl_cert=/etc/mysql/certs/galera_rep.crt;socket.ssl_key=/etc/mysql/certs/galera_rep.key;socket.ssl_cipher=AES128-SHA
log = /var/log/garbd.log
Here is garbd log:
2021-04-09 05:33:35.013 INFO: Read config:
daemon: 1
name: garb
address: gcomm://10.10.38.13:4567,10.10.38.14:4567
group: TEST-CL
sst: trivial
donor:
options: gmcast.listen_addr=tcp://0.0.0.0:4567;socket.ssl_cert=/etc/mysql/certs/galera_rep.crt;socket.ssl_key=/etc/mysql/certs/galera_rep.key;socket.ssl_cipher=AES128-SHA; gcs.fc_limit=9999999; gcs.fc_factor=1.0; gcs.fc_master_slave=yes
cfg: /etc/garbd.cnf
log: /var/log/garbd.log
2021-04-09 05:33:35.017 INFO: protonet asio version 0
2021-04-09 05:33:35.017 INFO: Using CRC-32C for message checksums.
2021-04-09 05:33:35.017 INFO: backend: asio
2021-04-09 05:33:35.018 INFO: gcomm thread scheduling priority set to other:0
2021-04-09 05:33:35.018 WARN: access file(./gvwstate.dat) failed(No such file or directory)
2021-04-09 05:33:35.018 INFO: restore pc from disk failed
2021-04-09 05:33:35.018 INFO: GMCast version 0
2021-04-09 05:33:35.018 INFO: (5d4c57ba-ba6b, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
2021-04-09 05:33:35.018 INFO: (5d4c57ba-ba6b, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
2021-04-09 05:33:35.018 INFO: EVS version 1
2021-04-09 05:33:35.018 INFO: gcomm: connecting to group 'TEST-CL', peer '10.10.38.13:4567,10.10.38.14:4567'
2021-04-09 05:33:38.020 INFO: (5d4c57ba-ba6b, 'tcp://0.0.0.0:4567') connection to peer 00000000-0000 with addr tcp://10.10.38.13:4567 timed out, no messages seen in PT3S, socket stats: rtt: 618 rttvar: 309 rto: 201000 lost: 0 last_data_recv: 49249008 cwnd: 10 last_queued_since: 3000295346 last_delivered_since: 3000295346 send_queue_length: 0 send_queue_bytes: 0
2021-04-09 05:33:38.020 INFO: (5d4c57ba-ba6b, 'tcp://0.0.0.0:4567') connection to peer 00000000-0000 with addr tcp://10.10.38.14:4567 timed out, no messages seen in PT3S, socket stats: rtt: 414 rttvar: 207 rto: 200000 lost: 0 last_data_recv: 49249009 cwnd: 10 last_queued_since: 3000732789 last_delivered_since: 3000732789 send_queue_length: 0 send_queue_bytes: 0
2021-04-09 05:33:38.020 INFO: EVS version upgrade 0 -> 1
2021-04-09 05:33:38.021 INFO: PC protocol upgrade 0 -> 1
2021-04-09 05:33:38.021 WARN: no nodes coming from prim view, prim not possible
2021-04-09 05:33:38.021 INFO: view(view_id(NON_PRIM,5d4c57ba-ba6b,1) memb {
5d4c57ba-ba6b,0
} joined {
} left {
} partitioned {
})
2021-04-09 05:33:38.521 WARN: last inactive check more than PT1.5S ago (PT3.50243S), skipping check
2021-04-09 05:33:42.021 INFO: (5d4c57ba-ba6b, 'tcp://0.0.0.0:4567') connection to peer 00000000-0000 with addr tcp://10.10.38.13:4567 timed out, no messages seen in PT3S, socket stats: rtt: 654 rttvar: 327 rto: 200000 lost: 0 last_data_recv: 49253009 cwnd: 10 last_queued_since: 2999647151 last_delivered_since: 2999647151 send_queue_length: 0 send_queue_bytes: 0
2021-04-09 05:33:45.022 INFO: (5d4c57ba-ba6b, 'tcp://0.0.0.0:4567') connection to peer 00000000-0000 with addr tcp://10.10.38.14:4567 timed out, no messages seen in PT3S, socket stats: rtt: 569 rttvar: 284 rto: 201000 lost: 0 last_data_recv: 49256010 cwnd: 10 last_queued_since: 2999626360 last_delivered_since: 2999626360 send_queue_length: 0 send_queue_bytes: 0
2021-04-09 05:33:48.027 INFO: (5d4c57ba-ba6b, 'tcp://0.0.0.0:4567') connection to peer 00000000-0000 with addr tcp://10.10.38.13:4567 timed out, no messages seen in PT3S, socket stats: rtt: 567 rttvar: 283 rto: 200000 lost: 0 last_data_recv: 49259016 cwnd: 10 last_queued_since: 3004261735 last_delivered_since: 3004261735 send_queue_length: 0 send_queue_bytes: 0
2021-04-09 05:33:51.526 INFO: (5d4c57ba-ba6b, 'tcp://0.0.0.0:4567') connection to peer 00000000-0000 with addr tcp://10.10.38.14:4567 timed out, no messages seen in PT3S, socket stats: rtt: 428 rttvar: 214 rto: 200000 lost: 0 last_data_recv: 49262515 cwnd: 10 last_queued_since: 3497745923 last_delivered_since: 3497745923 send_queue_length: 0 send_queue_bytes: 0
2021-04-09 05:33:54.527 INFO: (5d4c57ba-ba6b, 'tcp://0.0.0.0:4567') connection to peer 00000000-0000 with addr tcp://10.10.38.13:4567 timed out, no messages seen in PT3S, socket stats: rtt: 607 rttvar: 303 rto: 200000 lost: 0 last_data_recv: 49265516 cwnd: 10 last_queued_since: 2999352449 last_delivered_since: 2999352449 send_queue_length: 0 send_queue_bytes: 0
2021-04-09 05:33:57.528 INFO: (5d4c57ba-ba6b, 'tcp://0.0.0.0:4567') connection to peer 00000000-0000 with addr tcp://10.10.38.14:4567 timed out, no messages seen in PT3S, socket stats: rtt: 422 rttvar: 211 rto: 200000 lost: 0 last_data_recv: 49268516 cwnd: 10 last_queued_since: 2999620548 last_delivered_since: 2999620548 send_queue_length: 0 send_queue_bytes: 0
2021-04-09 05:34:00.529 INFO: (5d4c57ba-ba6b, 'tcp://0.0.0.0:4567') connection to peer 00000000-0000 with addr tcp://10.10.38.13:4567 timed out, no messages seen in PT3S, socket stats: rtt: 301 rttvar: 150 rto: 200000 lost: 0 last_data_recv: 49271518 cwnd: 10 last_queued_since: 3000628968 last_delivered_since: 3000628968 send_queue_length: 0 send_queue_bytes: 0
2021-04-09 05:34:03.530 INFO: (5d4c57ba-ba6b, 'tcp://0.0.0.0:4567') connection to peer 00000000-0000 with addr tcp://10.10.38.14:4567 timed out, no messages seen in PT3S, socket stats: rtt: 356 rttvar: 178 rto: 200000 lost: 0 last_data_recv: 49274519 cwnd: 10 last_queued_since: 2999730135 last_delivered_since: 2999730135 send_queue_length: 0 send_queue_bytes: 0
2021-04-09 05:34:06.531 INFO: (5d4c57ba-ba6b, 'tcp://0.0.0.0:4567') connection to peer 00000000-0000 with addr tcp://10.10.38.13:4567 timed out, no messages seen in PT3S, socket stats: rtt: 502 rttvar: 251 rto: 200000 lost: 0 last_data_recv: 49277520 cwnd: 10 last_queued_since: 2999591262 last_delivered_since: 2999591262 send_queue_length: 0 send_queue_bytes: 0
2021-04-09 05:34:08.037 INFO: PC protocol downgrade 1 -> 0
2021-04-09 05:34:08.037 INFO: view((empty))
2021-04-09 05:34:08.038 ERROR: failed to open gcomm backend connection: 110: failed to reach primary view: 110 (Connection timed out)
at /home/buildbot/buildbot/build/gcomm/src/pc.cpp:connect():160
2021-04-09 05:34:08.038 ERROR: /home/buildbot/buildbot/build/gcs/src/gcs_core.cpp:gcs_core_open():220: Failed to open backend connection: -110 (Connection timed out)
2021-04-09 05:34:08.038 ERROR: /home/buildbot/buildbot/build/gcs/src/gcs.cpp:gcs_open():1632: Failed to open channel 'TEST-CL' at 'gcomm://10.10.38.13:4567,10.10.38.14:4567': -110 (Connection timed out)
2021-04-09 05:34:08.038 INFO: Shifting CLOSED -> DESTROYED (TO: 0)
2021-04-09 05:34:08.039 FATAL: Exception in creating receive loop: Failed to open connection to group: 110 (Connection timed out)
at /home/buildbot/buildbot/build/garb/garb_gcs.cpp:Gcs():35
Any advice is appreciated.
Please sign in to leave a comment.
Comments
0 comments