This article to describes how to add an existing cluster into ClusterControl.
Supported clusters are:
- Galera Clusters
- Collection of MySQL Servers
- Mongodb/Tokumx
Assumptions:
- Dedicated server for CMON Controller and UI. Do not colocate with Cluster
- Use the same OS for the Controller/UI server as for the Cluster. Do not mix. It is un-tested.
- We assume in this guide that you install as root. Read more on server requirements.
Outline of steps:
- Setup SSH to the cluster nodes
- Preparations of Monitored MySQL Servers (part of the cluster)
- Installation
- Add the existing cluster
Limitations:
- All nodes must be started and able to connect to. For Galera, all nodes must be SYNCED, when adding the existing cluster
- If controller is deployed on an IP address (hostname=<ipaddr> in /etc/cmon.cnf) and MySQL/Galera server nodes are deployed without skip_name_resolve then GRANTs may fail. http://54.248.83.112/bugzilla/show_bug.cgi?id=141
- GALERA: If wsrep_cluster_address contains a garbd, then the installation may fail.
- GALERA: Auto-detection of the Galera nodes is based on the wsrep_cluster_address. Atleast one Galera node must have wsrep_cluster_address=S1,..,Sn set, where S1 to Sn denotes the Galera nodes part of the Galera cluster.
Setup SSH from Controller -> Cluster, and Controller -> Controller
On the controller do, as user 'root':
ssh-keygen -trsa
(press enter on all questions)
For each cluster node do:
ssh-copy-id root@<cluster_node_ip>
On the controller do (controller needs to be able to ssh to itself):
ssh-copy-id root@<controller_ip>
SSH as a Non-root user
For now you need to setup passwordless sudo.
See Read more on server requirements.
Monitored MySQL Servers - GRANTs
This section applies to "Galera clusters" and "Collection of MySQL Servers".
The mysql 'root' user on the monitored MySQL servers must have the WITH GRANT OPTION specified.
On the monitored mysql servers you need to be able to connect like this
mysql -uroot -p<root password> -h127.0.0.1 -P3306 #change port from 3306 to the MySQL port you use.
mysql -root -p<root password> -P3306 #change port from 3306 to the MySQL port you use.
and NOT LIKE THIS:
mysql -uroot <--- INVALID EXAMPLE
If you cannot connect to the mysql server(s) like the first example above you need to do:
mysql> GRANT ALL ON *.* TO 'root'@'127.0.0.1' IDENTIFIED BY '<root password>' WITH GRANT OPTION;
mysql> GRANT ALL ON *.* TO 'root'@'localhost' IDENTIFIED BY '<root password>' WITH GRANT OPTION;
Installation
Follow the instructions below. It will install a MySQL Server as well (if none is installed already) to host the monitoring and management data, the UI, and the CMON Controller with a minimal configuration.
$ wget http://severalnines.com/downloads/cmon/install-cc.sh
$ chmod +x install-cc.sh
$ sudo ./install-cc.sh
A longer version of the instructions are located here: http://support.severalnines.com/entries/23737976-ClusterControl-Installation
Add Existing Cluster
In the UI, press the button Add Existing Cluster.
- All cluster nodes must have the same OS
- All cluster nodes must be installed in the same place
- All cluster nodes must listen on the same mysql port
In the screenshot below you see an example, adding a Galera Cluster with one node running on 10.0.1.10 (the rest of the galera nodes are auto-detected).
The Galera node on 10.0.1.10 must have the wsrep_cluster_address set, and to verify that you can do (open a terminal on 10.0.1.10 first):
mysql -uroot -h127.0.0.1 -p
show global variables like 'wsrep_cluster_address';
mysql> show global variables like 'wsrep_cluster_address';
+-----------------------+------------------------------------------------------------+
| Variable_name | Value |
+-----------------------+------------------------------------------------------------+
| wsrep_cluster_address | gcomm://10.0.1.10,10.0.1.11,10.0.1.12 |
+-----------------------+------------------------------------------------------------+
1 row in set (0.01 sec)
Click "Add Cluster" and after a while the Cluster will appear, or else a "Job failed" will be shown on the screen and you can investigate what the problem is.
Comments
70 comments
Hi Cindy,
can you create a support ticket on http://support.severalnines.com/tickets/new and then we could perhaps have a remote session to debug this?
Best regards
Johan
Hi Johan,
I'm all set! Here is Alex resolution. These were the changes that were made.
Upgraded to the latest frontend + backend builds of ClusterControl 1.2.9
Enabled passwordless sudo for the 'cmon' os user for all db nodes.
Changed wsrep_cluster_address=gcomm://ip1,ip2,ip3?pc.wait_prim=no to wsrep_cluster_address=gcomm://ip1,ip2,ip3 prim.wait_prim=no can only be used when the SST method is mysqldump while we have rsync set.
Also ClusterControl supports bootstrapping a Galera cluster (and handles cluster/node recovery) so that option is no longer needed either.
Copied /etc/my.cnf.d/cluster.cnf to /etc/my.cnf instead.
Regards,
Cindy
Hi Cindy!
Thanks for updating this forum article with this information!
Much appreciated!
Regarding the "/etc/my.cnf.d/cluster.cnf", in version 1.2.10 we will be able to handle configuration files in different locations than /etc/mysql/my.cnf and /etc/my.cnf ( however, in 1.2.9 "/etc/my.cnf.d/server.cnf" is also supported).
Have a nice weekend.
BR
johan
I'm using the following user 'darkwingduck' to gain access to each of the servers in the cluster however it doesn't appear to be allowing accessing via the credentials i added:
4: Could not SSH to controller host (10.1.1.220): libssh auth error: Access denied. Authentication that can continue: publickey,gssapi-keyex,gssapi-with-mic,password (darkwingduck, key=/home/darkwingduck/.ssh/id_rsa)
3: Verifying the SSH access to the controller.
2: Verifying controller host and cmon password.
1: Message sent to controller
My port number is non-standard and I'm scratching my head about what to do next...
Hi Daniel,
It sounds like you haven't setup passwordless SSH to the controller host. On the ClusterControl host, as darkwingduck user, please do:
ssh-copy-id -i /home/darkwingduck/.ssh/id_rsa darkwingduck@10.1.1.220
Details on this is explained here: http://www.severalnines.com/ccg/28-passwordless-ssh
Regards,
Ashraf
25 - Message sent to controller
26 - Verifying controller host and cmon password.
27 - Verifying the SSH access to the controller.
28 - Could not SSH to controller host (192.168.0.6): libssh auth error: Access denied. Authentication that can continue: publickey,gssapi-keyex,gssapi-with-mic,password (root, key=/root/.ssh/id_rsa)
Job failed.
Im also facing the same issue like this, following up to the solution above same as CIndy, i already allow the port 22 and change the permission for the key but the error still occur.
May i know what others thing that i suppose to do ?
Thanks.
Ahmad,
The error indicates that it couldn't SSH to the controller host (192.168.0.6). Run following command on ClusterControl host:
ssh-copy-id root@192.168.0.6
ClusterControl also needs to have passwordless SSH setup to all managed nodes, including itself.
Regards,
Ashraf
Hi,
I restarted testing this application on different test environment. After confirming all prerequisites are properly working, I am now adding the existing cluster. I created separate mysql user instead of root. Problem is the job is failing without detailed reason. What is the possible caused of this?
Using cmon user with proper grants.
54 - Message sent to controller
55 - Verifying controller host and cmon password.
56 - Verifying the SSH access to the controller.
57 - Verifying job parameters.
58 - Verifying the SSH connection to 10.40.191.171.
59 - Verifying the MySQL user/password.
60 - monitored_mysql_root_password is not set, please set it later the generated cmon.cnf
61 - Getting node list from the MySQL server.
62 - Found node: '10.40.191.171'
63 - Found node: '10.40.193.69'
64 - Found node: '10.40.192.75'
65 - Found in total 3 nodes.
66 - Checking the nodes that those aren't in another cluster.
67 - Verifying the SSH connection to the nodes.
68 - Check SELinux statuses
69 - Detected that skip_name_resolve is not used on the target server(s).
70 - Granting the controller on the cluster.
71 - Node is Synced : 10.40.191.171
72 - Node is Synced : 10.40.193.69
73 - Node is Synced : 10.40.192.75
Job failed.
And this result when I use root mysql user.
**
78 - Message sent to controller
79 - Verifying controller host and cmon password.
80 - Verifying the SSH access to the controller.
81 - Verifying job parameters.
82 - Verifying the SSH connection to 10.40.191.171.
83 - Verifying the MySQL user/password.
84 - Getting node list from the MySQL server.
85 - Found node: '10.40.191.171'
86 - Found node: '10.40.193.69'
87 - Found node: '10.40.192.75'
88 - Found in total 3 nodes.
89 - Checking the nodes that those aren't in another cluster.
90 - Verifying the SSH connection to the nodes.
91 - Check SELinux statuses
92 - Detected that skip_name_resolve is not used on the target server(s).
93 - Granting the controller on the cluster.
94 - Node is Synced : 10.40.191.171
95 - Node is Synced : 10.40.193.69
96 - Node is Synced : 10.40.192.75
97 - Detecting the OS.
Job failed.**
Please see attached files for admin cluster jobs logs.
Regards,
Cindy
Cindy,
Can you please check the details of the Admin -> Cluster Jobs -> Failed job? They may give some more insight into what happened than the messages in the progress window (and by looking at what you pasted it looks to me like those are logs from the progress window). It's especially true when there are issues with credentials.
Thanks,
Krzysztof
i Krzysztof,
Here it is. I can successfully ssh 10.40.123.67 to 10.40.191.171 using the cmon user. datadir also defined in the config file. Do I need outbound ssh connection from cluster nodes to controller? (vice versa ssh connection)
124: Connection failed from 10.40.123.67 to 10.40.191.171: can't determine the datadir on '10.40.191.171:3306' connect, failed for user 'cmon': Can't connect to MySQL server on '10.40.191.171' (4) (errno: 2003)
123: Checking connectivity and determining the datadir on the MySQL nodes.
122: Detected OS = 'redhat'
121: Detecting the OS.
120: Node is Synced : 10.40.192.75
119: Node is Synced : 10.40.193.69
118: Node is Synced : 10.40.191.171
117: Granting the controller on the cluster.
116: Detected that skip_name_resolve is not used on the target server(s).
115: Check SELinux statuses
114: Verifying the SSH connection to the nodes.
113: Checking the nodes that those aren't in another cluster.
112: Found in total 3 nodes.
111: Found node: '10.40.192.75'
110: Found node: '10.40.193.69'
109: Found node: '10.40.191.171'
108: Getting node list from the MySQL server.
107: monitored_mysql_root_password is not set, please set it later the generated cmon.cnf
106: Verifying the MySQL user/password.
Thanks,
Cindy
I have already run this command.
MariaDB [(none)]> GRANT ALL ON*.*TO 'cmon'@'10.40.123.67' IDENTIFIED BY '<password>' WITH GRANT OPTION;
Query OK, 0 rows affected (0.00 sec)
MariaDB [(none)]> flush privileges;
Query OK, 0 rows affected (0.01 sec)
Cindy,
This error:
124: Connection failed from 10.40.123.67 to 10.40.191.171: can't determine the datadir on '10.40.191.171:3306' connect, failed for user 'cmon': Can't connect to MySQL server on '10.40.191.171' (4) (errno: 2003)
is related to the MySQL connectivity. Are you sure that you can connect from 10.40.123.67 to 10.40.191.171 on port 3306? Error seems to indicate it's not possible.
Thanks,
Krzysztof
Krzysztof,
Please see attached screenshot of connecting to 171 from controller (.67) using cmon user.
Thanks,
Cindy
Cindy,
I'm not talking about ssh. The problem, if exists, is with regards to MySQL connectivity. Can you run "telnet 10.40.191.171 3306" from 10.40.123.67 host?
Thanks,
Krzysztof
Hi,
Can someone help me please i'm new to this cluster control setup. i run the apt-get install mariadb-client mariadb-galera-server-5.5 percona-toolkit percona-xtrabackup manually then its works fine but as soon as i run threw the cluster control its fails i think its overrides my source.list file on ubuntu 12.04 LTS
Here is my error message:
34: Setting up the first server failed, aborting
33: clusterwb1: Setup server failure, see the previous msgs.
32: clusterwb1: failed apt-get install mariadb-client mariadb-galera-server-5.5 percona-toolkit percona-xtrabackup exited with 100: E: Unable to correct problems, you have held broken packages.
31: clusterwb1: Installing MariaDB-5.5 debian
30: clusterwb1: Using External repositories.
29: clusterwb1: Installing the MySQL packages.
28: clusterwb1: Prepare MySQL environment (user/group)
27: clusterwb1: Removing old MySQL packages from the host.
26: clusterwb1: Detected free disk space: 55701 MB
25: clusterwb1: Checking free-disk space of /var/lib/mysql
24: clusterwb1: Detected memory: 1467 MB
23: clusterwb1: Detecting total memory.
22: clusterwb1: Detected CPU cores: 2
21: clusterwb1: Detecting number of CPU cores/threads.
20: Verifying helper packages (checking if 'socat' is installed successfully).
19: clusterwb1: Installation report of helper packages: ok: psmisc ok: rsync ok: libaio1 ok: netcat ok: netcat-openbsd ok: socat ok: lsb-release ok: libssl0.9.8 ok: libssl1.0.0 ok: libdbd-mysql-perl ok: wget ok: curl ok: pigz
18: clusterwb1: Installing helper packages.
17: clusterwb1: Setting vm.swappiness = 1
16: clusterwb1: Tuning OS parameters.
15: clusterwb1: Keeping existing firewall settings.
14: clusterwb1: Detecting OS
13: clusterwb1: Checking SELinux status.
12: Using sudo_password.
11: clusterwb1: Verifying sudo on the server.
10: Verifying the SSH connection to clusterwb1.
9: clusterwb1: Checking if host is already exist in other cluster.
8: Checking job parameters.
7: create_cluster: calling job: setupServer(clusterwb1).
6: Testing SSH to 192.168.11.38.
5: Testing SSH to clusterwb3.
4: Testing SSH to clusterwb2.
3: Testing SSH to clusterwb1.
2: Verifying job parameters.
1: Message sent to controller
Hi,
Can you please paste here the output of:
dpkg --get-selections | grep hold
Thanks,
Krzysztof
Krzysztof,
I get no output on my testing server nothing is holding the package i tried to install then run it again and hold or unhold package gives me this error
456 - Message sent to controller
457 - Verifying job parameters.
458 - Testing SSH to clusterwb1.
459 - Testing SSH to clusterwb2.
460 - Testing SSH to clusterwb3.
461 - Testing SSH to 192.168.11.38.
462 - create_cluster: calling job: setupServer(clusterwb1).
463 - Checking job parameters.
464 - clusterwb1: Checking if host is already exist in other cluster.
465 - Verifying the SSH connection to clusterwb1.
466 - clusterwb1: Verifying sudo on the server.
467 - Using sudo_password.
468 - clusterwb1: Checking SELinux status.
469 - clusterwb1: Detecting OS
470 - clusterwb1: There is a mysqld server running. It must be uninstalled first, or you can also add it to ClusterControl.
471 - Setting up the first server failed, aborting
Thanks,
Chris
Hi,
This error is expected - if you asked CC to provision a host where MySQL is already running, it will stop the process in case it's a mistake - we don't want to cause any harm.
I'm not sure what exactly happened here, though - have you installed the MariaDB node manually? Can it be stopped? If yes, you can try two things. First - you can stop the MariaDB and then restart the job to create a new cluster from CC UI. If it fails with the previous error (Unable to correct problems, you have held broken packages), you can try to setup MariaDB node manually and then do "Add existing server/cluster" from CC UI.
If the problem persists, I'd suggest to open a ticket with us and we can try to look at what's going on using some kind of screenshare session (Teamviewer or Join.me).
Thanks,
Krzysztof
Hi,
Its very strange to me ass well but ill setup one node manually then can i add the rest "Add existing server/cluster" from CC UI or should i setup all of them
Thanks,
Chris
Hi,
Once you setup this single node and then add it as "Add existing server/cluster" a new cluster will show up in CC UI. You can enter this cluster and from within do "Add node". You'll have two options - you can either provision it from scratch or add existing one, when node is a part of the cluster. I'd suggest to try to setup a new one. If you'll see the similar problem as with the first one, please open a ticket with us. While it's possible to setup everything manually and then add those nodes to the CC, it's not how we'd like it to work. If you can't provision a node from scratch, it's either a problem in your particular setup or a bug in CC. We'd like to learn what's going on no matter what's the culprit - we'd like to try to fix a bug or find a workaround, whatever is causing problems..
Thanks,
Krzysztof
Hi,
I got the same output error (Unable to correct problems, you have held broken packages) what you suggested. on my previous question should i setup complete clustering and then add one manually when I'm done. why i ask is that i got a limitation on testing hardware.
Thanks
Chris
Hi,
Its working when i setup MariaDB manually and then add to "Add existing server/cluster" from CC UI.
Just quick small question i already got Loadbalancer setup and my plan initially is to use MariaDB Galera with 3 nodes and cluster control. so how do i add the other nodes to use MariaDB Galera clustering do i just click add node.
Thanks
Chris
HI,
What load balancer is it?
BR
johan
Hi,
I use haproxy for my webserver i saw on the UI you can add existing loadbalancer will it work?
Thanks,
Chris.
hi,
yes, HAProxy is supported.
BR
johan
Hi,
I hit strange error when adding standalone MySQL
77 - Message sent to controller
78 - Verifying controller host and cmon password.
79 - Verifying the SSH access to the controller.
80 - Verifying job parameters.
81 - Verifying the SSH connection to <IP ommited>.
82 - Verifying SUDO on <IP ommited>.
83 - Passwordless sudo for user 'root' is not available on the host <IP ommited>: sudo error, retval: 127, output: 'sh: 1: sudo: not found'
Job failed.
I dont realy understand why is the script atempting to test "sudo" when logging in as root... any advice?
Hi,
We'll take a look at this. It may take a while as some internal syncing with developers may be needed. We'll keep you posted as soon as we come up with some solution.
Thanks,
Krzysztof
Hi,
Assuming you are on 1.2.11 already:
Please do (as root/sudo):
service cmon stop
debian/ubuntu:
apt-get update && apt-get install clustercontrol-controller clustercontrol clustercontrol-cmonapi
redhat/centos:
yum clean all; yum install clustercontrol-controller clustercontrol clustercontrol-cmonapi
Then do:
service cmon start
You should now have:
Controller: 1.2.11-998
UI: 1.2.11-842
Many thanks for this report!
BR
Johan
Hi,
its working fine after update,
thanks for quick assistance
BR
Petr
Hello,
I don't understand why the add cluster doesn't work on the installation I've made. All servers running on redhat 7. Galera cluster with mariadb 10.1.11 on 3 nodes running galera-3-25. Cluster control controller running with cmon, version 1.2.12.1111.
Job output is:
1 Message sent to controller
2 Verifying controller host and cmon password.
3 Checking the SSH access to the controller (10.52.202.116)
4 Checking ssh/sudo on 1 hosts.
5 10.52.202.116: access with ssh/sudo granted
6 All 1 hosts are ok.
7 Verifying job parameters.
8 Checking ssh/sudo on 1 hosts.
9 10.52.207.81: access with ssh/sudo granted
10 All 1 hosts are ok.
11 10.52.207.81: Verifying the MySQL user/password.
12 10.52.207.81: Getting node list from the MySQL server.
13 Found node: '10.52.207.81'
14 Found node: '10.52.207.42'
15 Found node: '10.52.207.59'
16 Found in total 3 nodes.
17 Checking that nodes are not in another cluster.
18 Checking ssh/sudo on 3 hosts.
19 10.52.207.81: already checked 1 seconds ago.
20 10.52.207.42: access with ssh/sudo granted
21 10.52.207.59: access with ssh/sudo granted
22 All 3 hosts are ok.
23 Check SELinux statuses
24 Detected that skip_name_resolve is used on the target server(s).
25 Granting the controller on the cluster.
26 granting addresses: 10.52.202.116
27 10.52.207.81: Node is synced.
28 10.52.207.42: Node is synced.
29 10.52.207.59: Node is synced.
30 Connected succesfully: 'cmon@10.52.202.116' -> '10.52.207.81:3306'.
31 MySQL[10.52.207.81] datadir: /var/lib/mysql/
32 Connected succesfully: 'cmon@10.52.202.116' -> '10.52.207.42:3306'.
33 MySQL[10.52.207.42] datadir: /var/lib/mysql/
34 Connected succesfully: 'cmon@10.52.202.116' -> '10.52.207.59:3306'.
35 MySQL[10.52.207.59] datadir: /var/lib/mysql/
36 Detecting the OS.
37
Any ideas why?
Thanks and regards,
Mmin
Please sign in to leave a comment.