Why do I get: (ERROR) cmon_coredir not set, set it in /etc/cmon.cnf

Clint Alexander

August 04, 2012 20:06

My control server loads up just fine, but at the end it gives the error:

(ERROR) cmon_coredir not set, set it in /etc/cmon.cnf

Of course its lying to me. I have it set in cmon.cnf

By the way, cmon_coredir or cmon_core_dir?

I had defined them both just in case:

cmon_core_dir=/root/clustercontrol
cmon_coredir=/root/clustercontrol

So why is it not finding this option in the conf file and how can this be fixed?

Comments

14 comments

Johan

August 05, 2012 08:06
Hi Clint,

It should be:

cmon_core_dir=/root/clustercontrol

(take out the cmon_coredir)

Are you installing on a existing cluster or on a cluster deployed with the Severalnines Deployment package from the Configurator?

Thank you,

Johan
0

Comment actions Permalink
Clint Alexander

August 05, 2012 19:57
The "cmon_coredir" is in the message printed by cmon. The config file has the cmon_core_dir"... I just wanted to be sure.

This is an existing cluster setup previously without cmon.
0

Comment actions Permalink
Clint Alexander

August 06, 2012 08:58
FYI --

Server: RedHat Enterprise, CentOS 5.8
Cluster Network:
- 4 data nodes with 2 replicas
- 2 api/mysql nodes
- 1 management node
- The cmon database and controller is on the management node.

- On managment node, while ndb_mgmd is running, the local database is not connected to the cluster in any way.

Local Paths:
/usr/local/cmon/ --- the source (tar.gz)
/var/www/cluster.management/ --- the cmon website/files

I followed the instructions given here:
http://support.severalnines.com/entries/20613923-installation-on-an-existing-cluster

I was using "service cmon start [restart|stop|status]" to manage the process.

-----------------------------------------------------------
I did try the RPM a few times, but I like using src files because I usually can find whatever I'm missing and put things together by hand if necessary.

After re-copying /usr/local/cmon/etc/init.d/cmon to /etc/init.d/cmon (which I've done many times already) for some reason the cmon process began to run. I really don't know why.

-----------------------------------------------------------
There are some new errors, but I'm assuming this is due to a problem I'm now having on an agent (mentioned below). The error is:
Aug 06 00:35:40 : (ERROR) Failed to init - check you have set ssh user, staging dir, and bindir in the Cluster Settings and Global Settings
Aug 06 00:35:40 : (WARNING) Workflow failed to init - proceeding but with limited functionality.
Aug 06 00:35:40 : (ERROR) MYSQL BINDIR not set - set in 'Cluster Setup'
Aug 06 00:35:40 : (WARNING) Could not init Galera sub

... as before, these values are defined in the cmon.conf

-----------------------------------------------------------
In my search for resolution, I looked into the create bin/cmon_rrdcreate code, and I don't have a cmon_rrd or anything related to the rrdtool in /etc/*

As expected, I don't see any graphs being made.

-----------------------------------------------------------
ON MySQL API - AGENT

On an agent which is an API on the cluster network, startup fails with

Aug 06 01:55:59 : (INFO) IPv4 address: 192.168.1.2 (cluster02)
Aug 06 01:55:59 : (INFO) Setting up threads to handle monitoring
Aug 06 01:55:59 : (INFO) Starting MySQLCollectorThread
Aug 06 01:55:59 : (INFO) Starting host collector
Aug 06 01:55:59 : (INFO) Starting Process Manager Thread
Aug 06 01:55:59 : (INFO) Starting Bencher Thread
Aug 06 01:55:59 : (INFO) Connected to MySQL Server @ 127.0.0.1
Aug 06 01:55:59 : (INFO) Hostname=cluster02, ip=192.168.1.2, skip_name_resolve=0
Aug 06 01:55:59 : (INFO) Connected to local mysql server!
Aug 06 01:55:59 : (ERROR) Critical error (mysql error code -1) occured - shutting down

MySQL error log on the Controller has:
Error: Got an error reading communication packet

This is usually a network error, but the internal network is working just fine.

I've used both RPM and SRC in an attempt to fix this. I also used "cmon_install_agent.sh" (or whatever the name was). It actually made me work more on the cmon.cnf, having to change default values that were not asked about during installation.

Other smaller issues I've run into:

When using 127.0.0.1, it fails if you don't have a /tmp/mysql.sock. There is no option to change the mysql.sock path. I have to use hostnames instead to get the controller to run.

The cron scripts are not mentioned in the online installation guide:
http://support.severalnines.com/entries/20613923-installation-on-an-existing-cluster

The pid file /var/run/cmon* and the lock file /var/lock/cmon* is not removed on failures.

-----------------------------------------------------------
I dont see anyone else having these problem which concerns me, like I did something unusual or something. However, I do not have any unusual setups, and a very basic layout inside a normal internal network. So I'm not sure what else it could be...
0

Comment actions Permalink
Clint Alexander

August 06, 2012 09:02
Additional small issue -- RPM xinit.d script does not have "status" option, but the src.tar.gz does.
0

Comment actions Permalink
Clint Alexander

August 06, 2012 09:10
Additinoal small issue - the rrd errors do not reflect the actual file names, etc (much like the error before: cmon_coredir)

[root@clustermgmt ~]# /usr/local/cmon/bin/cmon_rrd_all /usr/local/cmon/bin
/usr/local/cmon/bin
usage: ./create_rrd <path to rrd.conf>

Why does it say "./create_rrd? Isn't it suppose to read $0?

The file that comes close to this is cmon_create_rrd.
0

Comment actions Permalink
Johan

August 06, 2012 12:10
Hi Clint,

I am looking at this.

Thanks for you comments about the cmon_coredir vs cmon_core_dir.

On the controller, did you try to set (if using RPMs) :

mysql_basedir=/usr/

or if using mysql-cluster tar ball install, e.g::

mysql_basedir=/usr/local/mysql

I saw this was missing in the install instructions here http://support.severalnines.com/entries/20613923-installation-on-an-existing-cluster

BR
johan
0

Comment actions Permalink
Clint Alexander

August 06, 2012 12:24
Yes I did. I made sure every single key/val was set correctly.

I went ahead and converted all ndbcluster tables to myisam and rebuilt the cluster through the automatic script.

I had to run the script a few times because it didn't expect to see another cmon installation.

Also, I had to reinstall php53 because the auto-install script installs 5.1 in CentOS. It would probably be best to simply provide links to the admin for them to download and install httpd and php. In fact, the instructions should probably link to the generic download page so you don't have to update it every version and let the sysadmin do the actual pick/select and download.

The autoscript came out successful, but I had to use a different mysql for the api for fear it would destroy the api installation. I was right.. it removed mysql if found prior to reinstalling it. This caused a few issues that I had to fix as well.

I think I'll build an Errata and send it over to you. I like the Cluster management tool. It saves me from writing my own (which is what I do on my network). Unfortunately, we'll never be able to authorize the license purchase as this is a gov. project and getting them to spend anything is almost impossible.I say this because I went to remove that temporary api and got the message about needing a key... I removed it from the config files but I can deal with it if it doesn't remove it completely...
0

Comment actions Permalink
Clint Alexander

August 06, 2012 15:39
I just noticed (or was informed actually) that the management server's mysql (where cmon db is located) has 137 processes in the processlist. It prevented other things from running because max_connections was reached.

Is cmon not closing links or something to create so many connections like this?
0

Comment actions Permalink
Johan

August 06, 2012 17:40
Can you attach your /var/log/cmon.log here?
0

Comment actions Permalink
Clint Alexander

August 06, 2012 20:45
attached.

FYI...

192.168.1.8 | server08 = cluster manager & cmon controller

192.168.1.4 | server04 = cluster data node

192.168.1.5 | server05 = cluster data node

192.168.1.6 | server06 = cluster data node

192.168.1.7 | server07 = cluster data node

192.168.1.2 | server02 = cluster api node

192.168.1.3 | server03 = cluster api node

192.168.1.17 | server17 = cluster api node (temporary)

server02 and server03 was not installed with the configurator script (as I said before, I did not risk the current production databases)

I attempted to add api server02 and server03 by adding them in server08:/etc/cmon.cfg and installing the agent rpm and updating its cnf file and finally starting cmon on those servers.

I also noticed on server02 that cmon is running several cmon processes at one time. Can't figure that one out but shutting it down stopped maxing out the connections.
0

Comment actions Permalink
Johan

August 08, 2012 15:21
Clint,

I have fixed the issues you have seen here. We will build a new version and then we have fixed a lot in the cmon_install_controller.sh script that allows you to setup the correct cmon.cnf files and also shows you want grants that needs to be granted.

I will keep you posted here when you can try the new version and it would be great if you can provide feedback on that part.

Best regards,
J
0

Comment actions Permalink
Clint Alexander

August 08, 2012 16:51
No problem. I'll give it a thorough review.

PS - I sent some email to 'sales@' about some info but haven't received any response. Can you check on that for me?
0

Comment actions Permalink
Johan

August 08, 2012 17:21
Hi Clint, I just checked into our CRM system and could not see any request from you there.. can you resend it to sales@severalnines.com, in case there was a typo.. thanks.

Johan
0

Comment actions Permalink
Clint Alexander

August 08, 2012 17:26
.... sent.
0

Comment actions Permalink

Please sign in to leave a comment.

Comments

Didn't find what you were looking for?