Cluster details cannot be retrieved. Please check the CMON process status (service cmon status). Also, ensure the dcps.apis token matches the rpc_key in /etc/cmon.cnf.
Problem:
"Cluster details cannot be retrieved. Please check the CMON process status (service cmon status). Also, ensure the dcps.apis token matches the rpc_key in /etc/cmon.cnf." when accessing the Clusters page.
"Unable to connect to 127.0.0.1:9500" on AJAX refresh
Pre-existing cluster is showing with UNKNOWN status.
This occurred after an instance of disk space exhaustion on the controller, which has since been resolved.
Troubleshooting:
- Cmon services are running; No errors in service status output.
- MySQL service is running and I can connect to the cmon and dcps schema as root. Can export both these schemas without an error, so doubt MySQL corruption is the issue.
- Have checked that token in the cmon database matches the rpc_key in /etc/cmon.cnf
- Have checked that the RPC_TOKEN in /var/www/html/clustercontrol/bootstrap.php matches the rpc_key in /etc/cmon.cnf
- Have explicitly set the cmon bind-address to be 127.0.0.1 and the Internal IP of the server. (Per another thread)
- Have checked time an timezone settings to ensure NTP was running correctly. (Per another thread)
Observations of Interest
- Flicking through the Cluster Control webapp shows that there are no Operational Reports or Scheduled Backup jobs despite there being some in the database. Not sure if this is because components aren't working, or if it gets this information from the database directly?
Errors in Logs (Probably unrelated):
From /var/log/cmon.log:
- (ERROR) Skip configuration '/etc/cmon.d/cmon_1.cnf', error: Duplicated clusterid (0) in file '/etc/cmon.d/cmon_1.cnf'
From /var/lib/mysql/error.log:
- 2020-06-26T06:26:02.636644Z 0 [ERROR] Function 'auth_socket' already exists
- 2020-06-26T06:26:02.636660Z 0 [Warning] Couldn't load plugin named 'auth_socket' with soname 'auth_socket.so'.
Environment:
- Ubuntu 18.04 LTS
- Controller: 1.7.6.3996
- Frontend: 1.7.6.6976-#1bd5f8
Any advice on things to check would be appreciated; Thanks!
-
Official comment
Hi,
First of all, please double-check if you can access database using the credentials from /etc/cmon.cnf.
Secondly, how did you check the backup schedule? How many clusters do you have? Are all broken or some are accessible? What's the status of the cluster with the ID of 1? The error in the cmon.log is a bit disturbing.
Thanks,
KrzysztofComment actions -
Hi Krzysztof,
1. Yes, I can connect to the database using the credentials supplied in /etc/cmon.cnf. SHOW TABLES returns a list of all tables as expected.
2. When using the web interface to view scheduled backup tasks: /clustercontrol/#/cluster:1,g:backup - Doesn't show the scheduled entry that I was expecting to be there. But this is a side symptom of the bigger issue, I think.
3. There is only one cluster. That cluster has the ID of 1 and is showing in the web interface as UNKNOWN -
Hi,
Unfortunately, we would need more data to understand what is happening. If you have a support contract, please open a ticket with us at support.severalnines.com. Please make sure that you include an error report. You can try to generate it from the command line:
s9s_errror_reporter -i 1
or, if that won't work:
s9s_errror_reporter -i 1Thanks,
Krzysztof
Please sign in to leave a comment.
Comments
3 comments