cmon cannot start after power failure centos 7

Comments

7 comments

  • Official comment
    Avatar
    Ashraf Sharif

    Hi,

    Does /var/run/cmon.pid exist? If yes, what is the content of it? Please remove it first since cmon is not actually running in the host. It might be due to the unclean shutdown happened during the power issue.

    Regards,
    Ashraf

    Comment actions Permalink
  • Avatar
    Tenuun

    There is no /var/run/cmon.pid

    Ack search result finds this PID 11733 only in an earlier centos audit log.

    And MySQL dump | grep and ack finds "11733" is within mariadb/cmon/cmon_status table

    But cmon refuses to start. I tried killing almost all daemon and flushed most of /var/run/ PID files.

     

    0
    Comment actions Permalink
  • Avatar
    Tenuun

    After force fsck and a dozen reboots, I was able to start the cluster control at last. Unfortunately, I don't know what exactly went wrong in the first place.

    Now my paranoia is kicking in and I am worried the cluster control will try and change settings of running pgsql hosts. I don't want cluster control deleting psql /main dir on running psql node or changing master automatically since I have manually set up them while cluster control was unable to start.

    How do I disable all auto-recovery or auto master selection of the cluster control before joining it back onto the network? Is there a way dry run or forbid changes to nodes on cluster control?

    Now I have disconnected the cluster control from the network so it would not change settings automatically on manually started running PostgreSQL cluster. 

     

    Best regards,

    Tenuun

    0
    Comment actions Permalink
  • Avatar
    Tenuun

    if someone sees this issue just add line

    enable_autorecovery=0

    inside 

    /etc/cmon.cnf

    https://severalnines.com/blog/installing-clustercontrol-standby-server

    0
    Comment actions Permalink
  • Avatar
    Tenuun

    Found the error. Centos7

    The culprit was time settings on mariadb

    Journalctl -u mariadb time stamp was wrong. so checking the now() yeilded wrong time so update to use system time.

    Make sure you ntpd update and then hwclock -systohc

    Regardless CMON service will start but not work completely netstat -tulpn | 9500 will be empty no errors given and no logs in /var/log/cmon.log written.

    mysql> select now();
    +---------------------+
    | now()               |
    +---------------------+
    | 2017-10-26 15:13:16 |
    +---------------------+
    1 row in set (0.07 sec)
    
    mysql> SET GLOBAL time_zone = 'SYSTEM';
    Query OK, 0 rows affected (0.07 sec)
    
    mysql> show global variables like '%time_zone%';
    +------------------+---------------------+
    | Variable_name    | Value               |
    +------------------+---------------------+
    | system_time_zone | India Standard Time |
    | time_zone        | SYSTEM              |
    +------------------+---------------------+
    2 rows in set (0.18 sec)
    
    mysql>
    0
    Comment actions Permalink
  • Avatar
    MOHD HAFRIZ NURAL AZHAN

    Hi

    I have same problem, but using ubuntu.

    2023-12-14T17:00:07.435Z : (INFO) Thread UserManager is running (LWP 5079).
    2023-12-14T17:00:07.436Z : (INFO) CmonRpcServerPrivate::log()
    2023-12-14T17:00:07.436Z : (INFO) Server started at tcp://127.0.0.1:9500
    2023-12-14T17:00:07.437Z : (INFO) CmonRpcServerPrivate::log()
    2023-12-14T17:00:07.437Z : (INFO) Thread RpcServer:9500 is running (LWP 5080).
    2023-12-14T17:00:07.437Z : (INFO) Server started at tls://127.0.0.1:9501
    2023-12-14T17:00:07.437Z : (INFO) Checking command handler.
    2023-12-14T17:00:07.437Z : (INFO) Thread RpcServer:9501 is running (LWP 5081).
    2023-12-14T17:00:07.445Z : (INFO) CDT entry '/.runtime/controller_clock' found.
    2023-12-14T17:00:07.445Z : (ERROR) The controller on host clustercontrol with pid 1268 is already running (seen -24920s ago).
    2023-12-14T17:00:08.445Z : (ERROR) The Cmon Controller is exiting...

    Have try all the method above but still not working. Any idea? I try to restore my server to known working restore point and its working but once i reboot the server, cmon stop working

    0
    Comment actions Permalink
  • Avatar
    Paul Namuag

    Hi MOHD HAFRIZ NURAL AZHAN

    Since you filed a ticket already. We'll take it from there.

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk