Cluster Control stability

Comments

6 comments

  • Avatar
    Johan

    Hi Laurent,

    1.  Regarding:

    • Jun 20 09:38:11 : (INFO) Checking if there is a MySQL Server running @ 127.0.0.1
    • Jun 20 09:38:20 : (WARNING) Query select p.pidfile, p.exec_cmd, p.process, p.id as pid, p.hid, p.pgrep_expr from processes p where p.hid=2 and p.active=1 and p.cid=1 failed: Error: Unknown column 'p.pgrep_expr' in 'field list'

    The warning is strange.  Did you upgrade from an earlier version or rolled out 1.1.30 from the Configurator?

    If not, did you apply the cmon_db-1.1.30.sql schema file as a part of the upgrade?

    Download :

     wget -O cmon_db-1.1.30.sql  http://www.severalnines.com/downloads/cmon/cmon_db-1.1.30.sql

    and do:

    mysql -ucmon -p -h127.0.0.1 cmon < cmon_db-1.1.30.sql

    The log message (INFO), sure that is a bit redundant.

    2. Regarding Jun 20 09:42:45 : (WARNING) Could not open /proc/573/stat file

    The agents and the contollers build up a list of what pids are running and then iterating through the list. At this point, a short lived process may have died. We can remove this printout, as it really does not add anything. 

    3. We are aware of some crashing bugs, there are fixes on the way. We will also be adding more traceability.

    4. Pewh, RRD is problematic and we are looking at replacing it.

    I don't know the problem you describe here, that is the first time i see it.

    If you do  

    rm -rf /var/lib/cmon/*

    so that the rrd databases will be initialized again, do you still see the same printouts?

    5. Agree, we need to improve on this.

    6. Regarding disabling galera recovery in ClusterControl, we can look at adding this option. 

    • At your disposal to discuss about these points if needed for more informations in the aim to improve the product.

    We appreciate all the great feedback. 

    Regarding #1 and #4 above, please feel free to create a support ticket if you want to upload any files or additional information. 

    We have had a great response regarding Galera, and the adoption has gone very fast considering v1.0 came out in October 2011.

    Since people use our tools in pretty much any Linux environment with different machine and network setups, behind any type of firewalls, etc, there are naturally situations where new problems occur that need to be investigated and resolved by our engineering team. We do want to resolve as many cases as we can, so that new users will not experience the same problems. Thanks for helping us find these issues. 

     

    Best regards,

    Johan

    0
    Comment actions Permalink
  • Avatar
    Laurent

    Hi Johan,

    First, thanks a lot for your fast answer as usual.

    For point 1, you're right, I did not import this sql script and after doing this, everything seems to be OK and no more warning anymore about this failed query.
    I think that I do not have the 'proper' upgrade method, could you please tell what is the recommanded method to upgrade from version to version please ? At the moment, I'm only doing untaring of http://www.severalnines.com/downloads/cmon/cmon-1.1.30-32bit-glibc23-mc70.tar.gz then stop cmon and unlink the symbolic link to the curently installed version to point it to the new version.
    By the way, I saw that the sql file you mention is not viewable from : http://www.severalnines.com/downloads/cmon/ so for newer versions, I will not be able to know if this script is present/ready or not ?

    For points 2 and 3, this is good to know that you're working on this and are already aware of these problems. 

    About point 4, I removed all files under /var/lib/cmon/ then wait for the next execution of the cron job, it seems I do not have anymore the error "found extra data on update argument" but still the log saying :

    ERROR: /var/lib/cmon//cluster_1_stats.rrd: expected 9 data source readings (got 1) from N

    At the moment, I do not have needs to retain history for graphs because I'm still in testing mode but it could be a problem to have to delete all files and so history is then lost.

    Regarding point 6, I think this could be a great improvement.
    Again ,thanks for your reply, time and support, often people say when there is some problems, things that are not working properly but never to report things that are OK and that we are happy with, so I have to say that I'm very satisfied with your reactivity, responsiveness and quality of answers, moreover for an open source software, IMO it was needed to be say !

    Regards,

    Laurent 

    0
    Comment actions Permalink
  • Avatar
    Johan

    Hi Laurent,

    We will be including upgrade instructions with the release changelog, so please use those to upgrade.

    We believe we have fixed all the issues reported above in 1.1.32, please see http://support.severalnines.com/entries/21633407-released-clustercontrol-v1-1-32, it would be great to hear if this works better for you.

    Thanks again for your feedback, let us know if there is anything else.

    Best regards,

    Johan

    0
    Comment actions Permalink
  • Avatar
    Laurent

    Hi Johan,

    Thanks a lot for these informations, this is great but currently I cannot test this new release because I need 32 bits version. Could you please make it available ?

    Thanks again.
    Regards,

    Laurent 

    0
    Comment actions Permalink
  • Avatar
    Johan

    Hi Laurent,

    32-bit versions are now up:

    http://www.severalnines.com/downloads/cmon/

    Are you planning on upgrading to 64-bit arch anytime soon?

    Thank you,

    Johan

    0
    Comment actions Permalink
  • Avatar
    Laurent

    Hi Johan,

    Thanks for your build and time.

    Unfortunately we still have 32 bits servers for the moment but these are normally planned to be replaced soon.

    Sorry to bother you with that 32 bits build, I would also prefer to use 64 bits instead ...
    Regards, 

    Laurent

    0
    Comment actions Permalink

Please sign in to leave a comment.

Powered by Zendesk