MD5SUM Failure During Restore

March 03, 2016 18:15

Every time I am trying to restore a backup I get a failure:

Failed to restore backup. Exit Code: 1	530	03 Mar 2016 11:11:17
md5sum : 29c08f6d9223f44c4dc37cf7fd36a6e0 != e170aa531214fd7e11b934f462a76950 Exit Code: 1	530	03 Mar 2016 11:11:17
Copying /var/backups/BACKUP-149/backup-full-2016-02-21_000012.xbstream.gz to 10.1.105.225:/root/s9s_tmp/1/galera/restore Exit Code: 0	530	03 Mar 2016 11:11:09
Checking disk space on 10.1.105.225 Exit Code: 0

Why is this an issue?

Comments

13 comments

Jason Mallory

March 03, 2016 18:16
This is the same error on every backup
0

Comment actions Permalink
Johan

March 03, 2016 18:52
Run:

On the controller:

md5sum /var/backups/BACKUP-149/backup-full-2016-02-21_000012.xbstream.gz

On the node 10.1.105.225:
md5sum /root/s9s_tmp/1/galera/restore/backup-full-2016-02-21_000012.xbstream.gz
0

Comment actions Permalink
Johan

March 03, 2016 18:52
what do you get?
0

Comment actions Permalink
Jason Mallory

March 03, 2016 18:58
Wrong md5sum. I think it's on our end. Our backups got corrupted somehow
0

Comment actions Permalink
Johan

March 03, 2016 19:07
Are the file size even the same after the copy?

You can also try to unpack the xtrabackup on the controller if you want to see that it unpacks okay.

In this case i recommend you copy it to a temp place:

mkdir /root/temp

cp /var/backups/BACKUP-149/backup-full-2016-02-21_000012.xbstream.gz /root/temp

cd /root/temp

gunzip backup-full-2016-02-21_000012.xbstream.gz

xbtstream -x < backup-full-2016-02-21_000012.xbstream

BR

johan
0

Comment actions Permalink
Jason Mallory

March 03, 2016 19:54
Should we rm -rf /var/lib/mysql/* then move the files back?
0

Comment actions Permalink
Johan

March 03, 2016 20:15
Did you figure out the md5 issue?

If so, i recommend you use the UI.

If not,

cd /root/temp
innobackupex --apply-log ./
rm -Rf /var/lib/mysql/*
innobackupex --copy-back ./
chown mysql:mysql -R /var/lib/mysql

BR
johan
0

Comment actions Permalink
Jason Mallory

March 03, 2016 20:35
The UI still fails but expanding the files all worked
0

Comment actions Permalink
Jason Mallory

March 03, 2016 20:36
I expanded the data on a cluster node under /var/tmp
0

Comment actions Permalink
Jason Mallory

March 03, 2016 21:27
2016-03-03 21:02:42 7ef7c8dfa700 InnoDB: Error: page 122 log sequence number 326161282961 InnoDB: is in the future! Current system log sequence number 325435368112. InnoDB: Your database may be corrupt or you may have copied the InnoDB InnoDB: tablespace but not the InnoDB log files. See InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html InnoDB: for more information. 2016-03-03 21:02:42 26427 [Note] WSREP: inited wsrep sidno 1 2016-03-03 21:02:42 26427 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification. 2016-03-03 21:02:42 26427 [Note] WSREP: REPL Protocols: 7 (3, 2) 2016-03-03 21:02:42 26427 [Note] WSREP: Service thread queue flushed. 2016-03-03 21:02:42 26427 [Note] WSREP: Assign initial position for certification: 0, protocol version: 3 2016-03-03 21:02:42 26427 [Note] WSREP: Service thread queue flushed. 2016-03-03 21:02:42 26427 [Note] WSREP: Synchronized with group, ready for connections 2016-03-03 21:02:42 26427 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification. 2016-03-03 21:02:42 7ef7c8dfa700 InnoDB: Error: page 783 log sequence number 326160894603 InnoDB: is in the future! Current system log sequence number 325435368112. InnoDB: Your database may be corrupt or you may have copied the InnoDB InnoDB: tablespace but not the InnoDB log files. See InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html InnoDB: for more information. 2016-03-03 21:02:42 7ef7c8dfa700 InnoDB: Error: page 308 log sequence number 326160894609 InnoDB: is in the future! Current system log sequence number 325435368112. InnoDB: Your database may be corrupt or you may have copied the InnoDB InnoDB: tablespace but not the InnoDB log files. See InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html InnoDB: for more information. 2016-03-03 21:02:42 7ef7c8dfa700 InnoDB: Error: page 539 log sequence number 326160894609 InnoDB: is in the future! Current system log sequence number 325435368112. InnoDB: Your database may be corrupt or you may have copied the InnoDB InnoDB: tablespace but not the InnoDB log files. See InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html InnoDB: for more information. 2016-03-03 21:02:42 7ef7c8dfa700 InnoDB: Error: page 307 log sequence number 326160894615 InnoDB: is in the future! Current system log sequence number 325435368112. InnoDB: Your database may be corrupt or you may have copied the InnoDB InnoDB: tablespace but not the InnoDB log files. See InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html InnoDB: for more information. 2016-03-03 21:02:42 7ef7c8dfa700 InnoDB: Error: page 556 log sequence number 326160894615 InnoDB: is in the future! Current system log sequence number 325435368112. InnoDB: Your database may be corrupt or you may have copied the InnoDB InnoDB: tablespace but not the InnoDB log files. See InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html InnoDB: for more information.
0

Comment actions Permalink
Jason Mallory

March 03, 2016 21:43
10.1.105.227: last 100 lines of mysql error log:
2016-03-03 21:39:29 11145 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1
2016-03-03 21:39:30 11145 [Note] WSREP: Passing config to GCS: base_dir = /var/lib/mysql/; base_host = 10.1.105.227; base_port = 4567; cert.log_conflicts = no; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 65536M; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT3S; pc.checksum = false; pc.ignore_quorum = false; pc.ignore_sb = fals
2016-03-03 21:39:30 11145 [Note] WSREP: Service thread queue flushed.
2016-03-03 21:39:30 11145 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1
2016-03-03 21:39:30 11145 [Note] WSREP: wsrep_sst_grab()
2016-03-03 21:39:30 11145 [Note] WSREP: Start replication
2016-03-03 21:39:30 11145 [Note] WSREP: Setting initial position to 00000000-0000-0000-0000-000000000000:-1
2016-03-03 21:39:30 11145 [Note] WSREP: protonet asio version 0
2016-03-03 21:39:30 11145 [Note] WSREP: Using CRC-32C for message checksums.
2016-03-03 21:39:30 11145 [Note] WSREP: backend: asio
2016-03-03 21:39:30 11145 [Warning] WSREP: access file(/var/lib/mysql//gvwstate.dat) failed(No such file or directory)
2016-03-03 21:39:30 11145 [Note] WSREP: restore pc from disk failed
2016-03-03 21:39:30 11145 [Note] WSREP: GMCast version 0
2016-03-03 21:39:30 11145 [Note] WSREP: (692f6df9, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
2016-03-03 21:39:30 11145 [Note] WSREP: (692f6df9, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
2016-03-03 21:39:30 11145 [Note] WSREP: EVS version 0
2016-03-03 21:39:30 11145 [Note] WSREP: gcomm: connecting to group 'my_wsrep_cluster', peer '10.1.105.225:,10.1.105.226:,10.1.105.227:'
2016-03-03 21:39:30 11145 [Warning] WSREP: (692f6df9, 'tcp://0.0.0.0:4567') address 'tcp://10.1.105.227:4567' points to own listening address, blacklisting
2016-03-03 21:39:30 11145 [Note] WSREP: (692f6df9, 'tcp://0.0.0.0:4567') address 'tcp://10.1.105.227:4567' pointing to uuid 692f6df9 is blacklisted, skipping
2016-03-03 21:39:30 11145 [Note] WSREP: (692f6df9, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers:
2016-03-03 21:39:30 11145 [Note] WSREP: declaring ab0a91ee at tcp://10.1.105.225:4567 stable
2016-03-03 21:39:30 11145 [Note] WSREP: Node ab0a91ee state prim
2016-03-03 21:39:30 11145 [Note] WSREP: view(view_id(PRIM,692f6df9,702) memb {
   692f6df9,0
   ab0a91ee,0
} joined {
} left {
} partitioned {
})
2016-03-03 21:39:30 11145 [Note] WSREP: save pc into disk
2016-03-03 21:39:30 11145 [Note] WSREP: discarding pending addr without UUID: tcp://10.1.105.226:4567
2016-03-03 21:39:31 11145 [Note] WSREP: gcomm: connected
2016-03-03 21:39:31 11145 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636
2016-03-03 21:39:31 11145 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
2016-03-03 21:39:31 11145 [Note] WSREP: Opened channel 'my_wsrep_cluster'
2016-03-03 21:39:31 11145 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 2
2016-03-03 21:39:31 11145 [Note] WSREP: Waiting for SST to complete.
2016-03-03 21:39:31 11145 [Note] WSREP: STATE_EXCHANGE: sent state UUID: 69c8e4ca-e188-11e5-83ef-fee446918998
2016-03-03 21:39:31 11145 [Note] WSREP: STATE EXCHANGE: sent state msg: 69c8e4ca-e188-11e5-83ef-fee446918998
2016-03-03 21:39:31 11145 [Note] WSREP: STATE EXCHANGE: got state msg: 69c8e4ca-e188-11e5-83ef-fee446918998 from 0 (10.1.105.227)
2016-03-03 21:39:31 11145 [Note] WSREP: STATE EXCHANGE: got state msg: 69c8e4ca-e188-11e5-83ef-fee446918998 from 1 (10.1.105.225)
2016-03-03 21:39:31 11145 [Note] WSREP: Quorum results:
   version    = 3,
   component = PRIMARY,
   conf_id    = 19,
   members    = 1/2 (joined/total),
   act_id     = 0,
   last_appl. = -1,
   protocols = 0/7/3 (gcs/repl/appl),
   group UUID = 3966b21b-e183-11e5-9a02-122b2c60a692
2016-03-03 21:39:31 11145 [Note] WSREP: Flow-control interval: [23, 23]
2016-03-03 21:39:31 11145 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 0)
2016-03-03 21:39:31 11145 [Note] WSREP: State transfer required:
   Group state: 3966b21b-e183-11e5-9a02-122b2c60a692:0
   Local state: 00000000-0000-0000-0000-000000000000:-1
2016-03-03 21:39:31 11145 [Note] WSREP: New cluster view: global state: 3966b21b-e183-11e5-9a02-122b2c60a692:0, view# 20: Primary, number of nodes: 2, my index: 0, protocol version 3
2016-03-03 21:39:31 11145 [Warning] WSREP: Gap in state sequence. Need state transfer.
2016-03-03 21:39:31 11145 [Note] WSREP: Running: 'wsrep_sst_xtrabackup-v2 --role 'joiner' --address '10.1.105.227' --auth 'root:PASSWORD' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' --parent '11145' '' '
WSREP_SST: [INFO] Streaming with xbstream (20160303 21:39:31.557)
WSREP_SST: [INFO] Using socat as streamer (20160303 21:39:31.559)
WSREP_SST: [INFO] Evaluating timeout -s9 100 socat -u TCP-LISTEN:4444,reuseaddr stdio | xbstream -x; RC=( ${PIPESTATUS[@]} ) (20160303 21:39:31.579)
2016-03-03 21:39:31 11145 [Note] WSREP: Prepared SST request: xtrabackup-v2|10.1.105.227:4444/xtrabackup_sst//1
2016-03-03 21:39:31 11145 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2016-03-03 21:39:31 11145 [Note] WSREP: REPL Protocols: 7 (3, 2)
2016-03-03 21:39:31 11145 [Note] WSREP: Service thread queue flushed.
2016-03-03 21:39:31 11145 [Note] WSREP: Assign initial position for certification: 0, protocol version: 3
2016-03-03 21:39:31 11145 [Note] WSREP: Service thread queue flushed.
2016-03-03 21:39:31 11145 [Warning] WSREP: Failed to prepare for incremental state transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not match group state UUID (3966b21b-e183-11e5-9a02-122b2c60a692): 1 (Operation not permitted)
   at galera/src/replicator_str.cpp:prepare_for_IST():456. IST will be unavailable.
2016-03-03 21:39:31 11145 [Note] WSREP: Member 0.0 (10.1.105.227) requested state transfer from '*any*'. Selected 1.0 (10.1.105.225)(SYNCED) as donor.
2016-03-03 21:39:31 11145 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 0)
2016-03-03 21:39:31 11145 [Note] WSREP: Requesting state transfer: success, donor: 1
WSREP_SST: [INFO] Evaluating socat -u TCP-LISTEN:4444,reuseaddr stdio | xbstream -x; RC=( ${PIPESTATUS[@]} ) (20160303 21:39:32.240)
WSREP_SST: [INFO] Proceeding with SST (20160303 21:39:32.251)
WSREP_SST: [INFO] Cleaning the existing datadir and innodb-data/log directories (20160303 21:39:32.253)
WSREP_SST: [INFO] Waiting for SST streaming to complete! (20160303 21:39:32.264)
2016-03-03 21:39:33 11145 [Note] WSREP: (692f6df9, 'tcp://0.0.0.0:4567') turning message relay requesting off
WSREP_SST: [ERROR] xtrabackup_checkpoints missing, failed innobackupex/SST on donor (20160303 21:39:42.507)
WSREP_SST: [ERROR] Cleanup after exit with status:2 (20160303 21:39:42.509)
2016-03-03 21:39:42 11145 [Warning] WSREP: 1.0 (10.1.105.225): State transfer to 0.0 (10.1.105.227) failed: -22 (Invalid argument)
2016-03-03 21:39:42 11145 [ERROR] WSREP: gcs/src/gcs_group.cpp:int gcs_group_handle_join_msg(gcs_group_t*, const gcs_recv_msg_t*)():731: Will never receive state. Need to abort.
2016-03-03 21:39:42 11145 [Note] WSREP: gcomm: terminating thread
2016-03-03 21:39:42 11145 [Note] WSREP: gcomm: joining thread
2016-03-03 21:39:42 11145 [Note] WSREP: gcomm: closing backend
2016-03-03 21:39:42 11145 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup-v2 --role 'joiner' --address '10.1.105.227' --auth 'root:PASSWORD' --datadir '/var/lib/mysql/' --defaults-f

Exit Code: 0
0

Comment actions Permalink
Johan

March 04, 2016 05:13
You never answered the question about the file size.

Was it the same on the controller and on the node?

CMON executes md5sum on the source file and once copied on the target file and makes a comparison.

I dont understand the above log snippets, since i dont know what you have done or what you want to do.

BR

johan
0

Comment actions Permalink
Jason Mallory

March 04, 2016 21:33
Here is where we are:

We did the restore in the method you suggested to node 1. MySQL will not start. There is corruption. Size of backups were consistent, not sure why MD5SUM was off. However, we still cannot restore a cluster:

Restored /var/backups/BACKUP-152/backup-full-2016-03-03_230526.xbstream.gz on 10.1.105.225 : InnoDB: Your database may be corrupt or you may have copied the InnoDB InnoDB: tablespace but not the InnoDB log files. See InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html InnoDB: for more information. 2016-03-04 19:31:40 7fa6f95e8720 InnoDB: Error: page 0 log sequence number 326070346149 InnoDB: is in the future! Current system log sequence number 325436234011. InnoDB: Your database may be corrupt or you may have copied the InnoDB InnoDB: tablespace but not the InnoDB log files. See InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html InnoDB: for more information. 2016-03-04 19:31:40 7fa6f95e8720 InnoDB: Error: page 0 log sequence number 326070359069 InnoDB: is in the future! Current system log sequence number 325436234011. InnoDB: Your database may be corrupt or you may have copied the InnoDB InnoDB: tablespace but not the InnoDB log files. See InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html InnoDB: for more information. 2016-03-04 19:31:40 7fa6f95e8720 InnoDB: Error: page 0 log sequence number 326070374727 InnoDB: is in the future! Current system log sequence number 325436234011. InnoDB: Your database may be corrupt or you may have copied the InnoDB InnoDB: tablespace but not the InnoDB log files. See InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html InnoDB: for more information. 2016-03-04 19:31:40 7fa6f95e8720 InnoDB: Error: page 0 log sequence number 326070390613 InnoDB: is in the future! Current system log sequence number 325436234011. InnoDB: Your database may be corrupt or you may have copied the InnoDB InnoDB: tablespace but not the InnoDB log files. See InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html InnoDB: for more information. 2016-03-04 19:31:40 7fa6f95e8720 InnoDB: Error: page 0 log sequence number 326070401849 InnoDB: is in the future! Current system log sequence number 325436234011. InnoDB: Your database may be corrupt or you may have copied the InnoDB InnoDB: tablespace but not the InnoDB log files. See InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html InnoDB: for more information. 2016-03-04 19:31:40 7fa6f95e8720 InnoDB: Error: page 0 log sequence number 326070418043 InnoDB: is in the future! Current system log sequence number 325436234011. InnoDB: Your database may be corrupt or you may have copied the InnoDB InnoDB: tablespace but not the InnoDB log files. See InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html InnoDB: for more information.
0

Comment actions Permalink

Please sign in to leave a comment.

Failed to restore backup.

md5sum : 29c08f6d9223f44c4dc37cf7fd36a6e0 != e170aa531214fd7e11b934f462a76950

Copying /var/backups/BACKUP-149/backup-full-2016-02-21_000012.xbstream.gz to 10.1.105.225:/root/s9s_tmp/1/galera/restore

Checking disk space on 10.1.105.225

Comments

Didn't find what you were looking for?