Backup Slave Failed
I have two nodes, one master and one slave, the replica is working correctly. But I have problems to make the backups in the slave node, the error that is generated is:
pg_dump: FATAL: terminating connection due to conflict with recovery
DETAIL: User query might have needed to see row versions that must be removed.
Any ideas?
-
Official comment
Hi Jortiz,
This is a common issue if you are running pg_dump on the hot standby slave, because the table data on the hot standby slave server is modified while a long running query is running. During querying some needed rows might be updated or deleted on primary. As a primary does not know that a query is started on secondary it thinks it can clean up (vacuum) old versions of its rows. Then secondary has to replay this cleanup, and has to forcibly cancel all queries which can use these rows.
There are a number of workarounds for this issue:
1) Set "hot_standby_feedback = on" on the slave node. It prevents VACUUM from removing recently-dead rows and so cleanup conflicts do not occur. However, this could bloat the master.
2) Set the following on the slave:
max_standby_archive_delay = 900s
max_standby_streaming_delay = 900sThis way queries on slaves with a duration less than 900 seconds won't be cancelled. If your workload requires longer queries, just set these options to a higher value.
3) Pause the replication while the backup is ongoing on the slave and resume it after the backup completes:
select pg_xlog_replay_pause(); -- suspend
pg_dump ...; -- your backup query
select pg_xlog_replay_resume(); --resumeRegards,
AshrafComment actions -
Thanks for the reply.
I tried to stop the replication with procedure 3, however an error is reported in the slave host. I investigated this type of error, but until now I can not find the solution, The Postgres version is 10.1.
ERROR: function pg_xlog_replay_pause () does not exist
LINE 1: select pg_xlog_replay_pause ();
^
HINT: No function matches the given name and argument types. You might need to add explicit type casts.********** Error **********
ERROR: function pg_xlog_replay_pause () does not exist
SQL state: 42883
Hint: No function matches the given name and argument types. You might need to add explicit type casts.
Character: 8 -
Hi Jortiz,
In PG10, there has been function renames. Check this one https://wiki.postgresql.org/wiki/New_in_postgres_10#Renaming_of_.22xlog.22_to_.22wal.22_Globally_.28and_location.2Flsn.29
So instead, try
select pg_wal_replay_pause(); -- suspend
pg_dump ...; -- your backup query
select pg_wal_replay_resume(); --resume
Please sign in to leave a comment.
Comments
5 comments