Pre-Failover Actions
Failover is the recovery process within the cluster when the primary database fails. It detects failures and switches to a standby database thereby minimizing downtime. The failed database should be reconfigured back to the cluster when it becomes available.
REPMGR does not support automatic failover. In the event of a failover, you will need to manually “REJOIN” the failed node to the cluster with the following procedure:
-
Switch to root user.
sudo su -
-
Stop the postgresql server.
systemctl stop postgresql-10
-
Switch to postgres user.
su - postgres
-
Manually force the standby node to rejoin using the command:
PGPASSWORD={repuser_Password_Here} /usr/pgsql-10/bin/repmgr -f /etc/repmgr/10/repmgr.conf -h {NEW_PRIMARY_IP} -U repuser -d postgres standby clone --force-rewind --force
exit
-
Start postgresql and execute the command below to register the database as
standby.
systemctl start postgresql-10
su - postgres
/usr/pgsql-10/bin/repmgr standby register --force
exit
Troubleshooting:
If the failover does not work, use the following
commands on the standby machine:su - postgres
repmgr standby promote