Failover is the recovery process within the cluster when the primary database fails. It detects failures and switches to a standby database thereby minimizing downtime. The failed database should be reconfigured back to the cluster when it becomes available.

REPMGR does not support automatic failover. In the event of a failover, you will need to manually “REJOIN” the failed node to the cluster with the following procedure:

  1. Switch to root user.
    sudo su -
  2. Stop the postgresql server.
    systemctl stop postgresql-10
  3. Switch to postgres user.
    su - postgres
  4. Manually force the standby node to rejoin using the command:
    PGPASSWORD={repuser_Password_Here} /usr/pgsql-10/bin/repmgr -f /etc/repmgr/10/repmgr.conf -h {NEW_PRIMARY_IP} -U repuser -d postgres standby clone --force-rewind --force
    exit 
  5. Start postgresql and execute the command below to register the database as standby.
    systemctl start postgresql-10
    su - postgres
    /usr/pgsql-10/bin/repmgr standby register --force
    exit 

Troubleshooting:
If the failover does not work, use the following commands on the standby machine:
su - postgres
repmgr standby promote