Sunday, 10 September 2023

Postgresql Check Standby Replication

Monitoring postgresql replication is in progress but in not adequate I think. here is some scripts for checking a standby replication. In some cases replication has terminated because of some problems. this scripts give you a glance and you will see the problem about replication. But do not forget to check the log file, you can miss some problems if you do mot check the log file. As I said before postresql views is not adequate.

check overall information about replication on primary

1
select * from pg_stat_replication;


If you are using replication slots then replication slots information view is

1
select * from pg_replication_slots;


check overall information about replication on standby

1
select * from pg_stat_wal_receiver;


check status of replication on standby (version > 10)

1
2
3
4
5
select  pg_is_in_recovery(),
        pg_is_wal_replay_paused(),
        pg_last_wal_receive_lsn(),
        pg_last_wal_replay_lsn(),
        pg_last_xact_replay_timestamp();


check status of replication on standby (version < 10)

1
2
3
4
select  pg_is_in_recovery(),
        pg_last_xlog_receive_location(),
        pg_last_xlog_replay_location(),
        pg_last_xact_replay_timestamp();

run this query for a few times and check if lsn/xlocation number ad replay timestamp is changing.
if it has not change then check the primary log file there could be an error about replication.


–check lag on standby (version >10)

1
2
3
4
SELECT  CASE WHEN pg_last_wal_receive_lsn() = pg_last_wal_replay_lsn()
        THEN 0
        ELSE EXTRACT (EPOCH FROM now() - pg_last_xact_replay_timestamp())
        END AS log_delay;


check lag on standby (version <10)

1
2
3
4
SELECT  CASE WHEN pg_last_xlog_receive_location() = pg_last_xlog_replay_location()
        THEN 0
        ELSE EXTRACT (EPOCH FROM now() - pg_last_xact_replay_timestamp())
        END AS log_delay;


check the primary lsn and compare with standby’s last applied lsn


get current lsn from primary

1
2
3
4
SELECT pg_current_wal_lsn();
 
    -[ RECORD 1 ]------+-----------
    pg_current_wal_lsn | 0/1ADE2F80


get last applied lsn from standby

1
2
3
4
5
6
7
8
9
10
11
12
select  pg_is_in_recovery(),
        pg_is_wal_replay_paused(),
        pg_last_wal_receive_lsn(),
        pg_last_wal_replay_lsn(),
        pg_last_xact_replay_timestamp();
         
    -[ RECORD 1 ]-----------------+------------------------------
        pg_is_in_recovery             | t
        pg_is_wal_replay_paused       | f
        pg_last_wal_receive_lsn       | 0/1005EEEF
        pg_last_wal_replay_lsn        | 0/1005EEEF
        pg_last_xact_replay_timestamp | 2021-05-05 13:01:14.328936+00



calculate difference between two lsn from above queries

1
2
3
4
5
6
pg_wal_lsn_diff(lsn pg_lsn, lsn pg_lsn) (version > 10 )
pg_xlog_location_diff (location pg_lsn, location pg_lsn) (version < 10)
 
select pg_wal_lsn_diff('0/1ADE2F80','0/1005EEEF');
    -[ RECORD 1 ]---+---------
    pg_wal_lsn_diff | 181944465


previous query return result in bytes, we can get gap size in human readable like this

1
2
3
4
5
6
7
select round(181944465/pow(1024,3.0),2) missing_lsn_GiB;
    -[ RECORD 1 ]---+-----
    missing_lsn_gib | 0.17
 
select round(181944465/pow(1024,2.0),2) missing_lsn_GiB;
    -[ RECORD 1 ]---+-------
    missing_lsn_gib | 173.52

No comments:

Post a Comment

Master and Slave - Sync check - PostgreSQL

  1) Run the below Query on Primary:- SELECT     pid,     usename,     application_name,     client_addr,     state,     sync_state,     sen...