Showing posts with label Replication. Show all posts
Showing posts with label Replication. Show all posts

Sunday, 11 August 2024

Switchover and Switchback - PostgreSQL-13

 

Below are setup details and the same will be used in this demonstration.

Sr No.HostnameIPRole
1postgres@rac09-p192.168.1.43Master / Primary Server
2postgres@rac10-p192.168.1.44Standby / Secondary Server


Basic Crosscheck :

postgres=# show listen_addresses;
 listen_addresses
------------------
 *
(1 row)

postgres=# show wal_level;
 wal_level
-----------
 replica
(1 row)

postgres=# show max_wal_senders;
 max_wal_senders
-----------------
 10
(1 row)

postgres=# show max_replication_slots;
 max_replication_slots
-----------------------
 10
(1 row)

postgres=# show max_wal_size;
 max_wal_size
--------------
 1GB
(1 row)


postgres=# show wal_log_hints;
 wal_log_hints
---------------
 off
(1 row)
postgres=#
postgres=# show archive_mode;
 archive_mode
--------------
 on
(1 row)



postgres=# show archive_command;
                                archive_command
-------------------------------------------------------------------------------
 test ! -f /backup/PostgreSQL13_arch/%f && cp %p //backup/PostgreSQL13_arch/%f
(1 row)



1. Check whether replication process and the sync status:

-- Run on master server(192.168.1.43 )

postgres=# \x
Expanded display is on.
postgres=#
postgres=#
postgres=# select * from pg_stat_replication;
-[ RECORD 1 ]----+---------------------------------
pid              | 16331
usesysid         | 16385
usename          | repuser
application_name | walreceiver
client_addr      | 192.168.1.44
client_hostname  |
client_port      | 23210
backend_start    | 2024-08-11 13:06:18.175803+05:30
backend_xmin     |
state            | streaming
sent_lsn         | 0/4019BC0
write_lsn        | 0/4019BC0
flush_lsn        | 0/4019BC0
replay_lsn       | 0/4019BC0
write_lag        |
flush_lag        |
replay_lag       |
sync_priority    | 0
sync_state       | async
reply_time       | 2024-08-11 18:50:57.943598+05:30




--Check the lag difference


postgres=# SELECT pg_wal_lsn_diff(sent_lsn, replay_lsn) from pg_stat_replication;
-[ RECORD 1 ]---+--
pg_wal_lsn_diff | 0



2. Check Replication Status on the secondary server:-

Now the replication is up and running. You can get the status of your replication from the pg_stat_wal_receiver table on the secondary server.

postgres=# SELECT "status", "last_msg_send_time", "slot_name", "sender_host" FROM pg_stat_wal_receiver;

-[ RECORD 1 ]------+---------------------------------
status             | streaming
last_msg_send_time | 2024-08-11 18:39:26.562249+05:30
slot_name          |
sender_host        | 192.168.1.43





postgres=# select pg_last_wal_receive_lsn(), pg_last_wal_replay_lsn(), pg_last_xact_replay_timestamp();
-[ RECORD 1 ]-----------------+---------------------------------
pg_last_wal_receive_lsn       | 0/4019BC0
pg_last_wal_replay_lsn        | 0/4019BC0
pg_last_xact_replay_timestamp | 2024-08-11 18:14:17.652363+05:30







3. Check the status of Recovery process on Slave database.



postgres=#  select pg_is_in_recovery();
 pg_is_in_recovery
-------------------
 t
(1 row)




postgres=#
postgres=# select pg_current_wal_lsn();
ERROR:  recovery is in progress
HINT:  WAL control functions cannot be executed during recovery.
postgres=#
postgres=#
postgres=#


As our database is running in recovery process hence it is not allowing us to see the current WAL lsn number.





Now we will proceed with switchover

2. Shutdown the MASTER: [ ON MASTER SERVER 192.168.1.43 ]

[postgres@rac09-p ~]$ /usr/pgsql-13/bin/pg_ctl stop -D /var/lib/pgsql/13/data
waiting for server to shut down.... done
server stopped



2024-08-11 19:04:13.852 IST [24646] FATAL:  could not connect to the primary server: could not connect to server: Connection refused
                Is the server running on host "192.168.1.43" and accepting
                TCP/IP connections on port 5432?
2024-08-11 19:04:18.854 IST [24647] FATAL:  could not connect to the primary server: could not connect to server: Connection refused
                Is the server running on host "192.168.1.43" and accepting
                TCP/IP connections on port 5432?
2024-08-11 19:04:23.856 IST [24648] FATAL:  could not connect to the primary server: could not connect to server: Connection refused
                Is the server running on host "192.168.1.43" and accepting
                TCP/IP connections on port 5432?
2024-08-11 19:04:28.862 IST [24657] FATAL:  could not connect to the primary server: could not connect to server: Connection refused
                Is the server running on host "192.168.1.43" and accepting
                TCP/IP connections on port 5432?
2024-08-11 19:04:33.865 IST [24658] FATAL:  could not connect to the primary server: could not connect to server: Connection refused
                Is the server running on host "192.168.1.43" and accepting
                TCP/IP connections on port 5432?





3. Promote slave as new master:[ RUN ON SLAVE SERVER 192.168.1.44 ]


[postgres@rac10-p ~]$
[postgres@rac10-p ~]$ /usr/pgsql-13/bin/pg_ctl promote -D /var/lib/pgsql/13/data
waiting for server to promote.... done
server promoted



2024-08-11 19:05:41.325 IST [13005] LOG:  received promote request
2024-08-11 19:05:41.325 IST [13005] LOG:  redo done at 0/5000028
2024-08-11 19:05:41.325 IST [13005] LOG:  last completed transaction was at log time 2024-08-11 18:14:17.652363+05:30
2024-08-11 19:05:41.328 IST [13005] LOG:  selected new timeline ID: 2
2024-08-11 19:05:41.361 IST [13005] LOG:  archive recovery complete
2024-08-11 19:05:41.367 IST [13003] LOG:  database system is ready to accept connections


We can see it has been opened with read , write with a new timeline id. But replication is broken. i.e currently we have only master server, no slave server.



So for the switchover to successful, we need to make old master as slave




4. On OLD MASTER( i.e NEW SLAVE 192.168.1.43)

 Create standby.signal on Old Primary

Create the standby.signal file on the old primary to ensure it starts as a standby when brought back online:


-- Create a standby.signal file(empty)

[postgres@rac09-p ~]$ touch /var/lib/pgsql/13/data/standby.signal


Edit postgresql.auto.conf on Old Primary - 192.168.1.43 -VIMP
[postgres@rac09-p data]$ cat postgresql.auto.conf
# Do not edit this file manually!
# It will be overwritten by the ALTER SYSTEM command.
archive_mode = 'on'
archive_command = 'test ! -f /backup/PostgreSQL13_arch/%f && cp %p //backup/PostgreSQL13_arch/%f'
listen_addresses = '*'
primary_conninfo = 'user=repuser host=192.168.1.44 port=5432 sslmode=prefer sslcompression=0 gssencmode=prefer krbsrvname=postgres target_session_attrs=any'




5 . START NEW SLAVE ( 10.20.30.6):


/usr/pgsql-13/bin/pg_ctl start -D /var/lib/pgsql/13/data

[postgres@rac09-p data]$ /usr/pgsql-13/bin/pg_ctl start -D /var/lib/pgsql/13/data
waiting for server to start....2024-08-11 19:12:26.340 IST [32583] LOG:  redirecting log output to logging collector process
2024-08-11 19:12:26.340 IST [32583] HINT:  Future log output will appear in directory "log".
 done
server started


Monitor Logs

Keep an eye on the logs to monitor the status of the database. Use the following command to view the logs:
The logs will have something like “Starting as a standby”


2024-08-11 19:42:46.949 IST [33947] LOG: starting PostgreSQL 13.16 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-22), 64-bit

2024-08-11 19:42:46.950 IST [33947] LOG: listening on IPv4 address "0.0.0.0", port 5432

2024-08-11 19:42:46.950 IST [33947] LOG: listening on IPv6 address "::", port 5432

2024-08-11 19:42:46.953 IST [33947] LOG: listening on Unix socket "/run/postgresql/.s.PGSQL.5432"

2024-08-11 19:42:46.955 IST [33947] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"

2024-08-11 19:42:46.958 IST [33949] LOG: database system was shut down in recovery at 2024-08-11 19:29:38 IST

2024-08-11 19:42:46.958 IST [33949] LOG: entering standby mode

2024-08-11 19:42:46.961 IST [33949] LOG: consistent recovery state reached at 0/50000A0

2024-08-11 19:42:46.961 IST [33949] LOG: invalid record length at 0/50000A0: wanted 24, got 0

2024-08-11 19:42:46.961 IST [33947] LOG: database system is ready to accept read only connections

2024-08-11 19:42:48.164 IST [33953] LOG: fetching timeline history file for timeline 2 from primary server

2024-08-11 19:42:48.167 IST [33953] LOG: started streaming WAL from primary at 0/5000000 on timeline 1

2024-08-11 19:42:48.168 IST [33953] LOG: replication terminated by primary server

2024-08-11 19:42:48.168 IST [33953] DETAIL: End of WAL reached on timeline 1 at 0/50000A0.

2024-08-11 19:42:48.169 IST [33949] LOG: new target timeline is 2

2024-08-11 19:42:48.170 IST [33953] LOG: restarted WAL streaming at 0/5000000 on timeline 2

2024-08-11 19:42:48.202 IST [33949] LOG: redo starts at 0/50000A0



6 . Check whether master-slave replication is working fine or not:

Verify switchover

Once the configurations are completed it’s imported to verify things. We will use few methods to verify the switchover process.

  1. Verify the switchover using the available views.
    First we will check if the server is receiving wals or not run below query on new slave:
    select * from pg_stat_wal_receiver;
    select pg_last_wal_receive_lsn(), pg_last_wal_replay_lsn(), pg_last_xact_replay_timestamp()
    ;



postgres=# select * from pg_stat_wal_receiver;

-[ RECORD 1 ]---------+----------------------------------------------------------------------------------------------------------------------------------------

pid                   | 33953

status                | streaming

receive_start_lsn     | 0/5000000

receive_start_tli     | 2

written_lsn           | 0/50032A0

flushed_lsn           | 0/50032A0

received_tli          | 2

last_msg_send_time    | 2024-08-11 22:38:05.809966+05:30

last_msg_receipt_time | 2024-08-11 22:38:05.950297+05:30

latest_end_lsn        | 0/50032A0

latest_end_time       | 2024-08-11 22:34:35.554242+05:30

slot_name             |

sender_host           | 192.168.1.44

sender_port           | 5432

conninfo              | user=repuser host=192.168.1.44 port=5432 sslmode=prefer sslcompression=0 gssencmode=prefer krbsrvname=postgres target_session_attrs=any



postgres=# select pg_last_wal_receive_lsn(), pg_last_wal_replay_lsn(), pg_last_xact_replay_timestamp();

-[ RECORD 1 ]-----------------+---------------------------------

pg_last_wal_receive_lsn       | 0/50032A0

pg_last_wal_replay_lsn        | 0/50032A0

pg_last_xact_replay_timestamp | 2024-08-11 22:29:28.892188+05:30



Verify by matching the tables in both DB’s
Log into the new live (Primary) database and create a testing schema, tables and insert some records for the verify.

CREATE TABLE test_schema.test_table (
id serial PRIMARY KEY,
name varchar(100),
age int
);

INSERT INTO test_schema.test_table (name, age) VALUES
('Alice', 28),
('Bob', 32),
('Carol', 24);

Now log into the new Slave server and verify the inserted records, the records should be visible.

3. Check the new slave is Read-only or not (It should be read-only)

SELECT pg_is_in_recovery();
This query will return a single boolean value:
true if the server is a standby and false if it's the primary.


postgres=# SELECT pg_is_in_recovery();

-[ RECORD 1 ]-----+--

pg_is_in_recovery | t




Conclusion

By following these steps, you can successfully perform a switchover in PostgreSQL, changing the IP addresses involved and ensuring a seamless transition from the old primary to the new standby.

Remember that a well-managed PostgreSQL environment requires regular monitoring, maintenance, and backups. With this guide, you’re now equipped to build a robust and highly available PostgreSQL setup that can handle your organization’s critical data with confidence.





Configuring Streaming Replication in PostgreSQL 13

 
Table of Contents

1. Introducing Streaming Replication

2. Prepare the environment

3. Execute on Server Master
-- Put PostgreSQL in Archive log mode
-- Parameter wal_level=replica
-- Network configuration between primary and standby

4. Execute on Slave Server
-- Backup and restore 
-- Start database

5. Check the results


================================================

1. Introducing Streaming Replication

Streaming Replication is a feature that helps you build a backup database system for PostgreSQL database. Moreover, it also allows you to query data on the backup database, helping to reduce the load on the main database.

Streaming Replication works by transferring WAL files from the primary database (or master) to the standby database (or slave). Then, apply these WAL files to the standby database. In the documents, this process is often called recovery, apply or replay.


2. Prepare the environment

My simulation environment has 2 servers both with PostgreSQL installed.

1. Server master:

  • IP: 192.168.1.43
  • Operating System: Red Hat Enterprise Linux Server release 8
  • PostgreSQL version: 13

2. Server slave:

  • IP: 192.168.1.44
  • Operating System: Red Hat Enterprise Linux Server release 8
  • PostgreSQL version: 13




Basic Check:-

postgres=# show config_file;
              config_file
----------------------------------------
 /var/lib/pgsql/13/data/postgresql.conf
(1 row)


postgres=# show hba_file ;
              hba_file
------------------------------------
 /var/lib/pgsql/13/data/pg_hba.conf
(1 row)


3. Execute on Server Master

1. PostgreSQL database put in archive mode

Check if the database is in archive mode

# show archive_mode ;
 archive_mode 
--------------
 on
(1 row)

If not, please put the database into Archive mode according to the instructions below.


2. Parameter wal_level=replica

Check the wal_level parameter again to see if the value is already replica.

# show wal_level ;
 wal_level 
-----------
 replica
(1 row)

If not, run the following command to change the parameter:

alter system set wal_level=replica;

And remember to restart the instance for the new value to take effect.

pg_ctl restart
[postgres@rac09-p data]$ pg_ctl -D /var/lib/pgsql/13/data start
waiting for server to start....2024-08-11 11:44:56.305 IST [12573] LOG:  redirecting log output to logging collector process
2024-08-11 11:44:56.305 IST [12573] HINT:  Future log output will appear in directory "log".
 done
server started



3. Network configuration between primary

Create user to sync

postgresql# CREATE USER repuser WITH REPLICATION PASSWORD 'repuser';



Then, you edit the pg_hba.conf file to allow standby to connect to primary via user replication.

vi pg_hba.conf

and add the following line:

host  replication     replication     192.168.1.44/32         trust

[postgres@rac09-p data]$ cat pg_hba.conf
# replication privilege.
local   replication      all                                                   peer
host    replication     all             127.0.0.1/32                   scram-sha-256
host    replication     all             ::1/128                            scram-sha-256
host    replication     repuser     192.168.1.44/32        trust



After editing the pg_hba.conf file , remember to run the following command for the changes to take effect.

pg_ctl reload

Note: The reload option only reloads the parameter values ​​and does not restart the database, so it does not affect any ongoing transactions on the database.


[postgres@rac09-p ~]$ psql -c "alter system set listen_addresses to '*'"

ALTER SYSTEM



4. Execute on Slave Server


Making a Base Backup to Bootstrap the Standby Server (Slave)
- You need to make a base backup of the master server from the standby server, this helps to bootstrap the standby server.

Stop the PostgreSQL severvice 


[postgres@rac10-p data]$ pg_ctl stop -D /var/lib/pgsql/13/data

[postgres@rac10-p ~]$

[postgres@rac10-p 13]$ pwd
/var/lib/pgsql/13
[postgres@rac10-p 13]$ /usr/pgsql-13/bin/pg_ctl -D /var/lib/pgsql/13/data stop
waiting for server to shut down.... done
server stopped


[postgres@rac10-p ~]$ cd $PGDATA
[postgres@rac10-p data]$ ls -lrt
total 132
drwx------. 2 postgres postgres  4096 Aug 11 00:11 pg_twophase
drwx------. 2 postgres postgres  4096 Aug 11 00:11 pg_snapshots
drwx------. 2 postgres postgres  4096 Aug 11 00:11 pg_serial
drwx------. 2 postgres postgres  4096 Aug 11 00:11 pg_notify
drwx------. 2 postgres postgres  4096 Aug 11 00:11 pg_dynshmem
drwx------. 2 postgres postgres  4096 Aug 11 00:11 pg_commit_ts
-rw-------. 1 postgres postgres     3 Aug 11 00:11 PG_VERSION
drwx------. 2 postgres postgres  4096 Aug 11 00:11 pg_tblspc
drwx------. 2 postgres postgres  4096 Aug 11 00:11 pg_replslot
drwx------. 4 postgres postgres  4096 Aug 11 00:11 pg_multixact
-rw-------. 1 postgres postgres 28086 Aug 11 00:11 postgresql.conf
-rw-------. 1 postgres postgres    88 Aug 11 00:11 postgresql.auto.conf
-rw-------. 1 postgres postgres  1636 Aug 11 00:11 pg_ident.conf
-rw-------. 1 postgres postgres  4548 Aug 11 00:11 pg_hba.conf
drwx------. 2 postgres postgres  4096 Aug 11 00:11 pg_xact
drwx------. 3 postgres postgres  4096 Aug 11 00:11 pg_wal
drwx------. 2 postgres postgres  4096 Aug 11 00:11 pg_subtrans
drwx------. 5 postgres postgres  4096 Aug 11 00:11 base
drwx------. 2 postgres postgres  4096 Aug 11 00:12 log
-rw-------. 1 postgres postgres    30 Aug 11 08:52 current_logfiles
-rw-------. 1 postgres postgres    58 Aug 11 08:52 postmaster.opts
-rw-------. 1 postgres postgres    99 Aug 11 08:52 postmaster.pid
drwx------. 2 postgres postgres  4096 Aug 11 08:52 pg_stat
drwx------. 2 postgres postgres  4096 Aug 11 08:53 global
drwx------. 4 postgres postgres  4096 Aug 11 08:57 pg_logical
drwx------. 2 postgres postgres  4096 Aug 11 12:51 pg_stat_tmp
[postgres@rac10-p data]$ cd ..
[postgres@rac10-p 13]$ ls -lrt
total 12
drwx------.  2 postgres postgres 4096 Aug  9 03:25 backups
-rw-------.  1 postgres postgres  920 Aug 11 00:11 initdb.log
drwx------. 20 postgres postgres 4096 Aug 11 08:52 data
[postgres@rac10-p 13]$


I have already "data" folder in my server so just moving existing data folder to data_bkp

[postgres@rac10-p 13]$ mv data data_bkp

Create DATA directory once again :-

[postgres@rac10-p 13]$
[postgres@rac10-p 13]$ mkdir -p data/
[postgres@rac10-p 13]$ chmod 700 data/
[postgres@rac10-p 13]$ chown -R postgres:postgres data/
[postgres@rac10-p 13]$
[postgres@rac10-p 13]$



6) 
Then user the pg_basebackup tool to take the base backup with the right ownership.
(the database system under i.e. postgres, within the postgres user account) and with the right permissions.


Below command is  used to setting up the Replication 

[postgres@rac10-p bin]$ pg_basebackup -h 192.168.1.43 -U repuser -p 5432 -D /var/lib/pgsql/13/data -Fp -Xs -P -R

25254/25254 kB (100%), 1/1 tablespace


With above command, Slave server will connect to the Master (Primary) server and will copy all the data directory from Master server. The connection will be trough a PostgreSQL database user, in our case this user is repuser. And at the end of the command we created the replication slot. What is a replication slot? Basically, the Master server is able to delete the WAL files if it is not needed anymore. But maybe the Slave server did not receive and applied it yet. In this case, the replication will fail. But if we have the replication slot configured, this process will not let the Master server delete the WAL files which did not receive by Slave server yet. At the end the standby.signal file has been created which make it the replica.


Crosscheck the data with is restored or not ..

[postgres@rac10-p bin]$ cd /var/lib/pgsql/13/data
[postgres@rac10-p data]$ ls -lrt
total 268
-rw-------. 1 postgres postgres    224 Aug 11 13:03 backup_label
-rw-------. 1 postgres postgres  28086 Aug 11 13:03 postgresql.conf
drwx------. 2 postgres postgres   4096 Aug 11 13:03 pg_twophase
drwx------. 2 postgres postgres   4096 Aug 11 13:03 pg_snapshots
drwx------. 4 postgres postgres   4096 Aug 11 13:03 pg_multixact
drwx------. 2 postgres postgres   4096 Aug 11 13:03 pg_commit_ts
drwx------. 3 postgres postgres   4096 Aug 11 13:03 pg_wal
drwx------. 2 postgres postgres   4096 Aug 11 13:03 pg_notify
-rw-------. 1 postgres postgres   4619 Aug 11 13:03 pg_hba.conf
drwx------. 2 postgres postgres   4096 Aug 11 13:03 log
drwx------. 5 postgres postgres   4096 Aug 11 13:03 base
drwx------. 2 postgres postgres   4096 Aug 11 13:03 pg_xact
-rw-------. 1 postgres postgres      3 Aug 11 13:03 PG_VERSION
drwx------. 2 postgres postgres   4096 Aug 11 13:03 pg_tblspc
drwx------. 2 postgres postgres   4096 Aug 11 13:03 pg_subtrans
drwx------. 2 postgres postgres   4096 Aug 11 13:03 pg_stat_tmp
drwx------. 2 postgres postgres   4096 Aug 11 13:03 pg_stat
drwx------. 2 postgres postgres   4096 Aug 11 13:03 pg_serial
drwx------. 2 postgres postgres   4096 Aug 11 13:03 pg_replslot
drwx------. 4 postgres postgres   4096 Aug 11 13:03 pg_logical
-rw-------. 1 postgres postgres   1636 Aug 11 13:03 pg_ident.conf
drwx------. 2 postgres postgres   4096 Aug 11 13:03 pg_dynshmem
drwx------. 2 postgres postgres   4096 Aug 11 13:03 global
-rw-------. 1 postgres postgres     30 Aug 11 13:03 current_logfiles
-rw-------. 1 postgres postgres      0 Aug 11 13:03 standby.signal
-rw-------. 1 postgres postgres    478 Aug 11 13:03 postgresql.auto.conf
-rw-------. 1 postgres postgres 135417 Aug 11 13:03 backup_manifest




[postgres@rac10-p data]$ cat postgresql.auto.conf
# Do not edit this file manually!
# It will be overwritten by the ALTER SYSTEM command.
archive_mode = 'on'
archive_command = 'test ! -f /backup/PostgreSQL13_arch/%f && cp %p //backup/PostgreSQL13_arch/%f'
listen_addresses = '*'
primary_conninfo = 'user=repuser passfile=''/var/lib/pgsql/.pgpass'' channel_binding=prefer host=192.168.1.43 port=5432 sslmode=prefer sslcompression=0 ssl_min_protocol_version=TLSv1.2 gssencmode=prefer krbsrvname=postgres target_session_attrs=any'



Check the PostgreSQL service start or not ..

[postgres@rac10-p data]$ ps -ef| grep postgres
root       10000    9940  0 11:48 pts/0    00:00:00 su - postgres
postgres   10001   10000  0 11:48 pts/0    00:00:00 -bash
postgres   12991   10001  0 13:06 pts/0    00:00:00 ps -ef
postgres   12992   10001  0 13:06 pts/0    00:00:00 grep --color=auto postgres
[postgres@rac10-p data]$


Start the PostgreSQL service as below:-

[postgres@rac10-p data]$ pg_ctl start -D /var/lib/pgsql/13/data
waiting for server to start....2024-08-11 13:06:17.934 IST [13003] LOG:  redirecting log output to logging collector process
2024-08-11 13:06:17.934 IST [13003] HINT:  Future log output will appear in directory "log".
 done
server started



[postgres@rac10-p data]$ ps -ef| grep postgres
root       10000    9940  0 11:48 pts/0    00:00:00 su - postgres
postgres   10001   10000  0 11:48 pts/0    00:00:00 -bash
postgres   13003       1  0 13:06 ?        00:00:00 /usr/pgsql-13/bin/postgres -D /var/lib/pgsql/13/data
postgres   13004   13003  0 13:06 ?        00:00:00 postgres: logger
postgres   13005   13003  0 13:06 ?        00:00:00 postgres: startup recovering 000000010000000000000004
postgres   13006   13003  0 13:06 ?        00:00:00 postgres: checkpointer
postgres   13007   13003  0 13:06 ?        00:00:00 postgres: background writer
postgres   13008   13003  0 13:06 ?        00:00:00 postgres: stats collector
postgres   13009   13003  1 13:06 ?        00:00:00 postgres: walreceiver streaming 0/4000060
postgres   13010   10001  0 13:06 pts/0    00:00:00 ps -ef
postgres   13011   10001  0 13:06 pts/0    00:00:00 grep --color=auto postgres
[postgres@rac10-p data]$

Check receiver background  process started or not :-

[postgres@rac10-p data]$ ps -eaf | grep receiver
postgres   13009   13003  0 13:06 ?        00:00:00 postgres: walreceiver streaming 0/4000060
postgres   13021   10001  0 13:07 pts/0    00:00:00 grep --color=auto receiver
[postgres@rac10-p data]$
[postgres@rac10-p data]$
[postgres@rac10-p data]$ psql
psql (13.16)
Type "help" for help.

postgres=# select pg_is_in_recovery();
 pg_is_in_recovery
-------------------
 t
(1 row)



Primary - Master :-

postgres=# select pg_is_in_recovery();
 pg_is_in_recovery
-------------------
 f


The result f (false) from select pg_is_in_recovery(); indicates that the PostgreSQL server is not in recovery mode. This means the server is running as the primary (or standalone) instance and is not in standby mode, as it would be if it were replicating from another primary server or undergoing a failover recovery process.






Standby - Slave :-

postgres=# select pg_is_in_recovery();
 pg_is_in_recovery
-------------------
 t


The result t (true) from select pg_is_in_recovery(); indicates that the PostgreSQL server is in recovery mode. This typically means the server is running as a standby instance in a replication setup, such as:

  1. Streaming Replication: The server could be continuously replicating data from a primary PostgreSQL instance.
  2. Point-in-Time Recovery: The server might be performing a recovery operation, applying WAL (Write-Ahead Log) files to bring it to a specific state.

While in recovery mode, the server operates in a read-only mode, meaning you won’t be able to perform any write operations. This is normal for standby servers in a replication environment.






Run below query on MASTER server for outgoing replication details:-

postgres=# \x
postgres=# select * from pg_stat_replication;

Run below query on SLAVE server.

postgres=# \x
postgres=# select pg_is_wal_replay_paused();






NOTE: f means, recovery is running fine. t means it is stopped.


Check for last wal received on SLAVE




You can check the Master and Slave databases info by running below command separately on both sides. If your system can not find the pg_controldata program, then you should set the path to this program. Because some of programs include pg_controldata placed under /usr/lib/postgresql/14/bin/ location.



[postgres@rac09-p data]$ pg_controldata -D $PGDATA
pg_control version number:            1300
Catalog version number:               202007201
Database system identifier:           7401588754142404987
Database cluster state:               in production
pg_control last modified:             Sun 11 Aug 2024 01:08:01 PM IST
Latest checkpoint location:           0/4000098
Latest checkpoint's REDO location:    0/4000060
Latest checkpoint's REDO WAL file:    000000010000000000000004
Latest checkpoint's TimeLineID:       1
Latest checkpoint's PrevTimeLineID:   1
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID:          0:489
Latest checkpoint's NextOID:          24576
Latest checkpoint's NextMultiXactId:  1
Latest checkpoint's NextMultiOffset:  0
Latest checkpoint's oldestXID:        479
Latest checkpoint's oldestXID's DB:   1
Latest checkpoint's oldestActiveXID:  489
Latest checkpoint's oldestMultiXid:   1
Latest checkpoint's oldestMulti's DB: 1
Latest checkpoint's oldestCommitTsXid:0
Latest checkpoint's newestCommitTsXid:0
Time of latest checkpoint:            Sun 11 Aug 2024 01:08:01 PM IST
Fake LSN counter for unlogged rels:   0/3E8
Minimum recovery ending location:     0/0
Min recovery ending loc's timeline:   0
Backup start location:                0/0
Backup end location:                  0/0
End-of-backup record required:        no
wal_level setting:                    replica
wal_log_hints setting:                off
max_connections setting:              100
max_worker_processes setting:         8
max_wal_senders setting:              10
max_prepared_xacts setting:           0
max_locks_per_xact setting:           64
track_commit_timestamp setting:       off
Maximum data alignment:               8
Database block size:                  8192
Blocks per segment of large relation: 131072
WAL block size:                       8192
Bytes per WAL segment:                16777216
Maximum length of identifiers:        64
Maximum columns in an index:          32
Maximum size of a TOAST chunk:        1996
Size of a large-object chunk:         2048
Date/time type storage:               64-bit integers
Float8 argument passing:              by value
Data page checksum version:           0
Mock authentication nonce:            e45f66c4810dad4d3b1a41c4ca3e1a84d56ea017d71a13e85f3bf365512ac661
[postgres@rac09-p data]$



[postgres@rac10-p data]$ pg_controldata -D $PGDATA
pg_control version number:            1300
Catalog version number:               202007201
Database system identifier:           7401588754142404987
Database cluster state:               in archive recovery
pg_control last modified:             Sun 11 Aug 2024 01:11:18 PM IST
Latest checkpoint location:           0/4000098
Latest checkpoint's REDO location:    0/4000060
Latest checkpoint's REDO WAL file:    000000010000000000000004
Latest checkpoint's TimeLineID:       1
Latest checkpoint's PrevTimeLineID:   1
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID:          0:489
Latest checkpoint's NextOID:          24576
Latest checkpoint's NextMultiXactId:  1
Latest checkpoint's NextMultiOffset:  0
Latest checkpoint's oldestXID:        479
Latest checkpoint's oldestXID's DB:   1
Latest checkpoint's oldestActiveXID:  489
Latest checkpoint's oldestMultiXid:   1
Latest checkpoint's oldestMulti's DB: 1
Latest checkpoint's oldestCommitTsXid:0
Latest checkpoint's newestCommitTsXid:0
Time of latest checkpoint:            Sun 11 Aug 2024 01:08:01 PM IST
Fake LSN counter for unlogged rels:   0/3E8
Minimum recovery ending location:     0/4000148
Min recovery ending loc's timeline:   1
Backup start location:                0/0
Backup end location:                  0/0
End-of-backup record required:        no
wal_level setting:                    replica
wal_log_hints setting:                off
max_connections setting:              100
max_worker_processes setting:         8
max_wal_senders setting:              10
max_prepared_xacts setting:           0
max_locks_per_xact setting:           64
track_commit_timestamp setting:       off
Maximum data alignment:               8
Database block size:                  8192
Blocks per segment of large relation: 131072
WAL block size:                       8192
Bytes per WAL segment:                16777216
Maximum length of identifiers:        64
Maximum columns in an index:          32
Maximum size of a TOAST chunk:        1996
Size of a large-object chunk:         2048
Date/time type storage:               64-bit integers
Float8 argument passing:              by value
Data page checksum version:           0
Mock authentication nonce:            e45f66c4810dad4d3b1a41c4ca3e1a84d56ea017d71a13e85f3bf365512ac661


Master and Slave - Sync check - PostgreSQL

  1) Run the below Query on Primary:- SELECT     pid,     usename,     application_name,     client_addr,     state,     sync_state,     sen...