Sunday, 11 August 2024

Configuring Streaming Replication in PostgreSQL 13

 
Table of Contents

1. Introducing Streaming Replication

2. Prepare the environment

3. Execute on Server Master
-- Put PostgreSQL in Archive log mode
-- Parameter wal_level=replica
-- Network configuration between primary and standby

4. Execute on Slave Server
-- Backup and restore 
-- Start database

5. Check the results


================================================

1. Introducing Streaming Replication

Streaming Replication is a feature that helps you build a backup database system for PostgreSQL database. Moreover, it also allows you to query data on the backup database, helping to reduce the load on the main database.

Streaming Replication works by transferring WAL files from the primary database (or master) to the standby database (or slave). Then, apply these WAL files to the standby database. In the documents, this process is often called recovery, apply or replay.


2. Prepare the environment

My simulation environment has 2 servers both with PostgreSQL installed.

1. Server master:

  • IP: 192.168.1.43
  • Operating System: Red Hat Enterprise Linux Server release 8
  • PostgreSQL version: 13

2. Server slave:

  • IP: 192.168.1.44
  • Operating System: Red Hat Enterprise Linux Server release 8
  • PostgreSQL version: 13




Basic Check:-

postgres=# show config_file;
              config_file
----------------------------------------
 /var/lib/pgsql/13/data/postgresql.conf
(1 row)


postgres=# show hba_file ;
              hba_file
------------------------------------
 /var/lib/pgsql/13/data/pg_hba.conf
(1 row)


3. Execute on Server Master

1. PostgreSQL database put in archive mode

Check if the database is in archive mode

# show archive_mode ;
 archive_mode 
--------------
 on
(1 row)

If not, please put the database into Archive mode according to the instructions below.


2. Parameter wal_level=replica

Check the wal_level parameter again to see if the value is already replica.

# show wal_level ;
 wal_level 
-----------
 replica
(1 row)

If not, run the following command to change the parameter:

alter system set wal_level=replica;

And remember to restart the instance for the new value to take effect.

pg_ctl restart
[postgres@rac09-p data]$ pg_ctl -D /var/lib/pgsql/13/data start
waiting for server to start....2024-08-11 11:44:56.305 IST [12573] LOG:  redirecting log output to logging collector process
2024-08-11 11:44:56.305 IST [12573] HINT:  Future log output will appear in directory "log".
 done
server started



3. Network configuration between primary

Create user to sync

postgresql# CREATE USER repuser WITH REPLICATION PASSWORD 'repuser';



Then, you edit the pg_hba.conf file to allow standby to connect to primary via user replication.

vi pg_hba.conf

and add the following line:

host  replication     replication     192.168.1.44/32         trust

[postgres@rac09-p data]$ cat pg_hba.conf
# replication privilege.
local   replication      all                                                   peer
host    replication     all             127.0.0.1/32                   scram-sha-256
host    replication     all             ::1/128                            scram-sha-256
host    replication     repuser     192.168.1.44/32        trust



After editing the pg_hba.conf file , remember to run the following command for the changes to take effect.

pg_ctl reload

Note: The reload option only reloads the parameter values ​​and does not restart the database, so it does not affect any ongoing transactions on the database.


[postgres@rac09-p ~]$ psql -c "alter system set listen_addresses to '*'"

ALTER SYSTEM



4. Execute on Slave Server


Making a Base Backup to Bootstrap the Standby Server (Slave)
- You need to make a base backup of the master server from the standby server, this helps to bootstrap the standby server.

Stop the PostgreSQL severvice 


[postgres@rac10-p data]$ pg_ctl stop -D /var/lib/pgsql/13/data

[postgres@rac10-p ~]$

[postgres@rac10-p 13]$ pwd
/var/lib/pgsql/13
[postgres@rac10-p 13]$ /usr/pgsql-13/bin/pg_ctl -D /var/lib/pgsql/13/data stop
waiting for server to shut down.... done
server stopped


[postgres@rac10-p ~]$ cd $PGDATA
[postgres@rac10-p data]$ ls -lrt
total 132
drwx------. 2 postgres postgres  4096 Aug 11 00:11 pg_twophase
drwx------. 2 postgres postgres  4096 Aug 11 00:11 pg_snapshots
drwx------. 2 postgres postgres  4096 Aug 11 00:11 pg_serial
drwx------. 2 postgres postgres  4096 Aug 11 00:11 pg_notify
drwx------. 2 postgres postgres  4096 Aug 11 00:11 pg_dynshmem
drwx------. 2 postgres postgres  4096 Aug 11 00:11 pg_commit_ts
-rw-------. 1 postgres postgres     3 Aug 11 00:11 PG_VERSION
drwx------. 2 postgres postgres  4096 Aug 11 00:11 pg_tblspc
drwx------. 2 postgres postgres  4096 Aug 11 00:11 pg_replslot
drwx------. 4 postgres postgres  4096 Aug 11 00:11 pg_multixact
-rw-------. 1 postgres postgres 28086 Aug 11 00:11 postgresql.conf
-rw-------. 1 postgres postgres    88 Aug 11 00:11 postgresql.auto.conf
-rw-------. 1 postgres postgres  1636 Aug 11 00:11 pg_ident.conf
-rw-------. 1 postgres postgres  4548 Aug 11 00:11 pg_hba.conf
drwx------. 2 postgres postgres  4096 Aug 11 00:11 pg_xact
drwx------. 3 postgres postgres  4096 Aug 11 00:11 pg_wal
drwx------. 2 postgres postgres  4096 Aug 11 00:11 pg_subtrans
drwx------. 5 postgres postgres  4096 Aug 11 00:11 base
drwx------. 2 postgres postgres  4096 Aug 11 00:12 log
-rw-------. 1 postgres postgres    30 Aug 11 08:52 current_logfiles
-rw-------. 1 postgres postgres    58 Aug 11 08:52 postmaster.opts
-rw-------. 1 postgres postgres    99 Aug 11 08:52 postmaster.pid
drwx------. 2 postgres postgres  4096 Aug 11 08:52 pg_stat
drwx------. 2 postgres postgres  4096 Aug 11 08:53 global
drwx------. 4 postgres postgres  4096 Aug 11 08:57 pg_logical
drwx------. 2 postgres postgres  4096 Aug 11 12:51 pg_stat_tmp
[postgres@rac10-p data]$ cd ..
[postgres@rac10-p 13]$ ls -lrt
total 12
drwx------.  2 postgres postgres 4096 Aug  9 03:25 backups
-rw-------.  1 postgres postgres  920 Aug 11 00:11 initdb.log
drwx------. 20 postgres postgres 4096 Aug 11 08:52 data
[postgres@rac10-p 13]$


I have already "data" folder in my server so just moving existing data folder to data_bkp

[postgres@rac10-p 13]$ mv data data_bkp

Create DATA directory once again :-

[postgres@rac10-p 13]$
[postgres@rac10-p 13]$ mkdir -p data/
[postgres@rac10-p 13]$ chmod 700 data/
[postgres@rac10-p 13]$ chown -R postgres:postgres data/
[postgres@rac10-p 13]$
[postgres@rac10-p 13]$



6) 
Then user the pg_basebackup tool to take the base backup with the right ownership.
(the database system under i.e. postgres, within the postgres user account) and with the right permissions.


Below command is  used to setting up the Replication 

[postgres@rac10-p bin]$ pg_basebackup -h 192.168.1.43 -U repuser -p 5432 -D /var/lib/pgsql/13/data -Fp -Xs -P -R

25254/25254 kB (100%), 1/1 tablespace


With above command, Slave server will connect to the Master (Primary) server and will copy all the data directory from Master server. The connection will be trough a PostgreSQL database user, in our case this user is repuser. And at the end of the command we created the replication slot. What is a replication slot? Basically, the Master server is able to delete the WAL files if it is not needed anymore. But maybe the Slave server did not receive and applied it yet. In this case, the replication will fail. But if we have the replication slot configured, this process will not let the Master server delete the WAL files which did not receive by Slave server yet. At the end the standby.signal file has been created which make it the replica.


Crosscheck the data with is restored or not ..

[postgres@rac10-p bin]$ cd /var/lib/pgsql/13/data
[postgres@rac10-p data]$ ls -lrt
total 268
-rw-------. 1 postgres postgres    224 Aug 11 13:03 backup_label
-rw-------. 1 postgres postgres  28086 Aug 11 13:03 postgresql.conf
drwx------. 2 postgres postgres   4096 Aug 11 13:03 pg_twophase
drwx------. 2 postgres postgres   4096 Aug 11 13:03 pg_snapshots
drwx------. 4 postgres postgres   4096 Aug 11 13:03 pg_multixact
drwx------. 2 postgres postgres   4096 Aug 11 13:03 pg_commit_ts
drwx------. 3 postgres postgres   4096 Aug 11 13:03 pg_wal
drwx------. 2 postgres postgres   4096 Aug 11 13:03 pg_notify
-rw-------. 1 postgres postgres   4619 Aug 11 13:03 pg_hba.conf
drwx------. 2 postgres postgres   4096 Aug 11 13:03 log
drwx------. 5 postgres postgres   4096 Aug 11 13:03 base
drwx------. 2 postgres postgres   4096 Aug 11 13:03 pg_xact
-rw-------. 1 postgres postgres      3 Aug 11 13:03 PG_VERSION
drwx------. 2 postgres postgres   4096 Aug 11 13:03 pg_tblspc
drwx------. 2 postgres postgres   4096 Aug 11 13:03 pg_subtrans
drwx------. 2 postgres postgres   4096 Aug 11 13:03 pg_stat_tmp
drwx------. 2 postgres postgres   4096 Aug 11 13:03 pg_stat
drwx------. 2 postgres postgres   4096 Aug 11 13:03 pg_serial
drwx------. 2 postgres postgres   4096 Aug 11 13:03 pg_replslot
drwx------. 4 postgres postgres   4096 Aug 11 13:03 pg_logical
-rw-------. 1 postgres postgres   1636 Aug 11 13:03 pg_ident.conf
drwx------. 2 postgres postgres   4096 Aug 11 13:03 pg_dynshmem
drwx------. 2 postgres postgres   4096 Aug 11 13:03 global
-rw-------. 1 postgres postgres     30 Aug 11 13:03 current_logfiles
-rw-------. 1 postgres postgres      0 Aug 11 13:03 standby.signal
-rw-------. 1 postgres postgres    478 Aug 11 13:03 postgresql.auto.conf
-rw-------. 1 postgres postgres 135417 Aug 11 13:03 backup_manifest




[postgres@rac10-p data]$ cat postgresql.auto.conf
# Do not edit this file manually!
# It will be overwritten by the ALTER SYSTEM command.
archive_mode = 'on'
archive_command = 'test ! -f /backup/PostgreSQL13_arch/%f && cp %p //backup/PostgreSQL13_arch/%f'
listen_addresses = '*'
primary_conninfo = 'user=repuser passfile=''/var/lib/pgsql/.pgpass'' channel_binding=prefer host=192.168.1.43 port=5432 sslmode=prefer sslcompression=0 ssl_min_protocol_version=TLSv1.2 gssencmode=prefer krbsrvname=postgres target_session_attrs=any'



Check the PostgreSQL service start or not ..

[postgres@rac10-p data]$ ps -ef| grep postgres
root       10000    9940  0 11:48 pts/0    00:00:00 su - postgres
postgres   10001   10000  0 11:48 pts/0    00:00:00 -bash
postgres   12991   10001  0 13:06 pts/0    00:00:00 ps -ef
postgres   12992   10001  0 13:06 pts/0    00:00:00 grep --color=auto postgres
[postgres@rac10-p data]$


Start the PostgreSQL service as below:-

[postgres@rac10-p data]$ pg_ctl start -D /var/lib/pgsql/13/data
waiting for server to start....2024-08-11 13:06:17.934 IST [13003] LOG:  redirecting log output to logging collector process
2024-08-11 13:06:17.934 IST [13003] HINT:  Future log output will appear in directory "log".
 done
server started



[postgres@rac10-p data]$ ps -ef| grep postgres
root       10000    9940  0 11:48 pts/0    00:00:00 su - postgres
postgres   10001   10000  0 11:48 pts/0    00:00:00 -bash
postgres   13003       1  0 13:06 ?        00:00:00 /usr/pgsql-13/bin/postgres -D /var/lib/pgsql/13/data
postgres   13004   13003  0 13:06 ?        00:00:00 postgres: logger
postgres   13005   13003  0 13:06 ?        00:00:00 postgres: startup recovering 000000010000000000000004
postgres   13006   13003  0 13:06 ?        00:00:00 postgres: checkpointer
postgres   13007   13003  0 13:06 ?        00:00:00 postgres: background writer
postgres   13008   13003  0 13:06 ?        00:00:00 postgres: stats collector
postgres   13009   13003  1 13:06 ?        00:00:00 postgres: walreceiver streaming 0/4000060
postgres   13010   10001  0 13:06 pts/0    00:00:00 ps -ef
postgres   13011   10001  0 13:06 pts/0    00:00:00 grep --color=auto postgres
[postgres@rac10-p data]$

Check receiver background  process started or not :-

[postgres@rac10-p data]$ ps -eaf | grep receiver
postgres   13009   13003  0 13:06 ?        00:00:00 postgres: walreceiver streaming 0/4000060
postgres   13021   10001  0 13:07 pts/0    00:00:00 grep --color=auto receiver
[postgres@rac10-p data]$
[postgres@rac10-p data]$
[postgres@rac10-p data]$ psql
psql (13.16)
Type "help" for help.

postgres=# select pg_is_in_recovery();
 pg_is_in_recovery
-------------------
 t
(1 row)



Primary - Master :-

postgres=# select pg_is_in_recovery();
 pg_is_in_recovery
-------------------
 f


The result f (false) from select pg_is_in_recovery(); indicates that the PostgreSQL server is not in recovery mode. This means the server is running as the primary (or standalone) instance and is not in standby mode, as it would be if it were replicating from another primary server or undergoing a failover recovery process.






Standby - Slave :-

postgres=# select pg_is_in_recovery();
 pg_is_in_recovery
-------------------
 t


The result t (true) from select pg_is_in_recovery(); indicates that the PostgreSQL server is in recovery mode. This typically means the server is running as a standby instance in a replication setup, such as:

  1. Streaming Replication: The server could be continuously replicating data from a primary PostgreSQL instance.
  2. Point-in-Time Recovery: The server might be performing a recovery operation, applying WAL (Write-Ahead Log) files to bring it to a specific state.

While in recovery mode, the server operates in a read-only mode, meaning you won’t be able to perform any write operations. This is normal for standby servers in a replication environment.






Run below query on MASTER server for outgoing replication details:-

postgres=# \x
postgres=# select * from pg_stat_replication;

Run below query on SLAVE server.

postgres=# \x
postgres=# select pg_is_wal_replay_paused();






NOTE: f means, recovery is running fine. t means it is stopped.


Check for last wal received on SLAVE




You can check the Master and Slave databases info by running below command separately on both sides. If your system can not find the pg_controldata program, then you should set the path to this program. Because some of programs include pg_controldata placed under /usr/lib/postgresql/14/bin/ location.



[postgres@rac09-p data]$ pg_controldata -D $PGDATA
pg_control version number:            1300
Catalog version number:               202007201
Database system identifier:           7401588754142404987
Database cluster state:               in production
pg_control last modified:             Sun 11 Aug 2024 01:08:01 PM IST
Latest checkpoint location:           0/4000098
Latest checkpoint's REDO location:    0/4000060
Latest checkpoint's REDO WAL file:    000000010000000000000004
Latest checkpoint's TimeLineID:       1
Latest checkpoint's PrevTimeLineID:   1
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID:          0:489
Latest checkpoint's NextOID:          24576
Latest checkpoint's NextMultiXactId:  1
Latest checkpoint's NextMultiOffset:  0
Latest checkpoint's oldestXID:        479
Latest checkpoint's oldestXID's DB:   1
Latest checkpoint's oldestActiveXID:  489
Latest checkpoint's oldestMultiXid:   1
Latest checkpoint's oldestMulti's DB: 1
Latest checkpoint's oldestCommitTsXid:0
Latest checkpoint's newestCommitTsXid:0
Time of latest checkpoint:            Sun 11 Aug 2024 01:08:01 PM IST
Fake LSN counter for unlogged rels:   0/3E8
Minimum recovery ending location:     0/0
Min recovery ending loc's timeline:   0
Backup start location:                0/0
Backup end location:                  0/0
End-of-backup record required:        no
wal_level setting:                    replica
wal_log_hints setting:                off
max_connections setting:              100
max_worker_processes setting:         8
max_wal_senders setting:              10
max_prepared_xacts setting:           0
max_locks_per_xact setting:           64
track_commit_timestamp setting:       off
Maximum data alignment:               8
Database block size:                  8192
Blocks per segment of large relation: 131072
WAL block size:                       8192
Bytes per WAL segment:                16777216
Maximum length of identifiers:        64
Maximum columns in an index:          32
Maximum size of a TOAST chunk:        1996
Size of a large-object chunk:         2048
Date/time type storage:               64-bit integers
Float8 argument passing:              by value
Data page checksum version:           0
Mock authentication nonce:            e45f66c4810dad4d3b1a41c4ca3e1a84d56ea017d71a13e85f3bf365512ac661
[postgres@rac09-p data]$



[postgres@rac10-p data]$ pg_controldata -D $PGDATA
pg_control version number:            1300
Catalog version number:               202007201
Database system identifier:           7401588754142404987
Database cluster state:               in archive recovery
pg_control last modified:             Sun 11 Aug 2024 01:11:18 PM IST
Latest checkpoint location:           0/4000098
Latest checkpoint's REDO location:    0/4000060
Latest checkpoint's REDO WAL file:    000000010000000000000004
Latest checkpoint's TimeLineID:       1
Latest checkpoint's PrevTimeLineID:   1
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID:          0:489
Latest checkpoint's NextOID:          24576
Latest checkpoint's NextMultiXactId:  1
Latest checkpoint's NextMultiOffset:  0
Latest checkpoint's oldestXID:        479
Latest checkpoint's oldestXID's DB:   1
Latest checkpoint's oldestActiveXID:  489
Latest checkpoint's oldestMultiXid:   1
Latest checkpoint's oldestMulti's DB: 1
Latest checkpoint's oldestCommitTsXid:0
Latest checkpoint's newestCommitTsXid:0
Time of latest checkpoint:            Sun 11 Aug 2024 01:08:01 PM IST
Fake LSN counter for unlogged rels:   0/3E8
Minimum recovery ending location:     0/4000148
Min recovery ending loc's timeline:   1
Backup start location:                0/0
Backup end location:                  0/0
End-of-backup record required:        no
wal_level setting:                    replica
wal_log_hints setting:                off
max_connections setting:              100
max_worker_processes setting:         8
max_wal_senders setting:              10
max_prepared_xacts setting:           0
max_locks_per_xact setting:           64
track_commit_timestamp setting:       off
Maximum data alignment:               8
Database block size:                  8192
Blocks per segment of large relation: 131072
WAL block size:                       8192
Bytes per WAL segment:                16777216
Maximum length of identifiers:        64
Maximum columns in an index:          32
Maximum size of a TOAST chunk:        1996
Size of a large-object chunk:         2048
Date/time type storage:               64-bit integers
Float8 argument passing:              by value
Data page checksum version:           0
Mock authentication nonce:            e45f66c4810dad4d3b1a41c4ca3e1a84d56ea017d71a13e85f3bf365512ac661


No comments:

Post a Comment

Master and Slave - Sync check - PostgreSQL

  1) Run the below Query on Primary:- SELECT     pid,     usename,     application_name,     client_addr,     state,     sync_state,     sen...