PostgreSQL DBA : Backup

Showing posts with label Backup. Show all posts

Friday, 16 February 2024

pg_basebackup in PostgreSQL

Physical Backup:- pg_basebackup

pg_basebackup is a simple but powerful utility provided by PostgreSQL to take online and consistent file system-level backups. The backups created using this command contain the actual database files so you don’t need to actually restore these kinds of backups. You can just uncompress the backup files and copy them to the database data directory. You can use it to create the base backups of your database cluster which in turn can be used for the point in time recovery. You can also set up a standby replica database server using the streaming option of this command.

You can take complete server file system backup of data directory,

it is online process, no locks will be applied.

If you want, you can perform point in time recovery with using this

physical backup, if you have enough WAL files.

If you have the file system backup, if you want you can bring up the server anywhere if you have proper software (PostgreSQL version)

To archive this, we have an utility called pg_basebackup.

You can take this backup in Plain format , at the same time compressed format as well.

You can bring up in any linux flavour, there will be no change data directory until unless page size and blocks size of your OS is equal, or else there will be problem data distribution.

VIMP:- If you want to take the base backup, that target directory should be empty.

We have to enable below parameters in postgresql.conf file to gather all the requried wal files

for consistency.

Enable the parameters in postgresql.conf file

=============================================

wal_level = replica (upto 9.6 this hot_standby) -- It will keep all the required WAL files for repilcation

Minimal:- it will keep WAL file in pg_xlog only for recovery purpose, so never use this minimal, it will be not suitable for replication/baseback

logical: as part of logical replication we will enable

Even though whether it is replication or backup activity to send all the WAL files to target location we have below parameter

max_wal_senders = 5 ( If you have enough resource(CPU+RAM) in your system then you can increase the value

wal_keep_segments = 32 ( how many number of WAL files you want to store in pg_xlog location)

(If you have enough space you can use as much as you can, some people will keep 1024 files, 16GB space they are blocking for pg_wal directory.

If you are taking archive files you no need to bother about the size.

To reflect

show wal_level;

show max_wal_senders;

show wal_keep_segements;

How to take base backup:-

pg_backbackup -D /var/lib/pgsql/backup -- this will take backup in plain format

pg_backbackup -D /var/lib/pgsql/backup/ -Ft -X fetch -v -P

where

-F - format

-t :- tar

-D :- target backup directory

-X fetch :- it will gather all required the WAL files - checkpoint will be happened.

-v :- versose

-P :- Progress

16429.tar

base.tar

these two files will be created

Here 2 folders are there, by default only one folder , but if you have tablespaces, each tablespace one tar file will be created.

How to Restore Full backup:-

================================

$] tar -xvf base.tar ----> extract this base.tar file

$] rm base_tar ----> in real time keep this base.tar file keep in safe locaiton

$] tar -xvf 16429.tar -------> this is tablespace tar file so extract this also

$] rm 16429.tar ---------> remove this file also

Once you extract both tar files, you will get exact copy of PostgreSQL datafiles./ datadirectory

Accordingly make the changes in the postgresql.conf file.

like port = 5432 to any

Change the permissions of backup directory to promote it as a data directory

chmod 0700 /var/lib/pgsql/backup

and start the postgresql instance

Monday, 5 February 2024

pg_dumpall - PostgreSQL

Monday, 22 January 2024

Backup and Restore - PostgreSQL-1

Backup Table Exercises

Export from emp table to emp file:-

postgres=#
postgres=#
postgres=# \q

-bash-4.2$
-bash-4.2$
-bash-4.2$ rm -rf /tmp/emp.txt
-bash-4.2$
-bash-4.2$
-bash-4.2$ psql -p 5455
psql (10.9, server 15.5)
WARNING: psql major version 10, server major version 15.
Some psql features might not work.
Type "help" for help.

How to take pg_basebackup in PostgreSQL - Examples

How to take pg_baseback ?

1) Created directory in /backup location

2) Changed the ownership of the directory

3) Took the backup in that particular /backup location

[root@rac7 ~]# cd /backup

[root@rac7 backup]# mkdir postgre_backup

[root@rac7 backup]# chown postgres:postgres postgre_backup/

[root@rac7 backup]# su - postgres

Last login: Mon Jan 22 18:53:13 IST 2024 from 192.168.1.102 on pts/6
-bash-4.2$
-bash-4.2$
-bash-4.2$ cd /backup

-bash-4.2$ cd 15_cluster_db1_bkp/

-bash-4.2$ pwd
/backup/postgre_backup/15_cluster_db1_bkp

Run the below command to take base backup :=

-bash-4.2$
-bash-4.2$ /usr/pgsql-15/bin/pg_basebackup -p 5455 -D /backup/postgre_backup/15_cluster_db1_bkp

Below files we will created in the backup location:-

-bash-4.2$ cd /backup/postgre_backup/15_cluster_db1_bkp
-bash-4.2$ ls -lrt
total 272
drwx------ 2 postgres postgres 4096 Jan 22 19:14 pg_stat_tmp
drwx------ 2 postgres postgres 4096 Jan 22 19:14 pg_snapshots
drwx------ 2 postgres postgres 4096 Jan 22 19:14 pg_dynshmem
drwx------ 2 postgres postgres 4096 Jan 22 19:14 log
-rw------- 1 postgres postgres 225 Jan 22 19:14 backup_label
drwx------ 3 postgres postgres 4096 Jan 22 19:14 pg_wal
-rw------- 1 postgres postgres 261 Jan 22 19:14 postgresql.auto.conf
-rw------- 1 postgres postgres 3 Jan 22 19:14 PG_VERSION
drwx------ 2 postgres postgres 4096 Jan 22 19:14 pg_tblspc
drwx------ 2 postgres postgres 4096 Jan 22 19:14 pg_serial
-rw------- 1 postgres postgres 1636 Jan 22 19:14 pg_ident.conf
-rw------- 1 postgres postgres 4789 Jan 22 19:14 pg_hba.conf
-rw------- 1 postgres postgres 374 Jan 22 19:14 logfile
drwx------ 5 postgres postgres 4096 Jan 22 19:14 base
drwx------ 2 postgres postgres 4096 Jan 22 19:14 pg_xact
drwx------ 2 postgres postgres 4096 Jan 22 19:14 pg_twophase
drwx------ 2 postgres postgres 4096 Jan 22 19:14 pg_replslot
drwx------ 2 postgres postgres 4096 Jan 22 19:14 pg_subtrans
drwx------ 2 postgres postgres 4096 Jan 22 19:14 pg_stat
drwx------ 2 postgres postgres 4096 Jan 22 19:14 pg_notify
drwx------ 4 postgres postgres 4096 Jan 22 19:14 pg_logical
drwx------ 2 postgres postgres 4096 Jan 22 19:14 pg_commit_ts
drwx------ 4 postgres postgres 4096 Jan 22 19:14 pg_multixact
-rw------- 1 postgres postgres 29457 Jan 22 19:14 postgresql.conf
drwx------ 2 postgres postgres 4096 Jan 22 19:14 global
-rw------- 1 postgres postgres 30 Jan 22 19:14 current_logfiles
-rw------- 1 postgres postgres 137625 Jan 22 19:14 backup_manifest
-bash-4.2$

PostgreSQL Backups: What is pg_basebackup?

pg_basebackup:-

pg_basebackup is a simple but powerful utility provided by PostgreSQL to take online and consistent file system-level backups.
The backups created using this command contain the actual database files so you don’t need to actually restore these kinds of backups.
You can just uncompress the backup files and copy them to the database data directory.
You can use it to create the base backups of your database cluster which in turn can be used for the point in time recovery.
You can also set up a standby replica database server using the streaming option of this command.

In this article, we will go through the process of creating backups using the pg_basebackup utility and also learn about various options that can be used for this task.

Backup Process for a PostgreSQL Cluster

You can run the following simple command to take the base backup of your database.

1	`$ pg_basebackup -D` `/tmp/backups` `-x`

Let’s see these options:

-D is used to specify the path of the output directory.
-x flag will include the transaction logs in the backup folder. These logs can be used to start the database server from the newly created base backup. This means that we can just extract the backup files anywhere and use it as a fully operational database.

Prerequisites:-

Before we start taking the backup and run the above command, we need to make some changes in the postgresql.conf file so our database can run properly during the backup process. Change the following values in the postgresql.conf file.

wal_level = archive
 
max_wal_senders = 4

Here, WAL stands for write-ahead logs (transaction logs) for PostgreSQL. By changing the value of wal_level to archive, the backup command will generate the logs in the format that are compatible with pg_basebackup and other replication-based backup tools. The value of max_wal_senders will determine how many clients can connect with the database and request the data files.

Apart from this change, we need to make one more change in the pg_hba.conf file adding the following line:

1	`local` `replication postgres peer`

pg_hba.conf file basically contains the access control list of connections. pg_basebackup command internally uses a replication protocol to take backups so we need to add the above line to allow local connections that can request the data files of the database.

Labeling Your Backups

It’s very important to create consistent and periodic base backups to keep your data safe. When you create multiple backups, it's always good to add labels to your database. You can do it using the following command:

$ pg_basebackup -D /tmp/backups -x -l 

"This backup was created at $(hostname) on $(date)"

Viewing the Backup Progress

If you want to know the approximate duration of the backup process then you can run the following command:

1	`$ pg_basebackup -D` `/tmp/backups` `-x -P`

When you provide this -P option, the backup process will start calculating the size of the full database. If your database is having active connections, some new data might be added in the database during the backup process, so this command won’t be able to predict the exact time duration.

Creating Backups from a Remote Server

This command can also be used to take the backups of remote PostgreSQL clusters. You can provide the hostname and port number by specifying the following options in your command.

1	`$ pg_basebackup -D` `/tmp/backups` `-x -h 10.0.2.15 -p 5432 -U postgres`

Creating gzipped tar Backup

If you create the backup in plain text format, the backup folder will occupy a lot of space on your server. Instead of plain text format, this command can generate backup files in tar format. Also, you can provide -z flag to compress the target backup so it will occupy even less space. Here is the command for it:

1	`$ pg_basebackup -D` `/tmp/backups` `-x -Ft -z`

Creating a Backup by Streaming the Transaction Logs

Pg_basebackup command can stream the transaction logs (WAL) in parallel while taking the backup. This is useful when some new data is being added to the database while the backup process is running. So, once the backup process is over, you can just extract the backup and start using it as a fully qualified database.

When you specify this option, the pg_basebackup command will open two connections to the server. One to request the data files and another to stream the WAL logs in parallel. So, make sure that max_wal_senders value is greater than 2 in your PostgreSQL configuration file.

1	`$ pg_basebackup -D` `/tmp/backups` `-X stream -Ft -z`

Also, If you take the backup using streaming enabled, the resulting backup will have all the data along with required WAL logs to start afresh replica server for your production database.

Recovery process of PostgreSQL cluster

First of all, stop your database server to start the recovery process.

1	`$ pg_ctlcluster 10.6 main stop`

Now, move the current database files to some other folder to keep this as a backup.

1	`$` `mv` `main main.backup`

Copy the archived base backup files to the data directory of the database.

1	`$` `cp` `-r` `/tmp/backups` `main`

Start the database server now to perform the recovery.

1	`$ pg_ctlcluster 10.6 main start`

If you want, you can monitor the recovery process by checking the logs file.

1	`$` `tail` `-f` `/var/log/postgresql/postgresql-10.6-main.log`

Recovery process is fairly simple as we copied actual data files during the backup process. So, you just have to extract the backup files and move it to the data directory of the database.

Conclusion:-

Pg_basebackup is a handful and easy to use utility for making PostgreSQL backups. For the large size of databases, you should use pg_basebackup command to take backups as pg_dumpall will take a very long time to take a full backup. Also, pg_basebackup provides some useful options such as streaming of logs, parallel compression, etc. This makes it a more attractive tool than pg_dump or pg_dumpall utilities. Pg_basebackup can only be used to create backups of the entire database cluster, as it doesn’t support single database backups. For that purpose, you can use pg_dump utility.