Sunday, 21 January 2024

LOG MANAGEMENT - PostgreSQL

 LOG MANAGEMENT - PostgreSQL



What Are PostgreSQL Logs?

PostgreSQL logs are text files showing you information related to what is currently happening in your database system. This includes who has access to what component, what errors have occurred, what settings have changed, what queries are in process, and what transactions are being executed.

To get a bird’s-eye view of these logs, you can ship them to a centralized place and then have a way to search across all of them. Parsing lets you retrieve important information and metrics, which you can then plot to better visualize as data points.

This article will show you how to tweak your settings in PostgreSQL using both a config file and a command line interface. It is recommended to make all these changes using exclusively the config file, otherwise your changes may be lost when you restart your server.


-bash-4.2$ psql -p 5455
psql (10.9, server 15.5)
WARNING: psql major version 10, server major version 15.
         Some psql features might not work.
Type "help" for help.
postgres=# 
postgres=# SELECT pg_current_logfile();
   pg_current_logfile   
------------------------
 log/postgresql-Sun.log
(1 row)

postgres=# 
postgres=# SHOW data_directory;
         data_directory         
--------------------------------
 /u02/PostgreSQL/15_cluster_db1
(1 row)
postgres=# 

postgres=# SHOW config_file;
                  config_file                   
------------------------------------------------
 /u02/PostgreSQL/15_cluster_db1/postgresql.conf
(1 row)

postgres=# show logging_collector();
ERROR:  syntax error at or near "("
LINE 1: show logging_collector();
                              ^
postgres=# 
postgres=# show logging_collector;
 logging_collector 
-------------------
 on
(1 row)

postgres=# 
postgres=# 
postgres=# show log_directory;
 log_directory 
---------------
 log
(1 row)
postgres=# show log_filename;
   log_filename    
-------------------
 postgresql-%a.log
(1 row)
postgres=# 
postgres=# show log_file_mode;
 log_file_mode 
---------------
 0600
(1 row)
postgres=# 
postgres=# show log_rotation_age;
 log_rotation_age 
------------------
 1d
(1 row)
postgres=# 
postgres=# show log_rotation_size;
 log_rotation_size 
-------------------
 0
(1 row)
postgres=# 

postgres=# show log_truncate_on_rotation;
 log_truncate_on_rotation 
--------------------------
 on
(1 row)
postgres=# 
postgres=# show log_line_prefix;
 log_line_prefix 
-----------------
 %m [%p] 
(1 row)
postgres=# 
postgres=# show log_timezone;
 log_timezone 
--------------
 Asia/Kolkata
(1 row)
postgres=# 
postgres=# show log_destination;
 log_destination 
-----------------
 stderr
(1 row)
postgres=# 
postgres=# show log_statement;
 log_statement 
---------------
 none
(1 row)
postgres=# 
postgres=# 
postgres=# 
postgres=# show log_duration;
 log_duration 
--------------
 off
(1 row)
postgres=# 
postgres=# show log_min_duration_statement;
 log_min_duration_statement 
----------------------------
 -1
(1 row)
postgres=# 
postgres=# 

The PostgreSQL server outputs its logs to the standard error stream by default. You can confirm this by executing the following query in psql:

 
SHOW log_destination;
 Output
 log_destination
-----------------
 stderr
(1 row)

It also provides a logging collector process that is responsible for capturing the logs sent to the standard error and routing then to log files. The behavior of this logging collector is controlled by the aptly named logging_collector setting.

Let's go ahead and check the status of the logging_collector config through the psql interface:

 
SHOW logging_collector;
 Output
 logging_collector
-------------------
 off
(1 row)

The logging_collector is off by default so the PostgreSQL logs (which are sent to the server's standard error stream) are not handled by the logging collector process. This means that the destination of the logs depend on where the server's stderr stream is directed to.

On Ubuntu, you may use the pg_lsclusters command to locate the log file for all the PostgreSQL clusters running on your machine:

 
pg_lsclusters

The Log file column indicates the location of the file:

 Output
Ver Cluster Port Status Owner    Data directory              Log file
14  main    5432 online postgres /var/lib/postgresql/15/main /var/log/postgresql/postgresql-15-main.log

If logging_collector is on on your machine, you may execute the following query in psql to find out the location of the file:

 
SELECT pg_current_logfile();
 Output
   pg_current_logfile
------------------------
 log/postgresql-Thu.log
(1 row)

The specified file (log/postgresql-Thu.log in this case) is relative to the PostgreSQL data directory, which you can locate by entering the following query:

 
SHOW data_directory;
 Output
   data_directory
---------------------
 /var/lib/pgsql/data
(1 row)

The complete path to the log file in this example is:

 
/var/lib/pgsql/data/log/postgresql-Thu.log

Once you've located the PostgreSQL log file, proceed to the next section where we'll examine the various types of logs that you might encounter when browsing the contents of the file.

Types of PostgreSQL logs

There are various kind of PostgreSQL logs that you're likely to encounter in the log file depending on your configuration. This section will cover the most important ones and what they look like.

Note that the examples below intentionally omit the log line prefix (everything before <log_message> below) for the sake of brevity. Only the <log_message> part is shown.

 
2023-07-30 08:31:50.628 UTC [2176] postgres@chinook <log_message>

1. Startup and shutdown logs

These logs describe the startup and shutdown processes of the PostgreSQL server. Startup logs include the server version, IP address, port and UNIX socket. They also report the last time the server was shutdown and indicate its readiness to accept new connections:

 
LOG:  starting PostgreSQL 15.3 (Ubuntu 15.3-0ubuntu0.22.04.1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 11.3.0-1ubuntu1~22.04.1) 11.3.0, 64-bit
LOG:  listening on IPv4 address "127.0.0.1", port 5432
LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
LOG:  database system was shut down at 2023-07-27 17:45:08 UTC
LOG:  database system is ready to accept connections

On the other hand, shutdown logs are entries describing why and how the server was shut down which could come in handy when investigating unexpected database failures. The logs below describe a fast shutdown procedure when a SIGINT signal is sent to the server: 

 
LOG:  received fast shutdown request
LOG:  aborting any active transactions
LOG:  background worker "logical replication launcher" (PID 82894) exited with exit code 1
LOG:  shutting down
LOG:  database system is shut down

2. Query logs

These logs represent the various SQL queries that are executed against a PostgreSQL database including SELECTINSERTUPDATEDELETE, and more.

 
STATEMENT:  select c.firstname, c.lastname, i.invoiceid, i.invoicedate, i.billingcountry
        from customer as c, invoice as i
        where c.country = 'Brazil' and
        select distinct billingcountry from invoice;
LOG:  statement: select distinct billingcountry from invoice;
LOG:  statement: select albumid, title from album where artistid = 2;

3. Query duration logs

Closely related to query logs are duration logs that track how long a query took to complete:

 
LOG:  duration: 13.020 ms
LOG:  duration: 13.504 ms  statement: UPDATE album SET title = 'Audioslave' WHERE albumid = 10;

4. Error logs

Error logs help you identify queries that lead to error conditions on the server. Such conditions include query-specific mistakes like constraint violations or mismatched data types to more severe problems such as deadlocks and resource exhaustion. Here are examples of such logs:

 
ERROR:  relation "custome" does not exist at character 15
STATEMENT:  select * from custome where country = 'Brazil';
ERROR:  duplicate key value violates unique constraint "pk_album"
DETAIL:  Key (albumid)=(10) already exists.
STATEMENT:  UPDATE album SET albumid = 10 WHERE albumid = 2;

5. Connection and disconnection logs

Information about client connections and disconnections to the database can also be recorded in the logs. These records include the source IP address, username, database name, and connection status. In the case of disconnections, they also include the total session duration.

 
LOG:  connection received: host=[local]
LOG:  connection authenticated: identity="postgres" method=peer (/etc/postgresql/15/main/pg_hba.conf:90)
LOG:  connection authorized: user=postgres database=chinook application_name=psql
LOG:  disconnection: session time: 0:00:08.602 user=postgres database=chinook host=[local]

6. Checkpoint and Write-Ahead Logs (WAL)

A checkpoint represents the moment when the Write Ahead Log (WAL) flushes modified data from memory (shared buffers) to disk. These logs look like this:

 
LOG:  checkpoint starting: time
LOG:  checkpoint complete: wrote 3 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.005 s, sync=0.002 s, total=0.025 s; sync files=2, longest=0.001 s, average=0.001 s; distance=0 kB, estimate=0 kB

Turning on the logging collector

As mentioned earlier, the logging collector is a background process that captures log messages sent to the standard error and redirects them into files. When you enable the collector, PostgreSQL logs will no longer be redirected to the /var/log/postgresql/postgresql-15-main.log file (on Ubuntu) but will be stored in a separate directory.

Before you can start modifying the PostgreSQL configuration, you need to locate its location on your machine with the following query:

 
SHOW config_file;
 Output
               config_file
-----------------------------------------
 /etc/postgresql/15/main/postgresql.conf

Open the path to the file in your favourite text editor with root privileges:

 
sudo nano /etc/postgresql/15/main/postgresql.conf

This file contains several configurable parameters and comments start with the # character.

Now, find the logging_collector option, uncomment the line and turn it on:

/etc/postgresql/15/main/postgresql.conf
logging_collector = on

Save the file, then restart your PostgreSQL server in the terminal (ensure to do this after modifying any setting in this file):

 
sudo systemctl restart postgresql

Afterwards, view the final lines in the /var/log/postgresql/postgresql-15-main.log file:

 
tail /var/log/postgresql/postgresql-15-main.log

You should observe two lines confirming that the logging collector is turned on and that future log output is now being placed in a "log" directory:

 Output
. . .
LOG:  redirecting log output to logging collector process
HINT:  Future log output will appear in directory "log".

This "log" directory is controlled by the log_directory configuration:

/etc/postgresql/15/main/postgresql.conf
log_directory = 'log'

From the comment above, the log_directory controls where PostgreSQL logs are written to. When a relative path is supplied, it is relative to the value of the data_directory setting:

/etc/postgresql/15/main/postgresql.conf
data_directory = '/var/lib/postgresql/15/main'

Hence, we can deduce that the log files should be in the /var/lib/postgresql/15/main/log/ directory. Execute the command below with root privileges to find out:

 
sudo ls /var/lib/postgresql/15/main/log/

You should observe at least one log file in the directory indicating that the logging collector is working correctly:

 Output
postgresql-2023-07-30_172237.log

In the next section, you'll learn how to customize the log file name and related options.

Customizing the log files

PostgreSQL allows the customization of the log filename through the log_filename option. The default is shown below:

/etc/postgresql/15/main/postgresql.conf
log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log'

This configuration produces filenames containing the date and time of file creation. The supported % escapes are listed in Open Group's strftime specification.

To keep things simple for the purpose of this tutorial you can change the log filename to the following:

/etc/postgresql/15/main/postgresql.conf
log_filename = 'postgresql-%a.log'

This will produce files with the abbreviated name of the day like postgresql-Thu.log etc. Note that although the filename ends with .log, if JSON or CSV formatting is enabled, the file produced will end in .json and .csv respectively (postgresql-Thu.json for example).

If you want to control how each log file is created, you can change the log_file_mode option:

/etc/postgresql/15/main/postgresql.conf
log_file_mode = 0600 # the default

The 0600 permission is commonly used for sensitive files that should only be accessible and modifiable by the file's owner. With this setting, you can only access, view or modify this file by switching to the postgres user or using root privileges. This is a good security measure since PostgreSQL logs often contain sensitive information.

Setting up log file rotation

As you probably already know, logging into files can quickly eat up all your disk space if an appropriate log rotation policy isn't in place, and this is especially true for PostgreSQL logs when query logging is turned on.

Conveniently, it provides a few configuration options for controlling how its log files are rotated:

  • log_rotation_age: This specifies the maximum age of a log file before it is rotated. The default value is 24 hours (1d). It can accept an integer value with a unit like m for minutes, d for days, w for weeks, etc. If the unit is not provided, minutes is assumed. You can also disable time-based log rotation by setting it to 0.

  • log_rotation_size: This value determines the maximum size of a log file before it is rotated in favor of a new log file. The default value is 10 megabytes (10MB). Set this to 0 if you want to disable size-based log rotation.

  • log_truncate_on_rotation: This option only applies when time-based rotation is applied to a log file. It is used to overwrite existing any log file of the same name instead of appending to the file. This is useful when you have a log_filename pattern like postgresql-%a.log which produces files like postgresql-Mon.logpostgresql-Tue.log etc. Enabling this option (together with setting log_rotation_age to 7d) ensures that each week's logs overwrites the previous week but the filenames remain the same.

As you can see, the options above provide basic log file rotation capabilities. If you need more customization options like the ability to automatically compress rotated files, turn off PostgreSQL's log file rotation options and use the logrotate utility instead.

Log formatting

PostgreSQL's log format is controlled by the log_destination setting which is set to stderr by default. You can use also the jsonlog and csvlog options to output the logs in JSON and CSV formats respectively. More on this soon.

/etc/postgresql/15/main/postgresql.conf
log_destination = 'stderr'

When stderr is enabled, you can modify its output through the log_line_prefix option which holds a printf-style string that determines the information that is will be included at the beginning of each record. Type the following in psql to find out its current value:

 
show log_line_prefix;
 Output
 log_line_prefix
------------------
 %m [%p] %q%u@%d
(1 row)

This value ensures that each log record sent to the stderr includes the following details:

  • %m: The time of the event with milliseconds.
  • %p: The process ID of the specific PostgreSQL instance that created the log.
  • %q: This token produces no output but informs background processes to ignore everything after this point. The other tokens (%u@%d) are only available in session processes.
  • %u: The connected user that triggered the event.
  • %d: The database the user is connected to.

A log entry produced using the format described above looks like this:

 
2023-07-30 08:31:50.628 UTC [2176] postgres@chinook LOG:  statement: select albumid, title from album where artistid = 2;

Note that everything after postgres@chinook is not controlled by the log_line_prefix and cannot be customized. It consists of the log level and the log message which is a SELECT statement in this case.

Customizing the log format

The only way to modify the stderr log format is through the log_line_prefix setting.

 Please see the full table of escape sequences  

  • %a: The application name (set by using the application_name parameter in the PostgreSQL connection string).
  • %p: The process ID of the PostgreSQL instance.
  • %u: The connected user.
  • %d: The database name.
  • %m: The time of the event with milliseconds (or %n if you prefer a UNIX epoch).
  • %q: Separate session-only variables from those that are valid in all contexts.
  • %i: The command tag identifying the SQL query that was executed.
  • %e: The relevant PostgreSQL error code.
  • %c: The ID of the current session.

To make the logs easier to read and parse, you can prefix each variable with a key like this:

/etc/postgresql/15/main/postgresql.conf
log_line_prefix = 'time=%m pid=%p error=%e sess_id=%c %qtag=%i usr=%u db=%d app=%a '

It's advisable to use a space (or some other character) before the closing quote to separate the log prefix from the actual log message. With this configuration in place, you will observe logs in the following format after restarting the PostgreSQL server:

 
time=2023-07-30 22:02:27.929 UTC pid=17678 error=00000 sess_id=64c6ddf3.450e LOG:  database system is ready to accept connections
time=2023-07-30 22:03:33.613 UTC pid=17769 error=00000 sess_id=64c6de20.4569 tag=idle usr=postgres db=chinook app=psql LOG:  statement: UPDATE album SET title = 'Audioslave' WHERE albumid = 10;

The first record above is created by a non-session process (a.k.a background process). These processes perform maintenance activities, background tasks, and other internal functions to support the database's operation. The inclusion of the %q escape excludes everything after it since such sequences are invalid in non-session processes. If %q is excluded, the tagusrdb, and app keys would be present in the log but empty.

On the other hand, session processes (a.k.a user backend processes) are those that are directly tied to a specific user and database, and they handle the actual queries and commands issued by the client application. In such contexts, all escape sequences are valid so the %q escape will have no effect.

Customizing the time zone

The log_timezone setting controls the time zone used for recording timestamp information. The default in Ubuntu is Etc/UTC which is UTC time. If you want the time to correspond to your server time, change this value accordingly:

/etc/postgresql/15/main/postgresql.conf
log_timezone = 'Europe/Helsinki'

However, I recommend sticking with UTC time for ease of log correlation and normalization.

Structured logging in JSON

PostgreSQL has supported logging in the CSV format since the v8.3 release  way back in February 2008, but JSON was only recently supported in v15 release. You can now format PostgreSQL logs as structured JSON by adding jsonlog to the log_destination config like this:

/etc/postgresql/15/main/postgresql.conf
log_destination = 'stderr,jsonlog'

When you restart the server and view the log directory, you will observe that two new log files with the same name but different extensions (.log, and .json). These files contain the same logs but in different formats. The .log file follows the stderr format as before, while the .json file predictably contains JSON logs.

Here's a comparison of the two files with the same logs:

/var/lib/postgresql/15/main/log/postgresql-Sun.log
time=2023-07-30 22:48:01.817 UTC pid=18254 error=00000 sess_id=64c6e836.474e tag=idle usr=postgres db=chinook app=psql LOG:  statement: SHOW data_directory;
time=2023-07-30 22:49:21.808 UTC pid=18254 error=00000 sess_id=64c6e836.474e tag=idle usr=postgres db=chinook app=psql LOG:  statement: TABLE albu limit 10;
time=2023-07-30 22:49:21.808 UTC pid=18254 error=42P01 sess_id=64c6e836.474e tag=SELECT usr=postgres db=chinook app=psql ERROR:  relation "albu" does not exist at character 7
time=2023-07-30 22:49:21.808 UTC pid=18254 error=42P01 sess_id=64c6e836.474e tag=SELECT usr=postgres db=chinook app=psql STATEMENT:  TABLE albu limit 10;

These logs are formatted according to the log_line_prefix setting as demonstrated earlier. On the other hand, the JSON logs contain all the available fields except those with null values: 

/var/lib/postgresql/15/main/log/postgresql-Sun.json
{"timestamp":"2023-07-30 22:48:01.817 UTC","user":"postgres","dbname":"chinook","pid":18254,"remote_host":"[local]","session_id":"64c6e836.474e","line_num":1,"ps":"idle","session_start":"2023-07-30 22:46:14 UTC","vxid":"4/3","txid":0,"error_severity":"LOG","message":"statement: SHOW data_directory;","application_name":"psql","backend_type":"client backend","query_id":0}
{"timestamp":"2023-07-30 22:49:21.808 UTC","user":"postgres","dbname":"chinook","pid":18254,"remote_host":"[local]","session_id":"64c6e836.474e","line_num":2,"ps":"idle","session_start":"2023-07-30 22:46:14 UTC","vxid":"4/4","txid":0,"error_severity":"LOG","message":"statement: TABLE albu limit 10;","application_name":"psql","backend_type":"client backend","query_id":0}
{"timestamp":"2023-07-30 22:49:21.808 UTC","user":"postgres","dbname":"chinook","pid":18254,"remote_host":"[local]","session_id":"64c6e836.474e","line_num":3,"ps":"SELECT","session_start":"2023-07-30 22:46:14 UTC","vxid":"4/4","txid":0,"error_severity":"ERROR","state_code":"42P01","message":"relation \"albu\" does not exist","statement":"TABLE albu limit 10;","cursor_position":7,"application_name":"psql","backend_type":"client backend","query_id":0}

A keen observer will notice that the error=42P01 event is spread over two log entries in the stderr format. However, this same event is captured in a single entry in the JSON format.

The obvious advantage to logging in JSON is that the logs can be automatically parsed and analyzed by log management tools. However, the fields cannot be customized in any way. You can't exclude a field or customize its name unlike the stderr format which is more flexible in that sense. If you really need such features, you can adopt a log shipper to transform the logs before sending it off to its final destination.

The rest of the examples will continue to be displayed in the stderr format but without the log_line_prefix for brevity.

How to log PostgreSQL queries

The log_statement setting controls what SQL queries are recorded in the server log files. Its valid options are as follows:

  • none (default): No SQL queries are recorded.
  • ddl: Logs only Data Definition Language (DDL) statements, such as CREATEALTER, and DROP statements that modify the database schema.
  • mod: In addition to DDL statements, this value logs Data Modification Language (DML) statements such as INSERTUPDATE, and DELETE.
  • all: Capture all SQL queries made to the server except those that fail before the execution phase due to parsing errors (see log_min_error_statement).
/etc/postgresql/15/main/postgresql.conf
log_statement = mod

Using the all setting in production will generate an enormous amount of data on busy systems which could slow down the database due to the overhead of writing every single query to disk. If your motivation for logging all queries is for auditing purposes, consider using pgAudit instead since it provides better controls for configuring audit logging.

Logging query duration

One of the main use cases for recording query logs is to identify slow-running queries that need optimization. However, the log_statement option does not track query duration in its output. You need to use the log_duration setting for this purpose. It is a boolean meaning that it can either be off (0) or on 1:

/etc/postgresql/15/main/postgresql.conf
log_duration = 1

Restart the server and view the logs once again. You should start observing entries containing a duration value in milliseconds:

 
LOG:  statement: UPDATE album SET title = 'Audioslave' WHERE albumid = 10;
LOG: duration: 18.498 ms

When log_duration is turned on, duration logs are produced for all statements executed by the server without exception (even a simple SELECT 1). This will produce a huge amount of logs which could harm performance and consume disk space unnecessarily.

Another problem with log_duration is that depending on the log_statement configuration, you may not know what query produced the duration log. To test this out, set log_statement to ddl, and keep log_duration enabled. Restart the server and enter the following query in psql:

 
UPDATE album SET title = 'Audioslave' WHERE albumid = 10;

You will observe the following log upon successful completion of the query:

 
LOG:  duration: 13.492 ms

Crucially, the statement that produced this entry is not recorded at all making it effectively useless. If you must use log_duration, ensure that the log_statement is also set to all so that each duration log can be correlated with the statement that produced it.

An alternative approach here is turning log_duration off and enabling log_min_duration_statement instead:

/etc/postgresql/15/main/postgresql.conf
log_duration = 0
log_min_duration_statement = 0

The log_min_duration_statement setting is not a boolean. It accepts an integer value in milliseconds and produces logs for queries whose execution time exceeds the specified value. When it is set to 0 as above, the duration of all completed statements will be recorded.

Apply the updated configuration above and restart the PostgreSQL server, keeping log_statement in ddl mode. Afterwards, repeat the UPDATE query. You should observe a duration log that looks like this:

 
LOG:  duration: 13.504 ms  statement: UPDATE album SET title = 'Audioslave' WHERE albumid = 10;

This is an improvement on the log_duration output because the query that produced the duration log is visible.

One weakness remains though and that is your inability to see the query in the log file before it finishes. If the server crashes before the completion of a query, you'll have no idea what queries were running before the crash since it wasn't recorded.

However, the logging behavior changes if the query produces its own log. To observe this, change log_statement to all then restart the server. Run SELECT pg_sleep(5) in psql and observe the logs.

You should see the following line immediately:

 
LOG:  statement: SELECT pg_sleep(5);

Five seconds later, the following line will appear:

 
LOG:  duration: 5005.975 ms

With this configuration, if a problem occurs before the five seconds have elapsed, you'll at least know what queries may have played a role in the incident.

The PostgreSQL documentation recommends that you include the process and session IDs in the log_line_prefix so that you can easily correlate both entries when reading and analysing the logs.

Logging slow-running queries

Logging the duration of all queries on your server is guaranteed to flood your log files with numerous trivial entries that don't add much value. It is more useful to log only slow running queries so that can investigate and optimize them accordingly. It also reduces the pressure on your PostgreSQL server since the volume of entries that will be logged will be drastically reduced.

What constitutes a "slow query" obviously depends on the type of application and specific workloads. Ensure to use a value that'll capture resource-intensive queries that can cause performance bottlenecks. Queries that join large tables, perform complex calculations, or involve recursion are prime candidates to be monitored.

A threshold of a couple hundred milliseconds for interactive applications could be good starting point, then you can adjust as you go along:

/etc/postgresql/15/main/postgresql.conf
log_min_duration_statement = 250

With this configuration, only queries whose execution exceeds the specified value (250ms) will be logged. For example:

 
SELECT pg_sleep(1);
 Output
LOG:  statement: SELECT pg_sleep(1);
LOG:  duration: 1013.675 ms

Message severity levels in PostgreSQL

PostgreSQL uses the following log levels to indicate the severity of each log record it produces:

  • DEBUG5DEBUG4DEBUG3DEBUG2DEBUG1: These levels are meant for logging detailed information that is useful for troubleshooting where DEBUG5 is the most verbose, and DEBUG1 is the least verbose.
  • INFO: Reports information that was implicitly requested by the user.
  • NOTICE: Reports details that might be helpful to users.
  • WARNING: Warns about potential problems such as using COMMIT outside of a transaction block.
  • ERROR: Reports errors that abort the current command.
  • LOG: Reports general database activity.
  • FATAL: Reports errors that abort the current session while other sessions remain active.
  • PANIC: Reports severe errors that abort all existing database sessions.

Here's an example of what messages tagged with each severity level look like:

 
DEBUG:  server process (PID 13557) exited with exit code 0
INFO:  vacuuming "chinook.public.album"
NOTICE:  identifier "very_very_very_very_very_very_very_very_long_table_name_with_more_than_63_characters" will be truncated to "very_very_very_very_very_very_very_very_long_table_name_with_mo"
WARNING:  SET LOCAL can only be used in transaction blocks
LOG:  statement: UPDATE album SET title = 'Audioslave' WHERE albumid = 10;
ERROR:  relation "albu" does not exist at character 7
FATAL:  role "root" does not exist
PANIC: database system shutdown requested

The default log level is set to WARNING through the log_min_messages setting, and I recommend keeping it that way.

/etc/postgresql/15/main/postgresql.conf
log_min_messages = warning

Reducing log verbosity

PostgreSQL allows you to control the verbosity of event logs through the log_error_verbosity setting, which, unlike the name suggests, works for each logged message. The possible values are tersedefault, and verbose.

With the default setting, you'll observe extra DETAILHINTQUERY, or CONTEXT information for certain queries. For example, when you query DROP TABLE album; against the chinook database in psql, you should observe the following logs:

 
LOG:  statement: DROP table album;
ERROR:  cannot drop table album because other objects depend on it
DETAIL:  constraint fk_trackalbumid on table track depends on table album
HINT:  Use DROP ... CASCADE to drop the dependent objects too.
STATEMENT:  DROP table album;

All five entries here were produced by the same query. If log_error_verbosity is terse, the DETAIL and HINT lines will be dropped. The verbose option includes everything in default coupled with lines that more low level details about the error such as its source code file name, function name, and line number:

 
LOCATION:  reportDependentObjects, dependency.c:1189

Note that if you're logging in JSON, the DETAILHINT and STATEMENT are included in the same log entry and LOCATION is split into func_namefile_name and file_line_num properties:

 Output
{"timestamp":"2023-08-03 15:56:26.988 UTC","user":"postgres","dbname":"chinook","pid":14155,"remote_host":"[local]","session_id":"64cbce27.374b","line_num":5,"ps":"DROP TABLE","session_start":"2023-08-03 15:56:23 UTC","vxid":"3/8","txid":16414,"error_severity":"ERROR","state_code":"2BP01","message":"cannot drop table album because other objects depend on it","detail":"constraint fk_trackalbumid on table track depends on table album","hint":"Use DROP ... CASCADE to drop the dependent objects too.","statement":"DROP table album;","func_name":"reportDependentObjects","file_name":"dependency.c","file_line_num":1189,"application_name":"psql","backend_type":"client backend","query_id":0}

Logging connections and disconnections

The following settings control how PostgreSQL logs client connections and disconnections. Both are turned off by default:

/etc/postgresql/15/main/postgresql.conf
log_connections = off
log_disconnections = off

Turning log_connections on (1) will log each attempted connection to the server as well as the completion of the client authentication and authorization:

/etc/postgresql/15/main/postgresql.conf
log_connections = 1
 
LOG:  connection received: host=[local]
LOG:  connection authenticated: identity="postgres" method=peer (/etc/postgresql/15/main/pg_hba.conf:90)
LOG:  connection authorized: user=postgres database=chinook application_name=psql

This information can be useful from a security perspective since you can see who's connecting to your systems and from where. It can also result in a large amount of log entries if there are several short-lived connections to the server and connection pooling isn't being used.

The log_disconnections option works similarly, and it includes the session duration:

/etc/postgresql/15/main/postgresql.conf
log_disconnections = 1
 
LOG:  disconnection: session time: 0:00:02.824 user=postgres database=chinook host=[local]

Creating custom log messages

So far, we've discussed the various types of logs produced by the PostgreSQL server and how to customize them. If you want to create your own log messages in your SQL queries or when writing trigger functions using PL/pgSQL, you may use the RAISE statement as follows:

 
CREATE OR REPLACE FUNCTION custom_logs_func()
RETURNS TRIGGER LANGUAGE PLPGSQL AS $$
BEGIN
    RAISE LOG 'This is an informational message';
    RAISE WARNING 'Something unexpected happened';
    RAISE EXCEPTION 'An unexpected error';
END;
$$;

CREATE OR REPLACE TRIGGER UPDATE_LAST_EDITED_TIME BEFORE UPDATE ON ALBUM
FOR EACH ROW EXECUTE FUNCTION custom_logs_func();

Depending on your log_min_messages setting, you will observe the following in the log file when custom_logs_func() is executed:

 
LOG:  This is an informational message
WARNING:  Something unexpected happened
ERROR:  An unexpected error

Analyzing log files with PgBadger

PgBadger is a command-line log analysis tool for PostgreSQL log files. It is designed to parse and analyze PostgreSQL log files, extracting valuable insights and generating detailed reports on the database's activity, performance, and usage patterns. It can produce comprehensive HTML reports with a host of statistics such as the following:

  • Total number of log entries parsed and processed.
  • Top slowest queries and their execution times.
  • Number of connections over time.
  • Most frequent error messages and their occurrence count.
  • Commit and rollback statistics for transactions.
  • Tables with the highest temporary file usage.
  • Histogram of query and session duration.
  • Users and applications involved in top queries..
  • The most time consuming prepare/bind queries
  • Number of autovacuum processes and their execution times.
  • and more!

Once you install pgbadger, check out its recommended logging configuration settings and configure them accordingly in your PostgreSQL config file. Note that those settings are tailored to produced as much information as possible so that pgBadger can effectively analyze database activity.

The tool provides a host of options, but the most basic setup to generate HTML reports is to provide the configured log line prefix, the log file, and the desired name/location of the report. At the time of writing, it only works with the stderr and CSV formats, but not JSON.

 
sudo pgbadger --prefix 'time=%m pid=%p error=%e sess_id=%c %qtag=%i usr=%u db=%d app=%a ' /var/lib/postgresql/15/main/log/postgresql-Thu.log -o postgresql.html
 Output
[========================>] Parsed 289019 bytes of 289019 (100.00%), queries: 102, events: 42
LOG: Ok, generating html report...

When you open the HTML file in your browser, you should observe a report that looks like this:



No comments:

Post a Comment

Master and Slave - Sync check - PostgreSQL

  1) Run the below Query on Primary:- SELECT     pid,     usename,     application_name,     client_addr,     state,     sync_state,     sen...