1. What is direct_cp?
  2. Features of direct_cp
  3. Install
    1. Requirement
    2. How to Install direct_cp?
  4. Usage
    1. Configuration
    2. Command Usage
  5. Uninstall
  6. Restrictions and Caution
  7. Detail Information
    1. Performance Improvement
  8. Related Documents

direct_cp 1.0

What is direct_cp?

direct_cp is direct IO copy command for not wasting file caches in OS, when PostgreSQL copies WAL to WAL archive file.
direct_cp is a specially WAL archiving copy command. In general, we use normal cp command, however, it uses buffered IO that waste file caches which will not use at all. direct cp uses direct IO method, therefore it can effectively use file caches and doesn't waste file caches. It is surprising good for performance in PostgreSQL with archive mode.
direct_cp performance effecitive is indicated in 7.1 Performance Improvement, it will extremely improvements for your PostgreSQL's performance! Please download file from here.

Features of direct_cp

direct_cp uses direct IO copy method, when PostgreSQL copies WAL in pg_xlog to WAL archive file in archiving directory. It doesn't use buffered IO, therefore it is good for performance. direct_cp is only copy command which uses at archive_command in postgresql.conf, and it is especially good at archive WAL copy command for more effectively using file cache in PostgreSQL with archive mode.

**Caution!** direct_cp doesn't support in Linux kernel 2.4 older. Because it doesn't support direct IO.

Features of direct_cp is under following in image.



Install

Explain how to use direct_cp

Requirement

Requirement OS
RHEL 6.1 or higher

**Caution!** It cannot support RHEL4/5. Because it have bugs in using direct IO with buffered IO. It might cause data file loss. Be careful!

How to Install direct_cp?

Install from RPM

Under following command example is install method of direct_cp x86_64 package with PostgreSQL 9.2 in RHEL 6.
$ su
# rpm -ivh direct_cp-1.0.0-1.pg92.rhel6.x86_64.rpm

Install from source

If you want to bulid from source, use pgxs compile option.

$ tar xzvf direct_cp-1.0.0.tar.gz
$ cd direct_cp-1.0.0
$ make USE_PGXS=1
$ su
# make USE_PGXS=1 install

If you don't want to use pgxs compile option, you should copy direct_cp directory to under contrib directory, and then execute make install command.

It is all of install!

Usage

This section explains how to use direct_cp.

WAL Archive Setting

Set WAL archive command in postgresql.conf.
In this configuration, set direct_cp command in archive_command.

This setting example is way of archive WAL file copy that copy WAL file to under /mnt/server/archivedir directory.

wal_level = archive
archive_mode = on
archive_command = 'direct_cp %p /mnt/server/archivedir/%f'
Sets parameters collect, and then reload or restart PostgreSQL.

Command Usage

Usage of direct_cp is under following.

$ direct_cp [OPTION]... SOURCE DEST

Command example is under following.
In under following command example, it indicates to copy /var/pgdata/pg_xlog/000000010000000000000002 file to under /mnt/server/archivedir directory.

$ direct_cp /var/pgdata/pg_xlog/000000010000000000000002 /mnt/server/archivedir

Option

-b, --buffer-size=NUM
Set buffer size by byte
Buffer size can set range from 4096 to 16777216 which is multiples of 4096.(*1)
This option can be omitted. Default buffer size is 2MB. It was the best paramater in our testing.
-v, --version
Show version information
-?, --help
Output help text

(*1) Limit of buffer size is same as Limit of WAL file size in PostgreSQL.

Uninstall

First, eliminate direct_cp command at archive_command in postgresql.conf, and then reload or restart PostgreSQL.
After that we can uninstall direct_cp package.

/* if you installed from RPM binaly */
$ su
# rpm -e direct_cp

/* if you installed from soruce */
$ su
# make USE_PGXS=1 uninstall

Restrictions and Cautions

direct_cp has under following restrictions and cautions.

Don't use disk space in NFS enviroment.
In NFS enviroment, it isn't same as physical file system. It might cause to data loss when server crash or power down.
Need file system which can supports O_DIRECT.
It cannot support file system which cannot O_DIRECT flag in write() system call. If you use in file system which does not support O_DIRECT, error message will be occured.
If copy target isn't WAL file, copy method uses normal buffered IO.
If you use direct_cp for copy file which aren't WAL file, direct_cp use normal buffered IO which is like normal cp command.
Not support multipul files copy.
It cannot support to copy multipul files. It supports to copy only single file.
Not support directory copy.
It cannot support to copy directory at all.

Detail Information

Result of Performance Improvement

When we use direct_cp on archive WAL copy, PostgreSQL performance extremely improvements than using normal cp command which is buffered IO at achive WAL copy command. It is because direct_cp doesn't waste file caches which are like archive WAL file that will not use at all in normal, and it can use file cache more effectively.
Detail result is following link.

Related Documents

PostgreSQL Document

Write Ahead Log(WAL), Continuous Archiving and Point-in-Time Recovery (PITR),