The logs and settings of Docker containers ought to be backed up periodically, firstly to guard against data corruption, and secondly for audit purposes to trace any security incidents. In a Docker environment, this can easily be done by spinning up another parallel container just to do this, without any need to touch the first (main application) container at all.

The delegation of backup duties to a parallel container may be a neater approach than adding periodic backup functions to the main application container since it avoids the main application code from being bloated. Furthermore, it may not always be possible to find convenient methods to inject periodic backup functions when highly compact Docker images are used.

The example code to implement a parallel backup container can be found in this repository. The two containers can be spinned up with the standard command

docker-compose up

The secondary container runs multiple cron jobs to perform periodic backups at various intervals (every restart, minute, hour, etc..). The settings for deciding how many versions of each backup (for each interval) to keep can be configured at backup_daemon/entrypoint_with_cronjobs.sh. This is one of the more straightforward retention strategy in allowing frequent backups but to purge most of them when their age is too large. It may be possible to have a single cron job that performs this backup every minute, and then use some filtering function to decide which backup to keep over the longer term of months or years, but this would be tricky to implement when the consecutive retention windows are not multiples of one another. For example, it would not be clear whether we should purge 1-in-4 or 1-in-5 of weekly backups to get monthly backups without more calculation. Thus, having multiple cron jobs that does the weekly, monthly, etc.. backups is cleaner to implement.

To improve the security factor, the cron tasks can encrypt the tar files, either symmetrically or asymmetrically using a GPG key such that even an intruder who have full access to these Docker files will not have access to the full historical records since the private GPG key will be stored safely somewhere else.

The cron tasks can also be augmented to automatically upload the tar files to an external location for additional resiliency.

References

Resilient Docker source code