archon - targetted archiving program

NAME

archon - targetted archiving program

SYNOPSIS

    archon [options] [archive-set]

DESCRIPTION

archon archives a specific set of files and directories. It is intended to be used as a periodic, partial backup. It is not intended to replace full or incremental dumps. The archive will be copied to removable media, such as a USB drive or a removable hard disk.

An archive is built according to the .archon configuration file. This file defines a number of named archive sets (lists of files to be archived) and has a list of named devices to use as possible archive locations. Using the archive sets and the device lists in conjunction with archon's options and arguments, it is easy to have a flexible, diverse set of archives. Archive sets should be reviewed periodically to ensure that the files being archived is still the required set files.

The .archon file is relatively straightforward, but it is very important that it be defined correctly so that the archives are built as needed. The format of the file changed in archon version 2.0, and it highly recommended that you read the "ARCHON CONFIGURATION FILES" section below in order to create your configuration file properly.

A small example configuration is given in archon's __DATA__ section at the end of the program file. If the .archon file does not exist, that very small default archive set will be backed up. This is a rudimentary configuration and the devices defined there are unlikely to work anywhere. This configuration is primarily intended as an example.

Relative paths in an archive set are assumed to be relative to the home directory of the user running archon. Absolute paths may be given, but modern tar implementations typically strip off the leading slash.

Several checks will be performed on the file names in the archive sets, the archive-set names, and the device names. The conditions checked are:

existence
readability
searchability (for directories)
valid type (regular file, directory, symlink)
invalid characters in the name

The first four checks will not be run if the -skip option is given.

The tar command is used to create the tarfile; the --bzip2 option is used to have tar compress the tarfile. Compression is done with the bzip2 compression format. When installing archon on a new system, ensure that its tar program has the --bzip2 option. If not, you'll need to adjust the archit() subroutine to:

not use --bzip2
not append ".bz2" to the archive filename
run bzip2 (or some other file compressor) on the archive file

The $TARARGS variable may also need some adjustment.

Archive Names

The archive files will be created with a specific filename pattern. The pattern includes a timestamp, the user's name, and the name of the archive set.

When spaces are used in archive-set names, they are converted to underscores for in the name of the actual archive file.

The following pattern is used:

    YYMMDD-hhmm-username-archiveset.arch.bz2

where:

    YYMMDD	date of archive (YY - year, MM - month, DD - day)
    hhmm	time of archive (hh - hour, mm - minute)
    username	user's login name
    archiveset	name of the archive set
    arch	file (doesn't change)
    bz2		indicates bzip2 compression (doesn't change)

A few example archive names are:

    150309-0836-suzee-inbox.arch.bz2
    150210-2021-bobb-project_data.arch.bz2
    150120-2310-chowder-homedir.arch.bz2

If an archive is started within the same minute as another archive using the same archive set, and username, the archive names will exactly match. The first archive will be overwritten by the second, unless the two archives are saved to different volumes.

OPTIONS

archon takes the following options:

-archives

-build

-config conffile

conffile

.archon

-devices

-files

-history count

count

-logfile file

-keep

-latest

archon

-noeject
-noumount
-nounmount

-select device

device

-skip

-usearchive archive-file

archive-file

archon

-usedevice volume

volume

archon

backups

volume

-verbose

-Version

archon

-help

ARCHON LOGGING

archon logs successful archive executions. By default, the log entries are added to the end of the selected .archon file. However, an alternate log file may be specified in the .archon file or with the -logfile option. The '>>>logfile' directive is used to specify an alternate log file in the configuration file.

Log entries may be display by using the -history option. The most recent N entries may be displayed by giving a numeric argument with -history. If an archive set or an archive device is specified on the command line, then only entries matching those selection criteria will be displayed.

Each log entry consists of a date and time timestamp, the archive set, the archive device, and a brief message indicating the status of that execution. The fields are separated by semicolons. The '>>>log' directive is used to mark a log entry in the configuration file. This is an example logging entry:

    >>>log 150405 18:29:31 ; daily ; /Volumes/archer ; archived

Only a few status messages are currently defined. This is expected to change in the future.

Logging Examples

The .archon file contains these logging entries:

    >>>log 150415 01:24:49 ; tiny ; /Volumes/archer ; built, not archived
    >>>log 150415 13:10:01 ; daily ; /Volumes/archer ; archived
    >>>log 150416 13:10:01 ; daily ; /Volumes/archer ; archived
    >>>log 150417 13:10:00 ; daily ; /Volumes/thumb1 ; archived
    >>>log 150418 17:30:39 ; tiny ; /Volumes/archer ; archived
    >>>log 150418 13:10:01 ; daily ; /Volumes/thumb1 ; archived
    >>>log 150419 13:10:02 ; daily ; /Volumes/serenity ; archived

If this command is given "archon -history", then all seven log entries will be displayed.

If this command is given "archon -history 1", then these entries will be displayed:

    150419 13:10:02 ; daily ; /Volumes/serenity ; archived

If this command is given "archon -history tiny", then this entry will be displayed:

    150415 01:24:49 ; tiny ; /Volumes/archer ; built, not archived
    150418 17:30:39 ; tiny ; /Volumes/archer ; archived

If this command is given "archon -history 1 tiny", then this entry will be displayed:

    150418 17:30:39 ; tiny ; /Volumes/archer ; archived

If this command is given "archon -history 1 /Volumes/thumb1", then this entry will be displayed:

    150418 13:10:01 ; daily ; /Volumes/thumb1 ; archived

ARCHON CONFIGURATION FILES

The original version of the .archon configuration file had a simple format; it was just a list of files to be archived. Version 2.0 of archon uses an enhanced, more flexible format. The old .archon files will continue to work, at least as far as archived files are concerned. However, the old files will require the -usedevice option.

The new .archon files are divided into sections of archive devices and archive sets. The device lists provide named devices that may be selected as the destination of archives. Archive sets are named lists of files that will be archived to the archive devices.

The new .archon configuration files provide several benefits. Moving device lists into the configuration file makes archon more flexible for individual users, since there is no longer a need to modify the source code just to change the available devices. Putting the names of files to be archived into named lists provides an easy way to have different archive sets defined for different uses.

The configuration sections are marked by a line starting with ">>>", followed by either the word "devices" (for a device list) or a label (for an archive set.) The archive-set label can consist of alphabetic characters, numerals, and the following punctuation: "," "-" "_" "." and spaces. No other characters are allowed.

Comment lines start with a sharp sign. Blank lines are allowed anywhere, even within an archive set or a device list. Comment lines and blank lines are ignored.

This is a very simple .archon file. It defines a single device and an archive set with two entries:

    # A simple .archon file.

    >>> devices
    thumb	/Volumes/thumb1

    >>> smallset
    .cshrc
    mail

Device Section

Device entries consist of two fields: a device name and a device path. The device name serves as a simple label for the device. The device name is the absolute path to the device. archon expects the device path to contain a writable directory named backups. In the example above, there should be a directory named /Volumes/thumb1/backups.

An archive device can actually be a directory and not removable media. They are discussed here as if they are removable media (thumb drive, hard drive, etc.) because it is generally safer to archive files on something that isn't tied to a specific machine. However, this is not required.

The device name serves several purposes. At heart, the name is a simple label for the device. If all the device names are unique, then the -usedevice option allows a specific device to be used as the archive device.

The device name may also be used to name a group of devices. If multiple devices are given the same name, then the -usedevice option will attempt to use each of the devices in the group until it finds one that is mounted. A device path may be listed with multiple device names, allowing it to be included in multiple groups. There is no inherent or required one-to-one correspondence between device names and device paths, though the configuration file may be written that way.

There are no default devices, except in one situation described below. If no device is named in the options, then the whole device list will be tried until a named device is found to be mounted.

The one time a default device will be used is when the .archon file does not exist. In this case, a rudimentary configuration will be used that is stored at the end of the archon program file. However, the devices in this rudimentary configuration are unlikely to work anywhere. This configuration is primarily intended as an example.

The "default" device name is a special name. The device path for the default device is not an actual path, but rather another device name. When trying to determine the set of archive devices to try, archon will use the default device's "path" -- another device name -- as the first device to try using.

The default device won't necessarily always work; it is just the first device that will be tried. If none of the associated device paths are mounted, then archon will not be able to work.

The example device section below contains all the possibilities discussed:

    >>> devices
    thumb		/Volumes/thumb1
    thumb		/Volumes/thumb2
    thumb		/Volumes/thumb3
    serenity		/Volumes/firefly
    firefly		/Volumes/firefly
    firefly		/Volumes/thumb3
    tmp			/tmp/data-archives
    default		firefly

The thumb device name is used to refer to three devices: /Volumes/thumb1, /Volumes/thumb2, and /Volumes/thumb3.

The serenity device name is used to refer to a single device: /Volumes/firefly.

The firefly device name is used to refer to two devices: /Volumes/firefly and /Volumes/thumb3. Both of those devices are already in other groups.

The tmp device name is used to refer to a single device: /tmp/data-archives. In most Unix systems, this is either a directory on a hard disk or an in-memory virtual disk.

The default device name points back into the device list to refer to the firefly device name. If the default device is specified, then the two firefly devices will be checked.

Archive Set Section

In addition to a device section, a .archon file has one or more archive sets. An archive set has an identifying name, and contains a list of files and directories to be archived for that set. The first archive set will be used by default if an archive set is not specified on the command line.

Archive sets in .archon are distinct, even though the contents of the sets may overlap. Specifying one set later (or earlier) in a file will not cause any other archive sets to be included.

The filenames listed in the archive set are somewhat restricted in the characters they may contain. The following characters cannot be used in filenames:

    ;  $  <  >  [  ]  {  }  (  )  &  ?  '  `  |  !

The name of an archive set is also unable to use that same set of characters, but the set names cannot contain slashes.

An archive-set name can contain spaces and tabs, but they will be converted to underscores. A filename can contain spaces, but in this case the filename must be surrounded by double-quotes.

Example .archon File

This example .archon file contains several archive sets and a pair of device sections.

    # The devices we'll want to use.

    >>> devices
            archer        /Volumes/archer
            archer        /Volumes/centaur
            thumbs        /Volumes/thumb1
            thumbs        /Volumes/thumb2
            thumbs        /Volumes/thumb3

    # Absolute essentials that we must archive.  Everything else is extra.
    >>> essentials
    .cshrc
    .login
    .mailrc
    .ssh
    bin
    data/sciuridae

    # The email folders we can't lose.
    >>> required-mail
    mail/inbox
    mail/projects
    mail/hunnybunny

    # All our email folders.
    >>> all-mail
    mail

    >>> devices
            firefly        /Volumes/serenity
            default        thumbs

    # Data for current projects.
    >>> data
    "data/file index"
    data/sciuridae
    data/ailuradae
    data/procyonidae

    # Everything we want to back up with some frequency.
    >>> everything
    .cshrc
    .login
    .mailrc
    .ssh
    bin
    mail

    "data/file index"
    data/sciuridae
    data/ailuradae
    data/procyonidae
    "source files/modules"

    >>>logfile /opt/logs/archon.log

    >>>log 150418 17:30:39 ; tiny ; /Volumes/archer ; archived
    >>>log 150418 13:10:01 ; daily ; /Volumes/thumb1 ; archived
    >>>log 150419 13:10:02 ; daily ; /Volumes/serenity ; archived

PORTABILITY NOTE

This script was written for Mac OS X, but it should be easily portable to other Unix-like systems. The important thing to watch for is how a non-OS X system names and manages removable media.

FUTURE POSSIBILITIES

The following possibilities may be included in a future version of archon:

specifying multiple archive sets on a command line
allowing an archive set to include another archive set
specifying a default device for a particular archive set

Inclusion of these possible features will depend on time availability and interest.

AUTHOR

Wayne Morrison, wayne@waynemorrison.com

LICENSE

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

NAME

SYNOPSIS

DESCRIPTION

Archive Names

OPTIONS

ARCHON LOGGING

Logging Examples

ARCHON CONFIGURATION FILES

Device Section

Archive Set Section

Example .archon File

PORTABILITY NOTE

FUTURE POSSIBILITIES

AUTHOR

LICENSE

SEE ALSO