Bacula Network Backup

What is the most important and boring job an administrator has to do? Backup! You may have the super duper high availability cluster, redundant storage arrays with multiple raid levels. But what if you make a mistake? Do you believe mistakes are rare? Count how many typos you have during a typical day working on a Unix/Linux console… I suppose it is not that difficult to type “rm -fr . /” instead of “rm -fr ./” …. So we need to have regular, reliable, manageable and carefully planned backup strategy.

I believe it is common for lots of administrator around the world (including me!) a situation where you have:

  1. Lots of servers (at least 3)
  2. Servers in remote sites
  3. Low or no budget at all (for buying very very very expensive professional backup software)
  4. Various OS platforms.

The best Open Source solution out there is Bacula (http://www.bacula.org). Bacula is an open source set of applications that provide functionallity equivalent of the one of expensive commercial products. Unfortunately it does not have a GUI for configuring it and it looks really frightening when you first touch it. However it is more a matter of getting used to its logic rather than difficulty in configuration (in the contraty, configuration itself is really straight forward!). Below I will present you a real case scenario (already working in production) and a bacula success story.

Get in touch with Bacula terminology

Bacula is based on the following 4 terms. Director, File Daemon, Storage Daemon, Console. Each one is independent and can exist on one machine either by itself either in addition with others. Console is pretty obvious. It is the “host” that can connect to bacula director and issue commands. File Daemon is the responsible daemon that actually performs the backup. It reads data from the disk and send them to the Storage daemon. Storage Daemon is responsible of writing data to the backup device. It receives data from a File Daemon and writes them to the specified backup device or file. Finally, Director is the responsible daemon to manage everything on bacula. It handles media pools, backup jobs, backup schedules and also controls File and Storage Daemons.

It is beyond the scope of this How To, to provide further explanation to what is Director, File and Storage Daemons, as well as Media Pools, Volumes and various other terms generally used when talking about backups. Bacula Manual is a perfect manual with lots of info to make it clear to you!

Network Topology

We have a main site and two remote sites. Our strategy is the following. On the first Sunday of each month we have a full system backup. On the 2nd to 4th Sunday of each month we have a differential backup and every day we have an incremental system backup. Moreover, ouranos, lahesis, okeanos and callisto are being backed up on callisto. Atropos is being backed up on atropos (on an external disk) and on sivyla. Sivyla is being backed up on sivyla (on an external disk) and on atropos. The reason for the remote sites to be backed up besides one to the other, but also locally is just speed in case of a small restore or the survival of the external disk in case of system failure. Everything is managed from callisto. So we have the following schema:

Bacula Network Diagram
Bacula Network Diagram

Overall the have following servers and roles

Callisto Director (dir), File Daemon (fd), Storage Daemon (sd), Console
Ouranos Fd
Lahesis Fd
Okeanos Fd
Atropos Sd, Fd
Sivyla Sd, Fd

Configuration Logic

The heart of Bacula is the Director. As a consequence it has the most difficult and complicated configuration. Before you start make sure you have a piece of paper and a pencil handy to keep track of you configuration choices. I will not go deep in details about every configuration option but rather I would like to make you familiar with the logic behind the configuration file. We will see a complete configuration regarding Callisto. First of all you must have a Media Pool to write your data. We choose the name “CallistoPool” to refer to this pool. We also want to autolabel the media and use a name convention like CallistoVolXXX (where XXX is a 3-digit number starting from 000). Finally we want a volume lifetime of 1 year. Having these in mind we have the following snippet of configuration:

Pool {
  Name = CallistoPool
  LabelFormat = "CallistoVol"
  Pool Type = Backup
  Recycle = yes # Bacula can automatically recycle Volumes
  AutoPrune = yes # Prune expired volumes
  Volume Retention = 365 days # one year
  Accept Any Volume = yes # write on any volume in the pool
}

As we have our media pool it’s time to define our storage device. In general we define a way to talk to a specific storage daemon. The IP address to connect and the port. For security reasons we must also have the password to connect to the storage daemon. Finally we choose in which device we want the storage daemon to write (we must have already configure this to the storage daemon configuration) and which media pool to use. So the configuration snippet is:

Storage {
  Name = FileCallisto
  Address = callistoprv.howto.gr # N.B. Use a fully qualified name here
  SDPort = 9103
  Password = "errgd452345243srqwrdwfcsvmoxvzidfhvpsarhjps"
  Device = FileStorageCallisto
  Media Type = CallistoVol
}

Now it’s time to configure the client to back up! First of all we must define a name to refer to this client, its ip address, the tcp port to connect to and the password. Other options depend heavily on our backup policy.

Client {
  Name = callisto-fd
  Address = callistoprv.howto.gr
  FDPort = 9102
  Catalog = MyCatalog
  Password = "o9h3Asdwadasdq123rsdfV9lzIwnsdasdasDkUpz120bIM" # password for FileDaemon
  File Retention = 30 days # 30 days
  Job Retention = 6 months # six months
  AutoPrune = yes # Prune expired Jobs/Files
}

It should be obvious that our next step is to define what we want to backup on the previously specified client. So we configure filesets. Suppose that callisto has only three mounted file systems. Root (/), boot (/boot) and var (/var). We name this fileset “Callisto_Full_fs” and enable the option to sign the files using MD5. Of course it is useless to backup /proc, /tmp and probably various other directories!

FileSet {
  Name = "Callisto_Full_fs"
  Include {
    Options {
      signature = MD5
    }
    File = /
    File = /boot
    File = /var
  }
  Exclude {
    File = /proc
    File = /tmp
    File = /.journal
    File = /.fsck
  }
}

Up until now we know WHERE and WHAT to backup. Let’s define WHEN! We create a schedule with the name “WeeklyCycle” and we define to have from Monday to Saturday an incremental backup. On the first Sunday of each month we have a full backup and from the second to the fifth Sunday we want to have a differential backup. All these should be done at 1 am every night.

Schedule {
  Name = "WeeklyCycle"
  Run = Full 1st sun at 1:05
  Run = Differential 2nd-5th sun at 1:05
  Run = Incremental mon-sat at 1:05
}

So our last step is to take all the information above and create a job. Bacula runs jobs, so when we know where, what and when we are ready to create a job for bacula. We name our job “Callisto_Full”. It should connect to the client “callisto-fd”, backup all the files defined in the fileset “Callisto_Full_fs” and the backup should be written to the storage daemon “FileCallisto”. Next we define some extra configuration options mostly to define the behavior of our job.

Job {
  Name = "Callisto_Full"
  Type = Backup
  Client = callisto-fd
  FileSet = "Callisto_Full_fs"
  Storage = FileCallisto
  Pool = CallistoPool
  Messages = Standard
  Where = /tmp/bacula-restores
  JobDefs = "FullBackupJob"
  Write Bootstrap = "/var/bacula/callisto.bsr"
}

Congratulations! You have created your first backup job. Of course there is still more things to do like configuring the Storage Daemon and the File Daemon but it must be really straight forward now. Do your homework and of course have Bacula Manual next to you, it is extremely useful!!! Below I provide you with full configuration for the network and the schema mentioned above.

Good Luck!

bacula-dir.conf

### Director ###
Director
{ # define myself
Name = callisto-dir
DIRport = 9101 # where we listen for UA connections
QueryFile = "/etc/bacula/query.sql"
WorkingDirectory = "/var/bacula/working"
PidDirectory = "/var/run"
Maximum Concurrent Jobs = 1
Password = "trUtd567tqxHxG9jkldDElX+yr13fx6VBposdflLHl4" # Console password
Messages = Daemon
}
#########################################
########### Job Definitions #############
#########################################
JobDefs {
Name = "FullBackupJob"
Type = Backup
Level = Incremental
Schedule = "WeeklyCycle"
# Storage = File
Messages = Standard
# Pool = Default
Priority = 10
}
JobDefs {
Name = "WinBackup"
Type = Backup
Level = Full
Schedule = "WeeklyCycle"
Messages = Standard
Priority = 10
}
#########################################
################# Jobs ##################
#########################################
#--- Linux Forthnet
Job {
Name = "Callisto_Full"
Type = Backup
Client = callisto-fd
FileSet = "Callisto_Full_fs"
Storage = FileCallisto
Pool = CallistoPool
Messages = Standard
Where = /tmp/bacula-restores
JobDefs = "FullBackupJob"
Write Bootstrap = "/var/bacula/callisto.bsr"
}
Job {
Name = "Ouranos_Full"
Type = Backup
Client = ouranos-fd
FileSet = "Ouranos_Full_fs"
Storage = FileOuranos
Pool = OuranosPool
Messages = Standard
Where = /tmp/bacula-restores
JobDefs = "FullBackupJob"
Write Bootstrap = "/var/bacula/ouranos.bsr"
}
Job {
Name = "Lahesis_Full"
Type = Backup
Client = lahesis-fd
FileSet = "Lahesis_Full_fs"
Storage = FileLahesis
Pool = LahesisPool
Messages = Standard
Where = /tmp/bacula-restores
JobDefs = "FullBackupJob"
Write Bootstrap = "/var/bacula/lahesis.bsr"
}
#--- Linux Foreign
Job {
Name = "Atropos_Full"
Type = Backup
Client = atropos-fd
FileSet = "Atropos_Full_fs"
Storage = FileAtropos
Pool = AtroposPool
Messages = Standard
Where = /tmp/bacula-restores
JobDefs = "FullBackupJob"
Write Bootstrap = "/var/bacula/atropos.bsr"
}
Job {
Name = "Atropos_To_Sivyla_Full"
Type = Backup
Client = atropos-fd
FileSet = "Atropos_Full_fs"
Storage = FileAtropos_Sivyla
Pool = AtroposPool_Sivyla
Messages = Standard
Where = /tmp/bacula-restores
JobDefs = "FullBackupJob"
Write Bootstrap = "/var/bacula/atropos_to_sivyla.bsr"
}
Job {
Name = "Sivyla_Full"
Type = Backup
Client = sivyla-fd