Thursday, June 7, 2018

Backup/Sync system

Ok, time to organize a backup strategy involving the Drobo unit.
The idea is to have Drobo having all the data around  the network copied on the Drobo unit, i.e. different machines do backup on Drobo.

Shopping list

In order to do so, since the majority of my machines are based on Linux, some tools will be used.
Here what is involved :

Some notes about the components.
Dropbox is used to store material that is in share on different machines.
Still I wanted to have such common material saved on my Drobo unit.

The Linux server, with Jenkins, will be the main controller for the data management/backup but some external procedures, triggered by specific machines, could integrate the backup system.

Unison is a program that allow to synchronize files among different directories and even machines.

Jenkins allows to run scheduled tasks, like cron but with much more options.
Is not a program born to handle backups or storage policies but it can be used for that.
There are some advantages to use Jenkins instead the usual scripts and cron.

  • access via web to all the functionalities
  • quick status of the jobs
  • easy to set up and maintain
  • history of the backup operations with log
Jenkins of course is not the only possible choice, but since I already have it on my server for other purposes, why don't use it ?


The most important thing is about policies.
In this specific case, a group of policies will determine what is what, what is copied, what is considered the main place for data, etc.
The Linux server is acting as main controller, deciding what/when/how to be saved on the Drobo unit via Jenkins.
i.e. the majority of the backup operations will be handled by the server.


On my network of course the most important machine is the main server, then I have the developer machine used for almost all the activities, from development to web browsing, a VPN server plus a plethora of devices, from VoIP phones, to Echos, Android and Apple devices via Wireless.
Plus a Dropbox account where I store some common material I want to be accessible on different machines.

So the main sources of data to be stored in the Drobo unit will be :
  • main server
  • VPN server
  • main Linux machine
  • Dropbox

On this schematic block is described the main structure of the system.
The server is connected to the Drobo unit and keep synchronized the material in it.
Machines on the LAN can interact with the server, adding material that the server will back up automatically.
Also a dropbox account is linked to the server, thus the server will keep a backup copy of the Dropbox content in Drobo.
Machines out the LAN usually will interact directly only with the Dropbox account. 

Here a screenshot of my Jenkins jobs related to the backup so far :

A quick view that shows how everything is OK, even some jobs had problems in the past (mainly due to testing purpose)


Other than copy material from the server to the Drobo, some activities are necessary to determine if the system has problems.
One main utility used for this purpose is drobom, a python code running on the main server capable to interrogate the Drobo unit and report information.
Another piece of monitoring involve Dropbox.
It is important to know if the server has updated the data on the Dropbox account.
To do so there is a CLI utility for Dropbox for Linux.

See the article Drobo - monitoring it for more information.


Let's talk briefly about unison.
unison is quite powerful program. Basically allows to "synchronize" two directories, doesn't matter where they are.
Since it is executed from the user jenkins, because unison is called from jenkins, is important to remember that the internal archives and preferences for unison will be stored in the jenkins user area space, i.e. /var/lib/jenkins/.unison
Specifically, a the file /var/lib/jenkins/.unison/default.prf will be used to set some parameters.

For example, many directories contains some files called .DS_Store or ._.DS_Store. These are files created by the Mac OS. They don't need to be kept in sync since are generated any time a Mac access the files.
So the default.prf file contains a line :

ignore = Name *DS_Store

that instruct unison to ignore such files.


Jenkins is the main engine to control the backup and synchronization of my data.
There are different jobs in place to control the backup/sync procedures.
The policy indicates for each group of data where they are copied. The Jenkins jobs, runs automatically to keep the data in sync among different sources.
Here the main jobs in place (some are still under construction at this writing time).

Drobo sanity

This job is executed every night and performs a check on Drobo, in order to send warning emails if the capacity of the unit is above 80% or one of the disks needs to be changed.

See the article Drobo - monitoring it for more details.

Backup - Dropbox

This job updates the content of Dropbox into the Drobo unit.
First it checks that the Dropbox on the server is updated and in sync, if not attempt to force the sync first.
Then it copy from Dropbox into the medical archive the latest entries and then updates the entire content of Dropbox into Drobo.
It is executed every day.

Backup - Milo

Milo is the main machine, NOT the server.
This job is still under development, but the idea is to keep in sync archives from the Milo to the server, i.e. the data on Milo have higher priority.
Of course it runs ONLY if Milo is On otherwise it fails (like the screenshot above)

Backup - Opus

Opus is the main server and this is the "main" job that actually keep in sync the content on the server and Drobo.
The job is not executed every day.

No comments:

Post a Comment