Carbonio Backup Architecture

Carbonio Backup Architecture
This section presents and describes the fundamental ideas required to understand the architecture of Carbonio Backup; each concept is then explained in its own section.
 
Before delving into the Carbonio Backup architecture, it is worth noting two broad methods to backup plan definition: RPO and RTO.
The Recovery Point Objective (RPO) is the most data that a stakeholder is ready to lose in the event of a disaster, whereas the Recovery Time Objective (RTO) is the most time that a stakeholder is prepared to wait to retrieve its data.
These definitions state that the ideal acceptable value is zero, although reality values are generally close to zero, depending on the amount of the data. Carbonio’s combination of Realtime Scanner and SmartScan ensures that RTO and RPO values are kept to a minimum: The Real Time Scanner guarantees that all metadata changes are recorded as soon as they occur, whereas the SmartScan copies all objects that have been modified, limiting the possibility of data loss to items that have changed between two consecutive SmartScan runs.
Item
Carbonio Backup’s whole design relies around the idea of ITEM: An item is the smallest thing preserved in the backup, for example:
  • a message sent through email
  • a person or a group of people
  • a file folder
  • a scheduled appointment
  • a task
  • a Carbonio Files file
  • an account (together with its settings)
  • a distribution list
  • a website
  • a type of service (COS)

Note

The last three items (mailing lists, domains, classes of services) are subject to the SmartScan only, i.e., the Real Time Scan will not record any change of their state.

There are additionally non-item objects that will never be scanned for changes by the Realtime Scanner and will never be part of a restore:
  • Server setup, or the configuration of each server
  • Carbonio product global configuration
  • Any software modifications (Postfix, Jetty, etc…)
Every alteration in the accompanying information for every object maintained by Carbonio is recorded and kept, allowing its restoration at any point in time. In other words, if one of an item’s information changes, a “photograph” of the entire thing is taken and recorded with a timestamp through a transaction. The following are some examples of metadata connected with an item:
 
  • after an email has been read, deleted, or moved to a folder
  • a change in a contact’s name, address, or employment
  • the removal or addition of a file from a folder
  • a change in the status of an object (for example, an account)
 
An item is technically saved as a JSON Array that contains all modifications made over the object’s lifespan. More on this in the section on Item Structure.
 
A Deleted Item is one that has been designated for deletion.

Note

An element in the thrash bin is not considered as a deleted item: It is a regular item, placed in a folder that is special only to us, from the Carbonio Backup’s point of view, the item has only changed its state when moved to the thrash bin.

Transaction

A transaction is a change in the status of an object. A change of status occurs when a user modifies one of the metadata associated with an object. As a result, a Transaction may be viewed as an image of the metadata at a certain point in time. A Transaction ID is assigned to each transaction. It is possible to return an item to a previous transaction. More information may be found under Restore Strategies.

Realtime Scanner and SmartScan
The initial structure of the backup is created during the SmartScan’s Initial Scan: the actual content of an AppServer is read and utilised to build the backup. If Scan Operation Scheduling is enabled in the Carbonio Admin Panel, the SmartScan is then run at the start of each Carbonio Backup and on a daily basis.

Important

SmartScan runs at a fixed time—that can be configured—on a daily basis and is not deferred. This implies that, if for any reason (like e.g., the server is turned off, or Carbonio is not running), SmartScan does not run, it will not run until the next day. You may however configure the Backup to run the SmartScan every time Carbonio is restarted (although this is discouraged), or you may manually run SmartScan to compensate for the missing run.

SmartScan’s primary function is to check for objects that have been updated since the last run and to update the database with any new information.
 
The Realtime Scanner records every event that occurs on the system in real time, allowing for split-second recovery. Because the Realtime Scanner does not erase data in the backup, each object has its own entire history. Furthermore, it may identify other modifications to the same item at the same time and record them all as a single metadata update.
 
SmartScan and Realtime Scanner are both turned on by default. While both may be stopped (independently), it is recommended that they remain running because they are meant to compliment each other.

Warning

If none of the two Scan Operations is active, no backup is created.

When Should Scan Operations Be Disabled?
Because backups are written to disc, Scan activities require I/O disc access. As a result, there are a variety of instances in which either the SmartScan or Realtime Scanner may (or should) be temporarily deactivated. As an example:
 
You have a high volume of transactions every day (or frequently deal with Carbonio Files documents) and observe a heavy burden on the server’s resources. You can temporarily disable the Real Time Scan in this scenario.
 
You begin a migration: In this scenario, it is recommended that you halt the SmartScan since it will generate a large number of I/O operations on disc and may even cause the server to crash. It would, in fact, consider every migrated or restored object as if it were a new one.
You receive and send a large number of emails each day. In this situation, you should always have the Realtime Scanner active, because otherwise, all transactions would be backed up solely by the SmartScan, which may not finish in an acceptable time owing to the resources required for I/O activities.
 
Backup Path 

The backup path is the location on a disc where all backup and archive information is saved. Each server has a unique backup path; no two servers can share the same backup path. It is organised as a hierarchy of directories, with the uppermost being  /opt/zextras/backup/zextras/. by default. This directory contains the following key files and directories:

  • map_[server_ID] are so-called map files that indicate if the Backup was imported from an external backup and include the server’s unique ID in the filename.
  • accounts is a directory that contains information for all accounts defined in the AppServer. There are particularly significant files and folders to be found there:
  • account_info is a file that contains all of the account’s metadata, such as the password, signature, and preferences.
  • account_stat is a file that contains different statistics about the account, such as the ID of the most recently saved material by SmartScan.
  • backupstat is a file that stores general backup statistics, such as the timestamp of the first run.
  • drive_items is a directory with up to 256 subfolders (named with two hexadecimal lowercase letters) that contains Carbonio Files items based on the last two letters of their UUID.
  • items is a directory with up to 100 subfolders (the names of which are made up of two digits, and in which things are placed according to their ID’s last two digits).
  • servers is a directory containing archives of the server setup and customizations, Carbonio configuration, and chat, one each day up to the chosen server retention duration.
  • items is a directory that may hold up to 4096 extra folders and is named with two hexadecimal (uppercase and lowercase) characters. Items in the AppServer will be saved in a directory with the last two characters of their ID as the name.
  • id_mapper.log is a user object ID mapping file that contains a link between the original and restored objects. It may be found in /backup/zextras/accounts/xxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/id_mapper.log. This file is only present in the event of an external restoration.
Configuring the Backup Path
A Backup Path is a directory where all objects and information are preserved. Each server must declare a single Backup route that is unique to that server and cannot be reused. To put it another way, attempting to utilise a Backup Path on a separate server and designating it as the current Backup Path will result in an error. Any attempt to force this condition by interfering with the backup file would result in the corruption of both old and new backup data.
 
The command may be used to get the current Backup Path value.
zextras$ carbonio config get server mail.example.com ZxBackup_DestPath

     server                                              9d16badb-e89e-4dff-b5b9-bd2bddce53e2
     values

             attribute                                                   ZxBackup_DestPath
             value                                                       /opt/zextras/backup/zextras/
             isInherited                                                 false
             modules
                     ZxBackup
 

Use the set subcommand instead of get to update the Backup Path and add the new path.

zextras$ carbonio config set server mail.example.com ZxBackup_DestPath /opt/zextras/new-backup/path
ok

The ok message will be displayed if the procedure is successful.

See also

You can do the same from the Carbonio Admin Panel under Server Config (Admin Panel ‣ Global Server Settings ‣ Server Config).

Policy on Retention
The Retention Policy (also known as the retention period) specifies how many days an object marked for deletion gets deleted from the backup. The backup retention policies are as follows:
 
  • The data retention policy only applies to single items and is set to 30 days by default.
  • Account retention policy applies to the accounts and is set to 30 days by default.

All retention durations are adjustable; if set to 0 (zero), archives are retained indefinitely (infinite retention), and the Backup Purge is disabled.

You may use accordingly to determine the current value of the Retention Policy.

zextras$ carbonio config dump global | grep ZxBackup_DataRetentionDays
zextras$ carbonio config dump global | grep backupAccountsRetentionDays

Use 0 for unlimited retention or any integer value as the number of days to update either variable. To set the retention time for data and accounts to 15 days, use:

zextras$ carbonio config set global ZxBackup_DataRetentionDays 15
zextras$ carbonio config set global backupAccountsRetentionDays 15

If an account is deleted and needs to be restored after the Data retention time has expired, it will still be possible to recover all items up to the Account retention time, because even if all metadata has been purged, the digest can still contain the information needed to restore the item.

See also

You can set retention policies from the Carbonio Admin Panel under Server Config (Admin Panel ‣ Global Server Settings ‣ Server Config).

Backup Cleanup
The Backup Purge is a cleanup procedure that removes from the Backup Path any deleted item that has outlived the retention duration specified in the Data Retention Policy and Account Retention Policy.
Check for Coherence
The Coherency Check is especially designed to detect faulty information and BLOBs, and it examines a Backup Path in more depth than SmartScan.
 
While the SmartScan checks just objects that have been updated since the last SmartScan run, the Coherency Check checks all metadata and BLOBs in the Backup Path.
 
Use the carbonio backup doCoherencyCheck carbonio_backup_docoherencycheck> command to initiate a Coherency Check through the CLI:
 
zextras$ carbonio backup doCoherencyCheck *backup_path* [param VALUE[,VALUE]]

How Does Carbonio Backup Function?
Carbonio Backup was created to save every possible version of an ITEM. Because it is not meant as a system or operating system backup, it can function with various OS architectures and Carbonio versions.
Carbonio Backup enables administrators to generate an atomic backup of every item in the AppServer account and restore various items across accounts or even servers.
 
By default, Carbonio Backup saves all backup files in the local directory  /opt/zextras/backup/zextras/. A directory must meet the following requirements in order to be considered for usage as the Backup Path:
 
  • The zextras user should be allowed to read and write to it.
  • Make use of a case-sensitive filesystem.

Hint

You can modify the default setting by using either technique shown in section Setting the Backup Path.

When Carbonio Backup is first launched, it does a SmartScan to get all data from the AppServer and generate the first backup structure, in which each item is stored along with all of its information as a JSON array on a case sensitive filesystem. Following the first start, the Real Time Scanner, SmartScan, or both can be used to maintain the backup up to date and synchronised with the account.

The Structure of an Item
The item’s basic structure is a JSON Array that records all changes that occur during the item’s lifetime, such as information about emails (e.g., tags, visibility, email moved to a folder), contacts, tasks, single folders, groups, or Carbonio Files documents, and user preferences (e.g., password hash, general settings).
 
To improve performance, only the changes required to restore the items are recorded: for example, it is not useful to store the user’s last login time or the IMAP and Activesync state because the values of those attributes would be related to the old account if the account is restored on a new one.
 
We can recover data at a certain point in its existence by collecting the transaction’s timestamp.
 
During the restoration, the engine evaluates the “start-date” and “end-date” characteristics of all valid transactions.
 
The same principle is used to recover deleted things: when an item is removed, we save the timestamp and may thus restore anything deleted within a certain time window.
 
Even if the blob connected with the item changes, and therefore its digest changes (as is the case with Carbonio Files Document), the metadata records the validity of both the old and new digests.
SmartScan
Because the SmartScan only functions on accounts that have been updated since the previous SmartScan, it can enhance system performance and reduce scan time significantly.
 
A SmartScan is run every night by default (if  Scan Operation Scheduling  is enabled in the Carbonio Backup section of the Carbonio Admin Panel). Once a week, on a user-specified day, a Purge is performed with the SmartScan to clean the volume on which the Carbonio Backup is kept of any deleted items that have surpassed the retention term.
 
The Carbonio Backup engine searches the Carbonio mailbox for things that have been updated since the last SmartScan. It updates any outdated entries and generates any items that are not yet present in the backup, while marking any item discovered in the backup but not in the Carbonio mailbox as destroyed.
 
The backup’s configuration information is then changed, resulting in domains, accounts, COSs, and server configurations being saved with a dump of all configuration.
 
When an LDAP backup is created, SmartScan saves a compressed dump in the Backup Path that may be used to recover a damaged configuration.

Note

In case the LDAP backup can not be executed (e.g., because the access credential are wrong or invalid), SmartScan will simply ignore to back up the Directory Server configuration, but will nonetheless save a backup of all the remaining configuration

When the External Restore feature is enabled, SmartScan builds a single (daily) archive for each account that contains all of the account’s metadata and saves it on an external drive. More information may be found in the section Backup on External Storage.
 
Smartscan may be launched from the command line or customised through the Admin Panel (Admin Panel ‣ Global Server Settings ‣ Server Config).

Running a SmartScan

To start a SmartScan via the CLI, use the command:

zextras$ carbonio backup doSmartScan *start* [param VALUE[,VALUE

Checking the Status of a Running Scan

Before actually carrying out this check, it is suggested to verify how many operations are running, to find the correct UUID. you can do this by using the command

zextras$ carbonio backup getAllOperations [param VALUE[,VALUE]]

To check the status of a running scan via the CLI, use the command

zextras$ carbonio backup monitor *operation_uuid* [param VALUE