Disk snapshot backup and Restore

This document covers disk snapshot based backup and restoration of a FoundationDB database. This tool leverages disk level snapshots and gets a point-in-time consistent copy of the database. The disk snapshot backup can be used for test and development purposes, for compliance reasons or to provide an additional level of protection in case of hardware or software failures.

Introduction

FoundationDB’s disk snapshot backup tool makes a consistent, point-in-time backup of FoundationDB database without downtime by taking crash consistent snapshot of all the disk stores that have persistent data.

The prerequisite of this feature is to have crash consistent snapshot support on the filesystem (or the disks) on which FoundationDB is running.

The disk snapshot backup tool orchestrates the snapshotting of all the disk images and ensures that they are restorable to a consistent point in time.

Restore is achieved by copying or attaching the disk snapshot images to FoundationDB compute instances. Restore behaves as if the cluster were powered down and restarted.

Backup vs Disk snapshot backup

Backup feature already exists in FoundationDB and is detailed here Backup, Restore, and Replication for Disaster Recovery, any use of fdbbackup will refer to this feature.

Both fdbbackup and Disk snapshot backup tools provide a point-in-time consistent backup of FoundationDB database, but they operate at different levels and there are differences in terms of performance, features and external dependency.

fdbbackup operates at the key-value level. Backup involves copying of all the key-value pairs from the source cluster and restore involves applying all the key-value pairs to the destination database. Performance depends on the amount of data and the throughput with which the data can be read and written. This approach has no external dependency, there is no requirement for any snapshotting feature from the disk system. Additionally, it has an option for continuous backup with the flexibility to pick a restore point.

Disk snapshot backup and restore are generally high performance because it operates at disk level and data is not read or written through the FoundationDB stack. In environments where disk snapshot and restore are highly performant this approach can be very fast. Frequent backups can be done as a substitute to continuous backup if the backups are performant.

Limitations

  • No support for continuous backup
  • Feature is not supported on Windows operating system
  • Data encryption is dependent on the disk system
  • Backup and restore involves tooling which are deployment and environment specific to be developed by operators
  • snapshot command is a hidden fdbcli command in the current release and will be unhidden in a future patch release.

Disk snapshot backup steps

snapshot
This command line tool is used to create the snapshot. It takes a full path to a snapshot create binary and reports the status. Optionally, it can take additional arguments to be passed down to the snapshot create binary. It returns a unique identifier which can be used to identify all the disk snapshots of a backup. Even in case of failures the unique identifier is returned to identify and clear any partially create disk snapshots.

In response to the snapshot request from the user, FoundationDB will run the user specified snapshot create binary on all processes which have persistent data, binary should call filesystem/disk system specific snapshot create API.

Before using the snapshot command the following setup needs to be done

  • Write a program that will snapshot the local disk store when invoked by the fdbserver with the following arguments:

    • UID - 32 byte alpha-numeric unique identifier, the same identifier will be passed to all the nodes in the cluster, can be used to identify the set of disk snapshots associated with this backup
    • Version - version string of the FoundationDB binary
    • Path - path of the FoundationDB datadir to be snapshotted, datadir specified in [fdbserver] section
    • Role - tlog/storage/coordinator, identifies the role of the node on which the snapshot is being invoked
  • Install snapshot create binary on the FoundationDB instance in a secure path that can be invoked by the fdbserver

  • Set a new config parameter whitelist_binpath in [fdbserver] section, whose value is the snapshot create binary absolute path. Running any snapshot command will validate that it is in the whitelist_binpath. This is a security mechanism to stop running a random/insecure command on the cluster by a client using the snapshot command. Example configuration entry will look like:

    whitelist_binpath = "/bin/snap_create.sh"
    
  • snapshot create binary should capture any additional data needed to restore the cluster. Additional data can be stored as tags in cloud environments or it can be stored in an additional file/directory in the datadir and then snapshotted. The section Disk snapshot backup specification describes the recommended specification of the list of things that can be gathered by the binary.

  • Program should return a non-zero status for any failures and zero for success

  • If the snapshot create binary process takes longer than 5 minutes to return a status then it will be killed and snapshot command will fail. Timeout of 5 minutes is configurable and can be set with SNAP_CREATE_MAX_TIMEOUT config parameter in [fdbserver] section. Since the default value is large enough, there should not be a need to modify this configuration.

snapshot is a synchronous command and when it returns successfully backup is considered complete and restorable. The time it takes to finish a backup is a function of the time it takes to snapshot the disk store. For example, if disk snapshot takes 1 second, time to finish backup should be less than < 10 seconds, this is general guidance and in some cases it may take longer. If the command is aborted by the user then the disk snapshots should not be used for restore, because the state of backup is undefined. If the command fails or aborts, operator can retry by issuing another snapshot command.

Example snapshot command usage:

fdb> snapshot /bin/snap_create.sh --param1 param1-value --param2 param2-value
Snapshot command succeeded with UID c50263df28be44ebb596f5c2a849adbb

will invoke the snapshot create binary on tlog role with the following arguments:

--param1 param1-value --param2 param2-value --path /mnt/circus/data/4502 --version 6.2.6 --role tlog --uid c50263df28be44ebb596f5c2a849adbb

Disk snapshot backup specification

Details the list of artifacts the snapshot create binary should gather to aid the restore.

Field Name Description Source of information
UID unique identifier passed with all the snapshot create binary invocations associated with a backup. Disk snapshots could be tagged with this UID. snapshot CLI command output contains the UID
FoundationDB Server Version software version of the fdbserver command line argument to snap create binary
CreationTime current system date and time time obtained by calling the system time
FoundationDB Cluster File cluster file which has cluster-name, magic and the list of coordinators, cluster file is detailed here Cluster files read from the location of the cluster file location mentioned in the command line arguments. Command line arguments of fdbserver can be accessed from /proc/$PPID/cmdline
Config Knobs command line arguments passed to fdbserver available from command line arguments of fdbserver or from foundationdb.conf
IP Address + Port host address and port information of the fdbserver that is invoking the snapshot available from command line arguments of fdbserver
LocalityData machine id, zone id or any other locality information available from command line arguments of fdbserver
Name for the snapshot file recommended name for the disk snapshot cluster-name:ip-addr:port:UID

snapshot create binary will not be invoked on processes which does not have any persistent data (for example, Cluster Controller or Master or CommitProxy). Since these processes are stateless, there is no need for a snapshot. Any specialized configuration knobs used for one of these stateless processes need to be copied and restored externally.

Management of disk snapshots

Unused disk snapshots or disk snapshots that are part of failed backups have to deleted by the operator externally.

Error codes

Error codes returned by snapshot command

Name Code Description Comments
snap_path_not_whitelisted 2505 Snapshot create binary path not whitelisted Whitelist the snap create binary path and retry the operation.
snap_not_fully_recovered_unsupported 2506 Unsupported when the cluster is not fully recovered Wait for the cluster to finish recovery and then retry the operation
snap_log_anti_quorum_unsupported 2507 Unsupported when log anti quorum is configured Feature is not supported when log anti quorum is configured
snap_with_recovery_unsupported 2508 Cluster recovery during snapshot operation not supported Recovery happened while snapshot operation was in progress, retry the operation.
snap_storage_failed 2501 Failed to snapshot storage nodes Verify that the snap create binary is installed and can be executed by the user running fdbserver
snap_tlog_failed 2502 Failed to snapshot TLog nodes ,,
snap_coord_failed 2503 Failed to snapshot coordinator nodes ,,
unknown_error 4000 An unknown error occurred ,,
snap_disable_tlog_pop_failed 2500 Disk Snapshot error No operator action is needed, retry the operation
snap_enable_tlog_pop_failed 2504 Disk Snapshot error ,,

Disk snapshot restore steps

Restore is the process of building up the cluster from the snapshotted disk images. There is no option to specify a restore version because there is no support for continuous backup. Here is the list of steps for the restore process:

  • Identify the snapshot disk images associated with the backup to be restored with the help of UID or creation time
  • Group disk images of a backup by IP address and/or locality information
  • Bring up a new cluster similar to the source cluster with FoundationDB services stopped and either attach the snapshot disk images or copy the snapshot disk images to the cluster in the following manner:
    • Map the old IP address to new IP address in a one to one fashion and use that mapping to guide the restoration of disk images
  • Compute the new fdb.cluster file based on where the new coordinators disk stores are placed and push it to the all the instances in the new cluster
  • Start the FoundationDB service on all the instances
  • NOTE: Process can have multiple roles with persistent data which share the same datadir. snapshot create binary will create multiple snapshots, one per role. In such case, snapshot disk images needs to go through additional processing before restore, if a snapshot image of a role has files that belongs to other roles then they need to be deleted.

Cluster will start and get to healthy state indicating the completion of restore. Applications can optionally do any additional validations and use the cluster.

Example backup and restore steps

Here are the backup and restore steps on an over simplified setup with a single node cluster and cp command to create snapshots and restore. This is purely for illustration, real world backup and restore scripts needs to follow all the steps detailed above.

  • Create a single node cluster by following the steps here Building a Cluster

  • Check the status of the cluster and write a few sample keys:

    fdb> status
    
    Using cluster file `/mnt/source/fdb.cluster'.
    
    Configuration:
      Redundancy mode        - single
      Storage engine         - ssd-2
      Coordinators           - 1
    
    Cluster:
      FoundationDB processes - 1
      Zones                  - 1
      Machines               - 1
      Memory availability    - 30.6 GB per process on machine with least available
      Fault Tolerance        - 0 machines
      Server time            - 12/11/19 04:02:57
    
    Data:
      Replication health     - Healthy
      Moving data            - 0.000 GB
      Sum of key-value sizes - 0 MB
      Disk space used        - 210 MB
    
    Operating space:
      Storage server         - 72.6 GB free on most full server
      Log server             - 72.6 GB free on most full server
    
    Workload:
      Read rate              - 9 Hz
      Write rate             - 0 Hz
      Transactions started   - 5 Hz
      Transactions committed - 0 Hz
      Conflict rate          - 0 Hz
    
    Backup and DR:
      Running backups        - 0
      Running DRs            - 0
    
    Client time: 12/11/19 04:02:57
    
    fdb> writemode on
    fdb> set key1 value1
    Committed (76339236)
    fdb> set key2 value2
    Committed (80235963)
    
  • Write a snap create binary which copies the datadir to a user passed destination directory location:

    #!/bin/sh
    
    while (( "$#" )); do
        case "$1" in
            --uid)
                SNAPUID=$2
                shift 2
                ;;
            --path)
                DATADIR=$2
                shift 2
                ;;
            --role)
                ROLE=$2
                shift 2
                ;;
            --destdir)
                DESTDIR=$2
                shift 2
                ;;
            *)
                shift
                ;;
        esac
    done
    
    mkdir -p "$DESTDIR/$SNAPUID/$ROLE" || exit 1
    cp "$DATADIR/"* "$DESTDIR/$SNAPUID/$ROLE/" || exit 1
    
    exit 0
    
  • Install the snap create binary as /bin/snap_create.sh, add the entry for whitelist_binpath in [fdbserver] section, stop and start the foundationdb service for the configuration change to take effect

  • Issue snapshot command as follows:

    fdb> snapshot /bin/snap_create.sh --destdir /mnt/backup
    Snapshot command succeeded with UID 69a5e0576621892f85f55b4ebfeb4312
    
  • snapshot create binary gets invoked once for each role namely tlog, storage and coordinator in this process with the following arguments:

    --path /mnt/source/datadir --version 6.2.6 --role storage --uid 69a5e0576621892f85f55b4ebfeb4312 --destdir /mnt/backup
    --path /mnt/source/datadir --version 6.2.6 --role tlog --uid 69a5e0576621892f85f55b4ebfeb4312 --destdir /mnt/backup
    --path /mnt/source/datadir --version 6.2.6 --role coord --uid 69a5e0576621892f85f55b4ebfeb4312 --destdir /mnt/backup
    
  • Snapshot is successful and all the snapshot images are in destdir specified by the user in the command line argument to snapshot command, here is a sample directory listing of one of the coordinator backup directory:

    $ ls /mnt/backup/69a5e0576621892f85f55b4ebfeb4312/coord/
    coordination-0.fdq                                     log2-V_3_LS_2-b9990ae9bc00672f07264ad43d9d0792.sqlite-wal  processId
    coordination-1.fdq                                     logqueue-V_3_LS_2-b9990ae9bc00672f07264ad43d9d0792-0.fdq   storage-f0e72cdfed12a233e0e58291150ca597.sqlite
    log2-V_3_LS_2-b9990ae9bc00672f07264ad43d9d0792.sqlite  logqueue-V_3_LS_2-b9990ae9bc00672f07264ad43d9d0792-1.fdq   storage-f0e72cdfed12a233e0e58291150ca597.sqlite-wal
    
  • To restore the coordinator backup image, setup a restore datadir and copy all the coordinator related files to it:

    $ cp /mnt/backup/69a5e0576621892f85f55b4ebfeb4312/coord/coord* /mnt/restore/datadir/
    
  • Repeat the above steps to restore storage and tlog backup images

  • Prepare the fdb.cluster for the restore with new coordinator IP address, example:

    znC1NC5b:iYHJLq7z@10.2.80.40:4500 -> znC1NC5b:iYHJLq7z@10.2.80.41:4500
    
  • foundationdb.conf can be exact same copy as the source cluster for this example

  • Once all the backup images are restored, start a new fdbserver with the datadir pointing to /mnt/restore/datadir and the new fdb.cluster.

  • Verify the cluster is healthy and check the sample keys that we added are there:

    fdb> status
    
    Using cluster file `/mnt/restore/fdb.cluster'.
    
    Configuration:
      Redundancy mode        - single
      Storage engine         - ssd-2
      Coordinators           - 1
    
    Cluster:
      FoundationDB processes - 1
      Zones                  - 1
      Machines               - 1
      Memory availability    - 30.5 GB per process on machine with least available
      Fault Tolerance        - 0 machines
      Server time            - 12/11/19 09:04:53
    
    Data:
      Replication health     - Healthy
      Moving data            - 0.000 GB
      Sum of key-value sizes - 0 MB
      Disk space used        - 210 MB
    
    Operating space:
      Storage server         - 72.5 GB free on most full server
      Log server             - 72.5 GB free on most full server
    
    Workload:
      Read rate              - 7 Hz
      Write rate             - 0 Hz
      Transactions started   - 3 Hz
      Transactions committed - 0 Hz
      Conflict rate          - 0 Hz
    
    Backup and DR:
      Running backups        - 0
      Running DRs            - 0
    
    Client time: 12/11/19 09:04:53
    
    fdb> get key1
    `key1' is `value1'
    fdb> get key2
    `key2' is `value2'