Backblaze B2 backup setup

Introduction

Recently, I’ve been thinking more and more about backups for my small (but growing) homelab. The golden rule is to follow the 3-2-1 method for backups:

  • 3 backups
  • 2 different types of media
  • 1 backup offsite

Current setup

Currently, I keep an encrypted external HDD at home and another at work. Every couple weeks, I perform a backup to both and rotate the drives (this covers a 2-1-1 backup).

Planned setup

I’d like to add cloud storage for a full 3-2-1 backup. My idea is to centralize all my backups to one location, then send the backups offsite to a cloud storage provider. The setup below is my final goal and will fulfill my 3-2-1 requirement.

Storage providers

For this, I was looking for a raw storage endpoint with some sort of API or command line interface. I was not interested in a file syncing service (e.g., Google Drive or Dropbox) or a cloud backup solution (e.g., Crashplan or Carbonite). While looking for cloud storage providers, I compared the following:

I ended up choosing Backblaze B2 storage. They seemed to be the cheapest, had the most straight-forward pricing, and were the easiest to setup with the backup program I was using.

Full disclosure, I was already a Backblaze fanboy. I was already subscribed to their great blog where they post yearly stats on their hard drives. But, if that’s not enough, they offer free restores via USB flash drive or external HDD if your data is too big to download. And if you need to upload up to 40TB of data, you can request a Fireball (not free, but still cool).

Backup programs

While looking for backup programs, I compared the following:

I ended up choosing Duplicity. It seemed to be the most popular program, it supports incremental backups and B2 storage, and supports encryption with GPG.

Setup B2

Sign up and install B2

Sign up for a B2 account if you don’t have one already. You can download the official B2 command line tool from these instructions, but I’m installing the package from the AUR using pacaur. Note – You can create a bucket from the website if you don’t want to install the B2 command line tool.

pacaur -S backblaze-b2

Setup a bucket

Start by authorizing your account (substitute your account ID as needed). You will be prompted for your Application Key, which you can get in the B2 control panel.

backblaze-b2 authorize_account xxxxxxxxxxx

Now, create a bucket (make sure it is allPrivate). The bucket name must be globally unique to all of Backblaze, not just your account. You can have up to 100 buckets per account.

backblaze-b2 create_bucket server-backups allPrivate

Finally, list your available buckets.

backblaze-b2 list_buckets

Setup GPG

I highly recommend you encrypt your backups using GPG. It’s integrated into Duplicity and will protect your files from prying eyes. I won’t be covering it here, but check out my other guide on how to create a GPG key. For this setup, I will be using a separate key for encryption and signing.

Disclaimer – Don’t lose the keys or the passphrases to the keys. For example, don’t backup the GPG keys using Duplicity, then have your hard drive crash, which would require the GPG keys to unlock Duplicity. Store the keys on a separate backup by themselves.

Setup Duplicity

First, install Duplicity.

sudo pacman -S duplicity

Duplicity basics

The basic syntax for Duplicity is below.

duplicity [SOURCE] [DESTINATION]

To backup directly to a server via SFTP, use a command similar to the one below.

duplicity ~/backups sftp://username@server/directory/

To backup a folder to your B2 bucket, use a command similar to the one below. Substitute your account ID, application key, and bucket name as needed.

duplicity ~/backups b2://[account_id]:[application_key]@[bucket_name]/[directory]

Duplicity also handles rotating backups. Here, I’m remove backups older than 3 months.

duplicity remove-older-than 3M b2://[account_id]:[application_key]@[bucket_name]/[directory]

Duplicity script

Because Duplicity has so many command line options, it’s easier to setup a script and run it via cron.

#!/bin/sh

# Backblaze B2 configuration variables
B2_ACCOUNT="AAA"
B2_KEY="BBB"
B2_BUCKET="CCC"
B2_DIR="backups"

# Local directory to backup
LOCAL_DIR="/home/DDD/backups"

# GPG key (last 8 characters)
ENC_KEY="EEEEEEEE"
SGN_KEY="FFFFFFFF"
export PASSPHRASE="GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG"
export SIGN_PASSPHRASE="HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH" 

# Remove files older than 90 days
duplicity \
 --sign-key $SGN_KEY --encrypt-key $ENC_KEY \
 remove-older-than 90D --force \
 b2://${B2_ACCOUNT}:${B2_KEY}@${B2_BUCKET}/${B2_DIR}

# Perform the backup, make a full backup if it's been over 30 days
duplicity \
 --sign-key $SGN_KEY --encrypt-key $ENC_KEY \
 --full-if-older-than 30D \
 ${LOCAL_DIR} b2://${B2_ACCOUNT}:${B2_KEY}@${B2_BUCKET}/${B2_DIR}

# Cleanup failures
duplicity \
 cleanup --force \
 --sign-key $SGN_KEY --encrypt-key $ENC_KEY \
 b2://${B2_ACCOUNT}:${B2_KEY}@${B2_BUCKET}/${B2_DIR}

# Show collection-status
duplicity collection-status \
 --sign-key $SGN_KEY --encrypt-key $ENC_KEY \
  b2://${B2_ACCOUNT}:${B2_KEY}@${B2_BUCKET}/${B2_DIR}

# Unset variables
unset B2_ACCOUNT
unset B2_KEY
unset B2_BUCKET
unset B2_DIR
unset LOCAL_DIR
unset ENC_KEY
unset SGN_KEY
unset PASSPHRASE
unset SIGN_PASSPHRASE 

Hope this helps!

Logan

57 thoughts on “Backblaze B2 backup setup”

    • None yet, I’m still looking to get a NAS to run this on. However, I plan on doing once a week when I do get it setup.

  1. reviewing the script, I generated PGP keys unsing gnupg. I located the keys and know the passphrase i used when making them. I am unsure which field i input the public key, do I input the passphrase I used to create the keys? Not exactly sure about the fields below. I know you mentioned you created 2 sets of keys. I only created 1 set (pub/priv). Thanks for the help

    ENC_KEY, SGN_KEY, PASSPHRASE, SIGN_PASSPHRASE

    • Sorry for the confusion.

      ENC_KEY=”EEEEEEEE” and SGN_KEY=”FFFFFFFF” are the last eight digits of the public fingerprint of your keys (in your case, it’s the same key).

      export PASSPHRASE=”GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG” and export SIGN_PASSPHRASE=”HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH” are the passwords to your keys (in your case, it’s the same password because it’s the same key)

  2. Thanks for the clarification.

    Where does the public key need to be located for this script to work? or does it pull the key from MIT’s servers? thank you!

    • Keep the public key in your ~/.gnupg directory (the default location). Duplicity knows to look in that directory (assuming you’re running the script as you).

  3. Hi,
    When I followed your script and execute, I get the following error:
    UnsupportedBackendScheme: scheme not supported in url: b2://<>:<>@B2 Bucket/B2 Dir.
    (I’ve removed my info from the error).
    Thanks.

    • Did you remove the brackets, or is that part of your command?

      Are you running this on Linux? If so, what distro? Also, I see there is a bug open for this too.

      • Logan,
        Thanks for the reply. When I couldn’t get this to work on my CentOS 6.9 server, I moved on to rclone.
        I saw the open bug and it’s the same issue I’m running into with my CentOS setup. I’ve been holding off on upgrading to CentOS7.x. I guess I need to start planning the migration.
        Thanks.

  4. Specifically, what GPG files would you need to have store elsewhere in order to restore files from an encrypted B2? Do you have a sample restore script?

    • You would need to keep your private key (since your files are encrypted with your public key), but I would backup all your public and private keys from ~/.gnupg (and store the backup somewhere secure).
      cd
      tar -czf gpg.tgz ./.gnupg

      You can use the command below as an example, but I haven’t tested it.
      duplicity --sign-key $SGN_KEY --encrypt-key $ENC_KEY restore --file-to-restore ${B2_DIR} b2://${B2_ACCOUNT}:${B2_KEY}@{B2_BUCKET}

  5. i generated the keys on my desktop then copied the entire dir into my remote comp running the script (where the data is located) when i execute the script i keep running into the below (seems that it doesnt think i have keys already there, which they are).

    root@Backblaze # ./test.sh
    Local and Remote metadata are synchronized, no sync needed.
    Last full backup date: none
    No old backup sets found, nothing deleted.
    Local and Remote metadata are synchronized, no sync needed.
    Last full backup date: none
    Last full backup is too old, forcing full backup
    GPGError: GPG Failed, see log below:
    ===== Begin GnuPG log =====
    gpg: keyring `/root/.gnupg/secring.gpg’ created
    gpg: keyring `/root/.gnupg/pubring.gpg’ created
    gpg: no default secret key: secret key not available
    gpg: [stdin]: sign+encrypt failed: secret key not available
    ===== End GnuPG log =====

    Local and Remote metadata are synchronized, no sync needed.
    Last full backup date: none
    No extraneous files found, nothing deleted in cleanup.
    Local and Remote metadata are synchronized, no sync needed.
    Last full backup date: none
    Collection Status
    —————–
    Connecting with backend: BackendWrapper
    Archive dir: /root/.cache/duplicity/3c1c4445206909a752edc39362f0fef7

  6. keys are located in the /.gnupg dir

    also this is on FreeBSD but shouldn’t matter.
    I wasn’t able to generate enough entropy which is why I needed to create then move the /.gnupg dir over to the server.

  7. root@Backblaze:~/.gnupg # ls -l

    total 41
    drwxr-xr-x 2 root wheel 3 Sep 1 01:13 crls.d
    -rwxr–r– 1 root wheel 2912 Sep 1 01:13 dirmngr.conf
    -rwxr–r– 1 root wheel 5191 Sep 1 01:13 gpg.conf
    drwxr-xr-x 2 root wheel 3 Sep 1 01:13 openpgp-revocs.d
    drwxr-xr-x 2 root wheel 4 Sep 1 01:13 private-keys-v1.d
    -rw——- 1 root wheel 0 Sep 1 01:15 pubring.gpg
    -rwxr–r– 1 root wheel 1390 Sep 1 01:13 pubring.kbx
    -rwxr–r– 1 root wheel 32 Sep 1 01:13 pubring.kbx~
    -rw——- 1 root wheel 0 Sep 1 01:15 secring.gpg
    -rwxr–r– 1 root wheel 1280 Sep 1 01:13 trustdb.gpg

    I had an idea that maybe the permissions were incorrect since I had copied the files from my desktop to the server (3 files were created at 01:15 as a result of the script not detecting the keys already in there). What do you think? thanks!

  8. I spent an hour or so bashing my head on a wall because of this error:

    gpg: no default secret key: bad passphrase
    gpg: [stdin]: sign+encrypt failed: bad passphrase

    Turns out my passphrase had a few characters that bash didn’t like and using single quotes, instead of double, for the passphrase variables in the backup script solved my issue.

    Thanks for the guide

  9. I’ve found today that adding these two parameters increases the likelihood of a successful backup to B2:

    –timeout 90 \
    –tempdir /big-fast-file-system/temp \
    ….

    The default temp file system on my machine was too small and my backups would repeatedly fail with error:

    Attempt 1 failed. SSLError: (‘The read operation timed out’,)

    After changing the temp directory location, I would still get the above error, but less often. That’s when I tried adding the timeout parameter as 90 seconds. Without the parameter, it takes a default of 30 seconds.

    With these two additions, my 4+GB backups started completing successfully.

    Thanks for the leg-up!

    • Thanks for the info! I’m planning a large backup of 100GB, so if it fails I’ll edit my script to include your recommendations.

  10. Another suggestion for the script is to not delete old backups — even those older than 90 days — until you know that you have a new, good backup.

    It would be a shame if your last good backup was heartlessly deleted before getting a new, good backup.

    My $0.02..

    • I agree with you there. I’m only doing the delete before the backup to free up space, but you could easily swap those two portions around.

  11. Since I’ve been struggling with it for a while I thought I’d share my experiences trying to restore a directory from B2 that I made with your script. I’ve backed up to a bucket with 3 directories a,b and c using a small quantity of test data. So on the backend I have:

    Bucket-name
    /a
    /b
    /c

    The command that successfully restored directory b is this:

    duplicity \
    restore \
    –sign-key $FULL_KEY \
    –encrypt-key $FULL_KEY \
    –file-to-restore ${B2_RESTORE} \
    b2://${B2_ACCOUNT}:${B2_KEY}@${B2_BUCKET}/${B2_DIR} ${LOCAL_DIR}

    Where:

    B2_DIR=”/b”
    B2_RESTORE=”/”
    LOCAL_DIR=”/backup/restore”

    This isn’t how the man page suggests it should work but if I didn’t include the bucket directory I found that I’d get an error “No backup chains with active signatures found”.

  12. Sorry – I should have also mentioned that I’m on Debian 9 running Duplicity 0.7.14 (installed manually from the website tar.gz)

  13. Great article, but two questions: do I need a stage dir?

    In other words, if I have 1TB to backup, is duplicity going to create a temporary compressed copy of that 1TB in /tmp (or wherever) before it uploads?

    Also, what if I’m restoring 1 file from that 1TB full? Can it extract one file out of the .tar.gz on b2?

  14. FYI – there is no need to for all of the unset’s at the end of the script since variables are local to the script (unlike .bat/.cmd scripts from DOS/Windows.)

  15. Hey Logan,

    Just wanted to say great blog! I stumbled upon it googling something like “Archer C7 OpenWRT OpenVPN server” and am finding all your articles awesome. I’m going through and bookmarking weekend projects.

    I’ve been kicking myself for quite some time to setup some offsite backups, and this seems like a fun project for this weekend. I’ll report back how things go.

    Spoilers – I’ll probably just sign up for Backblaze B2 and sync from my Synology NAS. I’m impressed that they’ve made it so simple.

    • Thanks, always happy to help out!

      Let me know how it goes! I’m looking at a Synology myself. Which model do you have? Would you recommend it?

      • Hi Logan – I have a Synology DS213J.

        I’m happy with it, and I’m glad it lasted me from the start of 2013, all the way out to now. Synology have been great at releasing regular updates. It’s a great NAS for general users who want access to their files, and access to community packages (torrent clients, music servers, couchpotato/sonarr, nzbget/sabnzbd).

        If you purchase a more recent model, you can even run Docker containers.

        I’d say Synology are the best out of those prebuilt NAS devices. If my main purpose for buying it was to tinker, I may have preferred my own FreeNAS/unRAID box.

        I also like that support for the different backup targets is built right into the device. I did a proof of concept, syncing a few GB of data to Backblaze. The encryption / consistency checks added less overhead than I expected.

        I’m going to do a bit more research before uploading everything though.

        • Very cool! I’m looking at the DS216J.

          Let me know how Backblaze works out for you on a large upload.

  16. hi Logan,
    first of all thank you for your fantastic guide! I used it, working perfectly.

    I uploaded ~15GB so far and I used ~5000 transactions. (total amount will be ~400GB)
    I just realized that Backblaze B2 is charging for Class B and Class C transactions.
    – storage cost is $0.005/GB
    – Class B transaction $0.004/10000 calls (first 2500 is free every day)
    – Class C transaction $0.004/1000 calls (first 2500 is free every day)

    Technical question:
    is there any difference in number of transactions for the following two methods?
    1, your script
    2, duplicity to local drive first then simply upload to cloud

    (reason for asking is that I see the following transaction types/numbers when using your method:
    – delete file / 1063 <- why is it deleting?
    – get upload URL / 1065
    – list file names / 2126
    – upload file / 1058 )

    the cost of transactions are quite low, so it is a theoretical question only 🙂
    thank you
    Gergo

    • I’ll admit, I haven’t looked much the transaction differences yet. As you mentioned, the costs are so low, it’s a negligible amount, especially once you get past the first large upload.

      As for your question, I don’t think there would be a difference, unless you uploaded your files as one large compressed file.

      Also, to save some transactions, you could:
      -remove the deletion of old files, but your required B2 storage would never stop growing
      -remove the collection status, since you’re basically just listing files out

  17. Logan, thanks for the incredibly helpful guide. With your help, I was able to get some backups moving off to B2. However, I ran into a hiccup, and I’m wondering if you or others have had this issue:

    I’m doing a backup of about 800 GB, but 24 hours in (give or take a little bit), the backup failed after only pushing 60 GB. The initial command was set to run a full backup, and your script was set up in a “nohup … &” command in terminal. Is there a way to resume a failed backup?

    For what it’s worth, the backup failed on this error:
    Writing duplicity-full.20180407T234349Z.vol2091.difftar.gpg
    Giving up after 5 attempts. HTTPError: HTTP Error 401: Unauthorized

    Thanks for any help or insight you can offer!

    • I haven’t run that large of a backup yet, but another commenter added a few flags to assist with large backups. Might be worth checking out.

      • Thanks, and I did see that. I did a little more digging, and did discover that Duplicity will fail after 24 hours of being connected to B2. This was solved in version 7.08 (https://bugs.launchpad.net/duplicity/+bug/1588503), but the update seems to not have been pushed out to the repositories for Ubuntu 16.04. I will run a manual install and test it out.

          • That did work! And thanks for the recommendation with the PPA.

            For anyone interested, I ran the same backup script again, and Duplicity was able to check what was already backed up. It just picked up where it left off, and it’s been uploading like a charm.

  18. Hey,

    Ive run into a couple of issues. If I run as root I get a GPG key not found error. Where would I call this in the script for root?

    When I run it as my normal user I get

    Traceback (most recent call last):
    File “/usr/bin/duplicity”, line 1532, in
    with_tempdir(main)
    File “/usr/bin/duplicity”, line 1526, in with_tempdir
    fn()
    File “/usr/bin/duplicity”, line 1377, in main
    globals.lockfile.acquire(timeout=0)
    File “/usr/lib/python2.7/dist-packages/lockfile/linklockfile.py”, line 21, in acquire
    raise LockFailed(“failed to create %s” % self.unique_name)
    LockFailed: failed to create /home/johndoe/.cache/duplicity/xxxxxxxxxxx/xxxxxxxxx

  19. I followed this tutorial and completed a full backup of my machine. It took ~54 hours and was ~500GB. The output showed 0 errors. However when I try any commands such as collection-status or list-current-files, duplicity finds no backups at all. When I browse the bucket I can see that it contains ~500GB of files. There are a lot of files of 209.8MB, with different timestamps and volume numbers, like ‘pc/duplicity-full.20181016T214816Z.vol1.difftar.gpg ‘ or ‘pc/duplicity-full.20181018T114323Z.vol1221.difftar.gpg’. The ‘pc/’ is because I used the filepath [bucket name]/pc, I thought this would create a directory called ‘pc’ in the bucket and place the files there. When looking at the info for the bucket, under ‘Unfinished Large Files’ there is one file, ‘pc%2Fduplicity-full.20181016T214816Z.vol541.difftar.gpg’. Could this be the reason Duplicity can’t find any backups? i wonder if I tried to back up to much at once and the uploads being on different days is an issue. I can’t find much information elsewhere online, any help would be much appreciated.

    • Tristan, I switched from Duplicity to rclone, so I can’t be much help here unfortunately. Yes, it’s possible that Duplicity can’t create a list of all your files if it isn’t done uploading them. Did you check the log files?

      • I haven’t checked the log files yet, I didn’t specify a log file when I ran the command and I’m having some trouble finding if and where default logs are stored. I am going to try to run another full backup with –log-file. Why did you decide to use rclone instead?

        • Check out this post.

          I ended up choosing rclone for this task, instead of Duplicity. Duplicity is great, but it requires a good bit of memory to run, and it writes temporary files to local storage while it encrypts and uploads them. Because the ODROID-HC2 has limited hardware, I didn’t want this to become a problem. As far as I can tell, rclone doesn’t have these problems or limitations. In addition, this backup is really a backup of a backup, so I’m just interested in pushing large amounts of data offsite as quickly as possible, which rclone seems to be suited for.

          • I will give it a read. I ran a full backup again with a log file and Info level verbosity. I guess . I should have used Debug. I grepped the logs for errors but only found matches within file names. Both runs have had this output:
            NOTICE 1
            . ————–[ Backup Statistics ]————–
            . StartTime 1540735365.25 (Sun Oct 28 10:02:45 2018)
            . EndTime 1540865521.74 (Mon Oct 29 22:12:01 2018)
            . ElapsedTime 130156.49 (36 hours 9 minutes 16.49 seconds)
            . SourceFiles 449398
            . SourceFileSize 440723528263 (410 GB)
            . NewFiles 449398
            . NewFileSize 440723528195 (410 GB)
            . DeletedFiles 0
            . ChangedFiles 0
            . ChangedFileSize 0 (0 bytes)
            . ChangedDeltaSize 0 (0 bytes)
            . DeltaEntries 449398
            . RawDeltaSize 440522009128 (410 GB)
            . TotalDestinationSizeChange 386360735399 (360 GB)
            . Errors 0
            . ————————————————-
            .

            But when using list-current-files b2://[my info] I get this output:
            Last full backup date: none
            Collection Status
            —————–
            Connecting with backend: BackendWrapper
            Archive dir: /home/tristan/.cache/duplicity/fbbf2e8111dd8bbf8168bbe9c17fd6fe

            Found 0 secondary backup chains.
            No backup chains with active signatures found
            No orphaned or incomplete backup sets found.

            Maybe it would be easier to switch to rclone anyway.

            • Ya sorry, I’m probably not going to be much help here. You could try to search/open a bug on Launchpad if you’re using Ubuntu.
              https://bugs.launchpad.net/duplicity

              I think Duplicity provides more secure encryption with GPG (as compared to rclone), but to me, it seemed that overhead was too much.

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.