Ready for the ultimate backup solution? Learn how to use Paperless with Rclone to back up your documents in the cloud fully automated and encrypted.
Last updated: Aug 31, 2024
Regularly creating backups in your paperless office, especially when using Paperless-ngx, is indispensable for several reasons:
You may have already heard of the 3-2-1 backup rule. It's a recommended strategy for creating backups and ensuring high data security. The rule is simple but effective.
The three criteria:
Implementing the 3-2-1 backup rule helps you effectively protect your data by diversifying across multiple storage types and locations.
If you want to follow the 3-2-1 backup rule, you can now consider where you want to secure your data.
I will show you an example of setting up a Paperless backup automation from Linux to Google Drive using the open-source tool rclone.
Most NAS or VPS run a Linux distribution. Therefore, the steps should be quite similar regardless of the device.
Rclone is a program for creating backups on cloud storage, running on Linux, macOS, and Windows, and supports over 70 cloud providers (Google Drive, OneDrive, Dropbox, Amazon S3, etc.)
Rclone mirrors your data from your server to the cloud storage. Thus, not everything needs to be re-uploaded with each change. To prevent the cloud provider from reading your data, we will also encrypt the backup with rclone.
The prerequisite for this guide is that you have already successfully installed Paperless-ngx and fed it with some documents. You also need an account with a cloud provider. Here I use Google Drive. The following steps are very similar and sometimes simpler for other cloud providers.
There are various ways to make Paperless-ngx backups. I recommend using the native “Paperless Document Exporter” & “Importer” instead of simply copying your entire Paperless folders. Since Paperless is probably running in Docker for you, some of the data might be in Docker volumes. You can run the Document Exporter with just one command, and it exports not only all files but also any configurations, user accounts, tags, and everything the algorithm has learned.
Log in to your server via SSH and execute the command to install rclone via snap:
sudo snap install rclone
The snap store daemon must be installed for this (it is usually pre-installed on Ubuntu). If the command above fails, install the snap store daemon first:
sudo apt update
sudo apt install snapd
For this guide, it's important that rclone is installed on a device with web browser access (in addition) to retrieve a token for connecting with the cloud provider (see Step 3).
Since I am using a VPS without a web browser, I install rclone additionally on my Mac. For this, I use the package manager homebrew:
brew install rclone
You can find downloads for other operating systems here.
Log into the Google API Console. It must be the same account that you want to use for Google Drive.
Select a project or create a new project.
Create a Google project
Under APIs & Services, click on + ENABLE APIS AND SERVICES and search for drive, then activate the Google Drive API.
Select the Google Drive API
Activate the Google Drive API
Click on Credentials in the left sidebar, then click on CONFIGURE CONSENT SCREEN.
Click on configure consent screen
Select External and click CREATE. Then enter an app name, the user support email (your own email is okay), and the developer contact information (your own email is okay). Click SAVE AND CONTINUE (all other fields are optional).
Enter the app information
Click on ADD OR REMOVE SCOPES. Add the scopes with the scope .../auth/docs
, .../auth/drive
, and .../auth/drive.metadata.readonly
so that you can edit, create, and delete files with rclone. After that, click UPDATE and then SAVE AND CONTINUE.
Add the relevant scopes
Add your own account as a test user. Then click SAVE AND CONTINUE.
Then click on Credentials in the left sidebar again. Click on + CREATE CREDENTIALS and select OAuth client ID.
Choose to create an OAuth client ID
Select Desktop app as the application type and click CREATE.
Create an OAuth client ID
Now you will see the Client ID and Client secret. Copy both! We will need these in Step 3 for rclone.
Copy the client ID and client secret
Click on OAuth consent screen in the left sidebar, then click on PUBLISH APP and confirm.
Go back to your terminal and create a new remote connection with rclone. The command for this is:
rclone config
e) Edit existing remote
n) New remote
d) Delete remote
r) Rename remote
c) Copy remote
s) Set configuration password
q) Quit config
e/n/d/r/c/s/q>
Type n
to create a new remote. Then enter a name for the remote (e.g., gdrive).
After that, select the cloud provider. For this, enter the correct number (17 for Google Drive).
Option Storage.
Type of storage to configure.
Choose a number from below, or type in your own value.
...
17 / Google Drive
\ (drive)
...
Storage> 17
Next, enter the Client ID from Step 2:
Option client_id.
Google Application Client Id
Setting your own is recommended.
See https://rclone.org/drive/#making-your-own-client-id for how to create your own.
If you leave this blank, it will use an internal key which is low performance.
Enter a value. Press Enter to leave empty.
client_id> my-client-id
Next, enter the Client Secret from Step 2:
Option client_secret.
OAuth Client Secret.
Leave blank normally.
Enter a value. Press Enter to leave empty.
client_secret> my-client-secret
Choose the access permissions for rclone by entering a number (for the backup to work, number 3 is sufficient).
Option scope.
Comma separated list of scopes that rclone should use when requesting access from drive.
Choose a number from below, or type in your own value.
Press Enter to leave empty.
1 / Full access all files, excluding Application Data Folder.
\ (drive)
2 / Read-only access to file metadata and file contents.
\ (drive.readonly)
/ Access to files created by rclone only.
3 | These are visible in the drive website.
| File authorization is revoked when the user deauthorizes the app.
\ (drive.file)
/ Allows read and write access to the Application Data folder.
4 | This is not visible in the drive website.
\ (drive.appfolder)
/ Allows read-only access to file metadata but
5 | does not allow any access to read or download file content.
\ (drive.metadata.readonly)
scope> 3
For the next step, you can leave it blank by pressing ENTER
.
Option service_account_file.
Service Account Credentials JSON file path.
Leave blank normally.
Needed only if you want use SA instead of interactive login.
Leading `~` will be expanded in the file name as will environment variables such as `${RCLONE_CONFIG_DIR}`.
Enter a value. Press Enter to leave empty.
service_account_file>
After that, select the default by pressing ENTER
.
Edit advanced config?
y) Yes
n) No (default)
y/n>
If your server does not have a web browser, enter n
.
Use web browser to automatically authenticate rclone with remote?
* Say Y if the machine running rclone has a web browser you can use
* Say N if running rclone on a (remote) machine without web browser access
If not sure try Y. If Y failed, try N.
y) Yes (default)
n) No
y/n> n
Copy the entire line that starts with rclone authorize "drive" and execute it on your computer. To do this, open a new tab in your terminal. Your web browser should launch, and you can log into your Google account. Then copy the token and paste it back into your server terminal.
Option config_token.
For this to work, you will need rclone available on a machine that has
a web browser available.
For more help and alternate methods see: https://rclone.org/remote_setup/
Execute the following on the machine with the web browser (same rclone
version recommended):
rclone authorize "drive" "very-long-code"
Then paste the result.
Enter a value.
config_token> mein-token
Select the default by pressing ENTER
:
Configure this as a Shared Drive (Team Drive)?
y) Yes
n) No (default)
y/n> n
You will see a summary of your configuration. Press ENTER
.
Keep this "gdrive" remote?
y) Yes this is OK (default)
e) Edit this remote
d) Delete this remote
y/e/d> y
Now the connection to your Google Drive account is established.
We will now create another remote. This remote will form an encryption layer over the cloud storage remote that we just created. If you do not wish to encrypt the backup, you can skip this step and proceed to Step 5. The advantage of an unencrypted backup is that you can use Google Drive OCR search to quickly find your documents in Google Drive.
Start the configuration again with rclone config
and create a new remote (n
). You could name it, for example, gdrive_encrypted.
Next, select crypt from the list of providers by entering number 13.
Then you specify the name of the Google Drive remote along with the desired path where the backup should be stored. My remote is called gdrive and I want to store the backup in a folder named paperless_backup_encrypted. So, I enter gdrive:paperless_backup_encrypted:
Option remote.
Remote to encrypt/decrypt.
Normally should contain a ':' and a path, e.g. "myremote:path/to/dir",
"myremote:bucket" or maybe "myremote:" (not recommended).
Enter a value.
remote> gdrive:paperless_backup_encrypted
Then you select the encryption of the file names. I choose the default by pressing ENTER
:
Option filename_encryption.
How to encrypt the filenames.
Choose a number from below, or type in your own string value.
Press Enter for the default (standard).
/ Encrypt the filenames.
1 | See the docs for the details.
\ (standard)
2 / Very simple filename obfuscation.
\ (obfuscate)
/ Don't encrypt the file names.
3 | Adds a ".bin", or "suffix" extension only.
\ (off)
filename_encryption> 1
Then you select the encryption of the folder names. I choose the default again:
Option directory_name_encryption.
Option to either encrypt directory names or leave them intact.
NB If filename_encryption is "off" then this option will do nothing.
Choose a number from below, or type in your own boolean value (true or false).
Press Enter for the default (true).
1 / Encrypt directory names.
\ (true)
2 / Don't encrypt directory names, leave them intact.
\ (false)
directory_name_encryption> 1
Then you set two passwords for the encryption. I let the passwords be generated with the maximum length.
Option password.
Password or pass phrase for encryption.
Choose an alternative below.
y) Yes, type in my own password
g) Generate random password
y/g> g
Password strength in bits.
64 is just about memorable
128 is secure
1024 is the maximum
Bits> 1024
Your password is: dCGprYdlIOtBGa5ozP7H2VM6uXRp_KlwH5OfEoufl-IOgJKXXSWNVKR92K0vf72u1oj3MM9CoEuEgnKTEyxh2na8LtGn-X6v3EGzNkr-UXrBg38pxrijclE0_jtOz1q0ldJe0X9Z918Fd0ZxDmAwT3IqjyTEXhs9bJMedYhyk9w
Use this password? Please note that an obscured version of this
password (and not the password itself) will be stored under your
configuration file, so keep this generated password in a safe place.
y) Yes (default)
n) No
y/n> y
Option password2.
Password or pass phrase for salt.
Optional but recommended.
Should be different to the previous password.
Choose an alternative below. Press Enter for the default (n).
y) Yes, type in my own password
g) Generate random password
n) No, leave this optional password blank (default)
y/g/n> g
Password strength in bits.
64 is just about memorable
128 is secure
1024 is the maximum
Bits> 1024
Your password is: XuL0KVb8vQWnIodgem7qWTlD6vrZKt18dgUmMAK61v1coMUt7DCc6EMPww4viD7YcQDcE78miAKBg9L9Qm8mx2kXiyquMUXrvND-BC9qcJMp95cJsPrsocVxQ26b1aU7aXa3glZre69phmqICZVb6ijfo_-61KRiOsBNCt-QKkA
Use this password? Please note that an obscured version of this
password (and not the password itself) will be stored under your
configuration file, so keep this generated password in a safe place.
y) Yes (default)
n) No
y/n> y
Edit advanced config?
y) Yes
n) No (default)
y/n> n
Configuration complete.
Copy the passwords and save them in a secure place. This completes the configuration of the crypt remote. Confirm with ENTER
and terminate the process with CTRL+C
.
I'll now show you how to create a script that will create a backup for you when executed. Go to the folder where you want to save the script. In my case, I'm going to the paperless-ngx folder:
cd paperless-ngx
Then create a new file:
sudo nano backup.sh
Then paste the script. My script is deliberately kept very simple. You can copy it, but you need to adjust the absolute path to your docker-compose.yml file and your export folder:
#!/bin/bash
# path to your docker-compose.yml
cd /home/tobias
# execute the paperless document exporter
# save the backup in the export folder
docker compose exec -T webserver document_exporter ../export
# rclone command to encrypt and sync with the cloud
/snap/bin/rclone sync /home/tobias/paperless-ngx/export gdrive_encrypted:
Save and close the file afterwards with CTRL+C
, Y
, and ENTER
.
To make the script executable, enter this command:
chmod +x backup.sh
Test now if your script works by executing it via the command line:
./backup.sh
Depending on the amount of data, it may take several minutes. The process should terminate automatically.
100%|██████████| 265/265 [00:00<00:00, 332.70it/s]
If you now open Google Drive in your web browser, you should find a new folder in your storage. In this folder, you will find the encrypted data from Paperless.
Google Drive folder with the encrypted backup
Now we will create a new cron job that automatically executes the script at a defined time:
crontab -e
If the script, for example, is to be executed daily at 3:00 a.m., add the following line at the end of the file:
0 3 * * * /home/tobias/paperless-ngx/backup.sh >> /home/tobias/paperless-ngx/backup.log 2>&1
Here you need to specify the absolute path to your backup script and the path where the log file should be created. Save and close the file. When you execute crontab -l
, you should see a list of your cron jobs including the added line.
First and foremost, make sure to securely store the passwords for the backup encryption. If your server becomes unreachable, the backup will be useless if you cannot decrypt it.
If your server has failed and you don't have a snapshot, you'll need to reinstall rclone and Paperless on your server. Follow the same steps and then don't choose a new password for the crypt remote, but instead, enter the previous passwords. This way, rclone can decrypt your backup again.
Afterwards, execute the rclone sync
command to import the backup:
rclone sync gdrive_encrypted: /home/tobias/paperless-ngx/export
Once your Paperless export folder is filled again, you can restore your Paperless instance with the Document Importer:
docker compose exec webserver document_importer ../export
The presented Paperless backup automation can be similarly modeled with other cloud providers. Typically, an API key or client ID with a client secret is sufficient for authentication.
When you run the Document Exporter, the export folder in your Paperless directory contains everything to restore your instance.
If you want to make backups to local hard drives, you can simply copy the contents of the export folder. You can use tools like rsync paired with a cron job for this purpose. Some NAS operating systems already have integrated local backup functions.
🛠️ Paperless-ngx IT Support 🛠️
Need help with the installation or configuration of Paperless-ngx? I'm happy to assist! Just send me an email at: hello@digitizerspace.com
Keep reading
Discover the latest trends in the automation industry and how they can impact your business.
View moreYour email address won't be published.