Synology btrfs replication and disk strangeness in DSM6
Well, I've been using Synology for more than six years now and I have to say what started as a small shell on top of base Linux has become a really nice easy to use interface. Note that these notes are for the older DSM6, I haven't had time to upgrade to the new DSM7 because, well my storage is a mess and I need to fix that first 🙂
There are some issues with Synology requiring their own drives for "enterprise" boxes, but typically you can buy a system and then slam your own drives into it. We have a DS1812 and a DS2413+ these names are a little complicated but what the model names mean are:
- DS or RS. This means a DiskStation which is not rack mounted and for consumers or small businesses or RS which means enterprise rackmounted
- The next one or two digits are general how many bays there are, so a DS2xx line has two bays and the DS4xx have four disk drive bays. The confusing this is the the DS18xx and the DS24xx has 12 bays. I carnally explain this numbering except the DS1xxx does support expansion classes, but why is the 12 drive a 24, I'm not sure and why is the RS3621xs+ a 12 day enclosure, but it does seem like the second digit might indicate the processor in the system. So the RS3 line for instance has real AMD Risen quad core and Xeon D-1541 processors. The DS1812+ has an Intel Atom C27xx while the DS2413+ has a bigger Atom processor D2700 at 2.1GHz and 4GB of memory. How ever this naming has really strayed a lot so for instance the DS722 is actually a four drive array, so I guess they just ran out of digits 🙂
- The last two digits are the model range so that the following to the DS1812 is the DS1813
- There there are the suffixes like the + which means a slight bump.
The Confusion that is Storage Pools, Volumes and File Shares
OK, probably the most confusing thing about Synology DSM (Disk Station Manager?) which is their graphical interface on top of Linux is how many different disk concepts there are and the fact you use different tools to manage each, so you need no less different graphical tools to manage your disks (Storage Manager, Control Panel/Shared Folder, File Station, Cloud Sync, Drive Admin, Storage Analyzer and HyperBackup. Confused yet?:
- Storage Pool. The first is the concept of a storage pool. This is a collection of disks with a disks and you get the option to change the RAID type (see below, but mainly you are picking RAID1 or RAID10 for today's big disks). The idea behind the storage pool is that you use it to swap disks around. You manage these at Main Menu > Storage Manager > Storage Pool with Create
- Volumes. You can have many volumes on a Storage Pool and the idea here is that this is where the actual files system (like btrfs) is set. You manage these at Main Menu > Storage Manager > Volume and you can see things like how much available capacity there is.
- Shared Folders. Ok, this is confusing, you manage the Shared Folders which are exposed to the network in Control Panel > Shared Folders where you can create and delete them and you browse them with Main Menu > File Station. The idea here is that you can many shared folders on a single volume, so in the end you get a hierarchy which is Storage Pools to Volumes to Shared Folders. I get confused all the time, because in the File Station, I want to (like MacOS) remove file shares and so forth and can't remember that the concept of Files is in a completely different place. One of the very cool things that you can do in the Shared Folder if you choose Edit is that in the Location field, you can dynamically move a Shared Folder between Volumes. So that you can space manage pretty effectively just by moving Shared Folders around
Setting up the Volumes with Btrfs and Volumes to RAID10
Well, the first thing to note is that with small systems it is fine to use the default SHR1 which basically means you get one redundant drive and can throw just about anything in there to make it work. So for a 4 drive system, you get one redundant drive. But when you have big drives that are 10TB or more they are very close to the 1E14 error rates you typically see, so when having a bad drive and try to rebuild it you are likely to get errors and it takes foooreeevvveeer. So RAID10 is also a single drive redundant but it rebuilds in seconds and likely without errors. Note that once you set this, you can't just convert things, so make the right choice first.
The first thing is that the default is using ext4, but you want a btrfs instead because they have some amazing things like snapshot basically this is a copy-on-write file system, so you can snapshot the disk for a moment in time. It allows very nice undo capabilities. Note that this isn't backup since what happens is that when you change a file, you create a new one, and then the "old" snapshots point to the old file so you can easily restore it. On Synology, you can't convert ext4 (the old Linux format) to btrfs, you have to recreate the volumes. You get this option when you create a volume and you can't change it.
Data scrubbing, then there is a nice feature of DSM called bit scrubbing, this basically goes through the redundant drives and does checksums to figure out where there has been bit rot. That is where data has actually changed. You set this manually in Main Menu > Storage Manager > Storage Pool > Data Scrubbing.
Snapshot and Snapshot Replication
The other advantage of btrfs is snapshot replication. This means that you can have another volume on the same NAS or on another NAS and it only copies the differences using the same snapshot trick. So the first copy is long, but then you get automatic sync.
However, for big drives (larger than say 4TB) because of the relatively high bit error rates, you really want to have a different strategy and use RAID10. This means that each drive gets a redundant copy and it and it is very efficient to copy them over.
Setting Snapshot Schedule and Retention
The way you do this is with yet another tool which is Main Menu > Snapshot Replication and yes it is confusing that you use the Snapshot Replication command to actually take snapshots. But, in Main Menu > Snapshot Replication > Snapshots > Shared Folders, you will see a. list of all the File Shares set in Main Menu > Control Panel > Shared Folder, so click on the shared folder that you like and then Settings and you can enable a Schedule which tells it when to take the snapshots I normally say every hour as snapshotting is basically free, but you might set it more frequently if your data is changing a lot.
Then you can set retention which is how many snapshots do you want to keep. For instance, you might want to keep quite a few and I normally set advanced retention pretty aggressively so you have:
- 24 hourly snapshots, so you have a copy for every hour of the day. Remember if there are no changes, this doesn't cost anything to have
- 7 daily snapshots, this means that after each day, you lose your hourly, but there is still a snapshot for every day.
- 4 weekly snapshots, so after a week, for a month, you will have weekly snapshot
- 12 Monthly snapshots, so you will have a snap shot for every month of the year
- 10 you will have 10 years where you get a snapshot on the last day of each year.
On a fast-moving file system, this can generate a lot of snapshots so feel free to turn these down. There is a limit of 1,024 snapshot in total, but this is way below that. For things that don't change much you can obviously turn this way down, but on my 10TB personal files, all these generated only about 1TB worth of recovery, so kind of worth it 🙂
OK, now you can replicate things. For critical data like my personal files, photos and videos, I actually have a local backup that is there are two complete copies of my personal folders on the same NAS. You do this by create two volumes and then:
- Got to Main Menu > Snapshot Replication > Replication > Shared Folder and choose create, then you can say Local replication and then you will be asked for the Shared Folder and then the target Volume. This will create a new share, so if you local replicate "Personal" you will get the replicated "Personal-1" which is kind of great. It took two days to do this for 10TB on a DS2413+ and the Synology hung once (I think because I was simultaneously building an array and doing a replication), but I rebooted and it worked. At this point Personal-1
- In the wizard you will be asked to set a schedule for replication. I just set this to daily late at night since it does take cycles to do this, although typically most files aren't going to change and this only does the deltas.
- I like to also go to Main Menu > Snapshot Replication > Shared Folder > Edit > Advanced and choose Replicated schedule local snapshots which also transfers all the backup snapshots as well, so you really have a full copy
Backups by Snapshot Replication and Hyperbackup
Now that you have your base NAS setup it's time to think about backups. First there is the local snapshot replication, but if it is critical data, you might want to do two things:
- Snapshot Remote Replication. You can replicate with btrfs to another Synology NAS that also has btrfs and Snapshot replication loaded. This is a great solution to quick backups that uses the btrfs and is a win. Note you do need it to have btrfs and it has to be a Synology server, then in Main Menu > Snapshot Replication > Shared Folder > Create, you just need to provide the IP address. The nice thing about this replication is that you get a usable file system, but it does mean it is very efficient
- Cloud Sync to Google Drive or Shared Drive. This is as it says a sync, so you can have the same files in your google Drive as on your system. You just install Cloud Sync from Main Menu > Package Manager and then choose Google Shared Drive. Note that Shared Drives are great if they are not your personal files but belong to entire family. It let's everyone see them and you can give person full access to everything whereas with a regular Google Drive, you basically are owned by a user. For most folks regular Google Drive is probably the right choice. YOu authenticate and then you select the name of the connection and the remote path (that is where do you want to put the files). You can also encrypt it so that it is more like a pure file backup (but see Hyperbackup below which is a different scheme for doing the same thing, but it is not just sync but real backup with copies etc.) you then select the local and remote path and you are in business.
- Hyperbackup to Google Workspace. This you add from the Synology Store and you can setup a backup to AWS and a variety of places. If you have a Google Workspace account, then this is pretty amazing since they provide unlimited storage and you get their email. So you can "for free" copy it up to the cloud. this is probably good enough for home users, but you can use AWS. Note that this is true block level backup, you won't the files, they are completely encrypted, so it is very safe and you can stuff onto Google or anywhere and then there is a restore process
Syncing to a different file server
One great piece of advice that Vlad gave me was that your backups should be on machines with different architectures so you don't have one bug kill the whole system.
- USB Attached Backup to Drobo. OK, I have two old Drobos (they went bankrupt I guess), but the drives still work. You can actually attach them to a Synology and they appear as just an array of disks. The disadvantage of this is that you cannot use the Drobo Dashboard to look at the state of your drives, so be aware of that.
- Synology to Mac with Drobo with Synology Drive. If you put a Mac in between, then you can have the equivalent of Google Drive syncing on a Mac. So you hook up your Drobos to your Mac and then with Package Manager download the Synology Drive Server, but you basically install it and then in Main Menu > Synology Drive Admin > Team Folder, you enable the Shared Folder as a Team folder and turn versioning on. Now on your mac, you can brew install synology-drive and you can create a sync task which syncs from the NAS to your machine.
What is actually is on your NAS with Storage Analyzer
Then there is the question of what is actually on your drive. For this, you need the Storage Analyzer, this is a tool that helps, it is not easy to use though. The confusing part is that to see the actual data, you need to go to the bottom to Create a Report and then make sure you set Settings so that it runs regularly. Then there is an arrow to the rigth where you can actually see the report and make sure you do at least weekly and that it Autonmatically includes all existing and future shared folders.
You can also generate reports in Report Profile > Report > Generate