Rsync is the perfect synchronization tool for keeping your data in sync. The program manages file properties and uses SSH to encrypt your data, and it is perfect for transferring large volumes of data if the target computer has a copy of a previous version. Rsync checks for differences between the source and target versions. The tool that has been developed by the Samba team uses an efficient checksum-search algorithm for comparing data; rsync only transfers the differences between the two sides and therefore saves time and bandwidth.
In Sync
The generic syntax for rsync is rsync [options] source target
, where target
can be a local target on the same machine or a remote target on another machine. The choice of source and target is critical; decide carefully in which direction you will by synchronizing to avoid loss of data. If you’re not sure that you’re using the correct options or the correct source/target, you can run rsync with the ‑n
flag to tell the program to perform a trial run. Additionally, you can increase the amount of information by defining ‑v
and switching to verbose output.
To mirror a directory dir1
on a local machine, for example, type:
$ rsync dir1/* dir2/ skipping directory foo skipping directory bar skipping non‑regular file "text.txt"
As the output shows, rsync would transfer normal files but leave out subdirectories and symbolic links (non‑regular file
). To transfer directories recursively down to the lowest level, you should specify the ‑r
option. Using the ‑l
flag additionally picks up your symlinks. Of course, a combination of the options is also possible:
rsync ‑lr dir1/* dir2/
Rsync has an alternative approach to handling symlinks. If you replace ‑l
with ‑L
, the program will resolve the link, and your former symlinks will end up as “normal” files at the target.
Be careful with the slash – appending a slash to a directory name influences the way rsync handles an operation (see the “Common Rsync Traps” box).
Common Rsync Traps |
---|
Some rsync options could cause trouble if you don’t use them with caution. Being aware of these common mistakes can help.
To avoid loss of data in this scenario, you can create a hard link before calling rsync. If the transfer fails now, you won’t lose the ISO image; instead, the partial file will be given a new name without destroying the original. |
As You Were
If you will be using rsync to create backups, it makes sense to keep the attributes of the original files. By attributes I mean permissions (read, write, execute, see the “Access Permissions” article) and timestamps – that is, information on the last access time (atime
), the last status change (ctime
), and the last modification (mtime
).
Additionally, administrators can benefit from parameters that preserve owner and group data and support device files. To retain the permissions, just specify the ‑p
option; ‑t
handles the timestamps, and ‑g
keeps the group membership.
Whereas any normal user can specify these parameters, the ‑o
(keep the owner data) and ‑D
(device attributes) flags are available only to root. The complete command line with all these options could look like this:
rsync ‑rlptgoD /home/huhn backup/
Don’t worry – you don’t have to remember all these options. Rsync offers a practical shortcut and a special option that combines these parameters for this case. Instead of ‑rlptgoD
, just type ‑a
.
Exclusive
Rsync has another practical option that allows you to exclude certain files from the synchronization process. To leverage this feature, specify the ‑ ‑exclude=
option and a search pattern and define the files to exclude. With this option, you can use wildcards:
rsync ‑a ‑‑exclude=*.wav ~/music backup/
This example excludes large WAV files that end in .wav
from the backup of a music collection. If you need to exclude MP3s as well, just append another exclude
statement and a pattern:
rsync ‑a ‑‑exclude=*.wav ‑‑exclude=*.mp3 ...
To save time, you can store your exclusions in a text file. To do this, you will need a separate line for each search pattern. Specify the ‑ ‑exclude‑from=file_with_exclusions
parameter to parse the file.
Tidying Up
Rsync offers various parameters for deleting data that is no longer needed or wanted. To get rid of files in your backup that no longer exist in the source, type ‑ ‑delete
. Rsync’s default behavior is to delete files before the transfer is finished. Alternatively, you can define ‑ ‑delete‑after
to delete files of the target after all the syncing is done.
Additionally, you can tell rsync to delete files that you have excluded (see the previous section). For example, imagine you’ve decided that you no longer want the MP3s in the backup and you’ve started to exclude them with ‑ ‑exclude=*.mp3
. Now you can define ‑ ‑delete‑excluded
, and rsync will recognize that those files are no longer wanted.
All ‑ ‑delete
options have basically the same goal: to keep an exact copy of the original. If you don’t use the switch, you will have to clean up manually; otherwise, the files that you’ve decided are useless will remain. Use these options with care (see the “Common Rsync Traps” box).
Tuning Rsync
Several options increase rsync’s performance. Often, I use the ‑z
switch to compress data when I sync data over a network connection. Figure 1 shows this using Grsync, the graphical front end to rsync. If the connection is very slow, you can also define a bandwidth limit. To transfer data with only 20KBps, for example, use:
rsync ... ‑‑bwlimit=20
Rsync is perfect for transferring large volumes of data. If you specify the ‑ ‑partial
parameter and the transfer is interrupted for some reason, you can pick up the transfer from the point at which you left off. Specifying the ‑ ‑progress
option gives you a progress indicator to let you keep track of the transfer operation:
rsync ‑avz ‑‑progress ‑‑partial remote.server:/home/huhn/music/folk ~/music/ receiving file list ... 42 files to consider ... 12_Moladh_Uibhist.mp3 1143849 4% 339.84kB/s 0:01:10
At the other end of the connection, the partial file is hidden in the target directory at first. Typing ls ‑a
reveals a file called 12_Moladh_Uibhist.mp3.7rUSSq
. The dot at the start of the file name keeps the file hidden, and the arbitrary extension removes the danger of overwriting existing files.
When the transfer completes, the file gets its original name back. If the transfer is interrupted, you can restart by specifying the ‑ ‑partial
option again. Alternatively, you have a shortcut: If you want to use a combination of ‑ ‑partial
and ‑ ‑progress
, simply use ‑P
. For the downside of using the ‑ ‑partial
flag, again see the “Common Rsync Traps” box.
Rsync keeps your data up to date and helps you stay on top of confusing version changes. Its options help you manage file properties, and it works well with SSH. When you need to transfer large volumes of data, rsync comes to your rescue.
This article originally appeared in the Linux Shell Handbook and is reprinted here with permission.
Want to read more? Check out the latest edition of Linux Shell Handbook.
Contact FOSSlife to learn about partnership and sponsorship opportunities.
Comments