Where Robocopy Fails

Fixing Windows Backups
Contents

Robocopy is Windows command-line tool for copying files and directories. I use it in a bash script in combination with rsync to backup an NTFS formatted external drive. Robocopy is for handling the NTFS related items such as reparse points and rsync for those with Unix properties. However, robocopy doesn’t work well when dealing with symbolic links.

The Problem

     my robocopy command  
robocopy source destination * /E /PURGE /ZB /SL /MT:20 /XO /A-:HS /COPY:DAT /DCOPY:DAT /W:0 /R:1 /ETA

/E Copy subdirectories, including Empty ones
/PURGE Delete dest files/dirs that no longer exist in source
/ZB Use restartable mode; if access denied use Backup mode
/SL Copy symbolic links versus the target
/MT[:n] Do multi-threaded copies with n threads (default 8)
/XO eXclude Older files
/A-:[RASHCNET] Remove the given Attributes from copied files
/COPY:copyflag[s] What to COPY for files (default is /COPY:DAT) - (copyflags : D=Data, A=Attributes, T=Timestamps) - (S=Security=NTFS ACLs, O=Owner info, U=aUditing info)
/DCOPY:copyflag[s] What to COPY for directories (default is /DCOPY:DA) - (copyflags : D=Data, A=Attributes, T=Timestamps)
/R:n Number of Retries on failed copies: default 1 million
/W:n Wait time between retries: default is 30 seconds
/ETA Show Estimated Time of Arrival of copied files
Closely Related
/MIR MIRror a directory tree (equivalent to /E plus PURGE)
/XJ Exclude Junction points. (normally included by default)
/L List only - don't copy, timestamp or delete any files
  • What it is supposed to do
    • Mirror the source to destination
    • Add missing and delete additional files/directories from the destination without modifying the source
    • Copy the symbolic links as such rather than copy the files/directories that they point to.
    • This implies that any changes to the destination shouldn’t affect the source, right? It shouldn’t indeed according to this 2 threads (1, 2) and the robocopy documentation.
  • What it actually does
    • If you remove or change symbolic link pointing to a directory in the source when mirroring to destination robocopy deletes the directory’s contents from both places rather than only the link in destination
    • Same goes if you change the parent directory containing the link if any
    • Strangely enough both of this applied first in the destination instead bears the same results

Testing

Here I’ll try to replicate those issues and see why and where exactly robocopy fails.

source  test/
destination  backup/
symbolic links to files
 test_symblink_file1 ~> /test/documents/cv.pdf
 test_symblink_file2 ~> /test/apple.txt
to directories
 test_symblink_dir ~> /test/documents/

     my test directory tree  
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
.
├── backup/
│   ├── apple2.txt
│   ├── apple.txt
│   ├── test_symblink_file1 ~> /test/documents/cv.pdf
│   ├── documents/
│   │   └── cv.pdf
│   ├── secret/
│   └── symbfold/
│       ├── test_symblink_dir ~> /test/documents/
│       └── test_symblink_file2 ~> /test/apple.txt
└── test/
    ├── apple2.txt
    ├── apple.txt
    ├── test_symblink_file1 ~> /test/documents/cv.pdf
    ├── documents/
    │   └── cv.pdf
    ├── secret/
    └── symbfold/
        ├── test_symblink_dir ~> /test/documents/
        └── test_symblink_file2 ~> /test/apple.txt

Let’s experiment with those symbolic links. What happens when mirroring if we:

test
 1   change the source symbfold/ name to symbfold_will_it_break/
 2   append _not to the symbolic link names in the source
in the destination
 3   perform test 1
 4   perform test 2
use /XJ as part of our mirroring command after performing
 5   test 1
 6   test 2
delete symbolic links
 7   to a file (test_symblink_file2) from the source
 8   to a folder (test_symblink_dir) from the source

Results


     test 1  
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
.
├── backup/
│   ├── apple2.txt
│   ├── apple.txt
│   ├── test_symblink_file1 ~> /test/documents/cv.pdf
│   ├── documents/
│   ├── secret/
│   └── symbfold_will_it_break/
│       ├── test_symblink_dir ~> /test/documents/
│       └── test_symblink_file2 ~> /test/apple.txt
└── test/
    ├── apple2.txt
    ├── apple.txt
    ├── test_symblink_file1 ~> /test/documents/cv.pdf
    ├── documents/
    ├── secret/
    └── symbfold_will_it_break/
        ├── test_symblink_dir ~> /test/documents/
        └── test_symblink_file2 ~> /test/apple.txt

Robocopy behaves well if the symbolic link points to a file, but treats the one that points to a folder as the folder it targets rather than a symbolic link. As a result all files that are contained in the symbolic link's target folder are deleted and after mirroring leaving an empty folder in both source and destination. In our example cv.pdf is deleted from test/ when robocopy followed the symbolic link and then mirrored the empty folder documents/ to backup/.


     test 2  
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
.
├── backup/
│   ├── apple2.txt
│   ├── apple.txt
│   ├── test_symblink_file1_not ~> /test/documents/cv.pdf
│   ├── documents/
│   │   └── cv.pdf
│   ├── secret/
│   └── symbfold_will_it_break/
│       ├── test_symblink_dir_not ~> /test/documents/
│       └── test_symblink_file2_not ~> /test/apple.txt
└── test/
    ├── apple2.txt
    ├── apple.txt
    ├── test_symblink_file1_not ~> /test/documents/cv.pdf
    ├── documents/
    ├── secret/
    └── symbfold_will_it_break/
        ├── test_symblink_dir_not ~> /test/documents/
        └── test_symblink_file2_not ~> /test/apple.txt

We have very similar results this time. Every symbolic link was renamed by appending _not to its filename. Just as before, the directory symblinks seem the ones problematic. In this case, the target files in the source that the symblink was pointing to were deleted but the backup retained their copy.


     test 3  
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
.
├── backup/
│   ├── apple2.txt
│   ├── apple.txt
│   ├── test_symblink_file1 ~> /test/documents/cv.pdf
│   ├── documents/
│   ├── secret/
│   └── symbfold/
│       ├── test_symblink_dir ~> /test/documents/
│       └── test_symblink_file2 ~> /test/apple.txt
└── test/
    ├── apple2.txt
    ├── apple.txt
    ├── test_symblink_file1 ~> /test/documents/cv.pdf
    ├── documents/
    ├── secret/
    └── symbfold/
        ├── test_symblink_dir ~> /test/documents/
        └── test_symblink_file2 ~> /test/apple.txt

Identical results as in test 1. Robocopy recurses in rather than copy the directory symbolic link.


     test 4  
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
.
├── backup/
│   ├── apple2.txt
│   ├── apple.txt
│   ├── test_symblink_file1 ~> /test/documents/cv.pdf
│   ├── documents/
│   │   └── cv.pdf
│   ├── secret/
│   └── symbfold_will_it_break/
│       ├── test_symblink_dir ~> /test/documents/
│       └── test_symblink_file2 ~> /test/apple.txt
└── test/
    ├── apple2.txt
    ├── apple.txt
    ├── test_symblink_file1 ~> /test/documents/cv.pdf
    ├── documents/
    ├── secret/
    └── symbfold_will_it_break/
        ├── test_symblink_dir ~> /test/documents/
        └── test_symblink_file2 ~> /test/apple.txt

Same results as in test 2.


     test 5  
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
.
├── backup/
│   ├── apple2.txt
│   ├── apple.txt
│   ├── test_symblink_file1 ~> /test/documents/cv.pdf
│   ├── documents/
│   ├── secret/
│   └── symbfold_will_it_break/
└── test/
    ├── apple2.txt
    ├── apple.txt
    ├── test_symblink_file1 ~> /test/documents/cv.pdf
    ├── documents/
    ├── secret/
    └── symbfold_will_it_break/
        ├── test_symblink_dir ~> /test/documents/
        └── test_symblink_file2 ~> /test/apple.txt

The same results as test 1 except that both the symbolic links pointing to a file and to a folder in the renamed symbfold/ are not mirrored.


     test 6  
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
.
├── backup/
│   ├── apple2.txt
│   ├── apple.txt
│   ├── documents/
│   ├── secret/
│   └── symbfold/
└── test/
    ├── apple2.txt
    ├── apple.txt
    ├── test_symblink_file1_not ~> /test/documents/cv.pdf
    ├── documents/
    ├── secret/
    └── symbfold/
        ├── test_symblink_dir_not ~> /test/documents/
        └── test_symblink_file2_not ~> /test/apple.txt

All symbolic links not mirrored (as they should) and we lost our cv.pdf again.


     test 7  
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
.
├── backup/
│   ├── apple2.txt
│   ├── apple.txt
│   ├── test_symblink_file1 ~> /test/documents/cv.pdf
│   ├── documents/
│   │   └── cv.pdf
│   ├── secret/
│   └── symbfold/
│       ├── test_symblink_dir ~> /test/documents/
└── test/
    ├── apple2.txt
    ├── apple.txt
    ├── test_symblink_file1 ~> /test/documents/cv.pdf
    ├── documents/
    │   └── cv.pdf
    ├── secret/
    └── symbfold/
        └── test_symblink_dir ~> /test/documents/

Everything works fine.


     test 8  
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
.
├── backup/
│   ├── apple2.txt
│   ├── apple.txt
│   ├── test_symblink_file1 ~> /test/documents/cv.pdf
│   ├── documents/
│   │   └── cv.pdf
│   ├── secret/
│   └── symbfold/
│       └── test_symblink_file2 ~> /test/apple.txt
└── test/
    ├── apple2.txt
    ├── apple.txt
    ├── test_symblink_file1 ~> /test/documents/cv.pdf
    ├── documents/
    ├── secret/
    └── symbfold/
        └── test_symblink_file2 ~> /test/apple.txt

The target again deleted. The destination still retains the original.

  • Robocopy doesn’t distinguish between a junction and a symbolic link. It treats both as junctions. When using /XJ both are ignored.
  • When removing or changing symbolic link to directory or junction robocopy deletes the target they are pointing to rather than the actual links
  • Deleting or changing symbolic link to a file doesn’t affect the target file.

Furthermore symbolic links are called junctions even in the official documentaton:

/XJD eXclude Junction points for Directories
/XJF eXclude Junction points for Files

Solution

If a link doesn’t exist in the destination, recursing into wouldn’t be possible and robocopy will do a clean copy. In that case, if I delete the all links from the destination before performing the backup will reassure that nothing from the source is deleted and that I still have a copy of my links as desired. I can achieve this fairly easy using Windows Subsystem for Linux with a simple bash function.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
function safe_backup() {
    # some variables
    source="some_dir/"
    dest="some_other_dir/"
    robocopyoptions="some_options"

    # remove all links
    find $dest -type l -print0 | xargs -0 rm -v --

    # perform the backup
    cmd.exe /c robocopy "$source" "$dest" "*" $robocopyoptions
}

Then all you need is to paste this in your terminal and call safe-backup. Don’t forget to update the directory variables and add your robocopy options beforehand. Then you can add it to your .bashrc or to a different configuration file in your preferred shell that is loaded on startup. Scheduling backups with the help of cron job is another great thing to do. In combination with this, you can add /L option to your robocopy command to test it before applying. This is the equivalent of --dry-run for rsync. Another way of deleting all links is by using the built-in del / erase command. But I don’t recommend this one. It may share the same issues with robocopy.

del /a:l *

My recommendation is if you don’t have to manage any NTFS / Windows specific files use rsync. It is like robocopy but on steroids and open source. If you just need to backup your photos, videos or your documents this is the much better alternative which works almost everywhere. On Windows this can be done through Windows Subsystem for Linux or any other terminal emulator.


Thanks for reading this article. You can find the code used for testing here.