Where Robocopy Fails

Fixing Windows Backups
Table of Contents

Have you ever dealt with symbolic links while using robocopy? Recently I had this pleasure and it didn’t go smoothly. I have a bash script containing a combination of robocopy and rsync to backup an NTFS formatted external drive that is used as portable workspace. Robocopy is for handling the NTFS related items such as reparse points and rsync for those with Unix properties.

The Problem

My robocopy command:
robocopy source destination * /E /PURGE /ZB /SL /MT:20 /XO /A-:HS /COPY:DAT /DCOPY:DAT /W:0 /R:1 /ETA
  • /E Copy subdirectories, including Empty ones
  • /PURGE Delete dest files/dirs that no longer exist in source
  • /ZB Use restartable mode; if access denied use Backup mode
  • /SL Copy symbolic links versus the target
  • /MT[:n] Do multi-threaded copies with n threads (default 8)
  • /XO eXclude Older files
  • /A-:[RASHCNET] Remove the given Attributes from copied files
  • /COPY:copyflag[s] What to COPY for files (default is /COPY:DAT) - (copyflags : D=Data, A=Attributes, T=Timestamps) - (S=Security=NTFS ACLs, O=Owner info, U=aUditing info)
  • /DCOPY:copyflag[s] What to COPY for directories (default is /DCOPY:DA) - (copyflags : D=Data, A=Attributes, T=Timestamps)
  • /R:n Number of Retries on failed copies: default 1 million
  • /W:n Wait time between retries: default is 30 seconds
  • /ETA Show Estimated Time of Arrival of copied files
Closely related:
  • /MIR MIRror a directory tree (equivalent to /E plus /PURGE)
  • /XJ Exclude Junction points. (normally included by default)
  • /L List only - don't copy, timestamp or delete any files

It is supposed to mirror the source to destination - add missing & delete additional from the destination without modifying the source and copy the symblinks as such rather than copy the folders/files that they point to. This implies that any changes to the destination shouldn’t affect the source, right? At least according to this 2 threads [1][2] and the robocopy documentation it indeed shouldn’t. The funny thing is my experience proved this wrong. Robocopy mirroring seems to work fine if you don’t mess with any of the NFTS reparse points or you’ll end up with something missing. Consider this as my test directory tree:

.
├── backup/
│   ├── apple2.txt
│   ├── apple.txt
│   ├── test_symblink_file1 -> ./test/Documents/CV.pdf
│   ├── Documents/
│   │   └── CV.pdf
│   ├── Secret/
│   └── Symbfold/
│       ├── test_symblink_dir -> ./test/Documents/
│       └── test_symblink_file2 -> ./test/apple.txt
└── test/
    ├── apple2.txt
    ├── apple.txt
    ├── test_symblink_file1 -> ./test/Documents/CV.pdf
    ├── Documents/
    │   └── CV.pdf
    ├── Secret/
    └── Symbfold/
        ├── test_symblink_dir -> ./test/Documents/
        └── test_symblink_file2 -> ./test/apple.txt

Our source is test/ and our destination backup/. It contains 3 symblinks pointing to 2 files (apple.txt & CV.pdf) and 1 directory (Documents/). To see where things go wrong let’s experiment with those symblinks. What happens when mirroring if we:

  1. Change the source Symbfold/ name to Symbfold_Will_It_Break/?
  2. Change the names of the symblinks in the source to *_not (* stands for file name)?
  3. Perform 1 on the destination instead?
  4. Perform 2 on the destination instead?
  5. Use /XJ as part of our mirroring command after performing 1?
  6. Use /XJ as part of our mirroring command after performing 2?
  7. Delete symblink to a file from the source (test_symblink_file2)?
  8. Delete symblink to a folder from the source (test_symblink_dir)?

Results

Test Case 1
.
├── backup/
│   ├── apple2.txt
│   ├── apple.txt
│   ├── test_symblink_file1 -> ./test/Documents/CV.pdf
│   ├── Documents/
│   ├── Secret/
│   └── Symbfold_Will_It_Break/
│       ├── test_symblink_dir -> ./test/Documents/
│       └── test_symblink_file2 -> ./test/apple.txt
└── test/
    ├── apple2.txt
    ├── apple.txt
    ├── test_symblink_file1 -> ./test/Documents/CV.pdf
    ├── Documents/
    ├── Secret/
    └── Symbfold_Will_It_Break/
        ├── test_symblink_dir -> ./test/Documents/
        └── test_symblink_file2 -> ./test/apple.txt
Robocopy behaves well if the symblink points to a file, but treats the one that points to a folder as the folder it targets rather than a symblink. As a result all files that are contained in the symblink's target folder are deleted and after mirroring leaving an empty folder in both source and destination. In our example CV.pdf as deleted from test/ when robocopy followed the sybmlink and then mirrored the empty folder Documents/ to backup/.

Test Case 2
.
├── backup/
│   ├── apple2.txt
│   ├── apple.txt
│   ├── test_symblink_file1_not -> ./test/Documents/CV.pdf
│   ├── Documents/
│   │   └── CV.pdf
│   ├── Secret/
│   └── Symbfold_Will_It_Break/
│       ├── test_symblink_dir_not -> ./test/Documents/
│       └── test_symblink_file2_not -> ./test/apple.txt
└── test/
    ├── apple2.txt
    ├── apple.txt
    ├── test_symblink_file1_not -> ./test/Documents/CV.pdf
    ├── Documents/
    ├── Secret/
    └── Symbfold_Will_It_Break/
        ├── test_symblink_dir_not -> ./test/Documents/
        └── test_symblink_file2_not -> ./test/apple.txt
We have very similar results this time. Every symblink was renamed by adding _not at the end. Just as before, the directory symblinks seem the ones problematic. In this case, the target files in the source that the symblink was pointing to were deleted but the backup retained their copy.

Test Case 3
.
├── backup/
│   ├── apple2.txt
│   ├── apple.txt
│   ├── test_symblink_file1 -> ./test/Documents/CV.pdf
│   ├── Documents/
│   ├── Secret/
│   └── Symbfold/
│       ├── test_symblink_dir -> ./test/Documents/
│       └── test_symblink_file2 -> ./test/apple.txt
└── test/
    ├── apple2.txt
    ├── apple.txt
    ├── test_symblink_file1 -> ./test/Documents/CV.pdf
    ├── Documents/
    ├── Secret/
    └── Symbfold/
        ├── test_symblink_dir -> ./test/Documents/
        └── test_symblink_file2 -> ./test/apple.txt
Identical results as in test case 1. Robocopy recurses in rather than copy the directory symblink.

Test Case 4
.
├── backup/
│   ├── apple2.txt
│   ├── apple.txt
│   ├── test_symblink_file1 -> ./test/Documents/CV.pdf
│   ├── Documents/
│   │   └── CV.pdf
│   ├── Secret/
│   └── Symbfold_Will_It_Break/
│       ├── test_symblink_dir -> ./test/Documents/
│       └── test_symblink_file2 -> ./test/apple.txt
└── test/
    ├── apple2.txt
    ├── apple.txt
    ├── test_symblink_file1 -> ./test/Documents/CV.pdf
    ├── Documents/
    ├── Secret/
    └── Symbfold_Will_It_Break/
        ├── test_symblink_dir -> ./test/Documents/
        └── test_symblink_file2 -> ./test/apple.txt
Same results as in test case 2.

Test Case 5
.
├── backup/
│   ├── apple2.txt
│   ├── apple.txt
│   ├── test_symblink_file1 -> ./test/Documents/CV.pdf
│   ├── Documents/
│   ├── Secret/
│   └── Symbfold_Will_It_Break/
└── test/
    ├── apple2.txt
    ├── apple.txt
    ├── test_symblink_file1 -> ./test/Documents/CV.pdf
    ├── Documents/
    ├── Secret/
    └── Symbfold_Will_It_Break/
        ├── test_symblink_dir -> ./test/Documents/
        └── test_symblink_file2 -> ./test/apple.txt
The same results as test case 1 except that both the symblinks pointing to a file and to a folder in the renamed Symbfold/ are not mirrored.

Test Case 6
.
├── backup/
│   ├── apple2.txt
│   ├── apple.txt
│   ├── Documents/
│   ├── Secret/
│   └── Symbfold/
└── test/
    ├── apple2.txt
    ├── apple.txt
    ├── test_symblink_file1_not -> ./test/Documents/CV.pdf
    ├── Documents/
    ├── Secret/
    └── Symbfold/
        ├── test_symblink_dir_not -> ./test/Documents/
        └── test_symblink_file2_not -> ./test/apple.txt
All symblinks not mirrored (as they should) and we lost our CV.pdf again.

Test Case 7
.
├── backup/
│   ├── apple2.txt
│   ├── apple.txt
│   ├── test_symblink_file1 -> ./test/Documents/CV.pdf
│   ├── Documents/
│   │   └── CV.pdf
│   ├── Secret/
│   └── Symbfold/
│       ├── test_symblink_dir -> ./test/Documents/
└── test/
    ├── apple2.txt
    ├── apple.txt
    ├── test_symblink_file1 -> ./test/Documents/CV.pdf
    ├── Documents/
    │   └── CV.pdf
    ├── Secret/
    └── Symbfold/
        └── test_symblink_dir -> ./test/Documents/
Everything works fine.

Test Case 8
.
├── backup/
│   ├── apple2.txt
│   ├── apple.txt
│   ├── test_symblink_file1 -> ./test/Documents/CV.pdf
│   ├── Documents/
│   │   └── CV.pdf
│   ├── Secret/
│   └── Symbfold/
│       └── test_symblink_file2 -> ./test/apple.txt
└── test/
    ├── apple2.txt
    ├── apple.txt
    ├── test_symblink_file1 -> ./test/Documents/CV.pdf
    ├── Documents/
    ├── Secret/
    └── Symbfold/
        └── test_symblink_file2 -> ./test/apple.txt
The target again deleted. The destination still retains the original.

Conclusion

  • Robocopy doesn’t distinguish between a junction and a symbolic link. It treats both as junctions. When using /XJ both are ignored.
  • When removing or changing symblink to directory or junction robocopy deletes the target they are pointing to rather than the actual links
  • Deleting or changing symblink to a file doesn’t affect the target file.

Furthermore Symblinks are called Junctions even in the official documentaton:

  • /XJD eXclude Junction points for Directories
  • /XJF eXclude Junction points for Files

My Solution

If a link doesn’t exist in the destination, recursing into wouldn’t be possible and robocopy will do a clean copy. In that case, if I delete the all links from the destination before performing the backup will reassure that nothing from the source is deleted and that I still have a copy of my links as desired. I can achieve this fairly easy using Windows Subsystem for Linux with a simple bash function.

function safe_backup() {
    # some variables
    source="some_dir/"
    dest="some_other_dir/"
    robocopyoptions="some_options"

    # remove all links
    find $dest -type l -print0 | xargs -0 rm -v --

    # perform the backup
    cmd.exe /c robocopy "$source" "$dest" "*" $robocopyoptions
}

Then all you need is to paste this in your terminal and call safe-backup. Don’t forget to update the directory variables and add your robocopy options beforehand. Then you can add it to your .bashrc or to a different configuration file in your preferred shell that is loaded on startup. Scheduling backups with the help of cron job is another great thing to do. In combination with this, you can add /L option to your robocopy command to test it before applying. This is the equivalent of --dry-run for rsync. Another way of deleting all links is by using the built-in del / erase command. But I don’t recommend this one. It may share the same issues with robocopy.

del /a:l *

My recommendation is if you don’t have to manage any NTFS / Windows specific files to use rsync. It is like robocopy but on steroids and open source. If you just need to backup your photos, videos or your documents this is the much better alternative which works almost everywhere. On Windows this can be done through Windows Subsystem for Linux.


Thanks for reading this article. I hope this saves you quite a lot of trouble.
You can find the code used for testing here.

 
comments powered by Disqus

categories

tags