This program does a little nice thing: searches for duplicate files on your disks, and lets you decide what to do of them.
Let me explain two of my scenarios:
- backups: I have multiple hand-made backups of the same external disk, took in different moments. They have quite a huge amount of redundancy;
- software development: I work with SVN (the same applies when working with other VCSs as well). I use to have multiple working copies of the same project, in different folders. Each working copy points to a different branch, so I can examine commits and make builds even in parallel. This leads to a huge redundancy of source code and other files.
Clonespy lets me detect redundancies in these two scenarios, and this applies to many other scenarios.
The interesting add on, that many other similar products do not have, is that this software lets you substitute the duplicate with a hard link to the "original" file (a technique similar to data deduplication).
The requirement is that the filesystem supports it, but in Windows most often you're making use of the compatible NTFS file system, so there is no problem. In fact, you will not notice any difference (nor will your build system or any automated tool notice it).
The advantages of a manual deduplication of files performed this way are:
- the first one, is that you free up a lot of space;
- you can expect less disk fragmentation;
- the same operation on different foldes (that have partly overlapping files) can be much quicker: the antivirus can scan the two "copies" of the same file only once, the two copies on the disk will have only one cached copy in RAM;
- it is fully compatible: if you deduplicate files on an external hard disk (NTFS formatted), and then use it on another device, all data is read without problems.
There is a potential disadvantage... but this does not apply to my case. If you edit a file in place, you will edit both "copies". As many editors do not edit in place, but create a new file instead, deleting the old one, this problem does not apply.