Without arguments,
dups.py
checks the current directory, recursively:$ dups.pyThis has been tested on the Mac OS X and cygwin, and should also work with Python for Windows.
Duplicates found:
./Data/2004/05_4/015_12A.jpg
./Data/2004/2004.09.29 Grandma/015_12A.jpg
Duplicates found:
./Data/2002/19/uvs021219-008.jpg
./Data/2006/01_2/uvs040430-006.jpg
...
There are lots of nerdy options, like filtering by file size and following symbolic links. Try
dups.py -h
to see them all:usage: dups.py [options] [<file_or_directory> ...]
Find duplicate files in the given path(s). Defaults to searching files recursively,
except for hidden files (beginning with "."), empty files, and symbolic links.
Options:
--version show program's version number and exit
-h, --help show this help message and exit
-v, --verbose verbose
Exclusion Options:
-f, --flat do not scan directories recursively
-g n, --greater-than=n
only scan files of size greater than n bytes
-l n, --less-than=n
only scan files of size less than n bytes
Inclusion Options:
-L, --follow-links follow symbolic links (warning: beware of infinite
loops)
-H, --hidden-files include hidden files
-z, --zero-files include empty files
Miscellaneous:
-D, --delete delete subsequent duplicates (files are scanned in
argument-list order)
-c, --create-rel-links
replace subsequent duplicates with relative links
(non-Windows only)
-C, --create-abs-links
same as "-c", but links are absolute
-s, --special-hidden
changes meaning of "hidden files" (-H) depending on
platform: cygwin - uses Windows file attributes
(warning: slow); win32 - files with names starting
with "." considered hidden
P.S. I hacked together a way to detect Windows hidden files from cygwin but it's ugly and slow.
4/6/08 update: I added the ability to delete duplicates (-D), and create relative (-c) or absolute (-C) symbolic links.