Sometimes I have two versions of a directory stored on my computer. When cleaning up, it is handy to be able to see which files are in common and which files differ. To do that, I wrote a small Python script that compares two directories. The script also shows duplicate files in each of the directories. MD5 checksums are used to determine if files are equal – no file name comparison is used. Also, the script goes recursively into subdirectories.
The usage is simple:
Usage: compare.py [-d] dir1 dir2 Compares files in two directories, based on their MD5 checksum. -d: Debug. Prints the MD5 checksum of every file to stderr.
Sample usage and output:
% python compare.py dir1 dir2 Duplicate files ============================================================ dir1/file 2.pdf dir1/file.pdf Common files ============================================================ dir1/image.png dir2/image.png dir1/test.pdf dir2/subdir/foobar.pdf Files only in dir1 ============================================================ dir1/hello.jpg Files only in dir2 ============================================================ dir2/something.jpg dir2/video.mov