<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/'><id>tag:blogger.com,1999:blog-3062430.post7059158001433788514..comments</id><updated>2008-12-29T11:51:52.755-08:00</updated><title type='text'>Comments on Vic’s Blog: Find Duplicate Files - dups.py</title><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://blog.vicshih.com/feeds/7059158001433788514/comments/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3062430/7059158001433788514/comments/default'/><link rel='alternate' type='text/html' href='http://blog.vicshih.com/2008/03/find-duplicate-files.html'/><author><name>Vic</name><uri>http://www.blogger.com/profile/03074915000996247481</uri><email>noreply@blogger.com</email></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>2</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>25</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-3062430.post-8891313013125108493</id><published>2008-12-29T11:51:52.755-08:00</published><updated>2008-12-29T11:51:52.755-08:00</updated><title type='text'>Brendan, are you saying the results were different...</title><content type='html'>Brendan, are you saying the results were different for different runs on the same data?&lt;BR/&gt;&lt;BR/&gt;Would it be possible for you to send me some of the files that were reported duplicates?&lt;BR/&gt;&lt;BR/&gt;The script actually returns whether the md5 hash of files match.  I suppose with diverse enough data there could be some false positives, but it's pretty unlikely.  I can add the final comparison check to eliminate these, if this is what is actually happening.</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3062430/7059158001433788514/comments/default/8891313013125108493'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3062430/7059158001433788514/comments/default/8891313013125108493'/><link rel='alternate' type='text/html' href='http://blog.vicshih.com/2008/03/find-duplicate-files.html?showComment=1230580312755#c8891313013125108493' title=''/><author><name>Vic</name><uri>http://www.blogger.com/profile/03074915000996247481</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='18248565958669383814'/></author><thr:in-reply-to xmlns:thr='http://purl.org/syndication/thread/1.0' href='http://blog.vicshih.com/2008/03/find-duplicate-files.html' ref='tag:blogger.com,1999:blog-3062430.post-7059158001433788514' source='http://www.blogger.com/feeds/3062430/posts/default/7059158001433788514' type='text/html'/></entry><entry><id>tag:blogger.com,1999:blog-3062430.post-4226960836384633058</id><published>2008-12-29T06:00:48.985-08:00</published><updated>2008-12-29T06:00:48.985-08:00</updated><title type='text'>Hi,I applaud (and appreciate) your efforts, but th...</title><content type='html'>Hi,&lt;BR/&gt;&lt;BR/&gt;I applaud (and appreciate) your efforts, but this did not function as expected on my WinXP system. I ran it under Python 2.4 on a directory full of tiffs (and some other file types) that were band images from IKONOS. It detected many duplicates among bands, i.e., for a given tile, there are 4 bands, and it would detect three as duplicates of the first. Not all the time, but much of the time.&lt;BR/&gt;&lt;BR/&gt;I checked the results with the DOS comp command, and it says they are, in fact, different. In short, it doesn't seem to work.</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/3062430/7059158001433788514/comments/default/4226960836384633058'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/3062430/7059158001433788514/comments/default/4226960836384633058'/><link rel='alternate' type='text/html' href='http://blog.vicshih.com/2008/03/find-duplicate-files.html?showComment=1230559248985#c4226960836384633058' title=''/><author><name>Brendan Hemens</name><uri>http://www.blogger.com/profile/00721568107945654992</uri><email>noreply@blogger.com</email></author><thr:in-reply-to xmlns:thr='http://purl.org/syndication/thread/1.0' href='http://blog.vicshih.com/2008/03/find-duplicate-files.html' ref='tag:blogger.com,1999:blog-3062430.post-7059158001433788514' source='http://www.blogger.com/feeds/3062430/posts/default/7059158001433788514' type='text/html'/></entry></feed>