Page 1 of 1

Comparison Methods

Posted: 12 Oct 2009, 23:37
by jackpots
Hello and thanks for you product.

How are the files compared for duplicity? That is, is it by size > crc hash > full byte to byte, or by some other means.

I need to know to determine my level of confidence in the output. Thanks again.

Re: Comparison Methods

Posted: 13 Oct 2009, 03:51
by Administrator
The content of a file will be verified byte by byte.

Posted: 13 Oct 2009, 06:08
by jackpots
Thanks.

You should state so in your product descriptions - it makes a world of difference.

Also, I left a small positive review today at Softpedia, where I discovered AllDup. However, it might be held a bit while being moderated.

Posted: 13 Oct 2009, 17:21
by Administrator
Thanks for the hint! I have added "File Content (byte by byte compare)" to the help file and web site.

Posted: 16 Oct 2009, 19:47
by jackpots
A follow-up question...

Does AllDup compare the full content of all files regardless of the file's size? Or, is it as "Compare Method" suggests that only a maximum of 9,999,999 bytes of each file are compared.

Please tell me otherwise. But if this is so, and if AllDup is still an active project, will you add that option.

Thanks.

Posted: 16 Oct 2009, 20:51
by Administrator
AllDup compares only files with the same size. AllDup compares the complete content of both files if no difference is found. Everything else does not make sense (IMHO).

Posted: 17 Oct 2009, 00:27
by jackpots
I thought so, however, the read buffer settings under Compare Methods casted some doubt. Thanks.

Posted: 17 Oct 2009, 15:22
by Administrator
The read buffer helps to detect content changes at the beginning of a file without reading a big data packet into the memory. Lets say you start with 10.000 bytes and the first 10 bytes of the two file are different. AllDup only have to read 2x 10.000 bytes to detect the difference.

Posted: 17 Oct 2009, 21:18
by jackpots
Understood.

Side note:
Some suggest that reading the end of the file may be even faster in determining content difference. That is, read a small block end first, determine difference, then start at top with the compare. I have not tested this but it seems plausible.

Posted: 17 Oct 2009, 21:53
by Administrator
jackpots wrote:Some suggest that reading the end of the file may be even faster in determining content difference.
Yes, this could speed up the file compare. I note this on the ToDo-List!

Posted: 19 Apr 2013, 05:16
by happyuser
Hello,

I have a suggestion regarding the comparison methods.

I understand that byte by byte comparison is most secure but it is very inefficient for archives.

That's why I suggest including CRC32 as search criteria, which can be determined for the archived files without extracting them. Although there are collisions for CRC32 it is very unlikely to have 2 files with same size and same CRC32. For the uncompressed files CRC32 can be calculated only for the files that have matching size with other files. This should be the fastest possible comparison when scan the contents of archive files option is enabled. It will also allow comparing encrypted archives in which the metadata is not encrypted.

You can allow selecting both CRC32 and Content so if CRC32 matches it will also do byte by byte comparison.

Posted: 19 Apr 2013, 23:34
by Administrator
ok, i understand. Using the existing crc32 value from inside the zip and rar files for comparing. I note this on the ToDo-List.