Comparison Methods
Comparison Methods
Hello and thanks for you product.
How are the files compared for duplicity? That is, is it by size > crc hash > full byte to byte, or by some other means.
I need to know to determine my level of confidence in the output. Thanks again.
How are the files compared for duplicity? That is, is it by size > crc hash > full byte to byte, or by some other means.
I need to know to determine my level of confidence in the output. Thanks again.
Last edited by jackpots on 04 Feb 2011, 07:57, edited 1 time in total.
-
- Site Admin
- Posts: 4049
- Joined: 04 Oct 2004, 18:38
- Location: Thailand
- Contact:
Re: Comparison Methods
The content of a file will be verified byte by byte.
Thanks.
You should state so in your product descriptions - it makes a world of difference.
Also, I left a small positive review today at Softpedia, where I discovered AllDup. However, it might be held a bit while being moderated.
You should state so in your product descriptions - it makes a world of difference.
Also, I left a small positive review today at Softpedia, where I discovered AllDup. However, it might be held a bit while being moderated.
Last edited by jackpots on 04 Feb 2011, 07:57, edited 1 time in total.
-
- Site Admin
- Posts: 4049
- Joined: 04 Oct 2004, 18:38
- Location: Thailand
- Contact:
A follow-up question...
Does AllDup compare the full content of all files regardless of the file's size? Or, is it as "Compare Method" suggests that only a maximum of 9,999,999 bytes of each file are compared.
Please tell me otherwise. But if this is so, and if AllDup is still an active project, will you add that option.
Thanks.
Does AllDup compare the full content of all files regardless of the file's size? Or, is it as "Compare Method" suggests that only a maximum of 9,999,999 bytes of each file are compared.
Please tell me otherwise. But if this is so, and if AllDup is still an active project, will you add that option.
Thanks.
Last edited by jackpots on 04 Feb 2011, 07:57, edited 1 time in total.
-
- Site Admin
- Posts: 4049
- Joined: 04 Oct 2004, 18:38
- Location: Thailand
- Contact:
-
- Site Admin
- Posts: 4049
- Joined: 04 Oct 2004, 18:38
- Location: Thailand
- Contact:
Understood.
Side note:
Some suggest that reading the end of the file may be even faster in determining content difference. That is, read a small block end first, determine difference, then start at top with the compare. I have not tested this but it seems plausible.
Side note:
Some suggest that reading the end of the file may be even faster in determining content difference. That is, read a small block end first, determine difference, then start at top with the compare. I have not tested this but it seems plausible.
Last edited by jackpots on 04 Feb 2011, 07:58, edited 1 time in total.
-
- Site Admin
- Posts: 4049
- Joined: 04 Oct 2004, 18:38
- Location: Thailand
- Contact:
Hello,
I have a suggestion regarding the comparison methods.
I understand that byte by byte comparison is most secure but it is very inefficient for archives.
That's why I suggest including CRC32 as search criteria, which can be determined for the archived files without extracting them. Although there are collisions for CRC32 it is very unlikely to have 2 files with same size and same CRC32. For the uncompressed files CRC32 can be calculated only for the files that have matching size with other files. This should be the fastest possible comparison when scan the contents of archive files option is enabled. It will also allow comparing encrypted archives in which the metadata is not encrypted.
You can allow selecting both CRC32 and Content so if CRC32 matches it will also do byte by byte comparison.
I have a suggestion regarding the comparison methods.
I understand that byte by byte comparison is most secure but it is very inefficient for archives.
That's why I suggest including CRC32 as search criteria, which can be determined for the archived files without extracting them. Although there are collisions for CRC32 it is very unlikely to have 2 files with same size and same CRC32. For the uncompressed files CRC32 can be calculated only for the files that have matching size with other files. This should be the fastest possible comparison when scan the contents of archive files option is enabled. It will also allow comparing encrypted archives in which the metadata is not encrypted.
You can allow selecting both CRC32 and Content so if CRC32 matches it will also do byte by byte comparison.
-
- Site Admin
- Posts: 4049
- Joined: 04 Oct 2004, 18:38
- Location: Thailand
- Contact: