For example, if there are two files: a.txt and b.txt, it looks like AllDup is comparing a.txt, b.txt and b.txt, a.txt even though both of those comparisons are identical. This means double the number of comparisons.
It’s not so bad with a few hundred files, but when there are thousands, the number of comparisons grows exponentially. For example, right now, I need to compare 150,000 files that are all exactly 8.192 bytes. If every file is compared with every other file twice, then it will mean >22 BILLION comparisons. If each pair of files is compared only once, then it will take “only” 11 billion comparisons.
I recorded a clip of AllDup. The Compare File field should not be repeating the same files over and over again.
Code: Select all
Comparisons are commutative, so for:
C:\Dupes\
a.txt
b.txt
c.txt
d.txt
e.txt
Compare:
a b c d e
a - + + + +
b - - + + +
c - - - + +
d - - - - +
e - - - - - (e.txt is not checked at all; it has already be checked against all other files)
5 files ≠ 25 (n*n) comparisons
5 files = 10 comparisons: (n-1)! = (n*(n-1))/2