Similar Pictures, explain concept of "100%"

English support for the software AllDup
Post Reply
therube
Posts: 322
Joined: 07 Nov 2012, 00:28

Similar Pictures, explain concept of "100%"

Post by therube »

Similar Pictures, explain concept of "100%"?
"The comparison methods aHash, ... enables you to find similar or almost identical pictures by using a percent match lower than 100%.

If you want to find exactly the same pictures you have to use
a percent match of 100%
or the comparison methods MD5/SHA.
So if I'm looking for "identical", "exactly the same pictures", I can choose to use either a "similar" hash method (like aHash) - set to 100%, OR, I could use MD5/SHA?

And they should return the same results - from my reading (understanding, of which I may not, understand)?


Yet 100% aHash <> 100% SHA (necessarily).

In fact, a "similar" hash (aHash) may return 100% ("identical") on files that at least appear to not be 100% identical (that are not 100% identical).

Where MD5/SHA means that the picture "contents", the nuts & bolts of the picture - itself, & exclusive of "tags", or other "extraneous" data (like errant CR/LF after the data content of the picture) [& maybe ? even "compression options" used to compress the picture ?].

So maybe I'm understanding part of this process, but other parts, I'm not getting, yet?

Two pics, "clearly" not "identical" visually.
SHA1 confirms that, says, not identical.
Yet, aHash (default settings) says, "100%"?
.
1000_F_388032051_nwWr6lX2HPKBjtOp9sbssj4yFUfjpEnT.jpg
xxx.jpg
Administrator
Site Admin
Posts: 4047
Joined: 04 Oct 2004, 18:38
Location: Thailand
Contact:

Re: Similar Pictures, explain concept of "100%"

Post by Administrator »

For the comparison methods (...)Hash, the percentage match refers to the created hash value and not to the byte content of the image.
therube wrote: 21 Oct 2023, 03:24 Two pics, "clearly" not "identical" visually. SHA1 confirms that, says, not identical. Yet, aHash (default settings) says, "100%"?
This is due to the internal rounding up from 99.x to 100%.
therube
Posts: 322
Joined: 07 Nov 2012, 00:28

Re: Similar Pictures, explain concept of "100%"

Post by therube »

rounding up from 99.x to 100%
For the cases of ?Hash, should it always report 99% rather then 100%?
That might make one more aware that "100%" might not be quite, 100%.

(More a food for thought question, then looking for an answer.)
Last edited by therube on 27 Oct 2023, 22:38, edited 1 time in total.
therube
Posts: 322
Joined: 07 Nov 2012, 00:28

Re: Similar Pictures, explain concept of "100%"

Post by therube »

SHA1 confirms that, says, not identical.
And it seems that even a SHA1 hash of 100% potentially might not be 100%, either.


Even an Image Search, SHA1 "100%" can have some "leniency" (as to the 100% part).
As in there can be times where the generated .png (which is what "exactness" is
based upon ?), can end up with different computed hash of said .png - indicating
that in fact the source files were not in fact "100% identical".

(Granted, in this situation, this "leniency" is [all but] immaterial [seemingly ?]
& i only bring up that only to point out that in the world of "images", 100%
need not be "100%" in the same way that a data hash HAS to be 100%, to be equal
[& collisions aside].)
Administrator
Site Admin
Posts: 4047
Joined: 04 Oct 2004, 18:38
Location: Thailand
Contact:

Re: Similar Pictures, explain concept of "100%"

Post by Administrator »

The 100% issue with the xHash will be fixed with the next update:

AllDup similar pictures 100% fixed.png
Post Reply