Just a little difference in file size, but no dups (file content)

English support for the software AllDup
Post Reply
prennie4532
Posts: 2
Joined: 14 Feb 2022, 14:50

Just a little difference in file size, but no dups (file content)

Post by prennie4532 »

This has cost me some headaches... I just don't understand what I am doing wrong (in my head).

I have some identical (at first sight) video and image files (only a difference in size), but for some reason AllDup sees them as being different when doing a content search (byte by byte). Even if I set the byte to byte match to 1% they are still being seen as different. These files have the same filename and extension, so I could easily do a search for duplicates, but I am just trying to figure out why Alldup is doing what is does :P


My settings in this specific case.
Screenshot 2022-02-14 091633.png

The files not being seen as duplicates when comparing content, but in this screenshot only when comparing filename and extension.
Screenshot 2022-02-14 091522.png
Screenshot 2022-02-14 091522.png (8.39 KiB) Viewed 5392 times

The only thing I can think about is the difference in the Media created date. I changed that.
Screenshot 2022-02-14 092612.png
Administrator
Site Admin
Posts: 4046
Joined: 04 Oct 2004, 18:38
Location: Thailand
Contact:

Re: Just a little difference in file size, but no dups (file content)

Post by Administrator »

The content of 2 files will be compared only if the length of both files is the same OR the size of the file content is equal after excluding the metadata!

Here are some examples for the byte-for-byte compare:

100% match:

Content of the 1. File: abc123de45
Content of the 2. File: abc123de45

0% match:

Content of the 1. File: abc123de45
Content of the 2. File: Xabc123de4

90% match:

Content of the 1. File: abc123de45
Content of the 2. File: Xbc123de45

10% match:

Content of the 1. File: abc123de45
Content of the 2. File: aabc123de4
prennie4532
Posts: 2
Joined: 14 Feb 2022, 14:50

Re: Just a little difference in file size, but no dups (file content)

Post by prennie4532 »

Condition 1 is met: I just checked both mp4 files and they have the same length: 00:03:16
Screenshot 2022-02-14 112137.png

Is there a difference between the size of the file and the size of the file content?
Screenshot 2022-02-14 112509.png

The size is a little different, as shown in the picture above. But one condition for comparison is met based on the same length, so why is there a difference in file content? It's exactly the same video.


Also I couldn't find the option to exclude metadata for video files. I have only the options as shown below:
Screenshot 2022-02-14 114308.png
If I can exclude the metadata I think the size will be equal.


And if the file content isn't compared because the conditions aren't met, how can AllDup decide if the file is a duplicate or not? Because if the files are not on the result list, one might think that the files are no duplicates. But you cannot tell, because the files were not compared... Shouldn't AllDup give you some warning? Do I make sense? :lol:

You must have a good brain to create a program like this... :shock:
Administrator
Site Admin
Posts: 4046
Joined: 04 Oct 2004, 18:38
Location: Thailand
Contact:

Re: Just a little difference in file size, but no dups (file content)

Post by Administrator »

prennie4532 wrote: 14 Feb 2022, 18:03 Condition 1 is met: I just checked both mp4 files and they have the same length: 00:03:16
The condition 1 doesnt met because it's only about the file size and not about the file content type like the video or audio length.
prennie4532 wrote: 14 Feb 2022, 18:03 And if the file content isn't compared because the conditions aren't met, how can AllDup decide if the file is a duplicate or not?
The decision will be made by the choosen search method and options.

In your case it's relatively easy:

You use the search method "file content" and the file size of your files is not equal = no duplicates.

In your case (video files) it will be better to use the search method "Find video & audio files on the basis of the audio length".
therube
Posts: 322
Joined: 07 Nov 2012, 00:28

Re: Just a little difference in file size, but no dups (file content)

Post by therube »

You could try a file comparison program & see if that might point out the differences between the two files.

Altap Salamander (file manager) contains a File Comparator program (Ctrl+Shift+C).

Throw two files at it & let it do its' thing.

You can (manually) select binary or text compare.
(On binary files, such as a movie, binary mode is quicker - though, depending on what they changes are, may not be an "appropriate" method of comparison.

*Appropriate.

Sometimes, an odd byte difference here or there can throw off a binary comparison that ends up making [virtually] the entire file appear to be different, where a Text comparison will actually point out those differences. In a case like that, it is often a tag-like situation, perhaps even a version level number of an encoder used to encode the video, so something negligible. With "odd" different lengths, like you have, it could be some "dumy" bytes at the end, that while different, are "past" the "end of the file", & so are totally irrelevant. If you came across larger swaths of differences, that would have to be looked at further. If they were something like nul bytes ($00$), that could point out corruption - where one version is, & the other is not. (If that were the case, it would be odd for the file size to be different - unless the file, at some point, was recovered by some sort of "unerase" program.) Also note that "corrupt" video may appear to be fine, as in, you may not "see" corruption, necessarily, with your player automatically bypassing corrupt spots.

So throw Salamander (or some other comparison) program at the files & see if it gives you something meaningful back.
Post Reply