This has cost me some headaches... I just don't understand what I am doing wrong (in my head).
I have some identical (at first sight) video and image files (only a difference in size), but for some reason AllDup sees them as being different when doing a content search (byte by byte). Even if I set the byte to byte match to 1% they are still being seen as different. These files have the same filename and extension, so I could easily do a search for duplicates, but I am just trying to figure out why Alldup is doing what is does
My settings in this specific case.
The files not being seen as duplicates when comparing content, but in this screenshot only when comparing filename and extension.
The only thing I can think about is the difference in the Media created date. I changed that.
Just a little difference in file size, but no dups (file content)
-
- Posts: 2
- Joined: 14 Feb 2022, 14:50
-
- Site Admin
- Posts: 4049
- Joined: 04 Oct 2004, 18:38
- Location: Thailand
- Contact:
Re: Just a little difference in file size, but no dups (file content)
The content of 2 files will be compared only if the length of both files is the same OR the size of the file content is equal after excluding the metadata!
Here are some examples for the byte-for-byte compare:
100% match:
Content of the 1. File: abc123de45
Content of the 2. File: abc123de45
0% match:
Content of the 1. File: abc123de45
Content of the 2. File: Xabc123de4
90% match:
Content of the 1. File: abc123de45
Content of the 2. File: Xbc123de45
10% match:
Content of the 1. File: abc123de45
Content of the 2. File: aabc123de4
Here are some examples for the byte-for-byte compare:
100% match:
Content of the 1. File: abc123de45
Content of the 2. File: abc123de45
0% match:
Content of the 1. File: abc123de45
Content of the 2. File: Xabc123de4
90% match:
Content of the 1. File: abc123de45
Content of the 2. File: Xbc123de45
10% match:
Content of the 1. File: abc123de45
Content of the 2. File: aabc123de4
-
- Posts: 2
- Joined: 14 Feb 2022, 14:50
Re: Just a little difference in file size, but no dups (file content)
Condition 1 is met: I just checked both mp4 files and they have the same length: 00:03:16
Is there a difference between the size of the file and the size of the file content?
The size is a little different, as shown in the picture above. But one condition for comparison is met based on the same length, so why is there a difference in file content? It's exactly the same video.
Also I couldn't find the option to exclude metadata for video files. I have only the options as shown below:
If I can exclude the metadata I think the size will be equal.
And if the file content isn't compared because the conditions aren't met, how can AllDup decide if the file is a duplicate or not? Because if the files are not on the result list, one might think that the files are no duplicates. But you cannot tell, because the files were not compared... Shouldn't AllDup give you some warning? Do I make sense?
You must have a good brain to create a program like this...
Is there a difference between the size of the file and the size of the file content?
The size is a little different, as shown in the picture above. But one condition for comparison is met based on the same length, so why is there a difference in file content? It's exactly the same video.
Also I couldn't find the option to exclude metadata for video files. I have only the options as shown below:
If I can exclude the metadata I think the size will be equal.
And if the file content isn't compared because the conditions aren't met, how can AllDup decide if the file is a duplicate or not? Because if the files are not on the result list, one might think that the files are no duplicates. But you cannot tell, because the files were not compared... Shouldn't AllDup give you some warning? Do I make sense?
You must have a good brain to create a program like this...
-
- Site Admin
- Posts: 4049
- Joined: 04 Oct 2004, 18:38
- Location: Thailand
- Contact:
Re: Just a little difference in file size, but no dups (file content)
The condition 1 doesnt met because it's only about the file size and not about the file content type like the video or audio length.prennie4532 wrote: ↑14 Feb 2022, 18:03 Condition 1 is met: I just checked both mp4 files and they have the same length: 00:03:16
The decision will be made by the choosen search method and options.prennie4532 wrote: ↑14 Feb 2022, 18:03 And if the file content isn't compared because the conditions aren't met, how can AllDup decide if the file is a duplicate or not?
In your case it's relatively easy:
You use the search method "file content" and the file size of your files is not equal = no duplicates.
In your case (video files) it will be better to use the search method "Find video & audio files on the basis of the audio length".
Re: Just a little difference in file size, but no dups (file content)
You could try a file comparison program & see if that might point out the differences between the two files.
Altap Salamander (file manager) contains a File Comparator program (Ctrl+Shift+C).
Throw two files at it & let it do its' thing.
You can (manually) select binary or text compare.
(On binary files, such as a movie, binary mode is quicker - though, depending on what they changes are, may not be an "appropriate" method of comparison.
*Appropriate.
Sometimes, an odd byte difference here or there can throw off a binary comparison that ends up making [virtually] the entire file appear to be different, where a Text comparison will actually point out those differences. In a case like that, it is often a tag-like situation, perhaps even a version level number of an encoder used to encode the video, so something negligible. With "odd" different lengths, like you have, it could be some "dumy" bytes at the end, that while different, are "past" the "end of the file", & so are totally irrelevant. If you came across larger swaths of differences, that would have to be looked at further. If they were something like nul bytes ($00$), that could point out corruption - where one version is, & the other is not. (If that were the case, it would be odd for the file size to be different - unless the file, at some point, was recovered by some sort of "unerase" program.) Also note that "corrupt" video may appear to be fine, as in, you may not "see" corruption, necessarily, with your player automatically bypassing corrupt spots.
So throw Salamander (or some other comparison) program at the files & see if it gives you something meaningful back.
Altap Salamander (file manager) contains a File Comparator program (Ctrl+Shift+C).
Throw two files at it & let it do its' thing.
You can (manually) select binary or text compare.
(On binary files, such as a movie, binary mode is quicker - though, depending on what they changes are, may not be an "appropriate" method of comparison.
*Appropriate.
Sometimes, an odd byte difference here or there can throw off a binary comparison that ends up making [virtually] the entire file appear to be different, where a Text comparison will actually point out those differences. In a case like that, it is often a tag-like situation, perhaps even a version level number of an encoder used to encode the video, so something negligible. With "odd" different lengths, like you have, it could be some "dumy" bytes at the end, that while different, are "past" the "end of the file", & so are totally irrelevant. If you came across larger swaths of differences, that would have to be looked at further. If they were something like nul bytes ($00$), that could point out corruption - where one version is, & the other is not. (If that were the case, it would be odd for the file size to be different - unless the file, at some point, was recovered by some sort of "unerase" program.) Also note that "corrupt" video may appear to be fine, as in, you may not "see" corruption, necessarily, with your player automatically bypassing corrupt spots.
So throw Salamander (or some other comparison) program at the files & see if it gives you something meaningful back.