Some questions on duplicates

English support for the software AllDup
Post Reply
Optimist
Posts: 9
Joined: 23 Apr 2015, 15:46

Some questions on duplicates

Post by Optimist »

Hello all, I've been using AllDup to sort out a number of duplicated folders across drives and I'm very pleased with it. However, I have some questions:

1. Sometimes in the results I notice seemingly identical files put in different groups. For example, yesterday I had four apparently identical images which I'd expect to be in a single group of four, yet they were listed in two groups of two. Why might this be?

2. I have noticed while looking through these various drives that images that have been copied across drives are sometimes shown by Windows as having changed size slightly - they might be a few tens of kilobytes different on a file of, say, 5 megabytes. There seems no obvious reason for this. Could this be connected with (1) above (although file sizes are reported the same in AllDup)?

3. When comparing files byte-by-byte, can one be sure that the comparison is exact; in other words, if AllDup says it's the same, is it definitely 100% the same? I ask this as I notice Joerg Rosenthal says of his Anti-Twin that byte-by-byte comparison might identify image files that are in fact slightly different, and offers pixel by pixel comparison to counter this. I don't know if this is because of an inherent issue or whether it's just something to do with his program.

...and I've forgotten the fourth question!

Thanks
Administrator
Site Admin
Posts: 4047
Joined: 04 Oct 2004, 18:38
Location: Thailand
Contact:

Re: Some questions on duplicates

Post by Administrator »

1:
How did u search for duplicate files?
Content byte by byte?

2:
Sorry, i dont know.

3:
Byte by byte compares from the 1. byte until the last byte of a file.
If AllDup reports them as same files it is definitely 100% the same.

4:
:?: :-)
Optimist
Posts: 9
Joined: 23 Apr 2015, 15:46

Re: Some questions on duplicates

Post by Optimist »

Thanks, and thanks for the very prompt reply.

1. I'm afraid I can't remember! It is likely to have been byte-by-byte but I can't be certain. I noticed it at the time but it did not occur to me to ask about it until later. If I see the same phenomenon again I will post full details.

2. An example is an image file, shown in AllDup results as two groups each of two files. The images look identical, the only differences being file size (different by 3kb) and creation date.

3. Thanks. Always good to be certain.

4. Still haven't remembered!

5. I should also have said that I assume if one uses 'byte-by-byte' in conjunction with other characteristics, such as file name or extension, the file comparison is still exact and the program uses the other characteristic for an initial sort?
Administrator
Site Admin
Posts: 4047
Joined: 04 Oct 2004, 18:38
Location: Thailand
Contact:

Re: Some questions on duplicates

Post by Administrator »

2. if the files have a different file size and they are in the same group im sure u not searched for duplicates by the content.

5. if you select more than one search criteria all of them must match to set files as duplicates.
Optimist
Posts: 9
Joined: 23 Apr 2015, 15:46

Re: Some questions on duplicates

Post by Optimist »

On point 1, I've come across an example of this. I have 5 .wmv files with the same title. Looking at the file properties, each one has the same file size, same file size on disk, same detailed properties (length, frame size, frame rate, author etc. etc.) They appear the same when played. The only apparent differences are creation and modified dates, and the media created date in the detailed file properties (some have a date set, some don't).

It's very unlikely that these are different files, and very likely - given that they've come from various backups - that they're different copies of the same file. Yet a byte-by-byte comparison has sorted them into two different groups, one of three files and one of two, so it appears the program is seeing some difference between them which is not obvious. The difference isn't the creation and modified dates, as they're different inside the two groups. The only difference across the groups appears to be whether a media creation date is set in detail properties for the file or not.

Any thoughts?
Administrator
Site Admin
Posts: 4047
Joined: 04 Oct 2004, 18:38
Location: Thailand
Contact:

Re: Some questions on duplicates

Post by Administrator »

It is the media creation date which is stored inside the file.

The same you get with Office files.
Office changes a time stamp inside the files if you just open and close them.

So they all look different at a byte by byte compare.
Optimist
Posts: 9
Joined: 23 Apr 2015, 15:46

Re: Some questions on duplicates

Post by Optimist »

Oh, OK. I'd naively assumed it wouldn't make a difference, same as date created & modified don't. Reassuring that AllDup picks up on (what to me seem) such small differences, though.

Found another set just now - 10 .MOV files, split by AllDup into two groups of 5, but this time the file sizes don't tally - AllDup shows the first batch as 144,921.14 KB each, the second group as 144,920.93 KB each; the file properties show the first group as 148,399,247 bytes, the second group as 144,399,036 bytes. Again the only visible difference is the media created date in the detailed properties, so I assume the same issue again despite the slight file size discrepancies?

Thanks very much for your prompt answers on these questions, it's much appreciated - I hope you don't mind, but I like to try to learn / understand what's going on.
Administrator
Site Admin
Posts: 4047
Joined: 04 Oct 2004, 18:38
Location: Thailand
Contact:

Re: Some questions on duplicates

Post by Administrator »

This windows system file properties are stored outside the file content:
  • File Attributes
    CreationTime
    Last AccessTime
    Last WriteTime
    File Size
    File Name
All other properties are individual (non windows system) and will be stored inside the file content.
Optimist
Posts: 9
Joined: 23 Apr 2015, 15:46

Re: Some questions on duplicates

Post by Optimist »

Oh right, OK. Thanks.
Post Reply