5 votes

Ensuring file integrity with multiple files inside a zip archive?

I've been involved with 'remastering' and/or improving media for people and usually use MD5 since it is typically one file (for example, an MP4 file).

Recently, I've been tasked with a project that includes several FLAC files distributed amongst separate folders. My MD5 software does not support folders nor multiple files at once.

Is there another method that I could use to document file integrity?

9 comments

  1. [4]
    cfabbro
    (edited )
    Link
    Parchive (PAR2) files might be an option worth looking into. On top of being able to provide checksums for multiple files, they can also be used to rebuild corrupted/missing data in those files...

    Parchive (PAR2) files might be an option worth looking into. On top of being able to provide checksums for multiple files, they can also be used to rebuild corrupted/missing data in those files when necessary. They have been a standard inclusion with files being distributed via Usenet for ages for exactly that reason.

    6 votes
    1. [3]
      suspended
      Link Parent
      That's certainly an interesting option. However, most of the people that will be using these files won't be as savvy. To explain, most of the 'audience' are familiar with MD5 and how it works. I'm...

      That's certainly an interesting option. However, most of the people that will be using these files won't be as savvy. To explain, most of the 'audience' are familiar with MD5 and how it works. I'm not sure if your suggestion would help in this particular scenario, unless I'm misunderstanding something.

      1 vote
      1. [2]
        cfabbro
        (edited )
        Link Parent
        There is no real savvyness required. Multipar (or whichever client you prefer, or is available for your OS) works very similar to most MD5/checksum software, and should take care of pretty much...

        There is no real savvyness required. Multipar (or whichever client you prefer, or is available for your OS) works very similar to most MD5/checksum software, and should take care of pretty much everything for you and anyone you send your files to. On your end it can create the PAR2 files for you from the specified files/archives, which you then just need to distribute to other people. And when people download those files, they just need to open the PAR2 files in Multipar (or whichever client they choose) and it will automatically verify the integrity of all the other associated files, but it will also do any repairs if they are needed as well.

        1 vote
  2. [4]
    umbrae
    Link
    I’m probably misunderstanding but just to ask.. wouldn’t just a matching md5sum of a zip archive ensure that the resultant files within the zip archive are also in tact? Because if they weren’t…...

    I’m probably misunderstanding but just to ask.. wouldn’t just a matching md5sum of a zip archive ensure that the resultant files within the zip archive are also in tact? Because if they weren’t… then the zip archive would have a different md5, because the bits would be different.

    I guess there’s the case where the files got corrupted at archive time?

    5 votes
    1. [3]
      suspended
      Link Parent
      I'll tag @cfabbro since this question is over my head.

      I'll tag @cfabbro since this question is over my head.

      1 vote
      1. [2]
        stu2b50
        Link Parent
        Zips are lossless compression, which by definition are bijective, so if the hash of the zip is the same, then the files inside must also be the same (furthermore, because it's bijective, if there...

        Zips are lossless compression, which by definition are bijective, so if the hash of the zip is the same, then the files inside must also be the same (furthermore, because it's bijective, if there were any corruption of the zip, it wouldn't be able to be extracted without error unless you layer some kind of error correction on top of it)

        4 votes
        1. suspended
          Link Parent
          That makes complete sense now. Thanks for explaining!

          That makes complete sense now. Thanks for explaining!

          1 vote
  3. vord
    (edited )
    Link
    Torrent files provide this exact functionality. You generate a torrent, it chunks the data into 8MB (configurable) segments and checksums each block. Can pass the user torrent file and media, they...

    Torrent files provide this exact functionality.

    You generate a torrent, it chunks the data into 8MB (configurable) segments and checksums each block. Can pass the user torrent file and media, they could load in torrent client to validate data.

    4 votes