1

I'm creating the MD5 checksum on video files before uploading to the server. I'm running into a case where when I upload the same file, a different MD5 checksum is generated.

I use the following code for generating the checksum

static func md5File(url: URL) -> String? {

    let bufferSize = 1024 * 1024

    do {
        // Open file for reading:
        let file = try FileHandle(forReadingFrom: url)
        defer {
            file.closeFile()
        }

        // Create and initialize MD5 context:
        var context = CC_MD5_CTX()
        CC_MD5_Init(&context)

        // Read up to `bufferSize` bytes, until EOF is reached, and update MD5 context:
        while case let data = file.readData(ofLength: bufferSize), data.count > 0 {
            data.withUnsafeBytes {
                _ = CC_MD5_Update(&context, $0, CC_LONG(data.count))
            }
        }

        // Compute the MD5 digest:
        var digest = Data(count: Int(CC_MD5_DIGEST_LENGTH))
        digest.withUnsafeMutableBytes {
            _ = CC_MD5_Final($0, &context)
        }
        let stringDigest = digest.map { String(format: "%02hhx", $0) }.joined()
        return stringDigest


    } catch {
        return nil
    }
}

I do notice the iOS does compressing after the video file is selected, then I'm given a URL in the tmp/ directory. The file each time does have the same size, but a different filename. It is my understanding the MD5 isn't calculated based on the filename. Am I correct with this? What could be causing a different MD5 every time?

Mike Walker
  • 2,944
  • 8
  • 30
  • 62
  • 1
    That code looks [vaguely familiar](http://stackoverflow.com/a/42935601/1187415) ... – Martin R May 08 '17 at 18:59
  • Did you compare the uploaded files on the server if they are identical or not? – Martin R May 08 '17 at 19:09
  • After uploading to the server, the file sizes were the same size, but the checksum differed. Since they are video files, the playback looked identical. – Mike Walker May 08 '17 at 19:13
  • I am not familiar with video compression, but there could be embedded metadata (such as timestamps) which cause a different result. On a Unix server you can use the "cmp" utility to check if they are byte for byte identical or not. – It is very unlikely that two different files produce the same MD5 hash. – Martin R May 08 '17 at 19:18
  • Yes I am not familiar with it too. I'm wondering if something is happening when the iOS itself compresses the video file that would cause a different MD5. – Mike Walker May 08 '17 at 19:20
  • But then your question is if compressing the same video twice can produce different files. That question is independent of MD5. – Martin R May 08 '17 at 19:22
  • By compressing twice I am talking about on separate occasions. I send the MD5 to the server to confirm the user hasn't uploaded the video previously So on 2 separate occasions, I attempted to upload the video and since the MD5 differs, the server thinks they are different video files. – Mike Walker May 08 '17 at 19:29

0 Answers0