I am designing a file-server application where I want to check if a cached file on a client computer is the last version who is kept on the server.
I don't quite trust the 'changed date' attribute in the file system, so I want to compare the actual bytes in the file.
I think the fastest way to do this (as sending all the bytes across the web takes some time), is to send the file length and hash bytes to the server. Then the server checks the file length first, and if they match, it computes a hash for the file located on the server, and then checks if it is the same that the client computed.
Can anybody tell me what the how probable the hash collisions are when the file size is the same? (I am currently using MD5 for its speed).
Can I assume if the file size is the same and the hash is the same that the content is the same?
Thanks!