0

Regrettably, What is the correct way of calculating a large CRC32 is not sufficient for me to understand how to implement calculation of a crc on a file of size 1kb <= x <= 128kb. The mhash library conceals the issue, and is thus suitable and convenient for me, nevertheless, I'd like to ask you to explain how one combines many crcs into one.

Perhaps this is the wrong question (which would then be the measure of my ignorance), but specifically, how is it legitimate to prepend the crc calculated in the previous iteration to the next block to be processed? Doesn't that severely slow the overall calculation and doesn't it potentially introduce new anomalies into otherwise unsullied data? TIA

Community
  • 1
  • 1
Shellsunde
  • 76
  • 10

1 Answers1

5

There is no prepending. The usual approach is for the CRC routine to take the running CRC at the end of the last block as the starting CRC for the next block. I.e. crc = crc32(crc, buf, len);. The first time it's called the initial CRC is (usually) zero, so crc = crc32(0, firstbuf, firstlen);.

If you want to calculate the CRC over multiple cores, then a more involved procedure is needed to combine CRCs that were all calculated in parallel with zero as the starting point, but you want the result to be as if the CRCs were done in series with the appropriate starting points. zlib provides the crc32_combine() routine for this purpose. See the zlib manual for more information.

Mark Adler
  • 101,978
  • 13
  • 118
  • 158