Is it possible to remove padding at the end from CRC checksum

Question

For example, I calculated CRC checksum of a file of size 1024 KB and file includes 22 KB of padding of zeros at the end of the file.

If given checksum of 1024 KB and size of the padding of zeros of given file

Is it possible to calculate the checksum of the file without the passing. That is in above case getting the checksum of 1002 KB of the file. Assuming we don't have to recalculate the checksum again and reuse the checksum already calculated for the entire file with padding.

There is a similar [question and answer](https://stackoverflow.com/questions/53018343/crc32-calculation-for-zero-filled-buffer-file). Note that CRC calculation is fairly fast. To calculate a 32 bit CRC for 1024 KB of data, on my system (Intel 3770K 3.5 ghz, Windows 7 Pro), a C based CRC program using a table to process a byte at a time takes 0.00341 seconds, and 500+ line (see the linked to question and answer) assembly code using xmm registers and pclmulqdq takes 0.00031 seconds. — rcgldr, Feb 14 '19 at 15:29
I updated my answer to show a solution with O(1) time complexity, if a table is pre-generated, or O(log2(n)) if not using a table. — rcgldr, Apr 03 '20 at 07:14

rcgldr · Answer 1 · 2020-04-03T07:22:53.387

After a normal CRC is calculated, a CRC can be "reverse cycled" backwards past the trailing zeroes, but rather than actually reverse cycling the CRC, a carryless multiply can be used:

new crc = (crc · (pow(2,-1-reverse_distance)%poly))%poly

The -1 represents the cyclic period for a CRC. For CRC32, the period is 2^32-1 = 0xffffffff .

By generating a table for pow(2,-1-(i*8))%poly) for i = 1 to n, time complexity can be reduced to O(1), doing a table lookup followed by a carryless multiply mod polynomial (32 iterations).

Example code for a 32 byte message with 14 data bytes, 18 zero bytes, with the new crc to be located at msg[{14,15,16,17}]. After the new bytes are stored in the message, a normal CRC calculation on the shortened message will be zero. The example code doesn't use a table, and the time complexity is O(log2(n)) for the pow(2,-1-(n*8))%poly) calculation.

#include <stdio.h>

typedef unsigned char       uint8_t;
typedef unsigned int       uint32_t;

static uint32_t crctbl[256];

void GenTbl(void)                       /* generate crc table */
{
uint32_t crc;
uint32_t c;
uint32_t i;
    for(c = 0; c < 0x100; c++){
        crc = c<<24;
        for(i = 0; i < 8; i++)
            crc = (crc<<1)^((0-(crc>>31))&0x04c11db7);
        crctbl[c] = crc;
    }
}

uint32_t GenCrc(uint8_t * bfr, size_t size) /* generate crc */
{
uint32_t crc = 0u;
    while(size--)
        crc = (crc<<8)^crctbl[(crc>>24)^*bfr++];
    return(crc);
}

/* carryless multiply modulo crc */
uint32_t MpyModCrc(uint32_t a, uint32_t b) /* (a*b)%crc */
{
uint32_t pd = 0;
uint32_t i;
    for(i = 0; i < 32; i++){
        pd = (pd<<1)^((0-(pd>>31))&0x04c11db7u);
        pd ^= (0-(b>>31))&a;
        b <<= 1;
    }
    return pd;
}

/* exponentiate by repeated squaring modulo crc */
uint32_t PowModCrc(uint32_t p)          /* pow(2,p)%crc */
{
uint32_t prd = 0x1u;                    /* current product */
uint32_t sqr = 0x2u;                    /* current square */
    while(p){
        if(p&1)
            prd = MpyModCrc(prd, sqr);
        sqr = MpyModCrc(sqr, sqr);
        p >>= 1;
    }
    return prd;
}

/*  message 14 data, 18 zeroes */
/*  parities = crc cycled backwards 18 bytes */

int main()
{
uint32_t pmr;
uint32_t crc;
uint32_t par;
uint8_t msg[32] = {0x01,0x02,0x03,0x04,0x05,0x06,0x07,0x08,
                   0x09,0x0a,0x0b,0x0c,0x0d,0x0e,0x00,0x00,
                   0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,
                   0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00};

    GenTbl();                           /* generate crc table */
    pmr = PowModCrc(-1-(18*8));         /* pmr = pow(2,-1-18*8)%crc */
    crc = GenCrc(msg, 32);              /* generate crc including 18 zeroes */
    par = MpyModCrc(crc, pmr);          /* par = (crc*pmr)%crc = new crc */
    crc = GenCrc(msg, 14);              /* generate crc for shortened msg */
    printf("%08x %08x\n", par, crc);    /* crc == par */
    msg[14] = (uint8_t)(par>>24);       /* store parities in msg */
    msg[15] = (uint8_t)(par>>16);
    msg[16] = (uint8_t)(par>> 8);
    msg[17] = (uint8_t)(par>> 0);
    crc = GenCrc(msg, 18);              /* crc == 0 */
    printf("%08x\n", crc);

    return 0;
}

score 0 · Answer 2 · answered Feb 13 '19 at 01:43

0

Sure. Look at this answer for code that undoes the trailing zeros, crc32_remove_zeros().

answered Feb 13 '19 at 01:43

Mark Adler

101,978
13
118
158

I updated my answer to show a solution with time complexity O(1) using a table, or O(log2(n)) if not using a table. – rcgldr Apr 03 '20 at 07:15

Is it possible to remove padding at the end from CRC checksum

2 Answers2

Linked