2

If we have a large file, let 's say 1 Petabyte, what's the best CRC that can detect all the errors? Is 32bits enough?

I also heard that undetected error rate (packet or chunk) is= BitR* BER * 0.5^k which K is the FSC of the CRC. in CRC 32 k is 31

I want to know if we have bigger packets or smaller ones how this will affect the CRC... from this equation, it is not affecting at all.

Arash
  • 225
  • 1
  • 11
  • 3
    A 32 bit CRC can detect a single 32 bit burst in a message up to 2^31-1 bits. However, only a few specific 32 bit CRC's can detect up to 7 random bit errors, and this is limited to messages up to 1024 bits, including the CRC, or 992 data bits + 32 CRC bits. Take a look [32 bit CRC Zoo](https://users.ece.cmu.edu/~koopman/crc/crc32.html) and look at the notes for an explanation. Detecting 9 bit errors is limited to message length 100 data + 32 CRC bits. – rcgldr Jul 04 '20 at 20:32
  • 2
    Link to example of [analysis of crc undetected error](http://www.ieee802.org/3/bv/public/Jan_2015/perezaranda_3bv_2_0115.pdf) – rcgldr Jul 05 '20 at 02:15
  • 1
    Usually as in that link the odds of failure include the probability of an error: 1 - (1-BER)^n, where BER is bit error rate, and n is number of bits (which is calculating 1 minus the probability of zero bits in error). – rcgldr Jul 05 '20 at 03:12
  • I’m voting to close this question because this is a better suited question for [cs.se], which specializes in generic algorithmic questions like this. – Machavity Aug 15 '22 at 17:32

1 Answers1

2

"Enough" depends on your tolerance for false positives. Given a CRC, or any other good hash, what probability would be acceptable to you of not detecting an error in any one message?

If you call that p, then the length of the CRC or hash you need in bits, n is n = ceiling(–log2(p)).

Note that this does not depend on the length of message. Kilobytes, Exabytes, whatever. Except to the extent that the expense of creating, sending, or storing the message impacts the p you would find acceptable.

For particularly expensive data or data sent over untrustworthy channels, you may want to consider error correcting codes, such as Reed-Solomon or BCH codes.

Mark Adler
  • 101,978
  • 13
  • 118
  • 158
  • I wondered, If I do multiple CRCs in different network layers what would be the undetected error probably over the whole? – Arash Oct 05 '20 at 03:57
  • 1
    If they're the same CRC on the same data, then it is same probability as for one. If the CRCs are different (with different polynomials), then the false positive probabilities will multiply. – Mark Adler Oct 05 '20 at 04:06
  • 1
    No. The different CRCs result in independent false positive probabilities. Therefore those probabilities can be multiplied. – Mark Adler Oct 05 '20 at 05:56
  • You mentioned ``` If the CRCs are different (with different polynomials), then the false positive probabilities will multiply``` which makes sense and it means the undetected error probability would decrease, I wondered is there any reference or paper for that? – Arash Apr 01 '21 at 01:05
  • Could you please assist me with my new question? https://stackoverflow.com/questions/67783861/which-one-is-more-complex-calculating-a-64-bits-crc-or-two-32-bits-crcs-with-di – Arash Jun 01 '21 at 06:44