CRC32 Demystified
A.K.A. Why does almost everyone bugger up CRC32 and how to fix it.
Version 2, August 23, 2022 By Michaelangel007
Table of Contents
- Introduction
- Checksum
- CRC32 Implementation
- TL:DR; CRC32 Summary
- CRC32 Hashing Confusion
- CRC32 or CRC33?
- TL:DR; demo
- References
Introduction
How many different ways can one implement a Cyclic redundancy check algorithm? Specifically, where the polynomial is 32-bits, aka CRC32?
Let me count the ways.
The CRC algorithm can be implemented in one of two general methods:
- Formulaic
- Table-Lookup
For each of these methods there are various options we can choose from:
First, for each CRC we can use one of two polynomials:
- a "forward" polynomial constant, or
- a "reverse" polynomial constant, that is, the forward polynomial bit reversed.
Second, when we initialize the table we can shift bits to the left or right.
Third, when we calculate the CRC value we can shift bits to left or right. Ross Williams mentions in his guide: A PAINLESS GUIDE TO CRC ERROR DETECTION ALGORITHMS
"There are really only two forms: normal and reflected. Normal shifts to the left and covers the case of algorithms with Refin=FALSE and Refot=FALSE. Reflected shifts to the right and covers algorithms with both those parameters true."
And gives these two formulas:
Normal: crc = table[ ((crc>>24) ^ *data++) & 0xFF ] ^ (crc << 8);
Reflect: crc = table[ (crc ^ *data++) & 0xFF ] ^ (crc >> 8);
Note: that while the latter "reflected" formula is correct, the former "normal" formula is incorrect -- one needs to reverse both the:
- data bits, and
- final CRC value.
That gives rise to the Forth and Fifth options, respectively. The fourth -- as we read each byte of the data we are calculating the CRC for we optionally bit-reverse the the bytes as they get applied into the CRC algorithm.
The fifth -- we optionally bit-reverse the final CRC value.
What this all means is that depending on:
- which form (Normal or Reflected) and
- which polynomial (Forward or Reverse)
is used a naive user won't realize there at least a total of 4 different outcomes!
Two of them are correct; the other two are incorrect.
- Normal form (Shift Left) with Forward polynomial (valid)
- Reflected form (Shift Right) with Forward polynomial
- Normal form (Shift Left) with Reverse polynomial
- Reflected form (Shift Right) with Reverse polynomial (valid)
Checksum
Ross also mentions a checksum called "CHECK":
CHECK: ... The field contains the checksum obtained when the ASCII string "123456789" is fed through the specified algorithm (i.e. 313233... (hexadecimal)).
For CRC32 a popular forward polynomial is 0x04C11DB7.
Reversing the bits
we get the reference polynomial of 0xEDB88320
.
Polynomial | Binary |
---|---|
0x04C11DB7 | 00000100_11000001_00011101_10110111 |
0xEDB88320 | 11101101_10111000_10000011_00100000 |
(Note: See CRC32 vs CRC33)
They generate this checksum:
Form | Polynomial | CRC32 checksum |
---|---|---|
Normal | 0x04C11DB7 | 0xCBF43926 |
Reflected | 0xEDB88320 | 0xCBF43926 |
Notice how the two checksum values are the same even though the polynomials are different! We'll examine why that is in a bit (groan) but for now we can utilize this knowledge to verify that we have implemented the CRC32 algorithm correctly regardless if we are using:
- a formulaic method,
- a table-lookup method,
- a normal polynomial, or
- a reflected polynomial.
These are not the only CRC32 polynomials used. Wikipedia has a table of popular CRC polynomials For example, if you enter in "123456789" in password searching utilities:
- https://decryptpassword.com/encrypt/
- https://www.integers.co/questions-answers/what-are-the-different-hash-algorithm-outputs-for-123456789.html
- http://php.net/manual/en/function.hash-file.php
They will list:
Encrypting 123456789 with CRC32: 181989fc
Encrypting 123456789 with CRC32B: cbf43926
NOTE: To prevent ambiguity I'll call the former CRC32A
.
The former CRC32A
is the ITU I.363.5 algorithm popularised by BZIP2.
The latter CRC32B
is the ITU V.42 algorithm used in Ethernet and popularised by PKZip)
Why are two different values when both of these are generated with the same CRC32 polynomial 0x04C11DB7 on the same input? We're getting ahead of ourselves but they can be summarize like this:
Polynomial | Shift | Reverse Data | Reverse CRC | Checksum | Name |
---|---|---|---|---|---|
0x04C11DB7 | Left | No | No | 0xFC891918 | crc32a |
0x04C11DB7 | Left | Yes | Yes | 0xCBF43926 | crc32b |
0xEDB88320 | Right | No | No | 0xCBF43926 | crc32b |
0xEDB88320 | Right | Yes | Yes | 0xFC891918 | crc32a |
You can see enum_crc32.cpp for more details.
For this document we will focus mainly on CRC32B.
On *nix machines we can use the built-in cksum
utility.
echo -n "123456789" > crctest.txt
cksum -o 3 crctest.txt
Produces this output:
3421780262 9 crctest.txt
Converting to hex via Basic Calculator
echo "obase=16; 3421780262;" | bc
CBF43926
Which matches the expected checksum value.
CRC32 Implementation
So how do we actually implement CRC32?
- Initialize the CRC value, typically zero, but by convention we usually invert the bits, that is, -1.
- Iterate over each byte we wish to calculate the CRC checksum for
- For each byte we exclusive-or (XOR) it with the current CRC one bit at a time
- By convention for the final CRC value we invert the bits
Sounds simple, right?
It is, except for some minor, but important implementation details:
- When we XOR the data byte into the current CRC value do we start at the top or bottom bits?
- Which direction do we shift the CRC bits?
- How do we convert this formula into a table where we handle all 8-bits at once?
We'll start with the formulaic version.
Formulaic CRC
If we are using a formulaic method, we can generalize the 4 permutations with 2 algorithms:
- a) Formulaic "Normal" CRC32
// ========================================================================
uint32_t crc32_formula_normal( size_t len, const void *data, const uint32_t POLY = 0x04C11DB7 )
{
const unsigned char *buffer = (const unsigned char*) data;
uint32_t crc = -1;
while( len-- )
{
crc = crc ^ (reverse[ *buffer++ ] << 24);
for( int bit = 0; bit < 8; bit++ )
{
if( crc & (1L << 31)) crc = (crc << 1) ^ POLY;
else crc = (crc << 1);
}
}
return reflect32( ~crc );
}
Notes:
- The
reverse
is a table of byte values bit-reversed. - Likewise
reflect32()
bit reverses a 32-bit value.
There are few key things to note:
Normal
= Shift Left- We reverse bytes in the buffer before XOR'ing them into the CRC value
- The data bytes are fed into the top 8 bits of the CRC value
- We test the top-bit of the CRC value
- We invert the final CRC value
- We bit reverse the final CRC value
What were to happen if we didn't reverse any of the bits?
// ========================================================================
uint32_t crc32_formula_normal_noreverse( size_t len, const void *data, const uint32_t POLY = 0x04C11DB7 )
{
const unsigned char *buffer = (const unsigned char*) data;
uint32_t crc = -1;
while( len-- )
{
crc = crc ^ (*buffer++ << 24);
for( int bit = 0; bit < 8; bit++ )
{
if( crc & (1L << 31)) crc = (crc << 1) ^ POLY;
else crc = (crc << 1);
}
}
return ~crc;
}
Running it on "123456789" produces the CRC32 value of 0xFC891918 -- notice
how this is the little endian form of the big endian value 0x181989FC mentioned above!
We would have CRC32A
.
-
b) Formulaic "Reflected" CRC32
If we want to remove all that bit-reversing shenanigans we end up the simpler version:
// ========================================================================
uint32_t crc32_formula_reflect( size_t len, const void *data, const uint32_t POLY = 0xEDB88320 )
{
const unsigned char *buffer = (const unsigned char*) data;
uint32_t crc = -1;
while( len-- )
{
crc = crc ^ *buffer++;
for( int bit = 0; bit < 8; bit++ )
{
if( crc & 1 ) crc = (crc >> 1) ^ POLY;
else crc = (crc >> 1);
}
}
return ~crc;
}
The areas to note are:
Reflected
= Shift Right- The data bytes are fed into the bottom 8 bits of the CRC value
- We test the bottom-bit of the CRC value
- We invert the final CRC value
What would happen if we mis-matched the form and the polynomial? Using the check string this produces the formulaic output of:
Normal ( 0x04C11DB7 ): 0xCBF43926
Reflected( 0x04C11DB7 ): 0xFC4F2BE9 invalid!
Normal ( 0xEDB88320 ): 0xFC4F2BE9 invalid!
Reflected( 0xEDB88320 ): 0xCBF43926
We can now understand why often times you'll find people posting on the internet
that they are using a (forward polynomial) value of 0x04C11DB7
and are asking for
help to understand why they aren't getting the correct CRC calculation.
The typical knee-jerk solution is someone tells the original parent to use the other
(reverse) polynomial, 0x0xEDB88320
,
without both parties understanding that both polynomials are correct!
The real problem is a MISMATCH of the bit shuffling used in the CRC32 table Initialization and CRC32 Calculation.
Table-Lookup CRC
There is one big annoyance with the formulaic approach:
- We need to test the CRC one bit a time
Can we test 8 bits at once? Yes, by using a pre-computed table. The table lookup approach the CRC algorithm is broken down into two phases:
- Initialization of the table
- Calculation of crc using the table
As we saw in the formulaic discussion above there are 4 different permutions.
- Normal form (Shift Left) with Forward polynomial (valid)
- Reflected form (Shift Right) with Forward polynomial
- Normal form (Shift Left) with Reverse polynomial
- Reflected form (Shift Right) with Reverse polynomial (valid)
The calculation phase of table-lookup adds 8 more permutations.
Phase | Permutations |
---|---|
Initialization | 2^2 |
Calculation | 2^3 |
Thus there are a total of 2^2 * 2^3 = 4 * 8 = 32 permutations.
Of these 32 different permutations only 4 are correct, or should I say standardized. The other 28 are broken implementations due to someone not understanding the theory and incorrectly applying it.
No wonder most people are confused about CRC32! It is so easy to go wrong!!
Here are the details:
a) How can the table initialization be implemented in 4 different ways?
Form | Bit Check | Rotate | Polynomial | Valid | Function |
---|---|---|---|---|---|
Normal | Top Bit | Left | 0x04C11DB7 | Yes | crc32_init_normal ( forward poly ) |
Reflected | Bottom Bit | Right | 0x04C11DB7 | no !! | crc32_init_reflect( forward poly ) |
Normal | Top Bit | Left | 0xEDB88320 | no !! | crc32_init_normal ( reverse poly ) |
Reflected | Bottom Bit | Right | 0xEDB88320 | Yes | crc32_init_reflect( reverse poly ) |
CRC32 Tables
Here are the 4 CRC32 tables:
- Normal 0x04C11DB7: &<<31 [ 1] = poly , [ 16] = poly << 8 VALID
00000000, 04C11DB7, 09823B6E, 0D4326D9, 130476DC, 17C56B6B, 1A864DB2, 1E475005, // 0 [0x00 .. 0x07]
2608EDB8, 22C9F00F, 2F8AD6D6, 2B4BCB61, 350C9B64, 31CD86D3, 3C8EA00A, 384FBDBD, // 8 [0x08 .. 0x0F]
4C11DB70, 48D0C6C7, 4593E01E, 4152FDA9, 5F15ADAC, 5BD4B01B, 569796C2, 52568B75, // 16 [0x10 .. 0x17]
6A1936C8, 6ED82B7F, 639B0DA6, 675A1011, 791D4014, 7DDC5DA3, 709F7B7A, 745E66CD, // 24 [0x18 .. 0x1F]
9823B6E0, 9CE2AB57, 91A18D8E, 95609039, 8B27C03C, 8FE6DD8B, 82A5FB52, 8664E6E5, // 32 [0x20 .. 0x27]
BE2B5B58, BAEA46EF, B7A96036, B3687D81, AD2F2D84, A9EE3033, A4AD16EA, A06C0B5D, // 40 [0x28 .. 0x2F]
D4326D90, D0F37027, DDB056FE, D9714B49, C7361B4C, C3F706FB, CEB42022, CA753D95, // 48 [0x30 .. 0x37]
F23A8028, F6FB9D9F, FBB8BB46, FF79A6F1, E13EF6F4, E5FFEB43, E8BCCD9A, EC7DD02D, // 56 [0x38 .. 0x3F]
34867077, 30476DC0, 3D044B19, 39C556AE, 278206AB, 23431B1C, 2E003DC5, 2AC12072, // 64 [0x40 .. 0x47]
128E9DCF, 164F8078, 1B0CA6A1, 1FCDBB16, 018AEB13, 054BF6A4, 0808D07D, 0CC9CDCA, // 72 [0x48 .. 0x4F]
7897AB07, 7C56B6B0, 71159069, 75D48DDE, 6B93DDDB, 6F52C06C, 6211E6B5, 66D0FB02, // 80 [0x50 .. 0x57]
5E9F46BF, 5A5E5B08, 571D7DD1, 53DC6066, 4D9B3063, 495A2DD4, 44190B0D, 40D816BA, // 88 [0x58 .. 0x5F]
ACA5C697, A864DB20, A527FDF9, A1E6E04E, BFA1B04B, BB60ADFC, B6238B25, B2E29692, // 96 [0x60 .. 0x67]
8AAD2B2F, 8E6C3698, 832F1041, 87EE0DF6, 99A95DF3, 9D684044, 902B669D, 94EA7B2A, // 104 [0x68 .. 0x6F]
E0B41DE7, E4750050, E9362689, EDF73B3E, F3B06B3B, F771768C, FA325055, FEF34DE2, // 112 [0x70 .. 0x77]
C6BCF05F, C27DEDE8, CF3ECB31, CBFFD686, D5B88683, D1799B34, DC3ABDED, D8FBA05A, // 120 [0x78 .. 0x7F]
690CE0EE, 6DCDFD59, 608EDB80, 644FC637, 7A089632, 7EC98B85, 738AAD5C, 774BB0EB, // 128 [0x80 .. 0x87]
4F040D56, 4BC510E1, 46863638, 42472B8F, 5C007B8A, 58C1663D, 558240E4, 51435D53, // 136 [0x88 .. 0x8F]
251D3B9E, 21DC2629, 2C9F00F0, 285E1D47, 36194D42, 32D850F5, 3F9B762C, 3B5A6B9B, // 144 [0x90 .. 0x97]
0315D626, 07D4CB91, 0A97ED48, 0E56F0FF, 1011A0FA, 14D0BD4D, 19939B94, 1D528623, // 152 [0x98 .. 0x9F]
F12F560E, F5EE4BB9, F8AD6D60, FC6C70D7, E22B20D2, E6EA3D65, EBA91BBC, EF68060B, // 160 [0xA0 .. 0xA7]
D727BBB6, D3E6A601, DEA580D8, DA649D6F, C423CD6A, C0E2D0DD, CDA1F604, C960EBB3, // 168 [0xA8 .. 0xAF]
BD3E8D7E, B9FF90C9, B4BCB610, B07DABA7, AE3AFBA2, AAFBE615, A7B8C0CC, A379DD7B, // 176 [0xB0 .. 0xB7]
9B3660C6, 9FF77D71, 92B45BA8, 9675461F, 8832161A, 8CF30BAD, 81B02D74, 857130C3, // 184 [0xB8 .. 0xBF]
5D8A9099, 594B8D2E, 5408ABF7, 50C9B640, 4E8EE645, 4A4FFBF2, 470CDD2B, 43CDC09C, // 192 [0xC0 .. 0xC7]
7B827D21, 7F436096, 7200464F, 76C15BF8, 68860BFD, 6C47164A, 61043093, 65C52D24, // 200 [0xC8 .. 0xCF]
119B4BE9, 155A565E, 18197087, 1CD86D30, 029F3D35, 065E2082, 0B1D065B, 0FDC1BEC, // 208 [0xD0 .. 0xD7]
3793A651, 3352BBE6, 3E119D3F, 3AD08088, 2497D08D, 2056CD3A, 2D15EBE3, 29D4F654, // 216 [0xD8 .. 0xDF]
C5A92679, C1683BCE, CC2B1D17, C8EA00A0, D6AD50A5, D26C4D12, DF2F6BCB, DBEE767C, // 224 [0xE0 .. 0xE7]
E3A1CBC1, E760D676, EA23F0AF, EEE2ED18, F0A5BD1D, F464A0AA, F9278673, FDE69BC4, // 232 [0xE8 .. 0xEF]
89B8FD09, 8D79E0BE, 803AC667, 84FBDBD0, 9ABC8BD5, 9E7D9662, 933EB0BB, 97FFAD0C, // 240 [0xF0 .. 0xF7]
AFB010B1, AB710D06, A6322BDF, A2F33668, BCB4666D, B8757BDA, B5365D03, B1F740B4, // 248 [0xF8 .. 0xFF]
- Reflected 0x04C11DB7: &1 >> [ 1] = rev. poly, [ 30] = rev.poly << 8 broken
00000000, 06233697, 05C45641, 03E760D6, 020A97ED, 0429A17A, 07CEC1AC, 01EDF73B, // 0 [0x00 .. 0x07]
04152FDA, 0236194D, 01D1799B, 07F24F0C, 061FB837, 003C8EA0, 03DBEE76, 05F8D8E1, // 8 [0x08 .. 0x0F]
01A864DB, 078B524C, 046C329A, 024F040D, 03A2F336, 0581C5A1, 0666A577, 004593E0, // 16 [0x10 .. 0x17]
05BD4B01, 039E7D96, 00791D40, 065A2BD7, 07B7DCEC, 0194EA7B, 02738AAD, 0450BC3A, // 24 [0x18 .. 0x1F]
0350C9B6, 0573FF21, 06949FF7, 00B7A960, 015A5E5B, 077968CC, 049E081A, 02BD3E8D, // 32 [0x20 .. 0x27]
0745E66C, 0166D0FB, 0281B02D, 04A286BA, 054F7181, 036C4716, 008B27C0, 06A81157, // 40 [0x28 .. 0x2F]
02F8AD6D, 04DB9BFA, 073CFB2C, 011FCDBB, 00F23A80, 06D10C17, 05366CC1, 03155A56, // 48 [0x30 .. 0x37]
06ED82B7, 00CEB420, 0329D4F6, 050AE261, 04E7155A, 02C423CD, 0123431B, 0700758C, // 56 [0x38 .. 0x3F]
06A1936C, 0082A5FB, 0365C52D, 0546F3BA, 04AB0481, 02883216, 016F52C0, 074C6457, // 64 [0x40 .. 0x47]
02B4BCB6, 04978A21, 0770EAF7, 0153DC60, 00BE2B5B, 069D1DCC, 057A7D1A, 03594B8D, // 72 [0x48 .. 0x4F]
0709F7B7, 012AC120, 02CDA1F6, 04EE9761, 0503605A, 032056CD, 00C7361B, 06E4008C, // 80 [0x50 .. 0x57]
031CD86D, 053FEEFA, 06D88E2C, 00FBB8BB, 01164F80, 07357917, 04D219C1, 02F12F56, // 88 [0x58 .. 0x5F]
05F15ADA, 03D26C4D, 00350C9B, 06163A0C, 07FBCD37, 01D8FBA0, 023F9B76, 041CADE1, // 96 [0x60 .. 0x67]
01E47500, 07C74397, 04202341, 020315D6, 03EEE2ED, 05CDD47A, 062AB4AC, 0009823B, // 104 [0x68 .. 0x6F]
04593E01, 027A0896, 019D6840, 07BE5ED7, 0653A9EC, 00709F7B, 0397FFAD, 05B4C93A, // 112 [0x70 .. 0x77]
004C11DB, 066F274C, 0588479A, 03AB710D, 02468636, 0465B0A1, 0782D077, 01A1E6E0, // 120 [0x78 .. 0x7F]
04C11DB7, 02E22B20, 01054BF6, 07267D61, 06CB8A5A, 00E8BCCD, 030FDC1B, 052CEA8C, // 128 [0x80 .. 0x87]
00D4326D, 06F704FA, 0510642C, 033352BB, 02DEA580, 04FD9317, 071AF3C1, 0139C556, // 136 [0x88 .. 0x8F]
0569796C, 034A4FFB, 00AD2F2D, 068E19BA, 0763EE81, 0140D816, 02A7B8C0, 04848E57, // 144 [0x90 .. 0x97]
017C56B6, 075F6021, 04B800F7, 029B3660, 0376C15B, 0555F7CC, 06B2971A, 0091A18D, // 152 [0x98 .. 0x9F]
0791D401, 01B2E296, 02558240, 0476B4D7, 059B43EC, 03B8757B, 005F15AD, 067C233A, // 160 [0xA0 .. 0xA7]
0384FBDB, 05A7CD4C, 0640AD9A, 00639B0D, 018E6C36, 07AD5AA1, 044A3A77, 02690CE0, // 168 [0xA8 .. 0xAF]
0639B0DA, 001A864D, 03FDE69B, 05DED00C, 04332737, 021011A0, 01F77176, 07D447E1, // 176 [0xB0 .. 0xB7]
022C9F00, 040FA997, 07E8C941, 01CBFFD6, 002608ED, 06053E7A, 05E25EAC, 03C1683B, // 184 [0xB8 .. 0xBF]
02608EDB, 0443B84C, 07A4D89A, 0187EE0D, 006A1936, 06492FA1, 05AE4F77, 038D79E0, // 192 [0xC0 .. 0xC7]
0675A101, 00569796, 03B1F740, 0592C1D7, 047F36EC, 025C007B, 01BB60AD, 0798563A, // 200 [0xC8 .. 0xCF]
03C8EA00, 05EBDC97, 060CBC41, 002F8AD6, 01C27DED, 07E14B7A, 04062BAC, 02251D3B, // 208 [0xD0 .. 0xD7]
07DDC5DA, 01FEF34D, 0219939B, 043AA50C, 05D75237, 03F464A0, 00130476, 063032E1, // 216 [0xD8 .. 0xDF]
0130476D, 071371FA, 04F4112C, 02D727BB, 033AD080, 0519E617, 06FE86C1, 00DDB056, // 224 [0xE0 .. 0xE7]
052568B7, 03065E20, 00E13EF6, 06C20861, 072FFF5A, 010CC9CD, 02EBA91B, 04C89F8C, // 232 [0xE8 .. 0xEF]
009823B6, 06BB1521, 055C75F7, 037F4360, 0292B45B, 04B182CC, 0756E21A, 0175D48D, // 240 [0xF0 .. 0xF7]
048D0C6C, 02AE3AFB, 01495A2D, 076A6CBA, 06879B81, 00A4AD16, 0343CDC0, 0560FB57, // 248 [0xF8 .. 0xFF]
- Normal 0xEDB88320: &<<31 [128] = rev. poly, [120] = rev.poly >> 8 broken
00000000, EDB88320, 36C98560, DB710640, 6D930AC0, 802B89E0, 5B5A8FA0, B6E20C80, // 0 [0x00 .. 0x07]
DB261580, 369E96A0, EDEF90E0, 005713C0, B6B51F40, 5B0D9C60, 807C9A20, 6DC41900, // 8 [0x08 .. 0x0F]
5BF4A820, B64C2B00, 6D3D2D40, 8085AE60, 3667A2E0, DBDF21C0, 00AE2780, ED16A4A0, // 16 [0x10 .. 0x17]
80D2BDA0, 6D6A3E80, B61B38C0, 5BA3BBE0, ED41B760, 00F93440, DB883200, 3630B120, // 24 [0x18 .. 0x1F]
B7E95040, 5A51D360, 8120D520, 6C985600, DA7A5A80, 37C2D9A0, ECB3DFE0, 010B5CC0, // 32 [0x20 .. 0x27]
6CCF45C0, 8177C6E0, 5A06C0A0, B7BE4380, 015C4F00, ECE4CC20, 3795CA60, DA2D4940, // 40 [0x28 .. 0x2F]
EC1DF860, 01A57B40, DAD47D00, 376CFE20, 818EF2A0, 6C367180, B74777C0, 5AFFF4E0, // 48 [0x30 .. 0x37]
373BEDE0, DA836EC0, 01F26880, EC4AEBA0, 5AA8E720, B7106400, 6C616240, 81D9E160, // 56 [0x38 .. 0x3F]
826A23A0, 6FD2A080, B4A3A6C0, 591B25E0, EFF92960, 0241AA40, D930AC00, 34882F20, // 64 [0x40 .. 0x47]
594C3620, B4F4B500, 6F85B340, 823D3060, 34DF3CE0, D967BFC0, 0216B980, EFAE3AA0, // 72 [0x48 .. 0x4F]
D99E8B80, 342608A0, EF570EE0, 02EF8DC0, B40D8140, 59B50260, 82C40420, 6F7C8700, // 80 [0x50 .. 0x57]
02B89E00, EF001D20, 34711B60, D9C99840, 6F2B94C0, 829317E0, 59E211A0, B45A9280, // 88 [0x58 .. 0x5F]
358373E0, D83BF0C0, 034AF680, EEF275A0, 58107920, B5A8FA00, 6ED9FC40, 83617F60, // 96 [0x60 .. 0x67]
EEA56660, 031DE540, D86CE300, 35D46020, 83366CA0, 6E8EEF80, B5FFE9C0, 58476AE0, // 104 [0x68 .. 0x6F]
6E77DBC0, 83CF58E0, 58BE5EA0, B506DD80, 03E4D100, EE5C5220, 352D5460, D895D740, // 112 [0x70 .. 0x77]
B551CE40, 58E94D60, 83984B20, 6E20C800, D8C2C480, 357A47A0, EE0B41E0, 03B3C2C0, // 120 [0x78 .. 0x7F]
E96CC460, 04D44740, DFA54100, 321DC220, 84FFCEA0, 69474D80, B2364BC0, 5F8EC8E0, // 128 [0x80 .. 0x87]
324AD1E0, DFF252C0, 04835480, E93BD7A0, 5FD9DB20, B2615800, 69105E40, 84A8DD60, // 136 [0x88 .. 0x8F]
B2986C40, 5F20EF60, 8451E920, 69E96A00, DF0B6680, 32B3E5A0, E9C2E3E0, 047A60C0, // 144 [0x90 .. 0x97]
69BE79C0, 8406FAE0, 5F77FCA0, B2CF7F80, 042D7300, E995F020, 32E4F660, DF5C7540, // 152 [0x98 .. 0x9F]
5E859420, B33D1700, 684C1140, 85F49260, 33169EE0, DEAE1DC0, 05DF1B80, E86798A0, // 160 [0xA0 .. 0xA7]
85A381A0, 681B0280, B36A04C0, 5ED287E0, E8308B60, 05880840, DEF90E00, 33418D20, // 168 [0xA8 .. 0xAF]
05713C00, E8C9BF20, 33B8B960, DE003A40, 68E236C0, 855AB5E0, 5E2BB3A0, B3933080, // 176 [0xB0 .. 0xB7]
DE572980, 33EFAAA0, E89EACE0, 05262FC0, B3C42340, 5E7CA060, 850DA620, 68B52500, // 184 [0xB8 .. 0xBF]
6B06E7C0, 86BE64E0, 5DCF62A0, B077E180, 0695ED00, EB2D6E20, 305C6860, DDE4EB40, // 192 [0xC0 .. 0xC7]
B020F240, 5D987160, 86E97720, 6B51F400, DDB3F880, 300B7BA0, EB7A7DE0, 06C2FEC0, // 200 [0xC8 .. 0xCF]
30F24FE0, DD4ACCC0, 063BCA80, EB8349A0, 5D614520, B0D9C600, 6BA8C040, 86104360, // 208 [0xD0 .. 0xD7]
EBD45A60, 066CD940, DD1DDF00, 30A55C20, 864750A0, 6BFFD380, B08ED5C0, 5D3656E0, // 216 [0xD8 .. 0xDF]
DCEFB780, 315734A0, EA2632E0, 079EB1C0, B17CBD40, 5CC43E60, 87B53820, 6A0DBB00, // 224 [0xE0 .. 0xE7]
07C9A200, EA712120, 31002760, DCB8A440, 6A5AA8C0, 87E22BE0, 5C932DA0, B12BAE80, // 232 [0xE8 .. 0xEF]
871B1FA0, 6AA39C80, B1D29AC0, 5C6A19E0, EA881560, 07309640, DC419000, 31F91320, // 240 [0xF0 .. 0xF7]
5C3D0A20, B1858900, 6AF48F40, 874C0C60, 31AE00E0, DC1683C0, 07678580, EADF06A0, // 248 [0xF8 .. 0xFF]
- Reflected 0xEDB88320: &1 >> [128] = poly , [ 8] = poly >> 8 VALID
00000000, 77073096, EE0E612C, 990951BA, 076DC419, 706AF48F, E963A535, 9E6495A3, // 0 [0x00 .. 0x07]
0EDB8832, 79DCB8A4, E0D5E91E, 97D2D988, 09B64C2B, 7EB17CBD, E7B82D07, 90BF1D91, // 8 [0x08 .. 0x0F]
1DB71064, 6AB020F2, F3B97148, 84BE41DE, 1ADAD47D, 6DDDE4EB, F4D4B551, 83D385C7, // 16 [0x10 .. 0x17]
136C9856, 646BA8C0, FD62F97A, 8A65C9EC, 14015C4F, 63066CD9, FA0F3D63, 8D080DF5, // 24 [0x18 .. 0x1F]
3B6E20C8, 4C69105E, D56041E4, A2677172, 3C03E4D1, 4B04D447, D20D85FD, A50AB56B, // 32 [0x20 .. 0x27]
35B5A8FA, 42B2986C, DBBBC9D6, ACBCF940, 32D86CE3, 45DF5C75, DCD60DCF, ABD13D59, // 40 [0x28 .. 0x2F]
26D930AC, 51DE003A, C8D75180, BFD06116, 21B4F4B5, 56B3C423, CFBA9599, B8BDA50F, // 48 [0x30 .. 0x37]
2802B89E, 5F058808, C60CD9B2, B10BE924, 2F6F7C87, 58684C11, C1611DAB, B6662D3D, // 56 [0x38 .. 0x3F]
76DC4190, 01DB7106, 98D220BC, EFD5102A, 71B18589, 06B6B51F, 9FBFE4A5, E8B8D433, // 64 [0x40 .. 0x47]
7807C9A2, 0F00F934, 9609A88E, E10E9818, 7F6A0DBB, 086D3D2D, 91646C97, E6635C01, // 72 [0x48 .. 0x4F]
6B6B51F4, 1C6C6162, 856530D8, F262004E, 6C0695ED, 1B01A57B, 8208F4C1, F50FC457, // 80 [0x50 .. 0x57]
65B0D9C6, 12B7E950, 8BBEB8EA, FCB9887C, 62DD1DDF, 15DA2D49, 8CD37CF3, FBD44C65, // 88 [0x58 .. 0x5F]
4DB26158, 3AB551CE, A3BC0074, D4BB30E2, 4ADFA541, 3DD895D7, A4D1C46D, D3D6F4FB, // 96 [0x60 .. 0x67]
4369E96A, 346ED9FC, AD678846, DA60B8D0, 44042D73, 33031DE5, AA0A4C5F, DD0D7CC9, // 104 [0x68 .. 0x6F]
5005713C, 270241AA, BE0B1010, C90C2086, 5768B525, 206F85B3, B966D409, CE61E49F, // 112 [0x70 .. 0x77]
5EDEF90E, 29D9C998, B0D09822, C7D7A8B4, 59B33D17, 2EB40D81, B7BD5C3B, C0BA6CAD, // 120 [0x78 .. 0x7F]
EDB88320, 9ABFB3B6, 03B6E20C, 74B1D29A, EAD54739, 9DD277AF, 04DB2615, 73DC1683, // 128 [0x80 .. 0x87]
E3630B12, 94643B84, 0D6D6A3E, 7A6A5AA8, E40ECF0B, 9309FF9D, 0A00AE27, 7D079EB1, // 136 [0x88 .. 0x8F]
F00F9344, 8708A3D2, 1E01F268, 6906C2FE, F762575D, 806567CB, 196C3671, 6E6B06E7, // 144 [0x90 .. 0x97]
FED41B76, 89D32BE0, 10DA7A5A, 67DD4ACC, F9B9DF6F, 8EBEEFF9, 17B7BE43, 60B08ED5, // 152 [0x98 .. 0x9F]
D6D6A3E8, A1D1937E, 38D8C2C4, 4FDFF252, D1BB67F1, A6BC5767, 3FB506DD, 48B2364B, // 160 [0xA0 .. 0xA7]
D80D2BDA, AF0A1B4C, 36034AF6, 41047A60, DF60EFC3, A867DF55, 316E8EEF, 4669BE79, // 168 [0xA8 .. 0xAF]
CB61B38C, BC66831A, 256FD2A0, 5268E236, CC0C7795, BB0B4703, 220216B9, 5505262F, // 176 [0xB0 .. 0xB7]
C5BA3BBE, B2BD0B28, 2BB45A92, 5CB36A04, C2D7FFA7, B5D0CF31, 2CD99E8B, 5BDEAE1D, // 184 [0xB8 .. 0xBF]
9B64C2B0, EC63F226, 756AA39C, 026D930A, 9C0906A9, EB0E363F, 72076785, 05005713, // 192 [0xC0 .. 0xC7]
95BF4A82, E2B87A14, 7BB12BAE, 0CB61B38, 92D28E9B, E5D5BE0D, 7CDCEFB7, 0BDBDF21, // 200 [0xC8 .. 0xCF]
86D3D2D4, F1D4E242, 68DDB3F8, 1FDA836E, 81BE16CD, F6B9265B, 6FB077E1, 18B74777, // 208 [0xD0 .. 0xD7]
88085AE6, FF0F6A70, 66063BCA, 11010B5C, 8F659EFF, F862AE69, 616BFFD3, 166CCF45, // 216 [0xD8 .. 0xDF]
A00AE278, D70DD2EE, 4E048354, 3903B3C2, A7672661, D06016F7, 4969474D, 3E6E77DB, // 224 [0xE0 .. 0xE7]
AED16A4A, D9D65ADC, 40DF0B66, 37D83BF0, A9BCAE53, DEBB9EC5, 47B2CF7F, 30B5FFE9, // 232 [0xE8 .. 0xEF]
BDBDF21C, CABAC28A, 53B39330, 24B4A3A6, BAD03605, CDD70693, 54DE5729, 23D967BF, // 240 [0xF0 .. 0xF7]
B3667A2E, C4614AB8, 5D681B02, 2A6F2B94, B40BBE37, C30C8EA1, 5A05DF1B, 2D02EF8D, // 248 [0xF8 .. 0xFF]
Thus by inspecting table[1] and table[128] entries we can tell:
- Which form you used,
- Which polynomial you used, and
- Which shift direction you should be using.
b) How can the CRC calculation be implemented in 8 different ways?
Shift CRC | Data Bits reversed? | Final CRC Reversed? |
---|---|---|
Left | No | No |
Right | Yes | Yes |
The data bits are reversed depending on which form we are in:
Form | Data bits |
---|---|
Normal | crc32 = table[ (crc32 >> 24) ^ reverse[ *data ] ^ (crc << 8); |
Reflected | crc32 = table[ (crc32 ) ^ *data ] ^ (crc >> 8); |
The final CRC value is reversed depending on which form we are in:
Form | Final CRC32 |
---|---|
Normal | crc32 = ~crc32 ; |
Reflected | crc32 = reflect32( ~crc32 ); |
Enumerating the 8 calculation permutations:
Right Shift | Reverse Data |
Reverse CRC |
Function |
---|---|---|---|
0 | 0 | 0 | crc32_000() |
0 | 0 | 1 | crc32_001() |
0 | 1 | 0 | crc32_010() |
0 | 1 | 1 | crc32_011() |
1 | 0 | 0 | crc32_100() |
1 | 0 | 1 | crc32_101() |
1 | 1 | 0 | crc32_110() |
1 | 1 | 1 | crc32_111() |
All CRC32 Permutations
Enumerating all 32 initialization and calculation permutations:
- See enum_crc32.cpp
Polynomial | Bit | Init | Shift CRC> |
Reverse Data |
Reverse CRC |
Function | Valid | Result |
---|---|---|---|---|---|---|---|---|
0x04C11DB7 | 31 | crc32_init_normal |
Left | 0 | 0 | crc32_000() |
crc32a | 0xFC891918 |
0x04C11DB7 | 31 | crc32_init_normal |
Left | 0 | 1 | crc32_001() |
no | 0x1898913F |
0x04C11DB7 | 31 | crc32_init_normal |
Left | 1 | 0 | crc32_010() |
no | 0x649C2FD3 |
0x04C11DB7 | 31 | crc32_init_normal |
Left | 1 | 1 | crc32_011() |
crc32b | 0xCBF43926 |
0x04C11DB7 | 31 | crc32_init_normal |
Right | 0 | 0 | crc32_100() |
no | 0x7EAD5C77 |
0x04C11DB7 | 31 | crc32_init_normal |
Right | 0 | 1 | crc32_101() |
no | 0xEE3AB57E |
0x04C11DB7 | 31 | crc32_init_normal |
Right | 1 | 0 | crc32_110() |
no | 0x0D0B7023 |
0x04C11DB7 | 31 | crc32_init_normal |
Right | 1 | 1 | crc32_111() |
no | 0xC40ED0B0 |
0x04C11DB7 | 1 | crc32_init_reflect |
Left | 0 | 0 | crc32_000() |
no | 0xC9A0B7E5 |
0x04C11DB7 | 1 | crc32_init_reflect |
Left | 0 | 1 | crc32_001() |
no | 0xA7ED0593 |
0x04C11DB7 | 1 | crc32_init_reflect |
Left | 1 | 0 | crc32_010() |
no | 0x9D594C04 |
0x04C11DB7 | 1 | crc32_init_reflect |
Left | 1 | 1 | crc32_011() |
no | 0x20329AB9 |
0x04C11DB7 | 1 | crc32_init_reflect |
Right | 0 | 0 | crc32_100() |
no | 0xFC4F2BE9 |
0x04C11DB7 | 1 | crc32_init_reflect |
Right | 0 | 1 | crc32_101() |
no | 0x97D4F23F |
0x04C11DB7 | 1 | crc32_init_reflect |
Right | 1 | 0 | crc32_110() |
no | 0xFDEFB72E |
0x04C11DB7 | 1 | crc32_init_reflect |
Right | 1 | 1 | crc32_111() |
no | 0x74EDF7BF |
0xEDB88320 | 31 | crc32_init_normal |
Left | 0 | 0 | crc32_000() |
no | 0x74EDF7BF |
0xEDB88320 | 31 | crc32_init_normal |
Left | 0 | 1 | crc32_001() |
no | 0xFDEFB72E |
0xEDB88320 | 31 | crc32_init_normal |
Left | 1 | 0 | crc32_010() |
no | 0x97D4F23F |
0xEDB88320 | 31 | crc32_init_normal |
Left | 1 | 1 | crc32_011() |
no | 0xFC4F2BE9 |
0xEDB88320 | 31 | crc32_init_normal |
Right | 0 | 0 | crc32_100() |
no | 0x20329AB9 |
0xEDB88320 | 31 | crc32_init_normal |
Right | 0 | 1 | crc32_101() |
no | 0x9D594C04 |
0xEDB88320 | 31 | crc32_init_normal |
Right | 1 | 0 | crc32_110() |
no | 0xA7ED0593 |
0xEDB88320 | 31 | crc32_init_normal |
Right | 1 | 1 | crc32_111() |
no | 0xC9A0B7E5 |
0xEDB88320 | 1 | crc32_init_reflect |
Left | 0 | 0 | crc32_000() |
no | 0xC40ED0B0 |
0xEDB88320 | 1 | crc32_init_reflect |
Left | 0 | 1 | crc32_001() |
no | 0x0D0B7023 |
0xEDB88320 | 1 | crc32_init_reflect |
Left | 1 | 0 | crc32_010() |
no | 0xEE3AB57E |
0xEDB88320 | 1 | crc32_init_reflect |
Left | 1 | 1 | crc32_011() |
no | 0x7EAD5C77 |
0xEDB88320 | 1 | crc32_init_reflect |
Right | 0 | 0 | crc32_100() |
crc32b | 0xCBF43926 |
0xEDB88320 | 1 | crc32_init_reflect |
Right | 0 | 1 | crc32_101() |
no | 0x649C2FD3 |
0xEDB88320 | 1 | crc32_init_reflect |
Right | 1 | 0 | crc32_110() |
no | 0x1898913F |
0xEDB88320 | 1 | crc32_init_reflect |
Right | 1 | 1 | crc32_111() |
crc32a | 0xFC891918 |
Legend:
- Bit = Which bit is tested during table initialization
- Shift CRC = Which direction the CRC value is shifted during calculation. Note that this is independent of which shift direction was used during initialization!
Looking at the Results
column from the table above we now have an alternative means to identity which CRC32 was used.
Source for crc32id.cpp:
#include "common.h"
const uint32_t CRC32Family[32]
{
// Polynomial reversed?
// Reflected?
// Shift Right?
// Data Reversed?
// CRC Reversed?
// Hash Id PRSDC
0xFC891918 // 0 [00000] 0x04C11DB7
,0x1898913F // 1 [00001]
,0x649C2FD3 // 2 [00010]
,0xCBF43926 // 3 [00011]
,0x7EAD5C77 // 4 [00100]
,0xEE3AB57E // 5 [00101]
,0x0D0B7023 // 6 [00110]
,0xC40ED0B0 // 7 [00111]
,0xC9A0B7E5 // 8 [01000]
,0xA7ED0593 // 9 [01001]
,0x9D594C04 // 10 [01010]
,0x20329AB9 // 11 [01011]
,0xFC4F2BE9 // 12 [01100]
,0x97D4F23F // 13 [01101]
,0xFDEFB72E // 14 [01110]
,0x74EDF7BF // 15 [01111]
,0x74EDF7BF // 16 [01000] 0xEDB88320
,0xFDEFB72E // 17 [01001]
,0x97D4F23F // 18 [01010]
,0xFC4F2BE9 // 19 [01011]
,0x20329AB9 // 20 [01100]
,0x9D594C04 // 21 [01101]
,0xA7ED0593 // 22 [01110]
,0xC9A0B7E5 // 23 [01111]
,0xC40ED0B0 // 24 [11000]
,0x0D0B7023 // 25 [11001]
,0xEE3AB57E // 26 [11010]
,0x7EAD5C77 // 27 [11011]
,0xCBF43926 // 28 [11100]
,0x649C2FD3 // 29 [11101]
,0x1898913F // 30 [11110]
,0xFC891918 // 31 [11111]
// 0 = 0x04C11DB7
// 1 = 0xEDB88320
// 0 = Normal
// 1 = Reflected
// 0 = Left
// 1 = Right
};
int CRC32Id(const uint32_t poly, const uint32_t hash)
{
const int nHash = (int) (sizeof(CRC32Family) / sizeof(CRC32Family[0]));
for( int iHash = 0; iHash < nHash; ++iHash )
{
// The last 16 hashes are reflected from the first 16
// thus we need to check if we have the first polynomial as well.
// The 4 forms have the polynomial stored in aPoly
if ((CRC32Family[iHash] == hash) && (poly == aPoly[iHash/8]))
return iHash;
}
return -1;
}
void CRC32Analysis(const uint32_t crc32id)
{
if (crc32id < 0)
{
printf( "Not a known CRC32 algorithm.\n" );
return;
}
int isNormal = (crc32id >> 4) & 1;
int isReflect = (crc32id >> 3) & 1;
int isShiftR = (crc32id >> 2) & 1;
int isRevData = (crc32id >> 1) & 1;
int isRevCRC = (crc32id >> 0) & 1;
const int iFunc = (crc32id >> 3) & 3;
const uint32_t *pData = aData[ iFunc ];
const char *pDesc = aDesc[ iFunc ];
const char *text = CRC32_CHECK_TXT;
const size_t length = strlen( text );
const char *aNoYes[2] = { "No ", "Yes" };
uint32_t crc = aFunc[ iFunc ]( pData, length, text );
printf( "Poly: %08X Reflect: %s Normal: %s Shift: %s, Rev. Data: %d, Rev. CRC: %d, 0x%08X\n"
, aPoly[ iFunc ]
, aNoYes[ isReflect ]
, aNoYes[ isNormal ]
, isShiftR
? "Right"
: "Left "
, isRevData
, isRevCRC
, crc
);
}
int main(int nArg, char *aArg[])
{
char buffer[16];
uint32_t poly = (nArg > 1) ? strtoul( aArg[1], 0, 16 ) : 0;
uint32_t hash = (nArg > 2) ? strtoul( aArg[2], 0, 16 ) : 0;
int id;
common_init();
if( !poly )
{
printf( "Using your favorite crc32()\n" );
printf( "1. Enter the 32-bit Polynomial (in hex) used:\n" );
if (fgets(buffer, sizeof(buffer), stdin))
sscanf( buffer, "%X", &poly );
}
if (!hash)
{
printf( "2. Enter the 32-bit Hash value (in hex) it generates for the string: %s\n", CRC32_CHECK_TXT );
if (fgets(buffer, sizeof(buffer), stdin))
sscanf(buffer, "%X", &hash);
}
id = CRC32Id( poly, hash );
printf( "CRCid: %d\n", id);
CRC32Analysis( id );
return 0;
}
TL:DR; CRC32 Summary
Here is a summary of the two forms of CRC32:
Description | Normal | Reflected |
---|---|---|
Polynomial bits reversed? | No | Yes |
Polynomial value | 0x04C11DB7 | 0xEDB88320 |
Polynomial nomenclature | Forward | Reverse |
Table initialization bit | Top-bit | Low-bit |
Table initialization bit test | 31 | 1 |
Table initialization bit shift | Left | Right |
Data bits reversed? | Yes | No |
CRC calculation shift | Left | Right |
Final CRC bits reversed? | Yes | No |
Correct Initialization Function | crc32_init_normal |
crc32_init_reflect |
Correct Calculation Function | crc32_011() |
crc32_100() |
CRC32 Checksum "123456789" | 0xCBF43926 |
CBF43926 |
CRC32 Hashing Confusion
Due to one person completely failing to understand CRC32 this misled other people to falsely conclude that CRC32 wouldn't make a good hash.
CRC32 was never intended for hash table use. There is really no good reason to use it for this purpose, and I recommend that you avoid doing so. If you decide to use CRC32, it's critical that you use the hash bits from the end opposite to that in which the key octets are fed in. Which end this is depends on the specific CRC32 implementation. Do not treat CRC32 as a "black box" hash function, and do not use it as a general purpose hash. Be sure to test each application of it for suitability.
Ironically, Bret Mulvey implemented a broken version of CRC32!
Original code:
public class CRC32 : HashFunction
{
uint[] tab;
public CRC32()
{
Init(0x04c11db7);
}
public CRC32(uint poly)
{
Init(poly);
}
void Init(uint poly)
{
tab = new uint[256];
for (uint i=0; i<256; i++)
{
uint t = i;
for (int j=0; j<8; j++)
if ((t & 1) == 0)
t >>= 1;
else
t = (t >> 1) ^ poly;
tab[i] = t;
}
}
public override uint ComputeHash(byte[] data)
{
uint hash = 0xFFFFFFFF;
foreach (byte b in data)
hash = (hash << 8)
^ tab[b ^ (hash >> 24)];
return ~hash;
}
}
While Bret's original hashing page is now dead ...
... Fortunately there are mirrors available:
NOTE: Bret has a new page but he siliently omits CRC32 due to his misunderstanding of CRC32.
This Stack Overflow question:
mentions Bret didn't implement CRC32 correctly, but no one actually says WHAT the bug(s) are!
We can inspect the:
- code
- data
to determine where Bret made his bugs.
Bad Code
Bret mismatched the table initialization with the CRC calculation.
- Incorrect table initialization.
Pardon the pun, but to re-hash:
CRC32 has two polynomials:
- The forward polynomial,
0x04C11DB7
, - The reverse polynomial,
0xEDB88320
, where the bits are reversed.
The CRC algorith comes in two forms:
- Normal initialization checks the top bit and shifts left,
- Reflected initialiation checks the bottom bit and shifts right.
Bret's Init()
used a forward polynomial mismatched with a 'reflected' bottom bit initialization.
- Incorrect CRC calculation
Bret's ComputeHash()
shifts left HOWEVER he didn't reverse the bits in both the data and final CRC value!
If we enumerate the possible ways someone could initialize the table with the forward polynomial 0x04C11DB7 we discover there are 8 crc values on the text "123456789". NONE of these are correct!
Reflected 0x04C11DB7: &1 >> [ 1] = rev. poly, [ 30] = rev.poly << 8 broken
Shift: Left , Rev. Data: 0, Rev. CRC: 0, 0xC9A0B7E5 no
Shift: Left , Rev. Data: 0, Rev. CRC: 1, 0xA7ED0593 no
Shift: Left , Rev. Data: 1, Rev. CRC: 0, 0x9D594C04 no
Shift: Left , Rev. Data: 1, Rev. CRC: 1, 0x20329AB9 no
Shift: Right, Rev. Data: 0, Rev. CRC: 0, 0xFC4F2BE9 no
Shift: Right, Rev. Data: 0, Rev. CRC: 1, 0x97D4F23F no
Shift: Right, Rev. Data: 1, Rev. CRC: 0, 0xFDEFB72E no
Shift: Right, Rev. Data: 1, Rev. CRC: 1, 0x74EDF7BF no
Why is that?
Why two forms?
You may be wondering why are there even two forms in the first place?
Back in the day when CRC was being first designed / implemented the hardware guys would shift bits out right-to-left using a barrel shifter.
When CRC was implemented in software someone noticed all this bit-reversal was unneccessary if they:
- reversed the polynomial
- reversed the reversed bytes
- reversed the reversed final CRC value
If we output a trace of how the crc32 is calculated using the tables ...
- Proper table for 0x04C11DB7
- Proper table for 0xEDB88320
... we get this output:
========== Bytes: 9 ==========
[ 31, 32, 33, 34, 35, 36, 37, 38, 39, ]
---------- Normal ----------
crc32=FFFFFFFF
^buf[0000]: 31 -> 8C bits reversed
= crc32[ 73 ]: 07BE5ED7 ^ FFFFFF__
crc32=1208C43E
^buf[0001]: 32 -> 4C bits reversed
= crc32[ 5E ]: 04D219C1 ^ 08C43E__
crc32=4CDD350D
^buf[0002]: 33 -> CC bits reversed
= crc32[ 80 ]: 04C11DB7 ^ DD350D__
crc32=B439EDEE
^buf[0003]: 34 -> 2C bits reversed
= crc32[ 98 ]: 017C56B6 ^ 39EDEE__
crc32=3AF83826
^buf[0004]: 35 -> AC bits reversed
= crc32[ 96 ]: 02A7B8C0 ^ F83826__
crc32=C7A3502C
^buf[0005]: 36 -> 6C bits reversed
= crc32[ AB ]: 00639B0D ^ A3502C__
crc32=7934B16F
^buf[0006]: 37 -> EC bits reversed
= crc32[ 95 ]: 0140D816 ^ 34B16F__
crc32=06693FF5
^buf[0007]: 38 -> 1C bits reversed
= crc32[ 1A ]: 00791D40 ^ 693FF5__
crc32=0AA4F8A6
^buf[0008]: 39 -> 9C bits reversed
= crc32[ 96 ]: 02A7B8C0 ^ A4F8A6__
crc32=9B63D02C
~=649C2FD3
=CBF43926 bits reversed
pass
---------- Reflect ==========
crc32=FFFFFFFF
^buf[0000]: 31
= crc32[ CE ]: 61043093 ^ __FFFFFF
crc32=7C231048
^buf[0001]: 32
= crc32[ 7A ]: CF3ECB31 ^ __7C2310
crc32=B0ACBB32
^buf[0002]: 33
= crc32[ 01 ]: 04C11DB7 ^ __B0ACBB
crc32=77B79C2D
^buf[0003]: 34
= crc32[ 19 ]: 6ED82B7F ^ __77B79C
crc32=641C1F5C
^buf[0004]: 35
= crc32[ 69 ]: 8E6C3698 ^ __641C1F
crc32=340AC5E3
^buf[0005]: 36
= crc32[ D5 ]: 065E2082 ^ __340AC5
crc32=F68D2C9E
^buf[0006]: 37
= crc32[ A9 ]: D3E6A601 ^ __F68D2C
crc32=AFFC9660
^buf[0007]: 38
= crc32[ 58 ]: 5E9F46BF ^ __AFFC96
crc32=651F2550
^buf[0008]: 39
= crc32[ 69 ]: 8E6C3698 ^ __651F25
crc32=340BC6D9
~=CBF43926
pass
Bad Data
Bret's code generates this bogus CRC table:
00000000, 06233697, 05C45641, 03E760D6, 020A97ED, 0429A17A, 07CEC1AC, 01EDF73B,
04152FDA, 0236194D, 01D1799B, 07F24F0C, 061FB837, 003C8EA0, 03DBEE76, 05F8D8E1,
01A864DB, 078B524C, 046C329A, 024F040D, 03A2F336, 0581C5A1, 0666A577, 004593E0,
:
052568B7, 03065E20, 00E13EF6, 06C20861, 072FFF5A, 010CC9CD, 02EBA91B, 04C89F8C,
009823B6, 06BB1521, 055C75F7, 037F4360, 0292B45B, 04B182CC, 0756E21A, 0175D48D,
048D0C6C, 02AE3AFB, 01495A2D, 076A6CBA, 06879B81, 00A4AD16, 0343CDC0, 0560FB57,
Which generates this "bad checksum" using the bogus CRC table:
---------- Mismatched Polynomial and Calculation ----------
crc32=FFFFFFFF
^buf[0000]: 31
= crc32[ 73 ]: 07BE5ED7 ^ FFFFFF__
crc32=FE449FAD
^buf[0001]: 32
= crc32[ B2 ]: 03FDE69B ^ 449FAD__
crc32=40E09BEC
^buf[0002]: 33
= crc32[ 8C ]: 02DEA580 ^ E09BEC__
crc32=E725B2D7
^buf[0003]: 34
= crc32[ CB ]: 0592C1D7 ^ 25B2D7__
crc32=259D5DD6
^buf[0004]: 35
= crc32[ 89 ]: 06F704FA ^ 9D5DD6__
crc32=9CF5B2DB
^buf[0005]: 36
= crc32[ F0 ]: 009823B6 ^ F5B2DB__
crc32=F3F2769A
^buf[0006]: 37
= crc32[ 1F ]: 0450BC3A ^ F2769A__
crc32=F21C8336
^buf[0007]: 38
= crc32[ EE ]: 02EBA91B ^ 1C8336__
crc32=1F32C140
^buf[0008]: 39
= crc32[ 83 ]: 07267D61 ^ 32C140__
crc32=365F481A
~=C9A0B7E5
FAIL
a) If he had actually followed his own advice ...
Do not treat CRC32 as a "black box" hash function,
... and inspected
- the checksum, and
- the table
to verify they were correct he would have noticed that all the high nibbles were set to zero! This is dilluting the effectiveness of the CRC32 hash!
Compare and constrast when the tables are correctly initialized:
- CRC Polynomial 0x04C11DB7:
00000000, 04C11DB7, 09823B6E, 0D4326D9, 130476DC, 17C56B6B, 1A864DB2, 1E475005,
2608EDB8, 22C9F00F, 2F8AD6D6, 2B4BCB61, 350C9B64, 31CD86D3, 3C8EA00A, 384FBDBD,
4C11DB70, 48D0C6C7, 4593E01E, 4152FDA9, 5F15ADAC, 5BD4B01B, 569796C2, 52568B75,
:
E3A1CBC1, E760D676, EA23F0AF, EEE2ED18, F0A5BD1D, F464A0AA, F9278673, FDE69BC4,
89B8FD09, 8D79E0BE, 803AC667, 84FBDBD0, 9ABC8BD5, 9E7D9662, 933EB0BB, 97FFAD0C,
AFB010B1, AB710D06, A6322BDF, A2F33668, BCB4666D, B8757BDA, B5365D03, B1F740B4,
- CRC32 Polynomial 0xEDB88320:
00000000, 77073096, EE0E612C, 990951BA, 076DC419, 706AF48F, E963A535, 9E6495A3,
0EDB8832, 79DCB8A4, E0D5E91E, 97D2D988, 09B64C2B, 7EB17CBD, E7B82D07, 90BF1D91,
1DB71064, 6AB020F2, F3B97148, 84BE41DE, 1ADAD47D, 6DDDE4EB, F4D4B551, 83D385C7,
:
AED16A4A, D9D65ADC, 40DF0B66, 37D83BF0, A9BCAE53, DEBB9EC5, 47B2CF7F, 30B5FFE9,
BDBDF21C, CABAC28A, 53B39330, 24B4A3A6, BAD03605, CDD70693, 54DE5729, 23D967BF,
B3667A2E, C4614AB8, 5D681B02, 2A6F2B94, B40BBE37, C30C8EA1, 5A05DF1B, 2D02EF8D,
Fixing Bret's Code
There two ways we could fix this depending on which polynomial we want to use.
-
Fix using the reverse polynomial
a) Table Initialization
This involves:
- Use the reverse polynomial
Init( 0xEDB88320 );
b) CRC Calculation
* Change the incorrect left-shift to right-shift:
hash = tab[ (b ^ hash) & 0xFF ] ^ ((hash >> 8) & 0xFFFFFF); // Clamp 'hash >> 8' to 24-bit
- Fix using the forward polynomial
a) Table initialization
This involves fixing the CRC initialization to
* set the initial crc value
* if the high-bit is set then shift-left and XOR the polynomial
for( int bite = 0; bite < 256; bite++ )
{
int crc = bite << 24;
for( int bit = 0; bit < 8; bit++ )
{
// optimized: if (crc & (1 << 31))
if (crc < 0)
{
crc <<= 1;
crc ^= POLY;
}
else
{
crc <<= 1;
}
}
tab[ bite ] = crc;
}
b) CRC Calculation
The data bytes must be bit-reversed
From:
hash = (hash << 8) ^ tab[ b ^ (hash >> 24)];
To:
hash = tab[ (reverse8[ b ] ^ (hash >> 24)) & 0xFF ] ^ (hash << 8);
Thus along with verifing the calculation was correct he would have come to a different conclusion about the validity of CRC32 as a hashing function.
TL:DR; Don't assume! VERIFY your code + data.
CRC32 or CRC33?
Technically, the CRC 32-bit constant 0x04C11DB7
is really a 33-bit constant 0x104C11DB7
which is classified as an IEEE-802 CRC.
See RFC 3385
Polynomial | Binary |
---|---|
0x04C11DB7 | 00000100_11000001_00011101_10110111 |
0x104C11DB7 | 1_00000100_11000001_00011101_10110111 |
Since 64-bit computing was hideously expensive at the time of when the CRC33 polynomial got truncated down to 32-bits without any loss in generality even though this gives different results then a pure 33-bit implementation:
Polynomial | 64-Bit reversed | 33-Bit reversed |
---|---|---|
0x104C11DB7 | 0xEDB8832080000000 | 0x1DB710641 |
TL:DR; Demo
#include <stdio.h>
#include <string.h>
#include <stdint.h>
uint32_t REVERSE[ 256 ]; // 8-bit reverse bit look-up table
unsigned int reflect32( const unsigned int x )
{
unsigned int bits = 0;
unsigned int mask = x;
for( int i = 0; i < sizeof(x)*8; i++ )
{
bits <<= 1;
if( mask & 1 )
bits |= 1;
mask >>= 1;
}
return bits;
}
uint32_t reverse32( const uint32_t x )
{
return 0
| REVERSE[ (x >> 24) & 0xFF ] << 0L
| REVERSE[ (x >> 16) & 0xFF ] << 8L
| REVERSE[ (x >> 8) & 0xFF ] << 16L
| REVERSE[ (x >> 0) & 0xFF ] << 24L;
}
void Reverse_Init()
{
for( int byte = 0; byte < 256; byte++ )
REVERSE[ byte ] = (reflect32( byte ) >> 24) & 0xFF;
}
const unsigned int CRC32_POLY = 0x04C11DB7; // normal = shift left
const unsigned int CRC32_REVERSE = 0xEDB88320; // reverse = shift right
/* */ unsigned int CRC32_FORWARD[256]; // init with poly = 0x04C11DB7
/* */ unsigned int CRC32_REFLECT[256]; // init with poly = 0xEDB88320
void CRC32_Init()
{
for( unsigned short byte = 0; byte < 256; byte++ )
{
unsigned int crc = (unsigned int) byte;
for( char bit = 0; bit < 8; bit++ )
if( crc & 1 ) crc = (crc >> 1) ^ CRC32_REVERSE; // reverse/reflected Form
else crc = (crc >> 1);
CRC32_REFLECT[ byte ] = crc;
}
for( unsigned short byte = 0; byte < 256; byte++ )
{
unsigned int crc = (unsigned int) byte << 24;
for( char bit = 0; bit < 8; bit++ )
if( crc & (1L << 31)) crc = (crc << 1) ^ CRC32_POLY; // forward Form
else crc = (crc << 1);
CRC32_FORWARD[ byte ] = crc;
}
if (CRC32_REFLECT[8] != (CRC32_REVERSE >> 4))
printf("ERROR: CRC32 Reverse Table not initialized properly! 0x%08X != 0x%08X\n", CRC32_REFLECT[8], CRC32_REVERSE );
if (CRC32_FORWARD[1] != CRC32_POLY)
printf("ERROR: CRC32 Forward Table not initialized properly! 0x%08x != 0x%08X\n", CRC32_FORWARD[1], CRC32_POLY );
if (CRC32_FORWARD[16] != (CRC32_POLY << 4))
printf("ERROR: CRC32 Forward Table not initialized properly! 0x%08x != 0x%08X\n", CRC32_FORWARD[16], CRC32_POLY );
}
unsigned int crc32_reverse( const unsigned char *pData, int nLength )
{
unsigned int crc = -1; // CRC32_INIT
while( nLength --> 0 )
crc = CRC32_REFLECT[ (crc ^ *pData++) & 0xFF ] ^ (crc >> 8); // reverse/reflected form
return ~crc; // ^CRC32_DONE
}
unsigned int crc32_forward( const unsigned char *pData, int nLength )
{
unsigned int crc = -1; // CRC32_INIT
while( nLength --> 0 )
crc = CRC32_FORWARD[ ((crc >> 24) ^ REVERSE[*pData++]) & 0xFF ] ^ (crc << 8); // normal form
return reflect32(~crc); // ^CRC32_DONE
}
int main()
{
CRC32_Init();
Reverse_Init();
const char *STATUS[2] = { "FAIL", "pass" };
const char *data = "123456789";
const int len = strlen(data);
unsigned int crc = crc32_reverse( (unsigned char*) data, len );
unsigned int nor = crc32_forward( (unsigned char*) data, len );
printf( "Input: %s\n", data );
printf( "Len: %d\n", len );
printf( "\n" );
int pass1 = (crc == 0xCBF43926); // "123456789" -> 0xCBF43926
printf( "Reflect CRC: 0x%08X\n", crc );
printf( "Status: %s\n", STATUS[ pass1 ] );
printf( "\n" );
int pass2 = (nor == 0xCBF43926);
printf( "Normal CRC: 0x%08X\n", nor );
printf( "Status: %s\n", STATUS[ pass2 ] );
printf( "\n" );
return 0;
}
References
-
"A PAINLESS GUIDE TO CRC ERROR DETECTION ALGORITHMS" http://www.ross.net/crc/download/crc_v3.txt
-
"Reversing CRC β Theory and Practice." http://stigge.org/martin/pub/SAR-PR-2006-05.pdf
-
"Hackers Delight", Chapter 14 http://www.hackersdelight.org/crc.pdf
-
Bret Mulvey https://web.archive.org/web/20130420172816/http://home.comcast.net/~bretm/hash/8.html http://archive.is/6rBZF#selection-251.11-251.21 http://home.comcast.net/~bretm/hash/8.html
-
http://stackoverflow.com/questions/26049150/calculate-a-32-bit-crc-lookup-table-in-c-c
-
http://stackoverflow.com/questions/10953958/can-crc32-be-used-as-a-hash-function
-
"Fastest CRC32 for x86, Intel and AMD, + comprehensive derivation and discussion of various approaches" https://github.com/komrad36/CRC