Home

Performance of Checksums and CRCs over Real Data


Author(s) : Senior Member Craig Partridge Michael Greenwald Member Jonathan Stone Jim Hughes, 
Publisher : N/A
Publication Date : 1998
ISSN : N/A
Abstract : Checksum and CRC algorithms have historically been studied under the assumption that the data fed to the algorithms was uniformly distributed. This paper examines the behavior of checksums and CRCs over real data from various UNIX file systems. We show that, when given real data in small to modest pieces (e.g., 48 bytes), all the checksum algorithms have skewed distributions. These results have implications for CRCs and checksums when applied to real data. They also can cause a spectacular failure rate for both the TCP and ones-complement Fletcher checksums when trying to detect certain types of packet splices. When measured over several large file-systems, the 16 bit TCP checksum performed about as well as a 10 bit CRC. We show that for fragmentation-and-reassembly error models, the checksum contribution of each fragment are, in effect, coloured by the fragment's offset in the splice. This coloring explains the performance of Fletcher's sum on non-uniform data, and shows that placing checksum fields in a packet trailer is theoretically no worse than a header checksum field. In practice, TCP trailer sums outperform even Fletcher header sums.,