Error Detection and Control in Data Transfer
22. 11. 1998
Erno Alhoniemi
Ti
Helsinki University of Technology
eal@cc.hut.fi
Abstract
Today's data is transfered over reliable communications. This is achieved
by using error detection and control. There are different communications
media with different quality of service (QOS). Consequently, there are
several error detection and control schemes for different signal conditions.
Some fundamental methods have existed for decades. In addition, there are
also new methods that can take advantage of the nature of the signal type.
These methods can be used to ensure reliable or adequate communications
and faster data transfer rates which are used in today's innovative applications.
1. Error sources
No errors can occur in the ideal transmission medium. However, none of
the transmission media is ideal. The signal representing the data is always
subject to various error sources. As the signal propagates along the transmission
media its amplitude decreases. This phenomenon is called as the signal
attenuation. The signal cannot be detected if it is too weak. In addition,
as the length of the medium increases the waveform also changes during
the transmission. This phenomenon is called as the delay distortion. The
signal cannot be recognized if it is too distorted. Furthermore, the transmission
media can also be a subject to interference resulting from other cables
or signals caused by electromagnetic radiation. The medium itself may also
cause constant white noise. All transmission errors increase as the length
of the transmission medium inreases. [6]
2 Error bursts
In practice, data communications systems are designed so that the transmission
errors are within acceptable rate. Under normal circumstanced there are
only few errors. However, it is possible that the signal conditions can
be sometimes so weak that sometimes the signal cannot be received at all.
It is also possible that sometimes the interference signal is stronger
than the signal to be transmitted. Consequently, the data sent during the
break is lost. The contiguous blocks of data corrupted by the error signal
are called error bursts. The length and frequency of the error bursts depend
on the quality of the data link which in turn depends on the transmission
medium and the signal conditions. Therefore, the error detection and control
should be able to handle as many errors that possible. However, the applications
limit which error detection and control schemes are suitable. All applications
benefit from the effeciency of the sheme to be used. In addition, some
applications may require short codeword length or low latency whereas some
other applications might prefer extremere low error rate, for example.
3 Error detection
Error detection is a method that allows some sommunications erros to be
detected. The data is encoded so that the encoded data contains additional
redundant information about the data. The data is decoded so that the additional
redundant information must match the original information. This allows
some errors to be detected. Unfortunately, some error bursts may cause
incorrectly received blocks which pass the error detection test. Therefore,
good error detection schemes are designed so that this should occur as
infrequently as possible.
3.1 Parity checking
Parity checking is a primitive character-based error detection method.
The characters are encoded so that an additional bit is added to each character.
The additional bit is to 0 or 1 according to the number of bits set in
the character. The resulting number is either even or odd. There extra
bit is set according to this result and according to which parity setting,
either even or odd, is being used. If even parity is used, the extra bit
is always set so that the codeword always contains an even number of
bits set. In even parity, the number of bits set in the codeword is always
odd. The decoding is done simply by checking the codeword and removing
the extra bit. The parity checking will only detect one bit error bursts
in each codeword. Parity checking has been used in character-based terminals
but it is not useful for today's reliable communications. However, it is
being used in memory chips to ensure correct operation.
3.2 Block check
Block check is a block-based error detection method. The data is divided
in blocks in the encoding process. An additional block check is added to
each block of data. The check is calculated from the current block. The
receiver also performs the same calculation on the block and compares the
calculated result with the received result. If these checks are equal the
blocks are likely to be valid. Unfortunately, the problem with all block
checks is that the block check is shorter than than the block. Therefore,
there are several different blocks that all have the same checksum. It
is possible that the date is corrupted by a random error burst that modifies
the block contents so that the block check in the corrupted frame also
matches the corrupted data. In this case the error is not detected. Even
best block checks cannot detect all error bursts but good block checks
mimimize this propability. However, the reliability increases as the length
of the block check increases.
3.2.1 Block check sum
Block check sum is a primitive block check sum that is the sum of all characters
in the block. The result is a character that is equally long as the characters
in the block. Therefore, the result is sometimes referred as the block
check character (BCC). Unfortunately, even a long BCC may allow relatively
simple errors. In other words, it is easy to find different blocks that
generate the same block check sum. Calculating check sum is certainly fast
and easy but the reliability of the check sum is not adequate for today's
reliable communications. However, due to its speed it is used in some applications
which require that the calculation is done by the software. [6]
3.2.2 Cyclic redundancy check
The Cyclic redundancy check (CRC) is an intelligent alternative for block
check sum. It is calculated by dividing the bit string of the block by
a generator polynomial. The value of the cyclic redundancy check is the
reminder of the calculation which is one bit shorter than the generator
polynomial. This value is also sometimes referred as the frame check sequence
(FCS). However, the generator polynomial must be chosen carefully. CRC
is a stronger check than the block check sum and it is being used in today's
reliable communication. Calculating the CRC requires slighty more processing
than the check sum. It can be easily implemented by using shift registers
and xor-operations both in hardware and software implementations. The most
common CRC is the CCITT CRC-16. The CRC is able to detect all single error
bursts up to the number of bits in the CRC and most random errors [6].
Unfortunately, the error bursts in many applications are random and the
CRC can only find only most of them. Therefore, a random error burst on
the block might sometimes match both the original and the corrupted data.
This is the case especially when CRC-16 is being used. Therefore, the number
of bits in the checksum should be as high as possible or at least high
enough. CRC-32 is enough for many todays applications. It has been tested
that as the number of bits in the CRC increases the propability that two
different blocks with the same CRC are encountered during the data transmission
approaches zero [4]. The test results show that the even CRC-48 can be
enough to ensure than no two different blocks having the same CRC can be
found in practice [4].
4 Error Control
Error control is a method that can be used to recover the corrupted data
whenever possible. There are two basic types of error control which
are backward error control and forward error control. In backward error
control, the data is encoded so that the encoded data contains additional
redundant information which is used to detect the corrupted blocks of data
that must be resent. On the contrary, in forward error control (FEQ), the
data is encoded so that it contains enough redundant information to recover
from some communications errors. [6]
4.1 Backward error control
Using backward error correction requires a two-way communiction channel.
The sender divides the data in blocks encodes the data with redundan additional
information that is used to detect communications errors. The receiver
applies error detection and if the receiver detects errors in the incoming
blocks it requests the sender to resend the block. This mechanism is also
called as the automatic repeat request (ARQ). The ARQ can always repair
any errors it can detect but it causes a variable delay on the data transfer.
The basic types of ARQ are idle RQ and continuous RQ. The backward error
control is used in many data tranfer protocols. In addition, in order to
use data compression, the communications errors must always be corrected.
[6]
4.1.1 Idle RQ
Idle RQ is a fundamental backward correction scheme used in many protocols.
The data is transfered in packets by using error detection. The receiver
checks the incoming packets and sends an acknowledgement (ACK) to the sender
if the packet was valid. If the sender receives an acknowledgement in the
specified time it sends the next packet to the receiver. Otherwise, the
sender must resend the packet. Idle RQ is very simple but it is often too
inefficient. It can only send data to one direction at a time. In addition,
the delay on the data transfer may result in situation where only a small
fraction of the capapacity of the communications link is used. [6]
4.1.2 Continuous RQ
Continuous RQ is an improvement over Idle RQ when there is a delay on the
data transfer. It allows several packets to be sent continuously. Therefore,
the sender must use packet numbering. The receiver also receives packets
continuously and sends an acknowledgement containing a packet number after
receiveing a valid packet. In case the sender cannot get an acknowledgement
it starts resending packets in either of the two ways. In selective repeat,
only each block that was corrupted is resent. Selective repeat is complex
but it is useful when error are common. In go-back-n, once a corrupted
block is detected, the transmission continues from the corrupted block
and all blocks after the corrupted blocks are discarded. Go-back-n is less
effective than selective repeat but it is also very simple and it is almost
equally effective if the errors are infrequent. [6]
4.1 Forward error control
Using FEQ requires a one-directional channel only. The data is encoded
to contain enough additional redundant information to receover from some
communication errors. Unfortunately, the FEQ cannot recover from errors
only when enough information has been succesfully received. There is no
way to recover from errors when this is not the case. However, the FEQ
operates continuously without any interrupts and it ensures constant delay
on the data transfer which is useful for real-time applications. [6]
4.1.1 Hamming single-bit code
Hamming single-bit is a block code in which each block is separate from
each other. The input block size can be made as small as necessary. The
number of bit errors to be corrected can be specified by using a adding
enough redundant information in the encoding process. The minimum number
of bits that differ on all possible codewords is called the Hamming distance
[8]. This yields that the error bursts shorter than the Hamming distance
can be detected. Therefore, to detect communications errors, the Hamming
distance of the line code must be longer than the length of the error bursts.
A N-bit error requires an encoding with a Hamming distance of N+1
for detection and 2*N+1 for recovery. [6]
4.1.2 Convolutional forward error correction
In block codes, each block is independent of other blocks. On the contrary,
in the convolutional forward error correction, the encoded data depends
on both the current data and the previous data. The convolutinal encoder
contains a shift register that is shifted each time a new bit is added.
The length of the shift register is called as the constraint length and
it contains the memory of the encoder. Each new input bit is then encoded
with each bit in the shift register by using modulo-2 adders. The decoding
is more difficult than the encoding. The data is decoded by using the Viterbi
algorithm that tries to find the best solution for the decoding. Of course,
all errors cannot be corrected but the error rate can be decreased. The
convolutional error correction has an advantage of using all previous correctly
received bits for error correction. [1,6]
4.1.3 Golay forward error correction
Golay codes are block codes that allow short codewords. The perfect Golay
code is an encoding that encodes 12 bits into 23 bits, denoted by (23,
12). It allows the correction of three or fewer single bit errors. The
extended Golay code contains an additional parity bit which allows up to
four errors to be detected. The resulting code is (24,12) which is also
known as half-rate Golay code. The decoding can be performed by using either
soft or hard decisions. The soft decisions provide better error correction
but requiere more processing. Golay codes are useful in applications that
require low latency and short codeword length. Therefore, Golay codes are
used in real-time applications and radio communications. [2,5,7]
4.1.4 Reed-Solomon forward error correction with interleaving
The Reed-Solomong forward error correction with interleaving is a forward
error correction scheme that is intended to be used with high-quality video
communications. The encoding is performed by filling a two dimensional
array of 128*47 octets, that is, 128 octets in each row containing 124
octets of data and 4 octets of redundant check data. The encoding is done
by filling the buffer columnwise 47 octets at a time. After this has been
repeated 124 times the buffer is full and it is encoded by writing each
row at a time. This encoding allows two cells to be corrected or 4 cells
to be reconstructed. Two interleave buffers are required because a single
buffer can only be either read or written at a time. The decoder also needs
two buffers for the same reason. The encoder writes a row at a time, then
performs the possible recovery and reconstruction of defective cells and
reads the array columnwise. Unfortunately, the encoding and decoding both
cause an additional delay on the data transfer that is equal to the transmission
time of a single buffer. This type of encoding does not completely repair
all error but it ensures hight-quality throughput in real time. [3]
5 Conclusion
The conventional error detection and correction methods are still being
used is conventional application. Today's services can be implemented by
using conventional methods. However, new methods have been discovered to
increase the data transfer rate on some applications. These methods can
certainly take advantage of the nature of the signal in a way that could
not be achieved by using conventional methods. These new techniques are
necessary for tomorrow's multimedia applications especially as the amount
of the IP-based and mobile network traffic increases.
6 References
[1] 4i2i Communications Ltd, 4i2i: Convolutional Coding, 15. 9. 1998 [ref.
19. 11. 1998]
<http://www.4i2i.com/viterbi.htm>
[2] 4i2i Communications Ltd, 4i2i: Golay Code, 14. 9. 1998 [ref. 19.
11. 1998]
<http://www.4i2i.com/golay.htm>
[3] Cellware Broadband, AAL1 Reed-Solomon FEC submodule specification,
21. 8. 1996 [ref. 19. 11. 1998]
<http://www.cellware.de/boards/aal1fec.html>
[4] Dillon, Matt, CRC16 - CRC64 test results on 18.2M dataset, w/program
source, 14. 11. 1997 [ref. 27. 9. 1998]
<http://www.backplane.com/diablo/crc64.html>
[5] EDN Magazine, Communication-systems error simulation resolves trade-offs,
18. 6. 1998 [ref. 19. 11. 1998]
<http://www.ednmag.com/reg/1998/061898/13df_03.cfm>
[6] Halsal, Fred, Data Communication, Computer Networks and Open Systems,
Fourth Edition, Addison-Wesley Publishhing Company Inc., United States
of America, 1996, 997 sivua, [ref. 27. 9. 1998]
[7] Kantronics Co., Inc., G-TOR, 2. 11. 1998, [ref. 19. 11. 1998]
<http://kantronics.com/gtor.htm>
[8] Togneri, Roberto, Hamming Distance, 8. 5. 1995 [ref. 27. 9. 1998]
<http://ciips.ee.uwa.edu.au/~roberto/units/itc314/help/node21.html>
7 Further information
Overview
of the Global System for Mobile Communications
Overview of GSM