teknik informatika: DATA ENCODING

DATA ENCODING

A distinction was made between analog and digital data, and analog

and digital signals. Figure 2.13 suggested that either form of data could be

encoded into either form of signal.

Figure 4.1 is another depiction that emphasizes the process involved. For digital

signaling, a data source g(t), which may be either digital or analog, is encoded

into a digital signal x(t). The actual form of x(t) depends on the encoding technique,

and is chosen to optimize use of the transmission medium. For example, the encoding

may be chosen to either conserve bandwidth or to minimize errors.

The basis for analog signaling is a continuous, constant-frequency signal

known as the carrier signal. The frequency of the carrier signal is chosen to be compatible

with the transmission medium being used. Data may be transmitted using a

carrier signal by modulation. Modulation is the process of encoding source data

onto a carrier signal with frequency fc. All modulation techniques involve operation

on one or more of the three fundamental frequency-domain parameters:

* Amplitude

* Frequency

* Phase

The input signal m(t) may be analog or digital and is called the modulating

signal, or baseband signal. The result of modulating the carrier signal is called the

modulated signal s(t). As Figure 4.lb indicates, s(t) is a bandlimited (bandpass) signal.

The location of the bandwidth on the spectrum is related to fc and is often

centered on fc. Again, the actual form of the encoding is chosen to optimize some

characteristic of the transmission.

Each of the four possible combinations depicted in Figure 4.1 is in widespread

use. The reasons for choosing a particular combination for any given communication

task vary. We list here some representative reasons:

* Digital data, digital signal. In general, the equipment for encoding digital data

into a digital signal is less complex and less expensive than digital-to-analog

modulation equipment.

Analog data, digital signal. Conversion of analog data to digital form permits

the use of modern digital transmission and switching equipment. The advantages

of the digital approach were outlined in Section 2.2.

@ Digital data, analog signal. Some transmission media, such as optical fiber and

the unguided media, will only propagate analog signals.

* Analog data, analog signal. Analog data in electrical form can be transmitted

as baseband signals easily and cheaply; this is done with voice transmission

over voice-grade lines. One common use of modulation is to shift the bandwidth

of a baseband signal to another portion of the spectrum. In this way,

multiple signals, each at a different position on the spectrum, can share the

same transmission medium; this is known as frequency-division multiplexing.

We now examine the techniques involved in each of these four combinations

and then look at spread spectrum, which fits into several categories.

DIGITAL DATA,DIGITAL SIGNAL

A digital signal is a sequence of discrete, discontinuous voltage pulses. Each pulse

is a signal element. Binary data are transmitted by encoding each data bit into signal

elements. In the simplest case, there is a one-to-one correspondence between

bits and signal elements. An example is shown in Figure 2.15, in which binary 0 is

represented by a lower voltage level and binary 1 by a higher voltage level. As we

shall see in this section, a variety of other encoding schemes are also used.

First, we define some terms. If the signal elements all have the same algebraic

sign, that is, all positive or negative, then the signal is unipolar. In polar signaling,

one logic state is represented by a positive voltage level, and the other by a negative

voltage level. The data signaling rate, or just data rate, of a signal is the rate, in

bits per second, that data are transmitted. The duration or length of a bit is the

amount of time it takes for the transmitter to emit the bit; for a data rate R, the bit

duration is 1/R. The modulation rate, in contrast, is the rate at which signal level is

changed; this will depend on the nature of the digital encoding, as explained below.

The modulation rate is expressed in bauds, which means signal elements per second.

Finally, the terms mark and space, for historical reasons, refer to the binary digits 1

and 0, respectively. Table 4.1 summarizes key terms; these should be clearer when

we see an example later in this section.

The tasks involved in interpreting digital signals at the receiver can be summarized

by again referring to Figure 2.15. First, the receiver must know the timing

of each bit. That is, the receiver must know with some accuracy when a bit begins

and ends. Second, the receiver must determine whether the signal level for each bit

position is high (1) or low (0). In Figure 2.15, these tasks are performed by sampling

each bit position in the middle of the interval and comparing the value to a threshold.

Because of noise and other impairments, there will be errors, as shown.

What factors determine how successful the receiver will be in interpreting the

incoming signal? We saw in Lesson 2 that three factors are important: the signalto-

noise ratio (or, better, EbINo), the data rate, and the bandwidth. With other factors

held constant, the following statements are true:

An increase in data rate increases bit error rate (the probability that a bit is

received in error).

An increase in SIN decreases bit error rate.

An increase in bandwidth allows an increase in data rate.

There is another factor that can be used to improve performance, and that is

the encoding scheme: the mapping from data bits to signal elements. A variety of

encoding schemes are in use. In what follows, we describe some of the more common

ones; they are defined in Table 4.2 and depicted in Figure 4.2.

Before describing these techniques, let us consider the following ways of evaluating

or comparing the various techniques.

Signal spectrum. Several aspects of the signal spectrum are important. A lack

of high-frequency components means that, less bandwidth is required for

transmission. In addition, lack of a direct-current (dc) component is also

desirable. With a dc component to the signal, there must be direct physical

attachment of transmission components; with no dc component, ac-coupling

via transformer is possible; this provides excellent electrical isolation, reducing

interference. Finally, the magnitude of the effects of signal distortion and

interference depend on the spectral properties of the transmitted signal. In

practice, it usually happens that the transfer function of a channel is worse

\near the band edges. Therefore, a good signal design should concentrate the

transmitted power in the middle of the transmission bandwidth. In such a

case, a smaller distortion should be present in the received signal. To meet this

objective, codes can be designed with the aim of shaping the spectrum of the

transmitted signal.

Clocking. We mentioned the need to determine the beginning and end of each

bit position. This is no easy task. One rather expensive approach is to provide

a separate clock-lead to synchronize the transmitter and receiver. The alternative

is to provide some synchronization mechanism that is based on the

transmitted signal; this can be achieved with suitable encoding.

Error detection. We will discuss various error-detection techniques in Lesson

6 and show that these are the responsibility of a layer of logic above the

signaling level known as data link control. However, it is useful to have some

error-detection capability built into the physical signaling-encoding scheme;

this permits errors to be detected more quickly.

Signal interference and noise immunity. Certain codes exhibit superior performance

in the presence of noise. This ability is usually expressed in terms of

a bit error rate.

Cost and complexity. Although digital logic continues to drop in price,

expense should not be ignored. In particular, the higher the signaling rate to

achieve a given data rate, the greater the cost. We will see that some codes

require a signaling rate that is, in fact, greater than the actual data rate.

We now turn to a discussion of various techniques.

Nonreturn to Zero (NRZ)

The most common, and easiest, way to transmit digital signals is to use two different

voltage levels for the two binary digits. Codes that follow this strategy share the

property that the voltage level is constant during a bit interval; there is no transition

(no return to a zero voltage level). For example, the absence of voltage can be used

to represent binary 0, with a constant positive voltage used to represent binary 1.

More commonly, a negative voltage is used to represent one binary value and a positive

voltage is used to represent the other. This latter code, known as Nonreturnto-

Zero-Level (NRZ-L), is illustrated1 in Figure 4.2. NRZ-L is generally the code

used to generate or interpret digital data by terminals and other devices. If a different

code is to be used for transmission, it is typically generated from an NRZ-L

signal by the transmission system. (In terms of Figure 1.2, NRZ-L is g(t) and the

encoded signal is s(t).)

A variation of NRZ is known as NRZI (Nonreturn to zero, invert on ones).

As with NRZ-L, NRZI maintains a constant voltage pulse for the duration of a bit

time. The data themselves are encoded as the presence or absence of a signal transition

at the beginning of the bit time. A transition (low-to-high or high-to-low) at

the beginning of a bit time denotes a binary 1 for that bit time; no transition indicates

a binary 0.

NRZI is an example of differential encoding. In differential encoding, the signal

is decoded by comparing the polarity of adjacent signal elements rather than

determining the absolute value of a signal element. One benefit of this scheme is

that it may be more reliable to detect a transition in the presence of noise than to

compare a value to a threshold. Another benefit is that with a complex transmission

layout, it is easy to lose the sense of the polarity of the signal. For example, on a

multidrop twisted-pair line, if the leads from an attached device to the twisted pair

are accidentally inverted, all 1s and 0s for NRZ-L will be inverted; this cannot happen

with differential encoding.

The NRZ codes are the easiest to engineer and, in addition, make efficient use

of bandwidth. This latter property is illustrated in Figure 4.3, which compares the

spectral density of various encoding schemes. In the figure, frequency is normalized

to the data rate. As can be seen, most of the energy in NRZ and NRZI signals is

between dc and half the bit rate. For example, if an NRZ code is used to generate

a signal with a data rate of 9600 bps, most of the energy in the signal is concentrated

between dc and 4800 Hz.

The main limitations of NRZ signals are the presence of a dc component and

the lack of synchronization capability. To picture the latter problem, consider that

with a long string of Is or 0s for NRZ-L, or a long string of 0s for NRZI, the output

is a constant voltage over a long period of time. Under these circumstances, any

drift between the timing of transmitter and receiver will result in a loss of synchronization

between the two.

Because of their simplicity and relatively low frequency response characteristics,

NRZ codes are commonly used for digital magnetic recording. However, their

limitations make these codes unattractive for signal transmission applications.

Multilevel Binary

A category of encoding techniques known as multilevel-binary address some of the

deficiencies of the NRZ codes. These codes use more than two signal levels. Two

examples of this scheme are illustrated in Figure 4.2: bipolar-AM1 (alternate mark

inversion) and pseudoternary.2

In the case of the bipolar-AM1 scheme, a binary 0 is represented by no line

signal, and a binary 1 is represented by a positive or negative pulse. The binary 1

pulses must alternate in polarity. There are several advantages to this approach.

First, there will be no loss of synchronization if a long string of Is occurs. Each 1

introduces a transition, and the receiver can resynchronize on that transition.

A long string of 0s would still be a problem. Second, because the 1 signals alternate

in voltage from positive to negative, there is no net dc component. Also, the

bandwidth of the resulting signal is considerably less than the bandwidth for NRZ

(Figure 4.3). Finally, the pulse-alternation property provides a simple means of

error detection. Any isolated error, whether it deletes a pulse or adds a pulse,

causes a violation of this property.

The comments of the previous paragraph also apply to pseudoternary. In this

case, it is the binary 1 that is represented by the absence of a line signal, and the

binary 0 by alternating positive and negative pulses. There is no particular advantage

of one technique over the other, and each is the basis of some applications.

Although a degree of synchronization is provided with these codes, a long

string of 0s in the case of AM1 or 1s in the case of pseudoternary still presents a

problem. Several techniques have been used to address this deficiency. One

approach is to insert additional bits that force transitions. This technique is used in

ISDN for relatively low data-rate transmission. Of course, at a high data rate, this

scheme is expensive, as it results in an increase in an already high signal-transmission

rate. To cope with this problem at high data rates, a technique that involves

scrambling the data is used; we will look at two examples of the technique later in

this section.

Thus, with suitable modification, multilevel binary schemes overcome the

problems of NRZ codes. Of course, as with any engineering design decision, there

is a tradeoff. with multilevel binary coding, the line signal may take on one of three

levels, but each signal element, which could represent log, 3 = 1.58 bits of information,

bears only one bit of information, making multilevel binary not as efficient as

NRZ coding. Another way to state this is that the receiver of multilevel binary signals

has to distinguish between three levels (+A, -A, 0) instead of just two levels

in the other signaling formats previously discussed. Because of this, the multilevel

binary signal requires approximately 3 dB more signal power than a two-valued signal

for the same probability of bit error; this is illustrated in Figure 4.4. Put another

way, the bit error rate for NRZ codes, at a given signal-to-noise ratio, is significantly

less than that for multilevel binary.

Biphase

There is another set of alternative coding techniques, grouped under the term

biphase, which overcomes the limitations of NRZ codes. Two of these techniques,

Manchester and Differential Manchester, are in common use.

In the Manchester code, there is a transition at the middle of each bit period.

The mid-bit transition serves as a clocking mechanism and also as data: a low-tohigh

transition represents a 1, and a high-to-low transition represents a o . I~n Dif-

ferential Manchester, the mid-bit transition is used only to provide clocking. The

encoding of a 0 is represented by the presence of a transition at the beginning of a

bit period, and a 1 is represented by the absence of a transition at the beginning of

a bit period. Differential Manchester has the added advantage of employing differential

encoding.

All of the biphase techniques require at least one transition per bit time and

may have as many as two transitions. Thus, the maximum modulation rate is twice

that for NRZ; this means that the bandwidth required is correspondingly greater.

On the other hand, the biphase schemes have several advantages:

Synchronization. Because there is a predictable transition during each bit

time, the receiver can synchronize on that transition. For this reason, the

biphase codes are known as self-clocking codes.

No dc component. Biphase codes have no dc component, yielding the benefits

described earlier.

Error detection. The absence of an expected transition can be used to detect

errors. Noise on the line would have to invert both the signal before and after

the expected transition to cause an undetected error.

As can be seen from Figure 4.3, the bulk of the energy in biphase codes is

between one-half and one times the bit rate. Thus, the bandwidth is riasonably narrow

and contains no dc component; however, it is wider than the bandwidth for the

multilevel binary codes.

Biphase codes are popular techniques for data transmission. The more common

Manchester code has been specified for the IEEE 802.3 standard for baseband

coaxial cable and twisted-pair CSMAICD bus LANs. Differential Manchester has

been specified for the IEEE 802.5 token ring LAN, using shielded twisted pair.

Modulation Rate

When signal encoding techniques are used, a distinction needs to be made between

data rate (expressed in bits per second), and modulation rate (expressed in baud).

The data rate, or bit rate, is l/tB, where tg = bit duration. The modulation rate is

the rate at which signal elements are generated. Consider, for example, Manchester

encoding. The minimum size signal element is a pulse of one-half the duration of a

bit interval. For a string of all binary zeroes or all binary ones, a continuous stream

of such pulses is generated. Hence, the maximum modulation rate for Manchester

is 2/tB. This situation is illustrated in Figure 4.5, which shows the transmission of a

stream of 1 bits at a data rate of 1 Mbps using NRZI and Manchester. In general,

One way of characterizing the modulation rate is to determine the average

number of transitions that occur per bit time. In general, this will depend on the

exact sequence of bits being transmitted. Table 4.3 compares transition rates for

various techniques. It indicates the signal transition rate in the case of a data stream

of alternating Is and Os, and for the data stream that produces the minimum and

maximum modulation rate.

Although the biphase techniques have achieved widespread use in local-area-network

applications at relatively high data rates (up to 10 Mbps), they have not been

widely used in long-distance applications. The principal reason for this is that they

require a high signaling rate relative to the data rate. This sort of inefficiency is

more costly in a long-distance application.

Another approach is to make use of some sort of scrambling scheme. The idea

behind this approach is simple: sequences that would result in a constant voltage

level on the line are replaced by filling sequences that will provide sufficient transitions

for the receiver's clock to maintain synchronization. The filling sequence must

be recognized by the receiver and replaced with the original data sequence. The filling

sequence is the same length as the original sequence, so there is no data-rate

increase. The design goals for this approach can be summarized as follows:

No dc component

No long sequences of zero-level line signals

0 No reduction in data rate

Error-detection capability

Two techniques are commonly used in long-distance transmission services;

these are illustrated in Figure 4.6.

A coding scheme that is commonly used in North America is known as bipolar

with S-zeros substitution (BSZS). The coding scheme is based on a bipolar-AMI.

We have seen that the drawback of the AM1 code is that a long string of zeros may

result in loss of synchronization. To overcome this problem, the encoding is

amended with the following rules:

If an octet of all zeros occurs and the last voltage pulse preceding this octet

was positive, then the eight zeros of the octet are encoded as 000t-4-+.

If an octet of all zeros occurs and the last voltage pulse preceding this octet

was negative, then the eight zeros of the octet are encoded as 000-+0+-.

This technique forces two code violations (signal patterns not allowed in

AMI) of the AM1 code, an event unlikely to be caused by noise or other transmission

impairment. The receiver recognizes the pattern and interprets the octet as

consisting of all zeros.

A coding scheme that is commonly used in Europe and Japan is known as the

high-density bipolar-3 zeros (HDB3) code (Table 4.4). As before, it is based on the

use of AM1 encoding. In this case, the scheme replaces strings of four zeros with

sequences containing one or two pulses. In each case, the fourth zero is replaced

with a code violation. In addition, a rule is needed to ensure that successive violations

are of alternate polarity so that no dc component is introduced. Thus, if the

last violation was positive, this violation must be negative, and vice versa. The table

shows that this condition is tested for by knowing whether the number of pulses

since the last violation is even or odd and the polarity of the last pulse before the

occurrence of the four zeros.

Figure 4.3 shows the spectral properties of these two codes. As can be seen,

neither has a dc component. Most of the energy is concentrated in a relatively sharp

spectrum around a frequency equal to one-half the data rate. Thus, these codes are

well suited to high data-rate transmission.

teknik informatika

Translate

Wednesday, September 28, 2016

DATA ENCODING

No comments:

Post a Comment