Ethernet Data Encoding


Specifications for Data Encoding

Because of its faster rate of data transmission, and therefore higher frequency on the wire, data encoding in Fast Ethernet is slightly more complex than it is in 10Mbps Ethernet. This essay will discuss the different data encoding methods used in Fast Ethernet and the new sublayers added to the Data Link Layer in order to support those data encoding methods.

This page discusses two general types of data encoding:

Data Representation: 100BASE-TX and 100BASE-FX

When Fast Ethernet was being developed, many of the problems of moving information at 100Mbps had already been solved in FDDI. Instead of spending a lot of time to find a different solution, many of the techniques used in FDDI on fiber and UTP (copper) were adopted.

When discussing in detail the inner workings of Ethernet, it is often useful to discuss the IEEE Data Link layer sublayers. In an IEEE network, the upper layer protocols, such as TCP/IP, Novell NetWare, etc., link into the Data Link layer via the LLC sub-layer (Logical Link Control). The LLC sub-layer is specified by IEEE 802.2 and spans all IEEE networks, including 802.5 Token Ring. In 802.3 Ethernet, the LLC sub-layer passes data to the MAC sub-layer (Medium Access Control), which prepares the data for transmission as a frame. The MAC sub-layer also performs the familiar CSMA/CD media access functions (as described in the Ethernet MEDIA ACCESS TOPIC). It is only below the MAC sub-layer that Fast Ethernet differs from classic Ethernet. This difference was explained, along with a diagram, in the LAYERS TOPIC. Fast Ethernet introduces a number of new functional layers to the physical layer. This discussion will center around the PCS (Physical Coding Sub-layer), PMA (Physical Medium Attachment sub-layer), and PMD (Physical Medium Dependent sub-layer). There is a diagram of these layers in the LAYERS TOPIC as well.

The primary difficulty with 100 Mbps transmission of data is that high-frequency signals don't propagate well over either twisted pair or fiber. 10 Mbps Ethernet uses Manchester encoding to include a clock signal with every data bit. However, the clocking almost doubles the rate of transmission, so a worst-case scenario would transmit 10Mbps of data with a 20MHz waveform. For 100 Mbps, the waveform frequency would peak at 200MHz. Category 5 UTP is only rated at 100MHz, so Fast Ethernet would be impossible to implement. Even fiber has difficulties with a 200MHz waveform.

Two forms of waveform encoding have been implemented as alternatives to Manchester encoding at the PMA sub-layer. The Compendium has a complete discussion of these alternative encoding formats in the DATA ENCODING TOPIC, later in this Fast Ethernet section. These formats are briefly discussed here.

100BASE-FX uses NRZI (Non-Return-to-Zero, Invert-on-one). To decrease the frequency even further on UTP, 100BASE-TX adds a variation of NRZI at the PMD sub-layer called either MLT-3 (Multiple Level Transition - 3 levels) or NRZI-3.

While NRZI and MLT-3 solve the problem of slowing down the data-carrier frequency, they run the risk of losing clock-signal encoding. A steady stream of zeros, not uncommon in data, would be represented by NRZI and MLT-3 as a total lack of transitions. With no transitions, the receiving station has no clear incoming signal. With no incoming signal, the phase-locked-loop the receiving station uses to recover the clock signal can drift. If enough drift is introduced into the perceived clock, the station can perceive false data from the data stream. To combat this problem, data is first encoded at the PCS sub-layer using 4B5B translation, replacing every 4 bits of data with a 5-bit code, specified in 802.3 section Every possible 4-bit pattern is assigned a 5-bit code. Instead of sending the actual 4 bits across the wire, the 5-bit code is transmitted. This is referred to as 4B/5B encoding and more information will be presented later in the DATA ENCODING TOPIC in this Fast Ethernet section. Since there are 16 possible 4-bit patterns and 32 possible 5-bit patterns, it is possible to pick symbols which ensure that every valid 4-bit representation has at least two transitions, enough transitions ensure proper clocking. 4 of the 16 symbols left over have been defined for use as starting and ending delimiters for each packet. In addition, the 4B5B pattern of all transitions has been defined for use as an idle signal. In 100BASE-TX and 100BASE-FX, stations are continually synched to each other via an idle signal whenever there is no data being transmitted. Unlike other implementations of Ethernet, 100BASE-TX and 100BASE-FX are never quiet: there is always at least an idle signal being transmitted.

The combination of 4B5B and either NRZI or MLT-3 yields a concise waveform for 100BASE-FX and 100BASE-TX. The signal is slow enough to be transmitted across fiber or UTP-5, but dense enough to encode 100Mbps. 100BASE-FX uses 4B5B encoding to increase the speed from 100 Mbps to 125 Mbps, then cuts it in half with NRZI to a maximum of 62.5 MHz. 100BASE-TX cuts that figure in half again with MLT-3, down to a maximum of 31.25 MHz. If the resulting 100BASE-TX signal does not meet FCC emissions requirements, an optional stream cipher has been defined to allow "scrambling" of the signal in the PMD sub-layer, after PCS's 4B5B encoding and PMA's NRZI encoding, but before PMD's MLT-3 encoding. The idea of the stream cipher is primarily to randomize the signal to reduce the resulting EMI emissions. From a security standpoint the cipher is worthless.

Data Representation: 100BASE-T4

100BASE-T4 uses a slight modification of the physical sublayers from 100BASE-X. 31.25 MHz is slow enough to travel well over UTP-5, but still far too fast for UTP-3, which is only certified for 16MHz. For Fast Ethernet over Category 3 UTP, 100BASE-T4 has been developed. Using the techniques of 100BASE-TX as a starting point, 100BASE-T4 combines and optimizes 4B5B and MLT-3 into a new encoding scheme called 8B6T encoding which will be discussed in the SIGNAL ENCODING TOPIC later in this Fast Ethernet section. 8B6T replaces each 8-bit byte with a code of only 6 tri-state symbols. To represent 256 different bytes, 729 tri-state symbols are possible. Unlike MLT-3, no progression from 1 to 0 to -1 is required: 8B6T allows an arbitrary use of these three states. 256 symbols have been chosen as a one-to-one remapping of every possible byte, similar to 4B5B. The remapping table is listed in IEEE 802.3 Annex 23A, with nine symbols used for starting and ending delimiters and control characters, listed in IEEE 802.3 section To see how the signal is slowed down in T4, let's work through the math. The fastest waveform required in 8B6T is alternating extreme states, +1 to -1, encoding two tri-state symbols in a single wavelength. Unlike 4B5B, the carrier wave frequency only needs to be 3/4 the speed of the bit stream, as only 6 signals are used to communicate 8 bits. The fastest possible waveform frequency is 37.5MHz, which is still too fast for UTP-3, so one more technique is needed.

The final slowdown in 8B6T comes from fanning the transmitted signal out to three cable pairs instead of a single pair. This is called "T4 Multiplexing" and will be discussed in detail in the SIGNAL ENCODING TOPIC later in this Fast Ethernet section. The maximum speed waveform required is now only 12.5 MHz, easily slow enough for even Category 3 twisted pair. When bytes in 100BASE-T4 are transmitted, three pairs are used in transmission. Of the four pairs in Category 3 cabling, three pairs are used to send data while the remaining pair listens for collisions. The same pairs used in 10Base-T are defined as "'dedicated" direction pairs, always transmitting on one pair while always listening on the other. The two pairs not used in any other flavor of Ethernet are available for either direction to "borrow", carrying information in the same direction as the data flow. Three pairs are always used to send data in a single direction across a 100BASE-T4 cable. Because pairs are appropriated for transmission, it is not possible to do full-duplex in 100BASE-T4. However, 802.3 does not actually define any full-duplex Ethernet, other than to mention its existence in Annex 28B.

The manner in which bytes are fanned out in 100BASE-T4 is to send an entire 6T byte down a single pair, then send the next 6T byte down the next pair, the following 6T byte down the third pair, and the next 6T byte down the original pair, which by now has finished transmitting the first 6T byte. The traditional Ethernet preamble has been modified to allow the receiving station not only to sync up to the transmit clock on each pair, but also to notify the receiving station of the pair transmit order, so each frame can be re-assembled in the proper order at the receive end. It's easy to sequence packets by adding fields to the protocols, but difficult to sequence individual bytes in a single packet. Similar to an Ethernet preamble or a 100BASE-X idle signal, the first several T4 codes are designed to allow the receive end to sync to the transmit clock in the easiest possible manner, alternating between extreme signals, +1, -1, +1, etc. As an added sanity-check, five end-of-symbol codes are used to inform the receiving station when to calculate the CRC.

Contact Us Savvius Blog Follow Savvius on Twitter Like Savvius on Facebook Follow Savvius on LinkedIn Follow Savvius on YouTube