## **High Speed Adaptive Equalizer**

## **Using Phase Detector Output**

**Ki-Hyuk Lee** 

The Graduate School

**Yonsei University** 

**Department of Electrical and Electronic Engineering** 

# **High Speed Adaptive Equalizer**

### **Using Phase Detector Output**

A Dissertation

Submitted to the Department of Electrical and Electronic Engineering

and the Graduate School of Yonsei University

in partial fulfillment of the requirements for the degree of

Doctor of Philosophy

**Ki-Hyuk Lee** 

June 2007

This certifies that the dissertation of Ki-Hyuk Lee is approved.

**Thesis Supervisor: Woo-Young Choi** 

**Gun-Hee Han** 

Hong-Il Yun

Seung-Woo Lee

Jae-Wook Lee

**The Graduate School** 

Yonsei University

June 2007

### Contents

| Figure Indexv                                           |
|---------------------------------------------------------|
| Table Indexxi                                           |
| Abstractxii                                             |
| 1. Introduction1                                        |
| 1-1. High Speed Serial Links1                           |
| 1-2. Channel Characteristics5                           |
| 1-3. Equalizing Filters9                                |
| 1-4. Outline                                            |
| 2. Feed-Forward Equalizer using Phase Detector Output13 |
| 2-1. Adaptation Methods13                               |
| 2-2. Behavioral simulation19                            |
| 2-3. Circuit Design23                                   |
| 2-3-1. System Structure23                               |
| 2-3-2. Equalizing Filter25                              |
| 2-3-3. PLL (Phase Locked Loop)                          |
| 2-3-4. CDR (Clock and Data Recovery)                    |
| 2-3-5. Adaptation circuit and phase detector42          |
| 2-4. Simulation Results45                               |
| 2-5. Measurement Results52                              |
| 2-6. Conclusion                                         |

| 3. Decision Feedback Equalizer with Data Pattern Filtering           |
|----------------------------------------------------------------------|
|                                                                      |
| 3-1. Stringent requirements for the first post-cursor cancellation65 |
| 3-2. Data pattern filtering68                                        |
| 3-3. Behavioral simulation72                                         |
| 3-4. Circuit design                                                  |
| 3-4-1. System configuration79                                        |
| 3-4-2. Decision feedback equalizer81                                 |
| 3-4-3. Adaptation circuit84                                          |
| 3-4-4 Phase detector in CDR circuit86                                |
| 3-5. Simulation results91                                            |
| 3-6. Summary93                                                       |
| 3-7. Conclusion96                                                    |
| Appendix: Debugging Review on Design                                 |
| Reference107                                                         |
| 국문 요약112                                                             |

## **Figure Index**

| Figure 1- 1 Parallel Bus2                                         |
|-------------------------------------------------------------------|
| Figure 1- 2 Parallel Bus Layout2                                  |
| Figure 1- 3 Serial Link                                           |
| Figure 1- 4 Backplanes5                                           |
| Figure 1- 5 Frequency response of the backplane6                  |
| Figure 1- 6 Input and output pulse after channel transmission7    |
| Figure 1-7 Eye-diagrams of (a) input and (b) output data after    |
| transmission8                                                     |
| Figure 1- 8 Tap delay line filter10                               |
| Figure 1- 9 Split-path equalizer11                                |
| Figure 1- 10 Decision feedback equalizer11                        |
|                                                                   |
| Figure 2- 1 Adaptation methods with spectral filtering14          |
| Figure 2- 2 Frequency spectrum comparison15                       |
| Figure 2- 3 Adaptation with LMS algorithm15                       |
| Figure 2- 4 Adaptation with phase detector output16               |
| Figure 2- 5 Equalizer gain error criteria17                       |
| Figure 2- 6 Linearized model with continuous time approximation18 |
| Figure 2-7 Behavioral model with CPPSIM19                         |

| Figure 2- 8 Modified phase detector in CDR20                             |
|--------------------------------------------------------------------------|
| Figure 2- 9 Convergence of filter coefficient21                          |
| Figure 2- 10 Recovered clock                                             |
| Figure 2- 11 Eye-diagram of input data with ISI22                        |
| Figure 2- 12 Eye-diagram of equalized output22                           |
| Figure 2- 13 System block diagram24                                      |
| Figure 2- 14 Equalizer filter using split path26                         |
| Figure 2- 15 Equalizer gain curve for path gain26                        |
| Figure 2- 16 High pass filter27                                          |
| Figure 2- 17 Combiner                                                    |
| Figure 2- 18 Equalizer control circuit                                   |
| Figure 2- 19 Equalizer gain curve for input code                         |
| Figure 2- 20 PLL block diagram                                           |
| Figure 2- 21 VCO delay stage and controller                              |
| Figure 2- 22 VCO gain partitioning (a) Coarse control (b) Fine control33 |
| Figure 2- 23 Schematic of charge pump35                                  |
| Figure 2- 24 Charge pump current vs control voltage                      |
| Figure 2- 25 Differential control voltage generator                      |
| Figure 2- 26 Clock and data recovery circuit                             |
| Figure 2- 27 Phase interpolator                                          |
| Figure 2- 28 Delay buffer                                                |

| Figure 2- 29 Control code for the PI & delay buffer41                 |
|-----------------------------------------------------------------------|
| Figure 2- 30 Recovered I/Q clock41                                    |
| Figure 2- 31 Modified phase detector                                  |
| Figure 2- 32 Clock tree                                               |
| Figure 2- 33 Control code for equalizing filter44                     |
| Figure 2- 34 Channel model45                                          |
| Figure 2- 35 PCB stripline                                            |
| Figure 2- 36 Package model46                                          |
| Figure 2- 37 Frequency responses of channel model47                   |
| Figure 2- 38 (a) Input signal after 40cm transmission, (b) equalizer  |
| output signal, and (c) equalizer control code48                       |
| Figure 2- 39 (a) Input signal after 80cm transmission, (b) equalizer  |
| output signal, and (c) equalizer control code49                       |
| Figure 2- 40 (a) Input signal after 120cm transmission, (b) equalizer |
| output signal, and (c) equalizer control code l50                     |
| Figure 2- 41 ISI jitter simulation for various transmission lengths51 |
| Figure 2- 42 Chip layout                                              |
| Figure 2- 43 Experimental setup54                                     |
| Figure 2- 44 Frequency responses of PCB trace                         |
| Figure 2- 45 PLL output clock                                         |
| Figure 2- 46 Recovered clock from CDR57                               |

| Figure 2- 47 (a) Eye diagram of input data with ISI after 80cm PCB         |
|----------------------------------------------------------------------------|
| trace (b) Eye diagram of equalizer Output58                                |
| Figure 2- 48 (a) Eye diagram of input data with ISI after 120cm PCB        |
| trace (b) Eye diagram of equalizer Output59                                |
| Figure 2- 49 (a) Eye diagram of input data with ISI after 160cm PCB        |
| trace (b) Eye diagram of equalizer Output60                                |
| Figure 2- 50 Equalizer output jitters with various PCB lengths61           |
| Figure 2- 51 Equalizer output jitters with various input data swings after |
| adaptation62                                                               |

| Figure 3- 1 DFE with first feedback tap                                |
|------------------------------------------------------------------------|
| Figure 3- 2 Timing requirement of the first feedback tap67             |
| Figure 3- 3 Pattern dependent jitter induced from the feedback delay67 |
| Figure 3- 4 Analysis of the feedback data pattern69                    |
| Figure 3- 5 Behavioral simulation results for the pattern analysis70   |
| Figure 3- 6 Phase detection table71                                    |
| Figure 3- 7 Behavioral model for DFE and CDR circuit72                 |
| Figure 3- 8 Input signal with ISI73                                    |
| Figure 3- 9 Feedback tap coefficients74                                |
| Figure 3- 10 Behavioral simulation results without pattern filtering75 |
| Figure 3- 11 Behavioral simulation results with pattern filtering76    |

| Figure 3- 12 Decision margin for decision feedback delay78        | 8 |
|-------------------------------------------------------------------|---|
| Figure 3- 13 System block diagram80                               | 0 |
| Figure 3- 14 Decision Feedback Equalizer                          | 1 |
| Figure 3- 15 Voltage combiner82                                   | 2 |
| Figure 3- 16 (a) Differential latch and (b) flip-flop             | 3 |
| Figure 3- 17 LMS circuit for adaptation                           | 5 |
| Figure 3- 18 Differential offset buffer                           | 5 |
| Figure 3- 19 Phase detector with pattern filtering                | 7 |
| Figure 3- 20 Differential (a) AND and (b) OR circuit              | 8 |
| Figure 3- 21 Clock trees for clock distribution90                 | 0 |
| Figure 3- 22 Eye-diagram of input data signal with ISI9           | 1 |
| Figure 3-23 Eye-diagram of equalizer output and recovered clock92 | 2 |
| Figure 3- 24 Chip layout                                          | 4 |

Figure A- 1 Parasitic resistance option (a) REDUCTION : NO (b)

| REDUCTION : YES                                          | 98  |
|----------------------------------------------------------|-----|
| Figure A- 2 schematic of VCO                             | 99  |
| Figure A- 3 Layout of VCO                                | 100 |
| Figure A- 4 VCO curve                                    | 100 |
| Figure A- 5 Input data eye-diagram after 120cm PCB trace | 102 |
| Figure A- 6 schematic of the data path                   | 102 |

| Figure A- 7 Layout of the data path103                                    |
|---------------------------------------------------------------------------|
| Figure A- 8 Simulated eye-diagram after the equalizing filter with (a)    |
| ideal netlist, (b)netlist with default RC extraction (c)netlist with full |
| RC extraction104                                                          |
| Figure A- 9 Simulated eye-diagram after the output buffer with (a) ideal  |
| netlist, (b)netlist with default RC extraction (c)netlist with full RC    |
| extraction105                                                             |
| Figure A- 10 Ac response of the data path when the equalizer control      |
| code is (a) 1111 and (b) 0000106                                          |

### **Table Index**

| 4  |
|----|
|    |
|    |
|    |
| 63 |
|    |
| 63 |
| {  |

| Table 3- | - 1 Performance summary | 95 |
|----------|-------------------------|----|
|----------|-------------------------|----|

### Abstract

### **High Speed Adaptive Equalizer**

### **Using Phase Detector Output**

By

Ki-Hyuk Lee

# Department of Electrical and Electronic Engineering The Graduate School

Yonsei University

To cancel the inter-symbol interference in the high speed serial links, adaptive equalizers are widely used. Among them, the adaptive equalizers in the receiver side are focused in this thesis. For the feed forward equalizer, a new adaptation method for the split-path equalizer is proposed. Because that system configuration does not use any other analog blocks for the adaptation and it just utilizes the phase detector output information in the clock and data recovery circuit, the equalizer and the control block are simple and take a very small area and a small power. The equalizer prototype is designed with 0.13µm CMOS process and its operation is simulated with HSPICE. The prototype is operated at 2.5Gpbs from 40cm to 160cm PCB trace. The equalizer and control circuit takes only  $0.1 \times 0.25 \mu m^2$  die area and consumes about 6mW power at the 1.2V power supply.

In addition, for the decision feedback equalizer, the data pattern filtering method to the clock and data recovery circuit is proposed to relieve the pattern dependent jitters which are induced by the feedback delay. The filtering method is applied to the phase detector in the clock and data recovery circuit and it makes the recovered clock follows a certain data pattern's transition edges to stop the clock being delayed more through the clock recovery process. The decision feedback equalizer circuits with data pattern filtering are implemented with 0.13 $\mu$ m CMOS process at the target speed of 5Gbps. Its performance is simulated with HSPICE and compared with the one without data pattern filtering. The equalizer with data pattern filtering shows better performance in the recovered clock and the data decision margin. The entire circuit in this prototype takes 0.5x1.1 $\mu$ m<sup>2</sup> die area and consumes 72mW power at the 1.2V power supply.

Keywords: Serial link, Adaptive equalizer, Clock and data recovery circuit, Feed forward equalizer, Decision feedback equalizer.

### 1. Introduction

### 1-1. High Speed Serial Links

As the silicon technology advances and the data rate requirements increases, higher clock rate and I/O data rate are demanded. However, chip I/O is becoming a major bottleneck for system performance because the I/O count is limited compared to a large gate count.

Parallel bus system is widely used for PC and other applications for chip I/O. As shown in Fig. 1–1, it is a source-synchronous system which comprises of a group of connections for data and a single connection for clock. Thus the parallel bus requires trace matching for each data line as shown in Fig. 1–2 or DLL (Delay Locked Loop) to relieve the skew problems between the data lines. Also, as it uses typically single-ended signaling, it suffers large noise from channel environments.



Figure 1- 1 Parallel Bus



Figure 1- 2 Parallel Bus Layout

Compared to the parallel bus, a serial link has an independent single connection which takes low pin count and it makes easy to layout the board traces as shown in Fig. 1–3. And because it uses differential signaling, it can transmit higher data rates and longer distances. However, as it is a clock embedded system, it requires CDR (Clock and Data Recovery) and encoding overheads.



Figure 1- 3 Serial Link

Table 1 summarizes the comparison of the parallel bus and the serial link. Because of the higher speeds and longer distances, many industry standards utilizing serial links are being developed for short-range inter-chip links such as CPU memory applications and long-range backplane or coax links that arise in systems such as scalable multipleprocessor servers and high-speed routers/switches.

|              | Parallel Buses          | Serial Links               |
|--------------|-------------------------|----------------------------|
| Examples     | HyperTransport          | PCI Express (2.5, 5Gbps)   |
|              | RapidIO                 | Serial ATA (1.5, 3,        |
|              | SPI-4                   | (6Gbps))                   |
|              | PCI-X                   | Serial RapidIO (3.125Gbps) |
|              |                         | XAUI (4×3.125Gbps)         |
| Advantage    | Lower power, area,      | Higher speeds and longer   |
|              | latency.                | distances.                 |
|              |                         | Differential Signaling.    |
|              |                         | Easy board layout          |
| Disadvantage | Trace matching          | Overhead of CDR and        |
|              | requirement.            | encoding.                  |
|              | Skew problem.           |                            |
|              | high pin count.         |                            |
|              | Single-ended signaling. |                            |

Table 1-1 Comparison of parallel bus and serial link

### 1-2. Channel Characteristics

Although the needs for higher bandwidth in backplane increase, system vendors are reluctant to deploy new backplanes due to the high development cost as well as a widely installed base of legacy systems. These backplanes includes long copper trace on a FR-4 (flame resistance 4) dielectric, connectors and chip package parasitic as shown in Fig. 1-4.



Figure 1-4 Backplanes

As the data rate increase, dielectric loss, skin effect, and reflections in channel cause frequency-dependent loss as shown in Fig. 1-5. Equivalently, in the time domain, they disperse transmitted symbols, causing them to interfere with adjacent symbols as shown in Fig. 1-6.



Figure 1- 5 Frequency response of the backplane



Figure 1- 6 Input and output pulse after channel transmission

When PRBS (Pseudo-Random Bit Sequence) data is transmitted to the channel, the ISI (Inter-Symbol Interference) makes the eye-diagram of channel output signal closed, which presents difficulty in the clock and data recovery and results in high BER (Bit Error Rate) [1, 2]. Fig. 1–7 represents eye-diagram of input data and that of channel output data.







(b)

Figure 1-7 Eye-diagrams of (a) input and (b) output data after

transmission

#### 1-3. Equalizing Filters

To fully utilize the limited bandwidth of the channel, multi-level signaling such as four-level pulse amplitude signaling (PAM-4) can be used [3-7]. However it causes a large reflections and crosstalk, resulting small signal-to-noise ratio (SNR). Instead of that, to remove the ISI or to compensate the frequency dependent loss, many equalizing techniques have been utilized in chip I/O [8-28]. There are two types of analog equalizing filters which are widely used in high speed data links, FFE (Feed-Forward Equalizer) and DFE (Decision Feedback Equalizer).

One of the FFE is a tap-delay line filter shown in Fig. 1–8. It can be implemented in both transmitter and receiver. In the transmitter it is called pre-emphasis and its implementation is relatively easy than that at the receiver because the parallel data bus naturally supplies the data input for the filter [9]. However it causes more crosstalk problem and it needs channel information from the receiver for adaptation for varying channel characteristics.

In the receiver side it can be implemented in discrete-time filters or continuous-time filters. Discrete-time filters are realized with sampleand-hold (S&H) circuit for tap delay line [15, 17]. However, S&H circuit has drawbacks like limited settling time, distortion and nonlinearity. In the continuous-time filters, tap delay lines are realized with artificial transmission line made of on-chip planar inductors [13, 29]. Thus, it takes large chip area.



Figure 1-8 Tap delay line filter

Another structure of FFE is a continuous time equalizer using splitpath amplifier [14, 16, 21, 22]. Split-path amplifier divides the signal into two paths as shown in Fig. 1–9. One path comprises a high pass filter or peak response filter to amplify the high frequency component. The other path is an all pass filter to match the time delay of the first path. Weighted sum of two paths is equivalent to a variant gain high pass filter, whose gain factor can be varied by control the weight of those two paths.



Figure 1-9 Split-path equalizer

Although above mentioned equalizing filters compensate the frequency dependent loss when they are implemented in the receiver, they also boost the other crosstalk and noise components in those frequencies. However DFE does not boost the noise components [19, 20, 24–26]. DFE has a similar structure like a tap delay line filter as shown in Fig. 1–10. Because it uses the decided data values to cancel the ISI components at the decision time, it does not boost the noise components.



Figure 1- 10 Decision feedback equalizer

### 1-4. Outline

In this dissertation two types of equalizer, FFE using split-path amplifier and DFE, will be focused. Section II presents a new adaptation circuit of FFE using split-path amplifiers. That is a very simple and digitally controlled adaptation method, which uses phase detector outputs in the clock and data recovery circuits. Section III describes DFE with data pattern filtering, which modifies a phase detector in the clock and data recovery circuits to relieve additional pattern dependent jitters, which are resulted from the stringent time requirements of the first post-cursor cancellation.

# 2. Feed-Forward Equalizer using Phase Detector Output

#### 2-1. Adaptation Methods

An adaptation circuit to control the equalizer using split-path amplifiers is configured like Fig. 2-1. It compares the power ratio at two specific frequency points as shown in Fig. 2-2 using the passive filters to set the weighting factors of those two paths like Fig. 2-1 (a) [16] or to set the weighting factor of one path and the reference signal swing like Fig. 2-1 (b) [21]. Regenerative circuit generates the reference data spectrum at the point B. Then the feedback circuits with spectrum filtering compare the equalizer output data spectrum at the point A and reference data spectrum at the point B. and control the equalizer or the regenerative circuit to compensate high frequency loss. The reason for existence of two control paths is that relative high frequency compensation is needed for optimum signal integrity. The ISI is resulted from the relative high frequency loss compared to the low frequency loss and over compensation of high frequency also results in ISI. However, these spectral filtering methods have some disadvantages. The regenerative circuits have a speed limit to generate sharp-edged reference data and the adaptation circuits take a large area because of the passive components in filters, ac coupling and integrator.



(a)



(b)

Figure 2-1 Adaptation methods with spectral filtering



Figure 2- 2 Frequency spectrum comparison

Digital control with LMS (Least Mean Square) algorithm can be thought to control the split path equalizer as shown in Fig. 2–3 [25]. However amplitude information to extract error signal is not enough to control the two paths because the two paths are correlated, which is different from the tap-delay line filters. Because of that reason it results in different filter weightings depending on the loop conditions and the parameters.



Figure 2-3 Adaptation with LMS algorithm

Another adaptation method which uses the data edge information is introduced by [26, 30] and have been applied to a tap delay line filter or a DFE for optimizing each tap coefficient. However it can be applied to split path equalizer with two correlated paths as shown in Fig. 2-4. Because it extracts the error information from the data edges and does not need a low frequency reference, it needs to control the filter in a way to increase or decrease the high frequency gain relative to the low frequency gain. This exploits the property that minimizing differential crossing point jitter also maximizes the eye height.



Figure 2- 4 Adaptation with phase detector output

Fig. 2-5 explains the adaptation criteria. If the leading edge is late while the trailing edge is early, the eye is too wide and is overequalized. If the leading edge is early and the trailing edge is late, the eye width is too narrow and the eye is under-equalized. This information is used to control the equalizing filter gain. Digitally counting up and down based on the above criteria, equalizer filter gain is controlled digitally.

Because this information is obtained from the CDR (Clock and Data Recovery) circuit, it does not need any extra analog circuits for the adaptation circuits. Instead, digital blocks for equalizer gain error decision, binary counter and an equalizer control circuit constitute the adaptation circuits. Thus they constitute a simple structure and take a small area.



Figure 2- 5 Equalizer gain error criteria

The loop dynamics can be analyzed by making continuous time approximation. Fig. 2–6 shows the linearized model of the feedback system. Gr(s), Gs(s) and Ge(s) represent the required equalizer gain, the supplied equalizer gain and the equalizer gain error, respectively. Using these loop variables, the output transfer function can be derived as

$$\frac{Ge(s)}{Gr(s)} = \frac{1}{1 + s/p} \tag{2-1}$$

where p is the pole of the loop. Because the loop transfer function is the first order system, the response of the system is always stable. The equalizer gain error decays exponential with a time constant equal to 1/p for any required equalizer gain. However, in actual digital implementation, the digital "bang-bang" nature of the gain error criteria circuits and the feedback control results in dithering around the zero gain error point.



Figure 2- 6 Linearized model with continuous time approximation

#### 2-2. Behavioral simulation

The operation of the proposed adaptation method is verified first by behavioral simulation. Fig. 2–7 shows the behavioral model with CPPSIM. The transmission channel is modeled by a simple low pass filter with three poles. The high frequency path of the split-path equalizing filter is realized with a high pass filter and its path weighting is controlled by the multiplying factor. Half rate CDR circuit is modified from default bang-bang CDR in CPPSIM. The phase detector in CDR is also modified to outputs the control signal for the equalizing filter in the gate level as shown in Fig. 2–8.



Figure 2-7 Behavioral model with CPPSIM



Figure 2-8 Modified phase detector in CDR

Fig. 2-9 shows the simulation results that the filter coefficient is converging and dithering around the optimum value. Recovered halfrate quadrature clocks are shown in Fig. 2-10. The data rate is assumed to 5Gbps and the eye-diagram of input data with ISI is shown in Fig. 2-11. The eye-diagram of the equalized data after the convergence of the filter coefficient is shown in Fig. 2-12. For the various conditions and the parameters it is verified that the equalizer operates successfully.



Figure 2-9 Convergence of filter coefficient



Figure 2- 10 Recovered clock



Figure 2-11 Eye-diagram of input data with ISI



Figure 2-12 Eye-diagram of equalized output
# 2-3. Circuit Design

#### 2-3-1. System Structure

The system block diagram is shown in the Fig. 2–13. Received input data with ISI first goes through the equalizing filter. Then the output of the equalizing filter is fed into the phase detector in the CDR and output buffer for monitoring the signal quality. For the clock recovery process, the dual loop CDR is used. The PLL supply the reference quadrature clocks to the phase interpolator. The phase detector in the CDR outputs the C\_UP/C\_DN signal to control the equalizing filter gain in addition to the UP/DN signal to control the generated clock phase. All digital control circuits operate at the clock speed divided by 8 of the PLL clock.



Figure 2- 13 System block diagram

### 2-3-2. Equalizing Filter

For the equalizing filter, feed-forward equalizer with split-path amplifier is used because its implementation is simpler and easier to be realized in the receiver than the tap-delay line filter. Passive high pass filter is applied to the high frequency boosting path. If the equalizing filter is simplified like Fig. 2-14, the transfer function of the filter is expressed like Eq. (2-2). Then, the low frequency gain is dependent on the value of  $\alpha$ . Zero frequency or high frequency gain is dependent on the value of  $\beta$  if the values of  $\omega_{HPF}$  and  $\alpha$  is constant as shown in Eq. (2-3). The gain curves of the equalizer filter are shown in Fig. 2-15 for the different values of  $\beta$ .

$$H(s) = \frac{(\alpha + \beta)s + \alpha \cdot \omega_{HPF}}{s + \omega_{HPF}}$$
(2-2)

$$s_{zero} = \frac{\alpha}{\alpha + \beta} \omega_{HPF}$$
(2-3)



Figure 2-14 Equalizer filter using split path



Figure 2-15 Equalizer gain curve for path gain

In the high-frequency path, the passive high pass filter shown in Fig. 2-16 is implemented without other amplifiers and in the all-pass path there is no amplifier. Cutoff frequency of the high frequency filter is determined by R & C value and its value is set around 1GHz to give the filter gain peak around a half frequency of the data rate. In the figure 50 Ohm is termination resistance at the input stage. Then, the weightings of two paths are controlled by the current combining ratio at the combiner in the Fig. 2-17. Because there is no amplifier in each path, the delay mismatch between the two paths is negligible.



Figure 2-16 High pass filter



Figure 2-17 Combiner

Current controller of the combiner is shown like Fig. 2–18. Because there is no reference value for the amplitude value or the low frequency gain in this adaptation method, the path gains of the two paths can be controlled simultaneously in the different directions with the codes. Then the sum of the bias currents of two differential pairs in the combiner is constant for the all the control code, which result that the common mode voltage of the output signal of the equalizer is also constant. Fig. 2–19 shows the gain curves of the equalizer filter for the 4–bit input control codes. Because the weighting of the two paths are controlled differentially, the gain curve has wider dynamic range in the limited voltage swing than that of controlling the one path.



Figure 2-18 Equalizer control circuit



Figure 2- 19 Equalizer gain curve for input code

### 2-3-3. PLL (Phase Locked Loop)

To supply for the reference half-rate quadrature clock to the CDR circuit, PLL is constructed like Fig. 2-20. External reference clock is provided to the PFD (Phase and Frequency Detector). Then, PFD compares the phase and the frequency difference between the reference clock and the oscillator clock divided by 16, providing the UP/DN pulse to the charge pump. Then, the charge pump provides a control voltage to the VCO controller through the external loop filter.



Figure 2- 20 PLL block diagram

For the VCO (Voltage Controlled Oscillator), four stage ring oscillator is chosen to provide a wide tuning range to encompass process and temperature variations. Each delay stage of the VCO consists of a fast and a slow path whose outputs are summed together as shown in Fig. 2– 21 [31]. By steering the current between the fast and the slow paths, the amount of the delay achieved through each stage and hence the VCO frequency can be adjusted. Since the low supply voltage makes it difficult to stack transistors, the current variation is performed through mirror arrangement driven by PMOS differential pairs.

To alleviate the high VCO gain, the control of the VCO is split between a coarse control and a fine control. The fine control is provided from the charge pump and loop filter in the phase locked loop and the coarse control is provided externally to cover a large frequency tuning range.

Simulated VCO frequency is shown in Fig. 2-22 with the separate control. Thus, the VCO gain in the PLL loop is about 860MHz/V.



Figure 2- 21 VCO delay stage and controller



(a)



(b)

Figure 2- 22 VCO gain partitioning (a) Coarse control (b) Fine control

If the charge pump output voltage is near the supply voltage or the ground, sourcing and sinking current mismatch of the charge pump can occur due to the difference of the drain-source voltage of the biasing PMOS and NMOS. The current mismatch degrades the locking range and jitter characteristics. To avoid the current mismatch, charge pump as shown in Fig. 2-23 is utilized. The OTA in the charge pump makes the sourcing current is always equal to the sinking current varying the PMOS bias depending on the output voltage. Thus rail-to rail input is used for the OTA to handle the wide range of charge pump output voltage. Fig. 2-24 shows simulated sourcing and sinking currents of the charge pump varying the control voltage, V<sub>LPF</sub>. It shows negligible current mismatch for wide range of the control voltage.



Figure 2-23 Schematic of charge pump



Figure 2- 24 Charge pump current vs control voltage

Since the charge pump delivers the single output voltage, the pseudo differential control voltages are generated form the circuit shown in Fig. 2-25 to control the VCO controller differentially. The control voltage of the loop filter drives ctr+ and the positive port of the VCO controller. Then ctr- drives the negative port of the VCO controller.



Figure 2- 25 Differential control voltage generator

#### 2-3-4. CDR (Clock and Data Recovery)

Since many transceivers are often integrated in a single chip in the multi-channel applications, the dual-loop CDR has several advantages. Because the dual-loop CDR generates a clock signal using PI (Phase Interpolator) instead of VCO, the jitter is not accumulated in the phase tracking process and several transceivers can share the reference PLL. It is also robust to noise and can be realized in a relatively small area without physical capacitors because the PI is controlled digitally.

The block diagram of the CDR is shown in Fig. 2–26 [32]. The CDR receives two differential quadrature clocks from the reference PLL. Then two adjacent clocks are selected to generate the desired clock phase. For the purpose of having simple structure but high phase resolution, the target phase is generated from the quadrature reference clocks using the 16-level phase interpolator and the 4-level delay buffer, which are controlled by thermometer digital codes. Thus, the generated clock has totally 256-phase resolution. The PD compares the phase of the generated clock and that of the data so that the controller can produce the control code for the next output phase. It generates "UP" or "DN" pulse when the recovered clock has a later or earlier phase than data, respectively. Then the controller decides the next

phase that the phase selection circuit, PI, and delay buffer should produce by counting up or down its state. As a whole, the CDR forms a negative feedback loop and aligns the recovered clock to the input data.



Figure 2- 26 Clock and data recovery circuit

A structure of digitally-controlled PI is shown in Fig. 2-27. It performs weighted-sum of two quadrature clock signals. The current DACs, which currents mapped to the weighting coefficients for each of two clocks, are constructed in thermometer type. Although the binaryweighted DAC is simpler and efficient, it has a drawback which causes the current overshoot and dynamic phase jump when switching.



Figure 2-27 Phase interpolator

Delay Buffer is realized with a current-starved inverter as shown in Fig. 2-28. It adds four additional phase steps between the phases generated by the PI by the control code, Dcode[0-3], which is also thermometer code.



Figure 2- 28 Delay buffer

Fig. 2–29 shows the variation of the control codes for the PI and delay buffer in the clock recovery process and Fig. 2–30 shows the recovered clock jitters. In the simulation input data rate is 5Gbps and it has frequency offset of 50 ppm with the reference clock in the PLL.



Figure 2- 29 Control code for the PI & delay buffer



Figure 2- 30 Recovered I/Q clock

#### 2-3-5. Adaptation circuit and phase detector

Half-rate bang-bang phase detector is modified as shown in Fig. 2-31 to provide the control information of the equalizer filter. Instead of obtaining early or late information in one data edge, both edges of the data are used for getting the phase information to determine if the data is over-equalized or under-equalized. Comparing to the classical half-rate bang-bang phase detector, three more D flip-flop, two XOR gates and two AND gates are added.

As shown in the data diagram of Fig. 2–31, the data and the edge information are sampled sequentially through the two quadrature clock both at the rising and falling edges. Then, they are aligned at the rising edge of the CLKi and two pairs of UP1/DN1 and UP2/DN2 are calculated at the same time through the XOR gate. If the first edge outputs UP1 and the second edge outputs DN2, it means the data is over-equalized and it needs to decrease the equalizer gain. If the first edge outputs DN1 and the second edge outputs UP2, it means the data is under-equalized and it needs to increase the equalizer gain. Counting up and down the equalizer gain states through C\_UP and C\_DN pulses, the equalizer gain is adaptively controlled though the negative feedback.

Since CLKi drives six D flip-flops and CLKq drives two D flip-flops, a

clock tree is used to match the clock loading and the delay as shown in Fig. 2-32.



Figure 2- 31 Modified phase detector



Figure 2- 32 Clock tree

Fig. 2–33 shows the simulated results of the equalizer control code when the input data has the ISI. The control code converges to a certain value for the optimum signal integrity.



Figure 2-33 Control code for equalizing filter

## 2-4. Simulation Results

Backplane channel model for ISI simulation includes output buffer, transmission line and a package as shown in Fig. 2-34.



Figure 2- 34 Channel model

Transmission line is designed as stripline with the dimensions of Fig. 2-35 which has a differential impedance of 100  $\Omega$  and simulated with HSPICE w-model. Package parasitic is modeled as a serially connected R-L-C as shown in Fig. 2-36, which represents bondwire resistance, bondwire inductance, bondwire capacitance and die pad capacitance respectively..



Figure 2- 35 PCB stripline



Figure 2- 36 Package model

The frequency responses of the channel model are shown in Fig. 2-37, varying the transmission length. At the frequency range of 2.5GHz, the channel loss is about -25dB after 120cm transmission.



Figure 2- 37 Frequency responses of channel model

Fig. 2–38 shows the simulation results of the equalizer after the control code for the equalizing filter converged through the adaptation process. The first eye-diagram represents the input signal to the equalizing filter after the transmission of 40cm PCB trace and the second eye-diagram represents the equalizer output signal after the adaptation process. As the transmission length is increased, the improvement of the signal integrity through the equalizer is notable. Fig. 2–39 and Fig. 2–40 show the results of 80cm and 120cm transmission. Although at the 120cm transmission the equalizer control code reaches a maximum value, the equalizer output shows large decision and timing margins.



Figure 2- 38 (a) Input signal after 40cm transmission, (b) equalizer output signal, and (c) equalizer control code









Figure 2- 39 (a) Input signal after 80cm transmission, (b) equalizer output signal, and (c) equalizer control code









Figure 2- 40 (a) Input signal after 120cm transmission, (b) equalizer output signal, and (c) equalizer control code l

Fig. 2-41 shows the ISI jitter simulation results for the various transmission lengths. Peak-to peak jitter induced from ISI increases as the PCB transmission length increases. However, the jitter of the equalizer output signal is dramatically decreased because the equalizer compensates the high frequency loss adaptively. The reason why the equalizer output shows similar ISI jitter at 10cm is the control code of equalizing filter is dithering around the fist equalizer gain step. If the bit number of the control code is increased, the dithering effect will be minimized.



Figure 2- 41 ISI jitter simulation for various transmission lengths

## 2-5. Measurement Results

The chip is designed and fabricated with 0.13µm CMOS process. Fig. 2-42 shows the photomicrograph of the prototype chip. The entire circuits take the die area of 500µm x 500µm excluding output buffers. The equalizer filter and the controller occupy only about 100µm x 200µm. The power consumption of the entire circuit is approximately 48mW at the 1.2V supply voltage excluding output buffers, in which the equalizer filter and the controller consume only about 6mW.

Experimental setup is configured as Fig. 2–43. The die is bonded directly to the test board. A  $2^7$ –1 pseudorandom bit sequence (PRBS) pattern with 500mV swing in peak to peak is applied to the input of the PCB trace board, which have various trace lengths. Fig. 2–44 shows the frequency responses of the PCB traces with different transmission length from 40cm to 160cm. Dip in the frequency is due to the impedance mismatch of the PCB trace with connector cable. The channel loss of 160cm PCB trace is more than 20dB at 1.25GHz.

The output of the PCB trace with ISI is applied to the equalizer input. Then, the equalizer output is monitored through the signal integrity analyzer. Reference clock from RF signal generator is also applied to the PLL in the prototype. The PLL output clock is monitored through the spectrum analyzer and the recovered clock is monitored through the signal integrity analyzer.



Figure 2-42 Chip layout



Figure 2-43 Experimental setup



Figure 2-44 Frequency responses of PCB trace

Fig. 2-45 shows the PLL output clock when the reference clock of 78.125MHz is applied to the PLL circuit in the prototype. Because the input reference clock is compared with the clock from the VCO, which is divided by 16, PLL outputs 1.25GHz clock, which feeds the CDR circuits. Fig. 2-46 is the recovered clock from CDR circuit when the 2.5Gbps input data is applied to the equalizer circuit, which has the 21.2ps of rms jitters. Because the CDR circuit operates in half-rate frequency, the recovered clock is displayed like an eye in the signal integrity analyzer.

Fig. 2-47 (a) shows the eye diagram of the 2.5Gbps input signal after the 80cm PCB trace and Fig. 2-47 (b) shows the eye diagram of the equalized output signal after the prototype chip. Because it includes the frequency dependent loss of the output buffer, the equalized signal will shows less jitters in the chip inside. Fig. 2-48 and Fig. 2-49 show the tested results of the prototype chip at the different PCB trace length, 120cm and 160cm, respectively. Because the limited gain of the equalizer and the limited bandwidth of the output buffer, the equalizer output has residual ISI jitters.

The measured data input jitter and the equalizer output jitter at the 2.5Gbps data speed are summarized in Fig. 2–50 with various PCB trace lengths and compared with the minimum jitter when the equalizer is controlled with external code setting. Although the equalizer gain is

limited to compensate the frequency dependent loss completely at the large transmission length, it shows the equalizer circuit operates adaptively to minimize the ISI jitters for the different channel length and condition. Varying the input swing voltage, the equalizer output jitters are also measured in Fig. 2–51. Because the gain error is decided from phase detector digitally, its adaptation circuit operates robustly at the low input voltage swing.

It is considered that one of the reasons why the operation speed is lowered to 2.5Gbps is due to the residual parasitic, which are not included in the parasitic extraction options. This can be observed from the lowered maximum VCO frequency. Also, the small number of power pad in the whole circuits might limit the current supply to the core and the output buffer circuits, which resulted in a low output voltage swing and large jitters in the recovered clock. As a result, the equalizer circuit operates in lower speed than the target data rate and it gives a limited high-frequency gain around the 1.25GHz, half frequency of the data rate.

The circuit performance is summarized in Table 2–1 and compared with other prior works which used the spectrum filtering in Table 2–2.



Figure 2- 45 PLL output clock



Figure 2- 46 Recovered clock from CDR







(b)

Figure 2- 47 (a) Eye diagram of input data with ISI after 80cm PCB trace (b) Eye diagram of equalizer Output






(b)

Figure 2- 48 (a) Eye diagram of input data with ISI after 120cm PCB trace (b) Eye diagram of equalizer Output







(b)

Figure 2- 49 (a) Eye diagram of input data with ISI after 160cm PCB trace (b) Eye diagram of equalizer Output



Figure 2- 50 Equalizer output jitters with various PCB lengths



Figure 2- 51 Equalizer output jitters with various input data swings after

adaptation

| Process      | 0.13µm CMOS                |  |  |
|--------------|----------------------------|--|--|
| Data rate    | 2.5Gbps                    |  |  |
| Jitter (rms) | < 23ps (< PCB 160cm)       |  |  |
|              |                            |  |  |
| Power        | PLL: 16mW (@1.2V)          |  |  |
| consumption  | CDR: 26mW (@1.2V)          |  |  |
|              | Eq.+Ctrl.: 6mW (@1.2V)     |  |  |
|              |                            |  |  |
| Chip size    | PLL : 200µm x 200µm        |  |  |
|              | CDR : 500µm x 200µm        |  |  |
|              | Eq.+ Ctrl. : 100µm x 250µm |  |  |
|              |                            |  |  |

Table 2- 1 Performance summary

|           | [16]         | [21]            | [27]            | This Work       |
|-----------|--------------|-----------------|-----------------|-----------------|
| Data Rate | 3.5Gbps      | 10Gbps          | 20Gbps          | 2.5Gbps         |
| Technolo  | 0.18µm       | 0.13µm          | 0.13µm          | 0.13µm          |
| gy        | CMOS         | CMOS            | CMOS            | CMOS            |
| Power     | 80mW         | 25mW            | 60mW            | 6mW             |
| Diss.     |              |                 |                 |                 |
| Power     | 1.8V         | 1.2V            | 1.5V            | 1.2V            |
| Supply    |              |                 |                 |                 |
| Area      | 0.48×0.73    | 0.45×0.36       | 0.8×0.25        |                 |
|           | $mm^2$       | $\mathrm{mm}^2$ | $\mathrm{mm}^2$ | 0.1×0.25        |
|           | (active area | (active area    | (active area    | $\mathrm{mm}^2$ |
|           | only)        | only)           | only)           |                 |

Table 2- 2 Performance comparisons

# 2-6. Conclusion

Simple and new adaptation method is proposed for the split-path equalizer. Instead of classical spectral filtering method, the phase detector outputs in the CDR circuit are used to adaptively control the equalizing filter. The adaptation circuit block is digitally controlled and takes a very small area. For the various channel length its robust operation is verified through the simulations and the measurements. The prototype is implemented in 0.13µm CMOS process. It operates adaptively at 2.5Gbps up to PCB 160cm. The equalizer and the control circuit take only 0.1 x 0.25 mm<sup>2</sup> area and consumes about 6mW at 1.2V power supply.

# 3. Decision Feedback Equalizer with Data Pattern Filtering

# 3-1. Stringent requirements for the first post-cursor cancellation

At the multi-gigabit-per-second rates, the primary challenge in the DFE (Decision Feedback Equalizer) is feeding back the decisions quickly enough to implement the first filter tap as shown in Fig. 3-1. Due to the speed limitations, some multi-gigabit-per-second DFEs have employed speculative or loop-unfolding techniques [19, 20, 24]. These approaches relax the timing requirements of the first tap feedback by precomputing the equalized eye for either prior input data polarity, sampling both results, and choosing the proper results once the previous bit is known. However, this also introduces unwanted loading in the critical signal and clock paths as well as complicates the associated CDR circuit design [26].

This paper presents a DFE where the first tap is fed back directly from the input slicers without speculation. The direct DFE architecture avoids loading and provides a single straightforward equalization approach for all taps. However as mentioned before, first tap feedback delay cause another pattern dependent jitter to the equalized data eyediagram if the timing requirement as expressed in Eq. (3-1) is not satisfied. Timing diagram is shown in Fig. 3-2. Since the input signal 'In' and the decided signal 'Do' are combined, decided signal 'Do' should be settled to the final value before the next symbol transit the state.

Fig. 3–3 shows the behavioral simulation results with a large feedback delay time at the constant clock phase which is aligned at the data center. This pattern dependent jitter increases as the feedback delay time increases. This degradation becomes severe as it operates with CDR circuit, because the decision clock gets more jitters following the data pattern dependent jitter and the decided feedback signal will be delayed more.

$$t_{comb} + t_{clk-q} + t_{settle} < \frac{T}{2}$$

 $t_{settle} < \frac{T}{2}$  (3-1)



Figure 3-1 DFE with first feedback tap



Figure 3- 2 Timing requirement of the first feedback tap



Figure 3- 3 Pattern dependent jitter induced from the feedback delay

# 3-2. Data pattern filtering

Data pattern dependent jitter induced from decision feedback delay has two distinct transitions. That pattern jitters can be discerned to two cases as shown in Fig. 3–4. In the first case, if the feedback signal has a transition, input data pattern around the transition edge receives a different value of the feedback compared to the center of the data eye, because the classical LMS algorithm for the tap coefficient adaptation samples the center value of the data eye for ISI cancellation. In the other case, if the feedback signal has no transition, transition edge and data center have the same value of the feedback compensation. Therefore, there are two different cases which result two distinct pattern dependent jitters in the DFE.



Figure 3- 4 Analysis of the feedback data pattern

To confirm the data pattern analysis, behavioral simulation with CPPSIM is performed. First, when the '101010' pattern which have transition at the every edge is equalized through the DFE with LMS algorithm, the equalized output is shown in Fig. 3–5 (a) with thick black line. Second, when the '001001' pattern which have a transition at the every other edge alternatively goes through the DFE, the output results in a thick black pattern in Fig. 3–5 (b). The first transition results in an earlier edge and the second transition results in a later edge. Thus, the pattern can be discriminated into the two cases.



(a)



Figure 3- 5 Behavioral simulation results for the pattern analysis

When the CDR circuit operates with the DFE which has a decision feedback delay, the decision clock will be delayed following the later edges, causing the more delayed decision feedback. To alleviate this effect, the data pattern filtering can be applied to the CDR circuit. Because the pattern induced jitters in the DFE have two distinct patterns, the later pattern edge can be filtered out in the CDR circuit. Hence the recovered clock only follows the earlier edge in the pattern induced jitters in the DFE. Since the earlier edges are the results of the decision feedback with no transition, those edges could be selected to determine the clock phase as shown in Fig. 3–6.



| D <sub>0</sub> | D <sub>1</sub> | D <sub>2</sub> | E <sub>12</sub> | phase |
|----------------|----------------|----------------|-----------------|-------|
| 0              | 0              | 1              | 0               | early |
| 1              | 1              | 0              | 1               | early |
| 0              | 0              | 1              | 1               | late  |
| 1              | 1              | 0              | 0               | late  |

Figure 3- 6 Phase detection table

# 3-3. Behavioral simulation

To verify the operation of the data pattern filtering in the DFE and CDR circuit, the behavioral model is constructed with CPPSIM as shown in Fig. 3–7. The simulated data rate is 5Gbps and the transmission channel is modeled with a low-pass filter which has three poles. DFE has three feedback taps and the CDR circuit operates with full rate clock. To consider the feedback delay, delay units after combiner and DFF (D flip-flop) are inserted.



Figure 3-7 Behavioral model for DFE and CDR circuit

Figure 3-8 shows input signal with ISI, which has gone through the channel model.



Figure 3-8 Input signal with ISI

As the clock phase follows the data phase, the feedback tap coefficients converge to the optimum values to maximize the data margin at the sampling clock in the middle of the data through the LMS algorithm. Fig. 3-9 shows the tap coefficients are converging through the adaptation process.



Figure 3-9 Feedback tap coefficients

When the classical CDR circuit without the pattern filtering is used, the DFE shows the results of Fig. 3–10 after the adaptation. The equalized data signal and the recovered clock have a large of jitters. This is because the recovered clock follows the pattern dependent jitter caused by a decision feedback delay and the decision clock is delayed more causing the more pattern dependent jitter.

However, when the pattern filtering is applied to the CDR circuit, the jitter of the recovered clock is much smaller than that of the CDR without the filtering. The eye-diagram of the equalized data and the recovered clock is shown in Fig. 3-11. Because the CDR circuit filters out the later transition edge which is caused by the delayed decision feedback, the recovered clock aligns with the left earlier data transition

edges.



Figure 3- 10 Behavioral simulation results without pattern filtering



Figure 3- 11 Behavioral simulation results with pattern filtering

The fact that the equalized data eye and the recovered clock have more jitter means that the decision voltage margin is reduced. The reason is that as the clock gets closer to the data edges, the decision error is more probable. The performance degradation above mentioned gets more severe as the decision feedback delay increase. As the total feedback delay is increased the voltage margin is measured through the behavioral simulations. As shown in Fig. 3–12, the decision margin is dramatically reduced as the delay is increased, because the recovered clock and equalized eye have more jitters. However the margin of the DFE with the pattern filtered CDR is almost constant for a large feedback delay.



Figure 3- 12 Decision margin for decision feedback delay

## 3-4. Circuit design

#### 3-4-1. System configuration

The system block diagram is shown in Fig. 3–13. To relieve the loading and the distribution of the clock, the half rate clock is used for the system. In addition to the DFE circuit, FFE in Section 2 is implemented in front of the DFE to cancel the pre-cursor and residual post-cursor. FFE is controlled externally in this prototype without the adaptation circuit. Input data signal with ISI goes through the FFE and DFE sequentially. Then the equalized output goes into the phase detector in the CDR circuit and the adaptation circuit for the tap weighting control. As explained in the Section 2, the dual loop CDR is used. Reference PLL supply the differential quadrature clock to the CDR circuit and the recovered clocks from CDR circuit are distributed to each block of the DFE and the adaptation circuit. The recovered clock and the decided data output in monitored at the chip output through the output buffer.



Figure 3- 13 System block diagram

#### 3-4-2. Decision feedback equalizer

DFE circuit is constructed as shown in Fig. 3–14. It has 2-tap decision feedback circuit and operates at the half rate clock. A buffer between the combiner and the D flip-flop limits the equalized signal and aids in the regeneration time in the decision circuit. Buffers between the combiner and multiplexer also help sharpening the edges of the feedback signal and shortening the settling time.



Figure 3-14 Decision Feedback Equalizer

The schematic of the voltage combiner is shown in Fig. 3–15. The combing ratio, hence the coefficient of the feedback tap, is controlled through the current DAC. The polarity of the first and the second tap inputs is inverted to cancel the ISI components from the input signal. The schematics of the differential latch and the flip-flop are shown in Fig. 3–16.



Figure 3-15 Voltage combiner



Figure 3-16 (a) Differential latch and (b) flip-flop

#### 3-4-3. Adaptation circuit

For the adaptation of the DFE circuit, classical LMS (Least Mean Square) algorithm is used. To relieve the hardware complexity, signsign LMS algorithm is implemented as shown in Fig. 3-7, which uses the only sign information of the error and data. The error signals are generated by slicing the equalizer output at the target ±1 levels. A longterm correlation of the sign error to the data polarity at a given delayed bit position indicates that there is ISI arising from the data bit position at the sampling time. Integration of the feedback tap weights in the direction opposite the sign of the long-term correlation results in eventual convergence of the tap weights to realize minimum residual ISI contribution and maximum eye opening at the data sampling point. The algorithm runs at 1/8 times of the clock to save power. To generate the sign error compared to the targeted swing, the differential offset buffers as shown in Fig. 3-18 are used and the swing level is controlled by the bias current.



Figure 3- 17 LMS circuit for adaptation



Figure 3- 18 Differential offset buffer

#### 3-4-4 Phase detector in CDR circuit

The same structure of the dual loop CDR explained in Section 2 is also applied to this DFE circuit except the phase detector. The phase detector with the patter filtering is implemented with the differential logic circuits as shown in Fig. 3–19. The quadrature clocks, CLKi and CLKq, sequentially sample the data and the edge of the equalizer output signal and the sampled values are aligned with the falling edge of the CLKi at the last D flip-flops. Then they are utilized to filtering out the transition edges that are induced from the decision feedback delay as explained in Section 3–1. The schematics of the differential logic circuits are drawn in Fig. 3–20.





Figure 3-19 Phase detector with pattern filtering



Figure 3- 20 Differential (a) AND and (b) OR circuit

In addition to the seven D flip-flops in the phase detector, there are many D flip-flops and blocks that needs the clock in other circuit blocks, DFE and adaptation circuit. Therefore, to drive those circuit blocks that need the synchronous clock, the clock trees for CLKi and CLKq are constructed like Fig. 3–21. CLKi1, CLKi2 and CLKi3 are for the phase detector, dCLK1 and dCLK2 are for DFE, and lCLK1 and lCLK2 are for the adaptation circuit. Considering the clock loading, transistor sizes for the D flip-flop, multiplexer and divider are determined. Also, to match the phase and delay between the CLKi and CLKq, the same clock depth and the loading is applied to those two clock trees.



(a)



(b)

Figure 3- 21 Clock trees for clock distribution

## 3-5. Simulation results

To verify the performance improvement of DFE circuit with the pattern filtering, the DFE circuit without the pattern filtering is also simulated and compared. Fig. 3–22 shows the eye-diagram of the input data signal after the channel transmission of 100cm PCB trace. Data rate is 5Gbps and the equalizer gain setting of FFE in the system is set to 0000.

Fig. 3-23 (a) shows the eye-diagram of the DFE output when the data pattern filtering is not applied, whereas Fig. 3-23 (b) shows the eyediagram of the DFE output when the data pattern filtering is applied. Jitter of the recovered clock is reduced and the more large data and timing margin is obtained through the pattern filtering.



Figure 3- 22 Eye-diagram of input data signal with ISI



Figure 3-23 Eye-diagram of equalizer output and recovered clock

# 3-6. Summary

The chip is designed and fabricated with 0.13µm CMOS process and the layout of the circuit is shown in Fig. 3-24. The entire circuits occupy the die area of 500µm x 1100µm excluding output buffers. The power consumption of the entire circuit is about 72mW at the 1.2V supply voltage excluding output buffers. The circuit performance is summarized in Table 2-1.



Figure 3- 24 Chip layout
| Process                  | 0.13µm CMOS               |
|--------------------------|---------------------------|
| Data rate                | 5Gbps                     |
| Simulated jitter (pk-pk) | 20ps                      |
| of the recovered clock   |                           |
| Power                    | PLL:16mW (@1.2V)          |
| consumption              | CDR: 26mW (@1.2V)         |
|                          | DFE+Ctrl. : 30mW (@1.2V)  |
| Chip size                | PLL : 200µm x 200µm       |
|                          | CDR : 200µm x 500µm       |
|                          | DFE+Ctrl. : 200µm x 800µm |

Table 3- 1 Performance summary

#### 3-7. Conclusion

For the decision feedback equalizer, the data pattern filtering method to the clock and data recovery circuit is proposed to relieve the pattern dependent jitters which are induced by the feedback delay. The filtering method is applied to the phase detector in the clock and data recovery circuit and it makes the recovered clock follows a certain data pattern's transition edges to stop the clock being delayed more through the clock recovery process. The decision feedback equalizer circuits with data pattern filtering are implemented with 0.13µm CMOS process at the target speed of 5Gbps and simulated with HSPICE. The equalizer with data pattern filtering shows better performance in the recovered clock and the data decision margin compared to the equalizer without data pattern filtering. As the feedback delay increases and the data speed increases, the performance improvement increases. The entire circuit in this prototype takes  $0.5 \times 1.1 \mu m^2$  die area and consumes 72 m W power at the 1.2V power supply.

### Appendix: Debugging Review on Design

The debugging review on the design is performed to reason why the operating speed is lowered. First, the cause of performance degradation can be contributed to the gap between the ideal circuit and the real circuit implemented in sub-micron CMOS process. To include the parasitic in the implementation, Star-RCXT of Synopsys is utilized for the parasitic extraction. To decrease the excessive simulation runtime, default options was used for the parasitic extraction in the design process. For the REDUCTION option, netlist of Fig. A-1 (a) which is non-reduced netlist was reduced to reduce the simulation runtime as shown in Fig. A-1 (b). COUPLE\_TO\_GROUND option determines whether or not parasitic coupling capacitances are lumped to ground. In this time, parasitic coupling capacitances are not lumped to ground. Changing these options to include the all parasitic components, complex netlist is simulated to compare the results with the measured ones.



(a)



(b)

Figure A- 1 Parasitic resistance option (a) REDUCTION : NO (b)

**REDUCTION : YES** 

To compare the effects of the parasitic option, the VCO used in PLL circuit is simulated for the different options. Four stage ring oscillator with clock buffers is used and its detailed structure is explained in Section 2. The schematic and the layout of the VCO are shown in Fig. A-2 and Fig. A-3. Varying the coarse control voltage of the VCO, the resonating frequency of the oscillator is monitored. Fig. A-4 shows the results of the ideal netlist, default option, and the fully included option. As the netlist complexity increases including the more parasitic components, VCO frequency is decreased and gets closer to the measured results of the prototype. The measured maximum resonating frequency was about 2GHz.



Figure A- 2 schematic of VCO



Figure A- 3 Layout of VCO



Figure A- 4 VCO curve

Also, the data path of FFE in Chapter 2 excluding other circuit blocks is simulated for the different extraction options. Input data signal of 3.125Gbps after PCB 120cm as shown in Fig. A–5 is put into the data path. Fig. A–6 and Fig. A–7 show the schematic and the layout of the critical data path including the equalizer and the output buffer. When the equalizer is set to have the maximum gain, the eye–diagrams of the equalizer output is shown in Fig. A–8. Fig. A–8 (a), (b) and (c) show the results with ideal netlist, reduced netlist with default option and fully extracted netlist, respectively. Fig. A–9 shows the output of the output buffer. Extracted RC parasitic deteriorates the equalizer performance and output results severely. The ac response of the data path when the equalizer control code is minimum and maximum is shown in Fig. A–10.

Thus, the reason for the performance degradation of the equalizer prototype can be contributed to the limited equalizer gain in the data path, as well as the lowered clock frequency, which are induced by residual parasitic which is not included in the design process. It is considered that this performance gap between the ideal netlist and the fully parasitic extracted netlist increase as the deep submicron CMOS process is used for the implementation.



Figure A- 5 Input data eye-diagram after 120cm PCB trace



Figure A- 6 schematic of the data path



Figure A- 7 Layout of the data path



Figure A- 8 Simulated eye-diagram after the equalizing filter with (a) ideal netlist, (b)netlist with default RC extraction (c)netlist with full RC extraction



Figure A- 9 Simulated eye-diagram after the output buffer with (a) ideal netlist, (b)netlist with default RC extraction (c)netlist with full RC

extraction







Figure A- 10 Ac response of the data path when the equalizer control

code is (a) 1111 and (b) 0000 106

## Reference

- B. Analui, J. F. Buckwalter, and A. Hajimiri, "Data-dependent jitter in serial communications," *Microwave Theory and Techniques, IEEE Transactions on,* vol. 53, pp. 3388–3397, 2005.
- J. F. Buckwalter and A. Hajimiri, "Analysis and equalization of data-dependent jitter," *Solid-State Circuits, IEEE Journal of*, vol. 41, pp. 607-620, 2006.
- [3] R. Farjad-Rad, C. K. K. Yang, and M. A. Horowitz, "A 0.3-μm
   CMOS 8-Gb/s 4-PAM serial link transceiver," *Solid-State Circuits, IEEE Journal of*, vol. 35, pp. 757-764, 2000.
- [4] J. L. Zerbe, P. S. Chau, C. W. Werner, T. P. Thrush, H. J. Liaw, B.
  W. Garlepp, and K. S. Donnelly, "1.6 Gb/s/pin 4-PAM signaling and circuits for a multidrop bus," *Solid-State Circuits, IEEE Journal of,* vol. 36, pp. 752-760, 2001.
- J. T. Stonick, W. Gu-Yeon, J. L. Sonntag, and D. K. Weinlader, "An adaptive PAM-4 5-Gb/s backplane transceiver in 0.25-/spl mu/m CMOS," *Solid-State Circuits, IEEE Journal of,* vol. 38, pp. 436-443, 2003.
- [6] J. L. Zerbe, C. W. Werner, V. Stojanovic, F. Chen, J. Wei, G. Tsang, D. Kim, W. F. Stonecypher, A. Ho, T. P. Thrush, R. T. Kollipara, M. A. Horowitz, and K. S. Donnelly, "Equalization and clock recovery for a 2.5-10-Gb/s 2-PAM/4-PAM backplane transceiver cell," *Solid-State Circuits, IEEE Journal of,* vol. 38, pp. 2121-2130, 2003.
- [7] V. Stojanovic, A. Ho, B. W. Garlepp, F. Chen, J. Wei, G. Tsang, E. Alon, R. T. Kollipara, C. W. Werner, J. L. Zerbe, and M. A. Horowitz, "Autonomous dual-mode (PAM2/4) serial link transceiver with adaptive equalization and data recovery," *Solid-State Circuits, IEEE Journal of,* vol. 40, pp. 1012–1026, 2005.

- [8] A. J. Baker, "An adaptive cable equalizer for serial digital video rates to 400 Mb/s," 1996, pp. 174–175, 439.
- W. J. Dally and J. Poulton, "Transmitter equalization for 4-Gbps signaling," *IEEE Micro*, vol. 17, pp. 48-56, 1997.
- [10] J. N. Babanezhad, "A 3.3 V analog adaptive line-equalizer for fast Ethernet data communication," 1998, pp. 343-346.
- [11] G. P. Hartman, K. W. Martin, and A. McLaren, "Continuous-time adaptive-analog coaxial cable equalizer in 0.5 μm CMOS," 1999, pp. 97-100 vol.2.
- [12] M. H. Shakiba, "A 2.5 Gb/s adaptive cable equalizer," 1999, pp. 396-397.
- [13] W. Hui, J. A. Tierno, P. Pepeljugoski, J. Schaub, S. Gowda, J. A. Kash, and A. Hajimiri, "Integrated transversal equalizers in high-speed fiber-optic systems," *Solid-State Circuits, IEEE Journal of,* vol. 38, pp. 2131–2137, 2003.
- [14] Y. Kudoh, M. Fukaishi, and M. Mizuno, "A 0.13-/spl mu/m CMOS 5-Gb/s 10-m 28AWG cable transceiver with no-feedback-loop continuous-time post-equalizer," *Solid-State Circuits, IEEE Journal of,* vol. 38, pp. 741-746, 2003.
- [15] W. Hui, J. Xicheng, D. Tam, F. Cheung, D. Cheung, W. Tong, M. Le, M. Wakayama, J. Van Engelen, V. Parthasarathy, H. Baumer, and A. Buchwald, "A quad multi-speed serializer/deserializer with analog adaptive equalization," in *Symposium on VLSI Circuits Digest of Technical Papers*, 2004, pp. 340-343.
- [16] C. Jong-Sang, H. Moon-Sang, and J. Deog-Kyoon, "A 0.18-/spl mu/m CMOS 3.5-gb/s continuous-time adaptive cable equalizer using enhanced low-frequency gain control method," *Solid-State Circuits, IEEE Journal of*, vol. 39, pp. 419-425, 2004.
- [17] M. Q. Le, J. Van Engelen, W. Hui, A. Madisetti, H. Baumer, and A. Buchwald, "A 3.125Gbps timing and data recovery front-end with adaptive equalization," in *Symposium on VLSI Circuits Digest of Technical Papers*, 2004, pp. 344–347.

- [18] C. Pelard, E. Gebara, A. J. Kim, M. G. Vrazel, F. Bien, Y. Hur, M. Moonkyun, S. Chandramouli, C. Chun, S. Bajekal, S. E. Ralph, B. Schmukler, V. M. Hietala, and J. Laskar, "Realization of multigigabit channel equalization and crosstalk cancellation integrated circuits," *Solid-State Circuits, IEEE Journal of*, vol. 39, pp. 1659–1670, 2004.
- [19] V. Balan, J. Caroselli, J. G. Chern, C. Chow, R. Dadi, C. Desai, L. Fang, D. Hsu, P. Joshi, H. Kimura, C. Y. Liu, P. Tzu-Wang, R. Park, C. You, Z. Yi, E. Zhang, and F. Zhong, "A 4.8-6.4-Gb/s serial link for backplane applications using decision feedback equalization," *Solid-State Circuits, IEEE Journal of,* vol. 40, pp. 1957–1967, 2005.
- [20] T. Beukema, M. Sorna, K. Selander, S. Zier, B. L. Ji, P. Murfet, J. Mason, W. Rhee, H. Ainspan, B. Parker, and M. Beakes, "A 6.4-Gb/s CMOS SerDes core with feed-forward and decision-feedback equalization," *Solid-State Circuits, IEEE Journal of,* vol. 40, pp. 2633–2645, 2005.
- [21] S. Gondi, L. Jri, D. Takeuchi, and B. Razavi, "A 10Gb/s CMOS adaptive equalizer for backplane applications," in *IEEE Int. solid-State Circuits Conf. Dig. Tech. Papers*, San Francisco, CA, 2005, pp. 328-601 Vol. 1.
- [22] Z. Guangyu Evelina and M. M. Green, "A 10 Gb/s BiCMOS adaptive cable equalizer," *Solid-State Circuits, IEEE Journal of,* vol. 40, pp. 2132-2140, 2005.
- [23] J. E. Jaussi, G. Balamurugan, D. R. Johnson, B. Casper, A. Martin, J. Kennedy, N. Shanbhag, and R. Mooney, "8-Gb/s sourcesynchronous I/O link with adaptive receiver equalization, offset cancellation, and clock de-skew," *Solid-State Circuits, IEEE Journal of,* vol. 40, pp. 80-88, 2005.
- [24] K. Krishna, D. A. Yokoyama-Martin, A. Caffee, C. Jones, M. Loikkanen, J. Parker, R. Segelken, J. L. Sonntag, J. Stonick, S. Titus, D. Weinlader, and S. Wolfer, "A multigigabit backplane

transceiver core in 0.13-/spl mu/m CMOS with a power-efficient equalization architecture," *Solid-State Circuits, IEEE Journal of,* vol. 40, pp. 2658-2666, 2005.

- [25] N. Krishnapura, M. Barazande-Pour, Q. Chaudhry, J. Khoury, K. Lakshmikumar, and A. Aggarwal, "A 5Gb/s NRZ transceiver with adaptive equalization for backplane transmission," in *IEEE Int. solid-State Circuits Conf. Dig. Tech. Papers*, San Francisco, CA, 2005, pp. 60–585 Vol. 1.
- [26] R. Payne, P. Landman, B. Bhakta, S. Ramaswamy, W. Song, J. D. Powers, M. U. Erdogan, A. L. Yee, R. Gu, W. Lin, X. Yiqun, B. Parthasarathy, K. Brouse, W. Mohammed, K. Heragu, V. Gupta, L. Dyson, and L. Wai, "A 6.25–Gb/s binary transceiver in 0.13–/spl mu/m CMOS for serial data transmission across high loss legacy backplane channels," *Solid–State Circuits, IEEE Journal of,* vol. 40, pp. 2646–2657, 2005.
- [27] L. Jri, "A 20Gb/s Adaptive Equalizer in 0.13/spl mu/m CMOS Technology," in *IEEE Int. solid-State Circuits Conf. Dig. Tech. Papers*, San Francisco, CA, 2006, pp. 273-282.
- [28] K. H. Lee, J. W. Lee, and W. Y. Choi, "A 0.18um CMOS 3.125-Gb/s Digitally Controlled Adaptive Line Equalizer with Feed-Forward Swing Control for Backpane Serial Link," *IEICE Trans. Electron.*, vol. 18, pp. 1389–1391, June 2006.
- [29] M. Maeng, F. Bien, Y. Hur, S. Chandramouli, H. Kim, Y. Kumar, C. Chun, E. Gebara, and J. Laskar, "A 0.18/spl mu/m CMOS equalizer with an improved multiplier for 4-PAM/20Gbps throughput over 20 inch FR-4 backplane channels," in *Microwave Symposium Digest, 2004 IEEE MTT-S International*, 2004, pp. 105-108 Vol.1.
- [30] A. C. Carusone, "An Equalizer Adaptation Algorithm to Reduce Jitter in Binary Receivers," *Circuits and Systems II: Express Briefs, IEEE Transactions on* vol. 53, pp. 807–811, 2006.
- [31] J. Savoj and B. Razavi, "A 10-Gb/s CMOS clock and data

recovery circuit with a half-rate linear phase detector," *Solid-State Circuits, IEEE Journal of,* vol. 36, pp. 761-768, 2001.

[32] C. K. Seong, S. W. Lee, and W. Y. Choi, "A 1.25-Gb/s Digitally-Controlled Dual-Loop Clock and Data Recovery Circuit with Enhanced Phase Resolution," *IEICE Trans. Electron.*, vol. E90-C, pp. 165-170, 2007.

#### 국문 요약

# 위상 검출기 출력을 이용한

# 고속 적응 동작 형 등화기

고속 데이터 시그널의 전송 시 케이블이나 PCB 보드 상에서 발생하는 인접 심볼간의 간섭을 제거하기 위해 수신기에 위치하는 적응 동작 형 등화 기에 초점을 맞추어 연구하였다. 먼저 구현이 쉬운 분리 경로를 이용한 필 터 형태의 등화기를 적응 제어 하기 위해 다음과 같은 방법을 적용하였다. 탭 지연 방식의 필터에만 사용되었던 데이터의 위상 정보를 이용한 적응 제 어 방법을 분리 경로를 이용한 필터에 사용함으로써 기존의 방식에서 사용 되었던 아날로그 블록들을 제거하고 디지털 방식으로 제어함으로써 보다 간 단하고 적은 면적과 전력 소모로 안정적인 동작을 가능하게 하였다. 0.13µm CMOS 공정으로 제작하였으며 측정을 통해 2.5Gpbs 에서 다양한 길이의 PCB 보드에서 적응 동작하는 것을 확인하였다.

또한 잡음을 증폭하지 않는 판단 귀환 등화기의 가장 큰 문제점인 첫 번 째 심볼 간섭 성분을 제거하기 위한 탭의 시간 지연에 따른 또 다른 심볼 간섭 문제를 완화하기 위해 데이터 형태 여과 방법을 사용하였다. 이는 클

112

럭 및 데이터 복원 회로의 위상 검출기 출력에서 특정 패턴의 위상을 검출
해 냄으로써 더 이상의 불필요한 클럭의 위상 지연과 귀환 탭의 시간 지연
을 막게 된다. 따라서 고속 등화기 설계에서 발생하는 시간 지연에 따른 성
능 열화 현상을 향상시킬 수 있다. 이를 뒷받침하기 위해 0.13μm CMOS
공정으로 5Gbps 급 판단 귀한 등화기를 설계 제작하였다.