



## 2006 IEEE International Symposium on Circuits and Systems

May 21 - 24, 2006 I Island of Kos, Greece

# CIRCUITS AND SYSTEMS: AT CROSSROADS OF LIFE AND TECHNOLOGY



**IEEE** 

IEEE CAS Society

IEEE Catalog Number: ISBN: Library of Congress: 06CH37717C 0-7803-9390-2 80-646530

© 2006 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.

### Tuesday, May 23, 2006

| Wireline Communications Circuits I (Lecture) |
|----------------------------------------------|
| Tuesday, May 23, 2006, 10:00 -11:30          |
| Galen (0.9)                                  |
| Calvin Plett                                 |
| Calvin Plett                                 |
|                                              |

| B1L-K.1 | Modeling and Verification of High-Speed Wired Links     |   |
|---------|---------------------------------------------------------|---|
|         | with Verilog-AMS                                        | ľ |
|         | Ming-Ta Hsieh, Gerald Sobelman, University of Minnesota |   |

| B1L-K.2 | Analysis and Modeling of Jitter and Frequency Tolerance      |   |
|---------|--------------------------------------------------------------|---|
|         | in Gated Oscillator Based CDRs                               | 9 |
|         | Armin Tajalli, Sharif University of Technology, Paul Muller, |   |
|         | LSM, EPFL, Mojtaba Atarodi, Sharif University of             |   |
|         | Technology, Yusuf Leblebici, EPFL-LSM                        |   |

### B1L-K.3 A 1.25-Gb/s Digitally-Controlled Dual-Loop Clock and Data Recovery Circuit with Enhanced Phase Resolution ... 2113 Chang-Kyung Seong, Yonsei University, Seung-Woo Lee, ETRI, Woo-Young Choi, Yonsei University

B1L-K.4 A Reconfigurable Fully-Integrated 0.18-Um CMOS Feed-Forward Equalizer IC for 10-Gb/sec Backplane Links...... 2117 Franklin Bien, Youngsik Hur, M. Maeng, H. Kim, Georgia Institute of Technology, E. Gebara, J. Laskar, Georgia Institute of Technology, Quellan, Inc.

B1L-K.5 Automatic Within-Pair-Skew Compensation for 6.25Gbps Differential Links Using Wide-Bandwidth Delay Units ..... 2121 Yuxiang Zheng, University of Texas at Dallas, Jiang Li, Hewlett-Packard, Jin Liu, University of Texas at Dallas, Qian Yu, Chinese Academy of Sciences

# A 1.25-Gb/s Digitally-Controlled Dual-Loop Clock and Data Recovery Circuit with Enhanced Phase Resolution

Chang-Kyung Seong<sup>1</sup>, Seung-Woo Lee<sup>2</sup> and Woo-Young Choi<sup>1</sup>

<sup>1)</sup>Department of electrical and electronic engineering Yonsei University Seoul, Korea

Abstract—This paper describes a 1.25-Gb/s digitally-controlled dual-loop clock and data recovery circuit with a 256-level phase resolution using only 4-phase reference clock. A novel scheme is proposed to enhance the phase resolution with little additional power consumption and chip area. A digitally-controlled delay buffer having a variable delay tunes output phase finely for a higher resolution. A prototype chip was fabricated with 0.18  $\mu$ m CMOS technology. In the measurement, the CDR has ±400ppm frequency offset tolerance and a flat jitter performance for wide variations of delay buffer. The power consumption of the CDR core is 17.8mW with 1.8V supply and the core occupies 255  $\mu$ m × 165  $\mu$ m.

#### I. INTRODUCTION

As the demands for wideband networks and high-speed ICs grow, high-speed serial I/O systems become one of the most important blocks. In many cases, such as switch applications, dozens of transceivers have to be integrated on a single chip. Therefore, low power consumption and small chip area are very crucial for data recovery circuits used in such applications.

A general phase-locked loop (PLL)-based clock and data recovery circuit (CDR) is not preferred in the multi-channel environments due to noise coupling problems between multiple transceiver modules [1]. Instead, dual-loop CDRs with a shared reference PLL and phase alignment blocks for each channel have been widely used. In the dual-loop CDR, the phase alignment block uses phase interpolator (PI) instead of voltage-controlled delay line (VCDL) to make continuous phase tuning in the range of 360°.

The dual-loop CDR using PI can be classified into two categories, analog and digitally-controlled type. Due to the inherent characteristics of continuous phase generation capability, the CDR using analog PI generates less jitters. However, it is much more sensitive to supply and substrate noises. In the noisy environment subject to switching noises from adjacent digital logic cores, analog controlled dual-loop CDR is not suitable. In the other hand, the digitally-controlled type is robust to noises and easily controllable. However, it suffers from jitter performance degradation by self-dithering [2]. This is caused by

<sup>2)</sup> Switching Technology Team Electronics and Telecommunications Research Institute Daejeon, Korea

the nature of inherent discrete phase generation of digitallycontrolled PI. The phase resolution of the digitally-controlled dual-loop CDR is a critical design parameter for jitter performance.

This paper presents a novel configuration with a digitallycontrolled delay buffer (DCDB) to increase the effective phase resolution of the digitally-controlled dual-loop CDR. In Section II, we present an overview and problems of a conventional digitally-controlled dual-loop CDR. Section III describes the proposed CDR. Section IV shows experimental results of the prototype chip. Finally, conclusions are given in Section V.

#### II. CONVENTIONAL DIGITALLY-CONTROLLED DUAL-LOOP CDR AND ITS PROBLEMS

#### A. Structure

Fig. 1 shows a block diagram of a conventional digitallycontrolled dual-loop CDR. It consists of a bang-bang phase detector (BBPD), controller, phase selection circuit and PI. The CDR receives several equally spaced and uniformly distributed reference phases from a reference PLL. The phase selection circuit takes two adjacent phases that contain the desired output phase from them and the PI makes the target phase by interpolating selected two phases. The BBPD compares phases



Figure 1. Block diagram of a conventional digitally-controlled dual-loop CDR



of the interpolated clock and data so that the controller can produce control code for the next output phase. By the negative feedback, the loop aligns the clock to input data.

#### B. Phase Resolution

The phase resolution of the CDR is related to three issues: jitter generation, jitter suppression and frequency offset tracking. Unlike analog-controlled CDR, digitally-controlled CDR inherently generates non-zero jitters. Since it generates quantized phases, the edge of recovered clock dithers around the edge of input data, resulting in quantization errors even in the locked state. Moreover, some clock latencies in the loop degrade the jitter generation performance [3]. In the aspect of jitter suppression, the phase resolution is directly related to open loop gain or loop bandwidth of CDR. For higher resolution, CDR should have narrower loop bandwidth and the phase step that CDR can jump in one clock cycle should be small. It means that CDR does not track the input jitter well. Consequently, CDR with narrow bandwidth can not track a large frequency offset. Many applications require that CDR should operate with a frequency offset of hundreds of ppm.

Thus, the phase resolution of the CDR is lower-bounded by the frequency offset tracking ability and upper-bounded by the jitter generation and jitter suppression performance. The digitally-controlled CDR should have a high enough resolution to generate small amount of jitters while covering a specified frequency offset range.

The PI generates the target phase by performing weightedsummation of two input signals. Since the weight coefficients are represented by two bias currents of digitally-controlled current bias circuits in each differential pair in PI, the resolution of the current DAC directly determines the phase resolution of PI. In fact, the resolution of the DAC is limited in both binaryweighted and thermometer-coded types. In the binary-weighted type, DAC suffers from a dynamic phase overshoot by using a



Figure 3. Schematic of the digitally-controlled delay buffer

large current source although it is a simple structure with a few bits of digital word. While the thermometer type is free from these problems, large chip area is required for implementing high resolution. Therefore, it is very difficult to realize PI that has small area, good dynamic performance and phase resolution higher than 4-bits, i.e. 16-level, in both types. In the case of using 4-phase reference clocks, the total phase resolution of CDR is increased to four times of PI resolution. The 6-bit CDR using 4-bit PI has the minimum phase step of 5.63°. Considering clock latencies of more than two cycles by the BBPD and controller, it is not small enough since peak-to-peak selfdithering becomes at least  $\pm 3$  phase steps, or 33.75°.

#### III. PROPOSED CDR

#### A. Structure

The block diagram of the proposed 1.25-Gb/s dual-loop CDR is shown in Fig. 2. The CDR receives two differential quadrature phase clocks from the reference PLL. Two 2:1 MUXs make up two adjacent phases that contain the desired phase by selecting an inverted or non-inverted version of the reference clocks. The target phase is provided by PI and DCDB. Fig. 3 shows the schematic of DCDB. It is a kind of a current-starved CMOS inverter that has a 4-level variable propagation delay. By fine tuning of the output phase, the DCDB provides the total resolution multiplied by DCDB resolution without more reference clock phases. Its delay is controlled by 2-bit binaryweighted digital word. In the prototype chip, the tuning voltage, V<sub>tuning</sub>, is used as a bias voltage to control the DCDB delay error for the purpose of testing. To avoid dynamic phase overshoot, PI contains thermometer-coded current DACs that have 16-level resolution. Additional blocks in the proposed structure are only simple CMOS logic gates. Therefore, it requires little additional power and chip area while overall phase resolution is increased from 64-level to 256-level, or 8-bits. The up/down filter after the BBPD reduces unwanted phase dithering by generating output pulses only after two consecutive UP or DOWN pulses [4].



#### B. Effect of DCDB Delay Error

It is not guaranteed that the DCDB provides the exact amount of desired delay due to PVT variations. When the DCDB delay is different from the desired value, the combined phase transfer curve could be non-monotonic and nonlinear as illustrated in Fig. 4. In the figure, large and small black circles correspond to the normal output phases of PI and DCDB, and crosses and diapers correspond to slipped output phases of DCDB with +50% and -50% error, respectively. Shadowed regions are where the phase transfer curve suddenly changes. Delay error of DCDB can be defined as follows.

$$Err_{DCDB}(\%) = \frac{\Delta\phi_{Slip} - \Delta\phi_{Nor}}{\Delta\phi_{Nor}} \times 100$$
(1)

where  $\triangle \Phi_{Nor}$  is the desired delay of the DCDB and  $\triangle \Phi_{Slip}$  is slipped delay of the DCDB.

To verify the degradation of the jitter generation performance, behavioral simulations were performed using CPPSIM, a C++based time step simulator [5]. The degradation factors such as latency in the controller module, frequency offset and the delay error of the DCDB were considered in the simulation. The output RMS and peak-to-peak jitters were measured for various delay errors of DCDB from -50% to 100% with ideal input data and fixed 200ppm frequency offset. Three conventional CDR models using only PIs having 6-bit, 7-bit or 8-bit total resolutions were also simulated for comparison.

Fig. 5 shows jitter generation of the proposed CDR. Three horizontal lines are simulated jitter generation levels of the conventional CDR models with different resolutions. As the delay error of DCDB increases, jitter generation performance is degraded and an effective phase resolution of the CDR is decreased closer to 7-bit level. However, jitter generation of the proposed model is similar to that of 8-bit model in a very wide range. Although there can be sudden phase variations at the edge



Figure 5. Jitter generation degradation vs. delay error of DCDB in both behavioral and circuit-level simulation (a) peak-to-peak jitter generation (b) RMS jitter generation

of two interpolated phases with DCDB errors shown in shadowed region in Fig. 4, the entire effective phase resolution is increased. In the inversed slope due to large positive errors, the phase will jumps to the opposite direction from input data phase. However, since the effect of increased phase resolution is more dominant than that of local phase fluctuation, the total jitter generation performance is improved.

In circuit-level simulations, jitter generation of the CDR was measured for various delay errors by changing  $V_{tuning}$ . It was also observed that the jitter generation level is flat for a wide variation of delay errors. Metastability of D-flipflops in the BBPD allows some dead-zones, which cause recovered clock to dither less in the circuit-level simulation than behavioral simulation as shown in Fig. 5 (b).

#### IV. EXPERIMENTAL RESULTS

The prototype chip was fabricated in 0.18  $\mu$ m CMOS technology. Power consumption of CDR core is about 17.82 mW with 1.8V supply voltage. The chip area of CDR core is about 255×165  $\mu$ m<sup>2</sup>. The die photo is shown in Fig 6.

The output jitter was measured for various delay errors by tuning  $V_{tuning}$ . Three tuning voltages, 0V, 0.2V and 0.4V, correspond to -50%, 0%, 50% DCDB error, respectively. Because the input jitter was 11.2ps-<sub>RMS</sub> and 42ps<sub>P-P</sub> due to a



differential signal mismatching of the pattern generator used for measurement, measured output jitter level in Fig. 8 was higher than in simulation. However, flat jitter performance for DCDB was verified for DCDB errors from -50% to 50%. To evaluate jitter rejection capability, input jitter was added by transmitting input data with 200ppm frequency offset through 2m PCB trace with 3.5m cable. As shown in Fig. 9, the CDR recovered clean data and clock signal waveforms from eye-closed data.

#### V. CONCLUSION

This paper presents a novel configuration of digitallycontrolled dual-loop CDR to increase effective phase resolution. The phase resolution can be easily increased by inserting DCDB with little additional power are chip-area costs. It is verified that the effect of DCDB delay error is not critical in a very wide range. A prototype chip is fabricated in 0.18  $\mu$ m CMOS technology. The CDR achieves 256-level, or 8-bit, effective phase resolution and can cover  $\pm 400$ ppm frequency offset. The chip consumes 17.8mW at 1.8V and has the area of  $255 \times 165 \mu$ m<sup>2</sup>.

#### ACKNOWLEDGMENT

This work was sponsored in part by the Ministry of Science and Technology of Korea and the Ministry of Commerce, Industry and Energy through the System IC 2010 program. We



Figure 8. Measured output jitter vs. delay error of DCDB



Figure 9. Measured eye diagram at 200ppm frequency offset (a) Input data : after 2m PCB trace and 3.5m cable, 0.53 UI<sub>P-P</sub> eye opening (b) Recovered data : 0.265UI<sub>P-P</sub> eye opening

also acknowledge that EDA software used in this work was supported by IDEC (IC Design Education Center).

#### REFERENCES

- P. Larsson, "Measurements and analysis of PLL jitter caused by digital switching noise," *IEEE J. Solid-State Circuits*, vol. 36, pp.113-119, July 2001.
- [2] Stefanos Sidiropoulos, Mark A. Horowitz, "Semidigital dual delay-locked loop," *IEEE J. Solid-State Circuits*, vol. 37, pp. 1683-1692, November 1997.
- [3] Jeongsik Yang, et al., "A quad-channel 3.125Gb/s/ch serial-link transceiver with mixed-mode adaptive equalizer in 0.18um CMOS," *ISSCC Dig. Tech. Papers*, 2004.
- [4] Muneo Fukaishi, et al., "A 20-Gb/s CMOS multichannel tranmitter and receiver chip set for ultra-high-resolution digital displays," *IEEE J. Solid-State Circuits*, vol. 30, pp.1611-1618, November 2000
- [5] M.H. Perrott, "Fast and accurate behavioral simulation of fractional-N synthesizers and other PLL/DLL circuits," *Design Automation Conference*, pp 498-503, June 2002.