# Clock and Data Recovery Circuit Using Digital Phase Aligner and Phase Interpolator

Seung-Woo Lee<sup>1)</sup>, Chang-Kyung Seong<sup>2)</sup>, Woo-Young Choi<sup>2)</sup> and Bhum-Cheol Lee<sup>1)</sup> <sup>2)</sup> Dept. of Electronic Engineering <sup>1)</sup> Switching Technology Team Electronics and Telecommunications Research Institute Yonsei University Daejon, Korea Seoul, Korea beewoo@etri.re.kr ck@tera.vonsei.ac.kr

Abstract— Clock and data recovery circuit using digital phase aligner and phase interpolator is proposed for multi-channel link applications. The proposed circuit reduces recovered clock jitter and alleviates the problem of distorted clock duty cycle. It is realized in 0.13um CMOS technology. Its power dissipation is 9.7mW at 1.2V power supply and its occupation area is 290x230um<sup>2</sup> with multi-phase clock generation block. The experimental results show that the proposed circuit recovers 1Gb/s of 27-1 PRBS with no error.

#### Ĩ. INTRODUCTION

In broadband network, the demands for wide bandwidth of upstream and downstream in switch applications increase and many high speed I/O interfaces are integrated in switch chips. In order to reduce the number of I/O pins, the interface between physical links and switch moves toward high speed serial interface. Also, the needs for the clock and data recovery circuits with low power and small area grow increasingly.

In general, clock and data recovery using phase-locked loop[1] includes analog loop filter with large area and voltage-controlled oscillator with high noise sensitivity. To establish multi-channel I/O interfaces, many phase-locked loops which are largely occupied with loop filter are required. Another design approach is also available using digital logic immune to noise sensitivity. Oversampling scheme[2] detects data edge using oversampled clocks and picks the correct data using majority-voting. Its main disadvantages are more power dissipation and increased algorithm complexity.

In [3] and [4], clock and data recovery with digital phase aligner (DPA) is introduced to have all digital structure, which is less sensitive to PVT variations and is simply designed by using standard cell logic. However, it has some problems as following. The clock and data recovery using DPA has no ability to reject jitters of incoming data so that the recovered clock has the same amount of jitters in input data. In addition, it synthesizes selected clocks to deteriorate duty cycle of recovered clock. Therefore, it is difficult to

design elastic buffer to have good timing margin for system clock.

In this paper, the clock and data recovery using DPA and phase interpolator is proposed to reduce jitters of recovered clock and alleviate distorted duty cycle. In section II, we represent the features of overall architecture and describe the operation of each block. Section III shows the experimental results of the chip performance. Finally, it is summarized with conclusions in section IV.

#### **OVERALL ARCHITECTURE** ĨĨ.

Figure 1 shows the overall architecture of the proposed clock and data recovery circuit which consists of two digital phase aligners, phase interpolator and decision logic. Two DPAs align selected clock to positive and negative transition of input data, respectively. The aligned clocks are fed into interpolator that generates recovered clock.



Figure 1. The proposed clock and data recovery circuit

The circuit uses eight multi-phase clocks (C[n]), where is  $1 \le n \le 8$ ) provided by phase interpolators and phase locked loop, which are not shown in this figure. Since source synchronous clocking is adapted in our applications, it can be

assumed that multi-phase clocks are frequency-locked to the incoming data.

## A. Digital Phase Aligner

In figure 2, DPA consists of multi-phase comparator and phase selector, which include flip-flops and OR logic. It corresponds to the upper DPA of overall circuit in figure 1. The multi-phase comparator compares multi-phase clocks at the data transition time and provides select signals to phase selector, in which the optimum clock of multi-phase clocks is selected and the composed clock of CCP is generated. When transition of the input data occurs, the first column of flip-flops (FF1) in phase comparator samples multi-phase clocks of C[n] at the rising time of DIN and holds the latched value of C[n] until the next rising time of DIN. The second column of flip-flops (FF2) in phase comparator samples the latched value of C[n] at the falling time of DIN. The circuit compares the outputs of FF2 with multi-phase clocks and generates the select signal of SP[n]. Two consecutive flip-flops are used in order to prevent malfunction when meta-stability condition occurs. Figure 3 shows the timing diagram of DPA. For example, if C[n-1] and C[n] are respectively '1' and '0' at the rising time of DIN, the select signal of SP[n] changes '1' to '0' at the falling time of DIN, which means that C[n] is chosen as the optimum clock closer to the center of data bit. It is represented as SP[k] in figure 3 that the kth select signal remains '0' and others are '1' until before the state of select signal changes.



Figure 2. Multi-phase comparator and phase selector

In phase selector, the gated clock of GCP[n] is activated using multi-phase clocks and SP[n] from the phase comparator. The circuit merges the gated clocks into the optimum sampling clock of CCP for the positive transition of input data. When SP[n] is '0', which means that C[n] is selectable as the optimum clock for data bit, C[n] of multiphase clocks is selected by OR logic and GCP[n] is synchronized to C[n]. When SP[n] is '1', which means that C[n] is not useful for sampling the input data, C[n] is not selected and GCP[n] remains '1'. The OR logic with N input merges the GCP[n]s to provide CCP for positive input data transition.



Figure 3. A timing diagram of DPA

### B. Phase Interpolator

As shown in figure 1, there are two DPAs for positive and negative transition of DIN in the proposed circuit. The optimum sampling clock for the positive transition of DIN is driven from the upper DPA. On the other side, the optimum sampling clock for negative transition of DIN comes from the lower one. The phase interpolator takes two inputs from two DPAs and mixes them to produce the output of the phase positioned between their phases. In Fig. 1, CCP from positive digital phase aligner and CCN from negative one are fed into the phase interpolator to be interpolated. When the data with jitter is inputted into the circuit, the phase of the sampling clock from the positive digital phase aligner can be different from the one of the negative digital phase aligner.



Figure 4. A timing diagram for bit sequence with input data jitter

Figure 4 shows the timing diagram when bit sequence with input data jitter is applied to the proposed circuit. As illustrated in figure 4, when the jitter occurs in data bit, the transition time of DIN shifts by the amount of incoming jitter and it results in the phase difference of two merged clocks of CCP and CCN. The interpolator averages them to reduce the jitter of the retiming clocks. In figure 4, SN[n] and CCN denote the select signal and the composed clock of the lower DPA, respectively.

In order to evaluate the jitter performance, the conventional circuit of [4] and the proposed one are modeled with behavioral simulation described in [5]. The conventional circuit samples multi-phase clocks at rising edge of data but our circuit does at both of rising and falling edges. Input data with 3/8 UIp-p (peak-to-peak unit interval) of eye closing and uniformly distributed jitter is applied and output jitters of recovered clock are compared as shown in figure 5. In this figure, the phase variations of input data and recovered clock are represented as unit interval (UI). Simulation results show that peak-to-peak jitters of both circuits are similar to about 0.4 UI because the peak-to-peak jitter is limited by quantized resolution of clock phases. However, rms jitter of the proposed circuit is decreased by 30%. The proposed clock and data recovery can filter the input jitter using the phase interpolator, whereas the conventional circuit has no ability to reject the input jitter. It can also alleviate degradation of duty cycle of recovered clock resulted from abrupt changes of clock phases. Therefore, it overcomes the problem that is difficult to design elastic buffer with sufficient timing margin.



Figure 5. Comparison of jitter performance when input data jitter occurs : (a) input data jitter (b) the conventional circuit and (c) the poposed circuit

## III. EXPERIMENTAL RESULTS

The proposed clock and data recovery circuit is fabricated by 0.13um 1-poly, 8-metal n-well CMOS process technology. It is designed to operate at 1Gb/s data rate and have 9.7mW of power dissipation with 1.2V of power supply. DPA and phase interpolator consumes only 3mW. Figure 6 shows the circuit layout and its occupation of 290um x 230um including multi-phase clock generator bock. In our realization, DPA is designed with single logic for low power dissipation and phase interpolator with differential circuit for noise immunity. Therefore, single-to-differential and differential-to-single circuits are used for logic conversion. The dummy logic cell is also included for matching delay between input data and recovered clock.



Figure 6. The layout of the poposed circuit

A PLL was designed to have second-order loop filters to make the loop bandwidth 1/100 of the external frequency. It occupies 120um x 250um without loop filters and provides 4 phase clocks to the proposed circuit, which generates equally spaced 8 phase clocks from multi-phase clock generator. Figure 7 shows measured jitter histogram of 1GHz PLL output clock. As shown in the figure, rms and peak-to-peak jitter of the PLL output clock are 5.9ps (0.006UI) and 34.4ps (0.034UI), respectively.

In figure 8, it shows the measured eye diagram for the recovered data of the proposed circuit. The width and height of opening eye are 780ps (0.78UI) and 300mV, respectively. For measurement, 1Gb/s of  $2^7$ -1 PRBS was applied to result in operating no error.



Figure 7. Jitter histogram of PLL ouptut



Figure 8. Output eye diagram at 1 Gb/s

The summary of the circuit performance is shown in table 1.

| Parameter          | Value                                                                  |
|--------------------|------------------------------------------------------------------------|
| Process Technology | 0.13um 1-poly 8-metal CMOS                                             |
| Supply Voltage     | +1.2V single supply                                                    |
| Layout Size        | 290x230um <sup>2</sup> including multi-phase clock<br>generation block |
| Power Dissipation  | 9.7mW @ 1Gb/s                                                          |
| PLL output jitter  | rms jitter : 5.9 ps<br>p-p jitter : 34.4 ps                            |
| Bit Error Ratio    | No error with 2 <sup>?</sup> -1 PRBS at 1Gb/s                          |

TABLE I SUMMARY OF THE CIRCUIT PERFORMANCE

# IV. CONCLUSION

Clock and data recovery to operate at 1Gb/s is proposed and fabricated with 0.13um CMOS process. This circuit includes digital phase aligner and phase interpolator to reduce the jitter of recovered clock and alleviate the problem of distorted duty cycle. Since it has low power of 9.7mW and small area per channel to recover data, it is suitable for multilink applications. It is immune to PVT variation and operates in a stable manner due to digital logic and phase interpolator. Its experimental result shows that it operates at 1 Gb/s of  $2^7$ -1 PRBS with no error.

### REFERENCES

- H. Djahanshahi and C.A.T. Salama, "Differential CMOS circuits for 622-MHz/933-MHz clock and data recovery applications," IEEE J. Solid-State Circuits, vol. 35, pp. 847–855, June 2000.
- [2] C.-K. Yang and M. Horowitz, "A 0.8um CMOS 2.5Gb/s oversampling receiver and tramsmitter for serial links," IEEE J. Solid-State Circuits, vol. 31, pp. 2015–2023, December 1996.
- [3] R.R.Cordell, "A 45-Mbit/s CMOS VLSI digital phase aligner," IEEE J. Solid-State Circuits, vol. 23, pp. 323–328, April 1998.
- [4] H.Y. Jung, B.C. Lee, and K.C. Park, "High speed digital data retiming apparatus," U.S. Patent 5887040, March 23, 1999.
- [5] M.H. Perrott, "Fast and accurate behavioral simulation of fractional-N frequency synthesizers and other PLL/DLL circuits," Design Automation Conference 2002, Proceedings 39th, pp. 498-503, June 2002.