# A 1Gb/s CMOS Data Retiming Circuit using Synchronous Digital Phase Aligners

T. S. Cheung, B. C. Lee, and W. Y. Choi IP Switching Technology Team, ETRI 161 Gajong, Yusong, Daejon, 305-350, Korea Dept. of Electrical and Electronic Eng., YONSEI University 134 Shinchon, Seodaemun, 120-749, Seoul, Korea

*Abstract* - A 1Gb/s data retiming circuit using newly proposed synchronous digital phase aligner is realized for multi-link applications. The data retiming circuit is implemented with 0.35 $\mu$ m CMOS process technology. The experimental results show that the proposed retiming circuit recovers the incoming 580Mb/s ~ 1.08Gb/s of 2<sup>31</sup>-1 pseudo random data.

### I. INTRODUCTION

In multi-link applications, a digital data retiming circuit is widely used to design a serial link since it is difficult to integrate multiple analog PLL-based data retiming circuits with digital circuits [1-5].

There are many types of digital data retiming circuits such as oversampler, gated oscillator, or digital phase aligner. The oversampler is cost-effective but has a disadvantage in that it requires quite large chip area because of its complexity [1]. The gated oscillator needs very small chip area and provides good high-frequency jitter tolerance. However, this is also not suitable for multi-link applications because it is difficult to match the characteristics of the voltage-controlled oscillators (VCO's) in each serial link [2]. The digital phase aligner is most suitable for these applications since it needs small chip area and uses small number of multiphase clocks compared to that using oversampling technique [3,4].

The conventional data retiming circuit using digital phase aligner, however, has a limitation in increasing its operating frequency because it retimes the incoming data using a bit clock synthesized by merging two or three neighboring multiphase clocks to prevent metastability, which makes the timing margin to be poor. And, since the phase relationship between the merged bit clock and reference bit clock is unknown, it is difficult to design an elastic buffer to change the clock domain of the retimed data from the merged bit clock to the reference bit clock [4].

We introduce a new data retiming circuit using a newly proposed synchronous digital phase aligner to overcome these problems [5].

## II. CIRCUIT ARCHITECTURE

Fig. 1 shows a block diagram of the proposed data

retiming circuit. It consists of one phase comparator, two synchronous digital phase aligners and one selector. It uses seven multiphase bit clocks (CP[n], where n is  $-3 \le n \le +3$ ) that are frequency-locked to the incoming data stream (DIN). And CP[0] is used as a system bit clock. Since CP[n] can easily be generated by applying a bit clock to a simple delay line, it is not described in this paper.



Fig. 1. Proposed data retiming circuit.



Fig. 2. Phase comparator and its timing diagram.

#### A. Phase Comparator

Fig. 2 shows an n-th slice of the phase comparator and its timing diagram. In this figure, C[n] is the latched value of the CP[n] at the rising transition of DIN and  $\overline{C[n-1]}$  is the inverting value of C[n-1] at the (n-1)-th slice. ( $\overline{C[3]}$  is used as an input of the two-input AND gate instead of  $\overline{C[-4]}$  at the first slice of the phase comparator.)

This chip was fabricated through the MPW program at IC Design Education Center (IDEC), Korea.

When C[n-1] is "0" and C[n] is "1", the rising edge of the CP[n] is positioned in the range of T/2 ~ T/2+ $\Delta$  which is the optimum timing to retime the corresponding DIN bit, where T is the bit clock period and  $\Delta$  is T/7 (a delay between CP[n-1] and CP[n].) A phase comparison result (P[n]) becomes "1" when CP[n] is in the optimum timing region and it returns to "0" when it is not. By using two D-type flip-flops and one OR gate, P[n] is retimed to CP[n] and is overlapped for one bit clock period in order to prevent malfunction when the metastable condition is occurred.



Fig. 3. Synchronous digital phase aligner and its timing diagram.



Fig. 4. Result extending algorithm of the synchronous digital phase aligner.

## B. Synchronous Digital Phase Aligner

Fig. 3 shows the synchronous digital phase aligner and its timing diagram. P[n]'s are first aligned to  $\overline{CP[0]}$  and then are finally aligned to CP[0] when n is less than 0, and the rest P[n]'s are directly aligned to CP[0] as shown in the timing diagram in the Fig. 3.

After all the P[n]'s are aligned to CP[0], they are extended to two-bit-wide results (P'[m]) by delaying their timing from 0 to 2T as shown in the Fig. 4, where m is  $-6 \le m \le +6$ . By doing this, the proposed data retiming circuit can always provides the minimum jitter or wander tolerance up to 0.85T.

The Fig. 4 shows two different signal groups in P'[m] that are spaced and delayed by one bit period with each other. One group will be selected as a valid signal group closer to the center of the two-bit-wide window, and the other will be discarded as an invalid signal group by the selector.

The synchronous digital phase aligner shown in the Fig. 3 is also used to sample DIN and to extend the sampled signals to two-bit-wide sampled signal (DIN'[m].) DIN is applied to the first D-type flip-flop stage instead of P[n]'s when this circuit is used for sampling DIN.

### C. Selector

The selector receives DIN'[m]'s and P'[m]'s, and selects one valid data (DOUT) as a retimed data output by combining these inputs as following equations.

DOUT = 
$$\sum_{m=-6}^{0} (DIN'[m] \times P'[m] \times \sum_{n=-3}^{m+3} S[n])$$
  
+  $\sum_{m=1}^{6} (DIN'[m] \times P'[m] \times \sum_{n=m-3}^{3} S[n])$  (1)

where S[n] is the initial value of P'[n] used to discard the invalid P'[m] values. The selector can be easily implemented using thirteen 7-input OR gates and 3-input AND gates, and one 13-input OR gate. Although the summation of S[n]'s does not require high-speed operation, the 13-input OR gate needs to be operated in the bit clock frequency, hence it should be implemented by a combination of several high-speed 3-input NOR and NAND gates.



Fig. 5. Chip layout. (4mm x 4mm)

**III. EXPERIMENTAL RESULTS** 

The proposed data retiming circuit was designed to operate at up to 1Gb/s data rate and was implemented by  $0.35\mu m$ CMOS process technology. Four serial link channels are included in one chip to provide an aggregate bandwidth of 4Gb/s. This operates with a +3.3V single power supply and consumes 2W when all the channels are active. Fig. 5 shows the chip layout. One channel occupies 0.36mm x 0.68mm.

The PLL was designed fully differentially and a 1/8 frequency divider was included in it to synthesize 1GHz bit clock from 125MHz external reference clock source and the on-chip loop filter was designed to make the loop bandwidth of the PLL to be 5MHz.

The PLL is fully functional with a lock range of  $580 \sim 1.08$ GHz with the input reference signal ranging from 72.5 to 126MHz. The VCO used in the PLL is shown in [6].



Fig. 6. Bit clock output jitter histogram.

Fig. 6 shows the measured jitter histogram of the 1GHz output clock signal of the PLL. As shown in this figure, the rms and peak-to-peak jitter of the PLL output are 4.5ps (0.005UI) and 34.2ps (0.034UI) respectively.



Fig. 7. Input data eye diagram at 1Gb/s.

Fig. 7 shows the measured eye diagram of the incoming data stream. For the measurement, a 1Gb/s of  $2^{31}$ -1 pseudo random bit sequence was applied through a 10.5m of transmission path consisting of three 1.5m coaxial cables and two 3m twin-ax cables. The measured eye width and height of the incoming signal were 620ps and 120mV, respectively.



Fig. 8. Output data eye diagram of the proposed circuit at 1Gb/s.

Fig. 8 shows the measured eye diagram of the retimed output signal of the proposed circuit. The measured eye width and height of the recovered output are 920ps and 700mV, respectively.

The proposed circuit shows unstable bit error ratio (BER) characteristics when it operates at about 950Mb/s or higher data rates. It can be inferred from the repeated experiments that it depends on the phase relationships between the incoming data stream and the multiphase clocks, but further analysis is required. The proposed circuit operates error-free up to 900Mb/s.

#### IV. CONCLUSION

A 580Mb/s ~ 1.08Gb/s data retiming circuit was realized with  $0.35\mu$ m CMOS process technology. This circuit includes a newly proposed synchronous digital phase aligner that provides the minimum jitter or wander tolerance up to 0.85T, and a phase comparator that has immunity on occasional metastability condition. The experimental results show that the data retiming circuit operates error-free up to 900Mb/s but becomes unstable at the data rate higher than 950Mb/s, which needs further analysis.

#### REFERENCES

- C.-K. Yang and M. Horowitz, "A 0.8µm CMOS 2.5Gb/s oversampling receiver and transmitter for serial links," IEEE J. Solid-State Circuits, vol.31, no.12, December 1996, pp.2015-2023,.
- [2] A.E. Dunlop, W.C. Fischer, M. Banu, and T. Gabara, "150/30Mb/s CMOS non-oversampled clock and data recovery circuits with instantaneous locking and jitter rejection," in ISSCC Dig. Tech. Papers, February 1995, p.44.
- [3] R.R. Cordell, "A 45-Mbit/s CMOS VLSI digital phase aligner," IEEE J. Solid-State Circuits, vol.23, no.2, April 1998, pp.323-328.
- [4] H.Y. Jung, B.C. Lee, and K.C. Park, "High speed digital data retiming apparatus," U.S. Patent 5887040, March 23, 1999.
- [5] T.S. Cheung, B.C. Lee, and E.C. Choi, "A data recovery and retiming unit for multi-link using multi-phase clocks," Korean Patent Pending, 10-2002-0019167, 2002.
- [6] T.S. Cheung, B.C. Lee, E.C. Choi, and W.Y. Choi, "A 1.8-3.2-GHz fully differential GaAs MESFET PLL," IEEE J. Solid-State Circuits, vol.36, no.4, April 2001, pp.605-610.