# A Low-Power 40-Gb/s Pre-Emphasis PAM-4 Transmitter With Toggling Serializers

Dae-Hyun Kwon<sup>®</sup>, Minkyu Kim<sup>®</sup>, Sung-Geun Kim, and Woo-Young Choi<sup>®</sup>

*Abstract*—We demonstrate a 40-Gb/s PAM-4 transmitter having 2-tap pre-emphasis, whose power consumption is significantly reduced by the use of toggling serializers. In addition, a new type of PAM-4 source-series terminated driver is proposed, which enables an easy control of pre-emphasis gain while maintaining impedance matching. A prototype 40-Gb/s PAM-4 transmitter containing these circuit ideas are successfully realized in 28-nm CMOS technology. It occupies 0.006 mm<sup>2</sup> and achieves energy efficiencies of 1.2 pJ/bit for 40-Gb/s operation without any pre-emphasis, and 1.68 pJ/bit with 9.54 dB pre-emphasis gain.

*Index Terms*—PAM-4 transmitter, toggling serializer, preemphasized PAM-4, SST driver, high speed serial link, multiphase, PAM-4 receiver.

## I. INTRODUCTION

S THE data rates of SerDes systems for various interconnect applications continuously increase, Pulse Amplitude Modulation (PAM-4) has become the required technique for achieving high data rates with reduced bandwidth. There have been many research and development activities in realizing PAM-4 transceivers [1], [2]. However, with multiple signal levels, realizing high-performance transceivers with low-power consumption is a great design challenge. In particular, new circuit ideas are needed for implementing PAM-4 transmitters with the pre-emphasis capability.

Fig. 1(a) shows the architecture of the conventional PAM-4 transmitter with 2-tap pre-emphasis [3]. Here, MSB and LSB serializers produce  $D_{\rm M}$  and  $D_{\rm L}$ , respectively, and for pre-emphasis implementation they are delayed by one UI in high-speed DFFs and combined in the driver. In order to generate desired PAM-4 signal levels, the driver should produce twice as large signals for MSB as for LSB. The desired amount of pre-emphasis is determined by  $\alpha$ , which represents the ratio of one UI delayed LSB signals to those without delay.

D.-H. Kwon, M. Kim, and W.-Y. Choi are with the High-Speed Circuits and Systems Laboratory, Department of Electrical and Electronic Engineering, Yonsei University, Seoul 120-749, South Korea (e-mail: wchoi@yonsei.ac.kr).

S.-G. Kim was with the High-Speed Circuits and Systems Laboratory, Yonsei University, Seoul 120-749, South Korea. He is now with Foundry Division, Samsung Electronics, Seoul 18448, South Korea.

Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TCSII.2019.2922040

Although the structure shown in Fig. 1(a) is conceptually straight-forward, its implementation requires many buffers and DFFs for processing high-speed signals. CML-type combiners are used in [3], resulting in very large power consumption. In [4], Series-Source Terminated (SST) drivers are used for power consumption reduction, but many sets of resistors are needed for impedance matching, resulting in a very a complex configuration.

In this brief, we demonstrate a new architecture for a PAM-4 transmitter with pre-emphasis having a much simpler configuration and higher energy efficiency compared to previously reported PAM-4 transmitters. This brief is based on the toggling serializer [5], which extracts input data transition information and uses it for serialization and pre-emphasis, resulting in much reduced power consumption and chip sizes. In addition, a new SST driver is demonstrated with which the pre-emphasis gain can be easily controlled while maintaining the required output impedance.

This brief is organized as follows. In Section II, the structure of proposed PAM-4 transmitter with pre-emphasis is described, and the structure of new SST driver is explained. Section III gives details of circuit implementation for key building blocks. Section IV discusses measurement results of the prototype chip. Section V gives the conclusion.

## **II. PAM-4 TRANSMITTER ARCHITECTURE**

## A. PAM-4 Transmitter Based on Toggling Serializers

Fig. 1(b) shows the structure of proposed PAM-4 transmitter. Each of MSB and LSB toggling serializers generates serialized data as well as two sets of data transition signals. For MSB,  $T_{MR}$  is logic high only when there is a rising data transition and  $T_{\rm MF}$  is logic high only when there is a falling data transition.  $T_{LR}$  and  $T_{LF}$  represent the same transition information for LSB. The proposed toggling serializer does not require high-speed DFFs and clock buffers, so that it can save the power consumption. Fig. 2 shows timing diagrams for these transition signals for sample input NRZ data  $D_{\rm M}$  for MSB and  $D_L$  for LSB. In the figure,  $T_{MR}$  and  $T_{LR}$  are logic high in the second interval as both  $D_M$  and  $D_L$  make transition from logical zero to one. On the other hand,  $T_{\rm MF}$  and  $T_{\rm LF}$  are logic high in the sixth interval as both  $D_{\rm M}$  and  $D_{\rm L}$  make transition from logical one to zero. These transition signals can be used for pre-emphasis implementation as well as serialization, as demonstrated for a low-power NRZ transmitter in [5]. As shown in Fig. 2 for the sample MSB and LSB data, the desired pre-emphasized PAM-4 signals,  $D_{\rm O}$ , can be obtained

1549-7747 © 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications\_standards/publications/rights/index.html for more information.

Manuscript received April 17, 2019; accepted June 7, 2019. Date of publication June 10, 2019; date of current version March 4, 2020. This work was supported in part by the Samsung Electronics, Materials and Parts Technology Research and Development Program funded by the Korean Ministry of Trade, Industry and Energy under Project 10065666, and in part by the Graduate School of Yonsei University Research Scholarship Grants. This brief was recommended by Associate Editor W. Namgoong. (*Corresponding author: Woo-Young Choi.*)



Fig. 1. (a) Conventional PAM-4 transmitter and (b) proposed PAM-4 transmitter.



Fig. 2. Timing diagram for producing pre-emphasized PAM-4 signals.



Fig. 3. Source-series terminated driver and clipper circuit with its feedback circuits for PAM-4 transmitter.

with the following simple logic operation,

$$D_{\rm O} = 2 \cdot D_{\rm M} + 2\alpha \cdot (T_{\rm MR} - T_{\rm MF}) + D_{\rm L} + \alpha \cdot (T_{\rm LR} - T_{\rm LF}).$$
(1)

Notice above operation does not require any delay elements, which can significantly increase power consumption.

#### B. SST Driver for PAM-4 Signals

Fig. 3 shows the structure for the half circuit of differential SST driver in this brief.  $D_{\rm M}$ ,  $D_{\rm L}$ ,  $T_{\rm MR}$ ,  $\overline{T_{\rm MF}}$ ,  $T_{\rm LR}$  and  $\overline{T_{\rm LF}}$  initially go through the clippers so that desired MOSFET turn-on resistance values (R<sub>ON,M</sub>, R<sub>ON,L</sub>, R<sub>ON,TM</sub>, R<sub>ON,TL</sub>) can be realized [6]. In addition, the values of series resistances  $(R_{S,M}, R_{S,L}, R_{S,TM}, R_{S,TL})$  are carefully selected so that impedance matching is achieved. In order to provide required output voltage swing levels for the desired pre-emphasis gain, CMOS drivers for  $T_{\rm MR}$ ,  $\overline{T_{\rm MF}}$ ,  $T_{\rm LR}$ , and  $\overline{T_{\rm LF}}$  are implemented with variable supply voltages of  $V_A$  and  $V_B$ , and the clippers are designed with feedback circuits (shown in right side of Fig. 3) producing  $V_1$  and  $V_2$  which are used within each clipper for achieving output impedance of  $R/\alpha$  regardless of  $V_A$ and  $V_{\rm B}$  in case of  $T_{\rm LR}$  and  $\overline{T_{\rm LF}}$  signals. In addition, the tuning range for  $V_A$  is limited from 1.2 V to 0.8 V and for  $V_B$  from 0 V to 0.4 V in our implementation so that any mismatch in driving speed and timing is minimized between toggling signals ( $T_{\text{MR}}$ ,  $\overline{T_{\text{MF}}}$ ,  $T_{\text{LR}}$ ,  $\overline{T_{\text{LF}}}$ ) having varying supply voltages and data signals ( $D_{\text{M}}$ ,  $D_{\text{L}}$ ) having the fixed supply voltage.

The operation of proposed SST driver can be more easily explained with the equivalent circuit shown in Fig. 4. Here, *R*, the total series resistance for  $D_L$ , represents the sum of  $R_{ON,L}$  and  $R_{S,L}$ .  $D_M$  has its total series resistance of *R*/2 since the output voltage for MSB should be twice as large as for LSB. The desired LSB output voltage swing with pre-emphasis can be achieved by providing the total series resistance of *R*/ $\alpha$ to  $T_{LR}$  and  $\overline{T_{LF}}$ . Likewise, the MSB output voltage swing with pre-emphasis is achieved by the total series resistance of *R*/ $\alpha$ to  $T_{MR}$  and  $\overline{T_{MF}}$ .

From the equivalent circuit shown in Fig. 4, the total output impedance of proposed SST driver,  $Z_0$ , can be easily determined as

$$Z_{\rm O} = \left(\frac{\rm R}{2} \| \frac{\rm R}{2\alpha} \| \frac{\rm R}{2\alpha}\right) \| \left(\rm R \| \frac{\rm R}{\alpha} \| \frac{\rm R}{\alpha}\right). \tag{2}$$

With this, we can determine the resistance values that satisfy Eq. (2). For the implementation,  $\alpha = 1$ , R = 450  $\Omega$  are used, which provide  $Z_{\Omega} = 50 \Omega$ . In order to realize R = 450  $\Omega$ , the



Fig. 4. Equivalent circuit for PAM-4 source-series terminated driver.



Fig. 5. Output voltage levels for pre-emphasized PAM-4 signals.

SST driver is designed so that  $R_{ON,M} = 75 \Omega$ ,  $R_{S,M} = 150 \Omega$ ,  $R_{ON,TM} = 75 \Omega$ ,  $R_{S,TM} = 150 \Omega$ ,  $R_{ON,L} = 65 \Omega$ ,  $R_{S,L} = 385 \Omega$ ,  $R_{ON,TL} = 65 \Omega$ , and  $R_{S,TL} = 385 \Omega$ .

There are total of 16 different output voltage levels for all the possible PAM-4 data transitions with pre-emphasis as shown in Fig. 5. The black lines in the figure represent the output voltage levels when there is no pre-emphasis. The gray lines represent the output voltages having pre-emphasis. The exact values for these 16 different output voltage levels can be calculated from the equivalent circuit shown in Fig. 4. In addition, the pre-emphasis gain,  $G_{PRE}$ , can be defined as shown in Fig. 5, which can be determined as,

$$G_{\text{PRE}} = 20 \cdot \log \left( 1 + \frac{2 \cdot \alpha \cdot (V_{\text{A}} - V_{\text{B}})}{V_{\text{DD}}} \right). \tag{3}$$

From Eq. (1) and Fig. 2,  $G_{PRE}$  can be originally defined as  $20 \cdot \log(1 + 2\alpha)$ , however, in the structures,  $V_A$  and  $V_B$  are



Fig. 6. Post-layout simulation results for PAM-4 SST driver output signals: (a) without any pre-emphasis, (b)  $\alpha \cdot (V_A - V_B)/V_{DD} = 1/3$ , (c)  $\alpha \cdot (V_A - V_B)/V_{DD} = 2/3$ , and (d)  $\alpha \cdot (V_A - V_B)/V_{DD} = 1$ .

changed to control the pre-emphasis gain so that  $V_{\rm A}$ - $V_{\rm B}$  is added as a variable with a constant,  $V_{\rm DD}$ .

Fig. 6 shows the post-layout simulation results in 28-nm CMOS technology for pre-emphasized PAM-4 40-Gbps signals produced by the SST drivers. For these simulations,  $\alpha = 1$ ,  $V_{\rm DD} = 1.2$ V,  $Z_{\rm L} = 50 \ \Omega$  are used while  $V_{\rm A}$ - $V_{\rm B}$ are changed from 0.4 V to 1.2 V while their common-mode voltage is maintained at 0.6V. The simulation results without any pre-emphasis is shown in Fig. 6(a), and the Ratio of Level Mismatch (RLM), the value of minimum eye-opening among the three eye-openings divided by the average of them, of the signal is 0.93. Even though this RLM value satisfy the IEEE 802.3bs standard [7], the simulation result can be further optimized with optimizing layout and enhancing the gain of the clipper feedback circuits. With this symmetric change of  $V_{\rm A}$ - $V_{\rm B}$  around the common-mode voltage,  $V_{\rm A}$  and  $V_{\rm B}$  can be controlled with one external voltage. With these conditions, G<sub>PRE</sub> ranges from 4.44 dB to 9.54 dB. As can be seen in the figure, impedance-matched 40-Gbps PAM-4 signals having different pre-emphasis gain are produced by one external control voltage.

#### **III. CIRCUIT IMPLEMENTATION**

Fig. 7 shows the block diagram of this brief. It includes a PRBS generator, two NRZ serializers, clock buffers, frequency dividers, single-to-differential converters (S2Ds), SR latches, and a SST driver. The PRBS generator produces 8 x 5 Gb/s PRBS  $2^7$ -1 parallel NRZ signals which are first converted into RZ signals with synchronized quadrature clock signals (*CK*<sub>0</sub>, *CK*<sub>90</sub>, *CK*<sub>180</sub>, and *CK*<sub>270</sub>). This conversion is necessary for achieving toggling serialization with resettable DFFs [8]. Although the RZ format has twice bandwidth compared to NRZ, the RZ format signals are used only for generating toggling signals (*T*<sub>MR</sub>, *T*<sub>MF</sub>, *T*<sub>LR</sub>, *T*<sub>LF</sub>), which themselves have the NRZ format. The increased power consumption due to using the RZ format signals are compensated by elimination of power-hungry high-frequency buffers with our toggling serializer [5], [8]. Quadrature clock signals are



Fig. 7. Block diagram of PAM-4 transmitter.



Fig. 8. Measurement setup, and summary of area and power for each block.

generated with a frequency divider and externally supplied half-rate clock signals ( $CK_P$  and  $CK_N$ ) having  $f_{CK} = 10$ -GHz. Two toggling serializers produce transition signals,  $T_{MR}$ ,  $T_{MF}$ for MSB and  $T_{LR}$ ,  $T_{LF}$  for LSB, respectively. S2Ds converts  $T_{MR}$ ,  $T_{MF}$ ,  $T_{LR}$  and  $T_{LF}$  into differential signaling. SR latches [8] produce serialized NRZ data for MSB and LSB from these transition signals. At the same time, they buffer the transition signals so that they have the same timing delay. The transition signals and serialized data are supplied directly to SST drivers, where they are summed up for producing PAM-4 output signals with pre-emphasis.

## **IV. MEASUREMENT RESULTS**

A prototype 40-Gb/s PAM-4 transmitter is implemented in 28-nm CMOS technology. The chip microphotograph and the measurement setup are shown in Fig. 8. The chip is mounted on a FR-4 printed circuit board with wire bonding for supply voltages and low-frequency control signals. Highfrequency signals are measured with on-chip probes. Also shown in Fig. 8 is a table showing the power consumption



Fig. 9. Measured eye-diagram of 40-Gb/s PAM-4 transmitter output.

and the chip area of each block. The prototype chip consumes, excluding the PRBS generator, 47.9 mW without any pre-emphasis operation and 67.3 mW with the highest preemphasis gain of 9.54 dB at 1.2-V supply voltage. It occupies 0.006 mm<sup>2</sup>. Fig. 9 shows the measured eye-diagram for the 40-Gbps PAM-4 output without any pre-emphasis.

For measuring the performance of pre-emphasized data, two FR4 traces having different trance length are used, whose measured channel characteristics are shown in Fig. 10. Also shown are the channel losses at the Nyquist frequency of 10-GHz. The measured eye diagrams after transmission through these traces are shown in Fig. 11. In the figure, the center of each eye-diagram is slightly misaligned in the Fig.11(b) and (d) due to the skew of quadrature clocks from the frequency dividers used for RZ data aligners. This can be further enhanced by the precise quadrature clock generation by Phase-Locked Loops (PLLs) or phase calibration techniques.

For each measurement,  $V_{\rm A}$ - $V_{\rm B}$  is controlled so that the optimum eye diagram is achieved. For this, Channel 1 requires



Fig. 10. Measured channel characteristics.



Fig. 11. Measured eye-diagrams with/without pre-emphasis for different channels.



Fig. 12. Measured return loss of SST driver.

 $G_{\text{PRE}}$  of 4.44-dB for, and Channel 2 9.54-dB. For Channel 1, the optimal value is larger than the corresponding channel losses at the Nyquist frequency shown in Fig. 11. This is due to additional losses caused by I/O pads and connectors. For Channel 2, the channel loss (10.42-dB) is larger than the maximum  $G_{\text{PRE}}$  available with present implementation (9.54 dB), resulting in degraded eye opening.

Fig. 12 shows the measured return loss for different values of  $V_{\rm A}$ - $V_{\rm B}$ . As shown in the figure, the driver can satisfy the return loss mask of CEI-56G-VSR [9] regardless of the change in the pre-emphasis gain. The performance of this

 TABLE I

 Performance Comparison With PAM-4 Transmitters

|                                | [10]   | [11]  | This Work |       |
|--------------------------------|--------|-------|-----------|-------|
| Data-rate (Gb/s)               | 25     | 40    | 40        |       |
| Tx FFE                         | 2-taps | No    | 2-taps    | No    |
| Active Area (mm <sup>2</sup> ) | 0.083  | 0.028 | 0.006     | 0.006 |
| Technology                     | 65 nm  | 14 nm | 28 nm     | 28nm  |
| Eye-opening<br>[mV]            | 38     | 61    | 51        | 51    |
| FOM (pJ/bit)                   | 2      | 4.2   | 1.68      | 1.2   |

brief is compared with previously reported 2-tap PAM-4 transmitters in Table I. The proposed PAM-4 transmitter is much smaller than previously results reported in [10] and [11]. This is because a driver based on capacitor DAC was used in [10], and in [11], inductors were used in [11], both of which takes up large chip areas. This brief achieves the lowest energy efficiency although fair comparison is not possible due to different technologies used for implementation.

## V. CONCLUSION

A 40-Gb/s PAM-4 transmitter based on the toggling serializer having pre-emphasis is demonstrated. It produces preemphasized PAM-4 signals with reduced power consumption and chip-area. In addition, a new SST driver is proposed, which allows easy control of pre-emphasis gain. A prototype PAM-4 transmitter realized in 28-nm CMOS technology successfully confirms that the transmitter in this brief operates properly.

#### ACKNOWLEDGMENT

The authors are thankful to IDEC for EDA software support.

### REFERENCES

- B. Song, K. Kim, J. Lee, J. Chung, Y. Choi, and J. Burm, "A 13.5-mW 10-Gb/s 4-PAM serial link transmitter in 0.13-µm CMOS technology," *IEEE Trans. Circuits Syst. II, Exp. Briefs*, vol. 61, no. 9, pp. 646–650, Sep. 2014.
- [2] P.-J. Peng, J.-F. Li, L.-Y. Chen, and J. Lee, "6.1 A 56Gb/s PAM-4/NRZ transceiver in 40nm CMOS," *Dig. Tech. Papers IEEE Int. Solid-State Circuits Conf.*, vol. 60, 2017, pp. 110–111.
- [3] J. Lee, P.-C. Chiang, P.-J. Peng, L.-Y. Chen, and C.-C. Weng, "Design of 56 Gb/s NRZ and PAM4 SerDes transceivers in CMOS technologies," *IEEE J. Solid-State Circuits*, vol. 50, no. 9, pp. 2061–2073, Sep. 2015.
  [4] G.-S. Byun and M. M. Navidi, "A low-power 4-PAM transceiver using
- [4] G.-S. Byun and M. M. Navidi, "A low-power 4-PAM transceiver using a dual-sampling technique for heterogeneous latency-sensitive networkon-chip," *IEEE Trans. Circuits Syst. II, Exp. Briefs*, vol. 62, no. 6, pp. 613–617, Jun. 2015.
- [5] S.-G. Kim, T. Kim, D.-H. Kwon, and W.-Y. Choi, "A 5–8 Gb/s low-power transmitter with 2-tap pre-emphasis based on toggling serialization," in *Proc. IEEE Asian Solid-State Circuits Conf. (A-SSCC)*, 2017, pp. 249–252.
- pp. 249–252.
  [6] C. Menolfi et al., "A 28Gb/s source-series terminated TX in 32nm CMOS SOI," Dig. Tech. Papers IEEE Int. Solid-State Circuits Conf., vol. 55, 2012, pp. 334–335.
- [7] IEEE P802.3bs 400 Gb/s Ethernet Task Force. Accessed: Aug. 10, 2018.
   [Online]. Available: http://www.ieee802.org/3/bs/
- [8] S. Kim, "Low-power transmitter based on data transition information," Ph.D. dissertation, Grad. School, Yonsei Univ., Seoul, South Korea, Aug. 2017.
- [9] (2017). OIF CEI-56G Implementation Agreements. [Online]. Available: https://www.oiforum.com/wp-content/uploads/2019/01/OIF-CEI-04.0. pdf
- [10] B. Hu, Y. Du, R. Huang, J. Lee, Y. K. Chen, and M. C. F. Chang, "A capacitor-DAC-based technique for pre-emphasis-enabled multilevel transmitters," *IEEE Trans. Circuits Syst. II, Exp. Briefs*, vol. 64, no. 9, pp. 1012–1016, Sep. 2017.
- [11] J. Kim et al., "3.5 A 16-to-40Gb/s quarter-rate NRZ/PAM4 dual-mode transmitter in 14nm CMOS," Dig. Tech. Papers IEEE Int. Solid-State Circuits Conf., vol. 58, 2015, pp. 60–61.