# A Wide Tracking Range 0.2-4Gbps Clock and Data Recovery Circuit

Pavan Kumar Hanumolu, Gu-Yeon Wei<sup>1</sup>, and Un-Ku Moon School of the EECS, Oregon State University, OR 97331 <sup>1</sup>Division of Engineering and Applied Sciences, Harvard University, Cambridge, MA 02138

## **Abstract**

A hybrid analog and digital quarter-rate clock and data recovery circuit employs a second-order digital loop filter with delta-sigma truncation to achieve sub-ps phase resolution and better than 2ppm frequency resolution. A test chip fabricated in a  $0.18\mu m$  CMOS process achieves BER  $<10^{-12}$  and consumes 14mW power while operating at 2Gbps. The tracking range is greater than  $\pm 5000$ ppm and  $\pm 2500$ ppm at 10kHz and 20kHz modulation frequencies respectively, thus, making this CDR suitable for systems with spread spectrum clocking.

## Introduction

The clock and data recovery (CDR) circuit is one of the most important building blocks in any serial communication system. A dual-loop structure consisting of a cascade of frequency and phase acquisition loops [1] is a widely used CDR architecture since it allows independent optimization of bandwidths in the two loops. However, this architecture relies on accurate digitalto-phase conversion to achieve low-jitter, which also introduces potentially large phase jumps when switching clock phases. The phase jumps increase the peak-to-peak jitter of the recovered clock. Additionally, many phase interpolators are required for multi-phase clock recovery thus making this architecture expensive in terms of both power and area. Averaging phase interpolation by fixed feedback phase selection [2] suppresses large phase jumps and facilitates multi-phase clock recovery without any power overhead. However, this architecture suffers from two major drawbacks. First, it imposes conflicting bandwidth requirements for jitter optimization. The phase-locked loop (PLL) requires high bandwidth to suppress the oscillator noise, while the CDR loop requires low PLL bandwidth to attenuate the jitter due to phase selection. Second, the simple first-order CDR loop used in [2] has limited frequency tracking range and limited input jitter filtering. In this paper, we present a multi-phase CDR architecture that alleviates the bandwidth tradeoff and extends the frequency tracking range to several thousand parts per million (ppm), thus, making it suitable even in applications employing spread spectrum clocking.

### **Proposed Architecture**

A simplified block diagram of the proposed second-order quarter-rate CDR architecture is shown in Fig. 1. It comprises a ring-oscillator-based PLL that generates 8 equally spaced phases. These eight phases clock eight samplers to generate four data samples and four edge samples from the incoming data. The sampler outputs are further de-multiplexed by four (shown as "\under 4" in Fig. 1) to ease the speed requirements of the digital circuitry. The resulting thirty two samples are used to first calculate multiple early-late decisions which are then reduced to a 3-level (early/late/no transition) bang-bang phase detector (!!PD) output by a simple majority vote. The phase er-

ror is filtered by a second-order digital loop filter (DLF), which is implemented as sum of proportional and integral paths. The resulting 14-bit DLF output is quantized to 3-levels  $(\pm 1, 0)$  by a second-order delta-sigma modulator (DSM). The DSM drives the phase rotator  $(\Phi_{ROT})$  to achieve unlimited phase shifting required for plesiochronous clocking. The phase rotator can be thought of as a simple 8-bit circular shift register which is loaded with a single one and seven zeros at start-up. The 3-level DSM output either shifts the lone one left (+1), right (-1) or holds the current state (0). The one active phase rotator output then selects one of the eight phases of the VCO and feeds it to the divider. Glitch-free phase switching is achieved with a combination of digital control logic and a retimed analog multiplexer. As a result of filtering of the high-frequency DSM noise by the PLL loop, the 14-bit information in the DLF output is preserved despite using only 3-level phase selection.

Since the DSM reduces the noise within the bandwidth of the PLL, by shaping the quantization error to high frequencies, higher PLL bandwidths can be used to suppress oscillator noise. This can be better understood by considering the properties of quantization error in the frequency domain. In the case of fixed phase selection [2], the quantization error can be approximated to have white spectrum, and, hence a very low loop bandwidth is required to minimize excess jitter due to quantization error leakage. However, in the case of phase selection with the DSM, the quantization error is *shaped* to high frequency, which enables the PLL to have relatively higher bandwidth. For example, with  $K_P = 128$  and  $K_I = 1$ , and the PLL operating at 1GHz with a bandwidth of 4MHz, simulations show that a phase and frequency resolution of 0.1ps and 2ppm can be achieved. This fine frequency resolution also improves the CDR's immunity to a long consecutive string of identical digits. A 14-bit frequency integrator is used which allows more than  $\pm 5000$ ppm frequency tracking range. The ability to track large frequency deviation with fine resolution makes this architecture suitable for applications with spread-spectrum clocks.

# **Circuit Design**

The block diagram of the split-tuned PLL is shown in Fig. 2. Split tuning offers the benefits of wide operating range and low VCO gain. The PLL consists of a phase frequency detector (PFD) whose output is level shifted to minimize glitching on the differential charge-pump output caused by the rail-to-rail UP/DN signals. A separate frequency-tracking loop (referred to as coarse loop hereafter) integrates the voltage across the loop filter capacitor  $C_1$  and drives the VCO toward frequency lock [3]. The integrator is implemented as a first order  $G_m$ -C filter. Additionally, the coarse loop also biases the charge-pump output to a fixed reference voltage  $(V_{REF})$ . This minimizes the folding of the high-frequency noise due to PFD/CP nonlinearity. Large variation in both the coarse and fine gains of the VCO is reduced by using the resistor based voltage-to-

current (V2I) converters shown in Fig. 3. The coarse loop V2I shown in Fig. 3(a) enables wide tuning range of the coarse control input. On the other hand the fine loop V2I shown in Fig. 3(b) has wide bandwidth so as not to influence the PLL loop dynamics. The wide bandwidth is achieved at the expense of a narrower linear range. The simulated coarse and fine VCO gains are 1200MHz/V and 200MHz/V respectively and the gain variation is less than  $\pm 10\%$ . It can be shown that the sensitivity of the PLL loop dynamics to loop filter resistor variation  $(R_z)$  is suppressed by matching the  $R_z$  with V2I resistors  $R_C$  and  $R_F$ . The VCO is composed of a 4-stage ring oscillator consisting of split-tuned current starved differential inverters. A buffer circuit that maintains 50% duty cycle is used to drive the samplers.

## **Experimental Results**

A test chip fabricated in a  $0.18\mu m$  CMOS process operates off of a nominal 1.4V power supply. The PLL clock jitter at 500MHz (2Gbps) is 6.5ps rms (Fig. 4). The recovered quarter-rate data, and the recovered clock, operating at 2Gbps with  $2^7-1$  PRBS data are shown in Fig. 5. The measured bit error rate (BER) is less than  $10^{-12}$ . The measured tracking range of the CDR with BER less than  $10^{-12}$  is greater than  $\pm 5000$ ppm and  $\pm 2500$ ppm when the reference clock is modulated with 10kHz and 20kHz triangular signals respectively. The CDR, operating at 2Gbps, dissipates 14mW. The maximum data rate was limited by the test equipment used to generate the random data sequence. CDR operation at 4Gbps was verified with a simple alternating data sequence. The die-photo and the performance summary are shown in Fig. 6.

#### Acknowledgements

The authors would like to thank National Semiconductors for providing IC fabrication. We also would like to thank Dr. Ceballos, Dr. Kook, Dr. Ahn, M. Kim, and all the members of the signalling team at Intel research labs for useful discussions and critical feedback. This work was supported by Intel Corporation.

## References

- [1] S. Sidiropoulos, M. Horowitz, "A semidigital dual delay-locked loop," *IEEE J. Solid-State Circuits*, vol. 32, no. 11, pp. 1683-1692, Nov. 1997.
- [2] P. Larsson, "A 2-1600-MHz CMOS clock recovery PLL with low-Vdd capability," *IEEE J. Solid-State Circuits*, vol. 34, no. 12, pp. 1951-1960, Dec. 1999.
- [3] S. Williams, H. Thompson, M. Hufford, and E. Naviasky, "An improved CMOS ring oscillator PLL with less than 4ps RMS accumulated jitter," *Proc. of IEEE CICC*, pp. 151-154, Oct. 2004.



Figure 1: Mixed analog-digital multi-phase CDR.



Figure 2: Split-tuned phase locked loop.



Figure 3: (a) Coarse loop V2I. (b) Fine loop V2I



Figure 4: PLL clock jitter at 500MHz.



Figure 5: (a) Recovered data (b) Recovered clock.



Figure 6: Die Photo and performance summary table.