# A 3.2Gb/s Oversampling CDR with Improved Jitter Tolerance

Merrick Brownlee<sup>1</sup>, Pavan Kumar Hanumolu, and Un-Ku Moon School of EECS, Oregon State University, Corvallis, OR 97331 <sup>1</sup>Now with Mindspeed Technologies

Abstract—A 3.2Gbps CDR circuit employs an oversampling architecture to decouple the tradeoff between jitter generation and jitter tolerance. The test chip fabricated in a  $0.13\mu m$  CMOS process achieves a 30x increase in the jitter tolerance corner without increasing recovered clock jitter. Power consumption is 19.5mW from a 1.4V supply at 3.2Gbps and die area is  $0.081mm^2$ .

#### I. INTRODUCTION

In order for the exponential advances in digital IC processing technology to benefit overall digital system performance, chip-to-chip communication bandwidth must keep up with onchip bandwidth. Off-chip bandwidth has scaled slower than on-chip bandwidth, however, so the aggregate bandwidth must be achieved by 1) employing link architectures which push speed limitations and 2) by increasing the number of links per chip. Both of these trends present challenges in the context of the critical clock and data recovery (CDR) circuit. First, speed critical CDRs typically use bang-bang phase detectors (!!PDs) which are limited by a tradeoff between jitter tolerance (JTOL) and jitter generation (JGEN) performance, as explained below. Second, integrating many links means that the importance of CDR power consumption and die area consumption is magnified. This paper describes a CDR circuit that decouples the JTOL/JGEN tradeoff through a novel 2x oversampling architecture while consuming minimal die area and power.

Section II reviews background material that is central to the description of the proposed architecture in Section III. Section IV covers circuit design and Section V shows results of measurements taken on the prototype chip. Finally, conclusions are drawn in Section VI.

#### II. BACKGROUND

In contrast to a linear phase detector which produces an output proportional to the detected phase error, a !!PD gives only the sign of the phase error and can be implemented with a flip-flop sampling the incoming data stream by the recovered clock. Therefore, the nonlinear !!PD can operate as fast as a flip-flop is able in the given process, whereas a linear PD is limited by the ability to produce and process narrow pulses. Fig. 1(a) shows a bang-bang CDR phase-domain model with a typical proportional plus integral (PI) filter. The main drawback to CDRs employing !!PDs is that the nonlinear nature of the PD causes the CDR to dither when in lock. This dithering is the major component of the jitter generation (JGEN) of the CDR. It is easily shown that dithering related



(a)



Fig. 1. Typical bang-bang phase detector based CDR (a) phase-domain model and (b) JTOL behavior.

JGEN is dictated by the proportional path frequency step  $(\Delta f_P = K_P K_{VCO})$  [1].

Another important metric of CDR performance is jitter tolerance (JTOL), which is the amount of input jitter the CDR is able to tolerate without causing a bit error. Fig. 1(b) shows the JTOL plot for the architecture of Fig. 1(a). There are three regions in the plot. At high frequencies, the loop is unable to track the jitter, so jitter of magnitude larger than the sampling margin (ideally 0.5UI) will cause a bit error regardless of jitter frequency. When the input jitter is large enough to cause bit errors, the nonlinear bang-bang CDR must slew in an attempt to track the jitter [2]. Noting this, we see that at medium frequencies, the integral path output changes negligibly and the output phase slews linearly at a rate proportional to  $\Delta f_P$ . The corner between the low frequency and high frequency regions ( $f_1$ ) is proportional to the medium frequency slew rate



Fig. 2. Proposed bang-bang CDR (a) phase-domain model and (b) jitter tracking.

and, therefore, to  $\Delta f_P$ . From this we see that since JTOL and JGEN are both proportional to  $\Delta f_P$ , the designer is conflicted in the choice of this parameter.

The low frequency region of the JTOL plot corresponds to frequencies where the integral path is able to respond to the input jitter. At these frequencies, the integral path attempts to force the phase error to zero, causing the slewing to be quadratic as shown in Fig. 1(b). It can be shown that this quadratic slewing leads to a -40dB/dec slope in the low frequency region [2]. The corner between the low frequency and medium frequency regions  $(f_2)$ , however, is limited by stability constraints to very low frequencies. Clearly,  $f_1$  is the critial JTOL metric since increasing  $f_1$  shifts the entire curve up, thereby improving both high frequency and low frequency JTOL.

## III. PROPOSED ARCHITECTURE

In this work, the JGEN/JTOL tradeoff is decoupled through the novel architecture shown in Fig. 2(a). Whereas a typical !!PD gives only early/late information by sampling at the edge of the data eye, the proposed PD takes extra samples an equal distance before and after the edge aligned clock. These samples give "very early/late" information which indicates that the CDR loop is failing to track the input jitter. When the PD detects very early/late, a secondary high gain VCO input is activated to increase the slew rate accordingly. The resulting jitter tracking behavior is shown in Fig. 2(b). When tracking large jitter, the effective slew rate is dictated by the high gain frequency step ( $\Delta f_{HIGH} = K_{HIGH}K_V$ ) and the JTOL corner frequency is increased. When there is no large jitter to be tracked, however, the high gain path is never activated, and the dithering jitter is dictated by the low gain frequency step  $(\Delta f_{LOW} = K_{LOW} K_V).$ 

In Section II we showed that the low frequency JTOL improvement afforded by the integral path is marginal. A more important function of the integral path is frequency pullin range improvement [3]. Therefore, if an alternate means of setting the frequency within the limited pull-in range of the first order CDR is available in conjunction with the improved low frequency JTOL afforded by the oversampling architecture, the integral path can be avoided. Eliminating the integral path saves the die area required for the integrating capacitor and assures no peaking in the jitter transfer function. In this work, manual tuning was used to bring the VCO within the pull-in range of the CDR in lieu of a frequency acquisition loop. In cases where frequency drift or the need to track a spread spectrum clock mandate an integral path, one can easily be added to the proposed architecture in either analog or digital form.

Fig. 3 shows the complete CDR. The oversampling !!PD operates at quarter rate in order to achieve higher speeds. Therefore, the PD generates 4 early/late and 4 very early/late signals which are combined by a digital majority vote circuit (MV) before driving the VCO. The multiphase VCO generates the eight phases (4 data, 4 edge) needed for the quarter rate PD operation. The extra edges needed for oversampling are generated within the PD by tunable delay elements.

## IV. CIRCUIT DESIGN

One of the advantages of the proposed architecture is that it simplifies circuit design. The only analog blocks needed are those that perform the fundamental functions of sampling (sense amps within the phase detector) and phase adjustment (VCO). All processing of the signals between the samplers and VCO is done in the digital domain.

The quarter-rate oversampling !!PD is shown in Fig. 4. Tunable delay cells take the even phases from the VCO and generate the very early, edge, and very late sampling clocks. The odd phases go through a single identical delay to generate a data sampling clock that is half a UI from the edge sample as desired. The samples are synchronized before simple logic converts them to up/down signals.

The delay range of the tunable delay elements must be large to accommodate the input data rate range and the minimum delay must be very small to accommodate the maximum data rate. Most delay tuning methods restrict the minimum delay since the tuning element loads the delay element. To overcome this, supply-tuned delay cells were used. The supply voltage was controlled differentially so that the clock swing is centered at midrail independent of the tuning voltage.



Fig. 3. Proposed oversampling CDR schematic.



Fig. 4. Quarter-rate oversampling bang-bang phase detector implementation.

The sampling flip-flops are sense amplifier based for good sensitivity and small aperture time. Two sense amps are cascaded to reduce metastability.

Two 3-level current DACs as in [4] are used to convert and sum the outputs of the two majority voters to generate a suitable VCO control voltage. The delay cell used in the fourstage VCO is a full-swing, fast-switching latch [5] chosen for its low intrinsic jitter [6].

#### V. EXPERIMENTAL RESULTS

The CDR was fabricated in a  $0.13\mu m$  CMOS technology and consumes only  $0.081mm^2$ . Fig. 5 shows the measured JTOL at 3.2Gbps with and without the high gain path enabled. The JTOL corner ( $f_1$ ) is improved by 30x with the proposed architecture. As expected, the recovered clock jitter is unchanged when the high gain path is enabled (84.4ps p-p with vs. 85.6ps p-p without). The recovered quarter-rate data eye and the recovered clock jitter histogram are shown in Fig. 6. For both measurements a 3.2Gbps  $2^7 - 1$  PRBS input



Fig. 5. Measured jitter tolerance at 3.2Gb/s.

TABLE I Performance Summary

| Technology                       | $0.13 \mu m$ CMOS                        |
|----------------------------------|------------------------------------------|
| Supply Voltage                   | 1.4V                                     |
| Power Consumption @ 3.2Gbps      | 19.5mW                                   |
| Max Bit Rate (BER < $10^{-12}$ ) | 3.6Gbps                                  |
| JTOL Corner $(f_1)$              | 1MHz w/o high gain<br>30MHz w/ high gain |
| Recovered Clock Jitter           | 13.2/85.6ps w/o high gain                |
| @3.2Gbps (rms/p-p)               | 13.0/84.4ps w/ high gain                 |
| Active Die Area                  | $0.081mm^{2}$                            |

sequence is applied. The chip micrograph is shown in Fig. 7 and the performance of the CDR is summarized in Table I.

## VI. CONCLUSION

A novel oversampling CDR architecture that decouples the tradeoff between jitter generation and jitter tolerance has been presented. A prototype implemented in  $0.13\mu m$  CMOS process operates at 3.2Gbps with a recovered clock jitter of 13.2ps rms and achieves a BER less than  $10^{-12}$ . Power consumption is 19.5mW from a 1.4V supply and die area



Fig. 6. Recovered quarter rate (a) data eye (b) clock jitter.



Fig. 7. Chip micrograph.

consumption is  $0.081mm^2$ . The prototype increases the jitter tolerance corner by 30x without increasing recovered clock jitter.

## VII. ACKNOWLEDGEMENTS

The authors thank Samsung Electronics for providing IC fabrication, Dr. V. Kratyuk and Dr. T. Wu for useful discussions, and the signaling group at Intel Circuits Research Labs for access to test equipment. This work is supported by Intel Corporation.

#### REFERENCES

- Y. Greshishchev, et. al., "A fully integrated SiGe receiver IC for 10-Gb/s data rate," *IEEE J. Solid-State Circuits*, vol. 35, pp. 1949-1957, Dec. 2000.
- [2] J. Lee, K. Kundert, and B. Razavi, "Analysis and modeling of bang-bang clock and data recovery circuits," *IEEE J. Solid-State Circuits*, vol. 39, pp. 1571-1580, Sep. 2004.
- [3] R. Walker, "Designing bang-bang PLLs for clock and data recovery in serial data transmission systems," in *Phase-Locking* in *High-Performance Systems: From Devices to Architectures*, B. Razavi, Ed., Wiley-IEEE Press, 2003, pp. 34-45.
- [4] P. Hanumolu and M. Kim and G. Wei and U. Moon, "A 1.6Gbps Digital Clock and Data Recovery Circuit," *in Proc. IEEE Custom Int. Circuits Conf.*, pp. 603-606, Jun. 2006.
  [5] J. Lee and B. Kim, "A low-noise fast-lock phase-locked loop mitted with the set of the set o
- [5] J. Lee and B. Kim, "A low-noise fast-lock phase-locked loop with adaptive bandwidth control," *IEEE J. Solid-State Circuits*, vol. 35, pp. 1137-1145, Aug. 2000.
- [6] A. Hajimiri and T. Lee, "A General Theory of Phase Noise in Electrical Oscillators," *IEEE J. Solid-State Circuits*, vol. 33, pp. 179-194, Aug. 1998.