# Power Efficient Tree-Based Crosslinks for Skew Reduction

Inna Vaisband, Ran Ginosar, Avinoam Kolodny Dept. of Electrical Engineering Technion – Israel Institute of Technology Haifa 32000, Israel inna.vaisband@gmail.com [ran, kolodny]@ee.technion.ac.il

# ABSTRACT

Clock distribution networks are an important design issue that is highly dependent on delay variations and load imbalances, while requiring power efficiency. Existing mesh solutions significantly increase the dissipated power, whereas existing link based methods only address skew caused by variations and do not consider power consumption. The power dissipated by the inserted crosslinks within a buffered clock tree is investigated in this paper, and is shown to be a strong function of the resistance and capacitance of the crosslink. A crosslink may be power efficient despite the presence of short-circuit currents caused by multiple drivers in a non-tree clock network. The power characteristics of crosslink size and placement are also discussed, showing that the crosslink is best placed as close as possible to the target leaves of the tree. Crosslink insertion as both an alternative and complement to buffer sizing for low power skew reduction is also considered.

## **Categories and Subject Descriptors**

B.7.m [Integrated Circuits]: Miscellaneous

#### General Terms: Design

**Keywords:** Non-tree clock distribution network, clock tree, mesh, crosslink, skew, power

#### **1. INTRODUCTION**

Power is a primary concern in modern circuits. Clock distribution networks, in particular, are an essential part of a synchronous digital circuit and a significant power consumer. Clock distribution networks are subject to skew due to process, voltage, and temperature (PVT) variations and load imbalances.

Existing skew mitigation techniques include buffer insertion and sizing [1]-[3], wire sizing [2]-[4], and non-tree clock networks [5]-[12], providing alternative paths for the clock signal to

*GLSVLSI'09*, May 10–12, 2009, Boston, Massachusetts, USA. Copyright 2009 ACM 978-1-60558-522-2/09/05...\$5.00.

Eby G. Friedman Dept. of Electrical and Computer Engineering University of Rochester Rochester, New York 14627, USA friedman@ece.rochester.edu

maintain balance. Non-tree topologies vary from a tree with a limited number of additional crosslinks [9]-[12] to a complete mesh structure [5]-[8], where a crosslink is a wire segment that connects two tree nodes and a mesh is a set of crosslinks that connects all or a significant group of adjacent nodes within a specific level of a clock tree (see Figure 1). Mesh structures are designed to balance each of the clock delays at the leaves or at some intermediate level of the tree [5]-[8]. These topologies, however, increase the total wire length, resulting in higher capacitance and, consequently, significantly increased dynamic power consumption. Thus, power is traded off for skew.



Figure 1: Non-tree clock network topologies. (a) Crosslink insertion (b) Leaf-level and intermediate-level meshes.

An ideal crosslink X, modeled as a lumped RC wire (see Figure 2), shorts two connected nodes, minimizing the skew between these models. In practice, crosslink X exhibits a non-zero resistance  $R_X$  and capacitance  $C_X$ , incurring dynamic power to charge the crosslink capacitance. Consider measuring skew between  $Clk_{Out1}$  and  $Clk_{Out2}$  in the following two cases:

- *Aligned Inputs: Clk<sub>In1</sub>* and *Clk<sub>In2</sub>* toggle simultaneously.
- Misaligned Inputs: Clk<sub>In1</sub> and Clk<sub>In2</sub> are skewed.

For aligned inputs, the skew between  $Clk_{Out1}$  and  $Clk_{Out2}$  may be caused by an unbalanced clock tree and loads and/or by PVT variations. In the case of misaligned inputs, an additional issue should be considered: short-circuit current may flow through the crosslink and the two buffers, as illustrated in Figure 2 by the dotted line, dissipating additional power.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.



Figure 2: Two clock tree segments connected with a crosslink. The dotted line illustrates the short-circuit current path for  $Clk_{In1} = '0'$  and  $Clk_{In2} = '1'$ .

Crosslink insertion has been proposed for reducing skew variations only between those pairs of nodes that target zero nominal skew [9]-[12]. However, power aspects have not been presented in these works; notably, inserting crosslinks, which may introduce short-circuit current, has not been discussed [10]. In reality, a crosslink may be effective despite the presence of short-circuit currents. A deeper understanding of the tradeoff between dynamic and short-circuit power and skew is the main objective of this paper and is essential for providing an effective and robust algorithm for reducing skew by inserting a crosslink.

In this paper, the power dissipated by inserting a crosslink in a buffered clock tree is analyzed, particularly due to the presence of short-circuit current. The power dissipated from inserting a crosslink at different locations within a tree is determined. Crosslink insertion is also compared with buffer sizing. Note that inserting a crosslink between two nodes may increase the skew between other pairs of nodes. It can be shown, however, that this increase in skew is bounded and may be reduced by inserting additional crosslinks. In the limit, a complete mesh minimizes the overall skew, albeit at the expense of higher power.

Similarly, the authors in [12] show that given two nodes in an unbuffered clock tree, u and w, with zero nominal skew and nonzero skew variation  $q_{u,w}$ , inserting a crosslink between u and w may increase the skew variation between two other nodes by  $\frac{1}{2}q_{u,w}$ . Despite the possibility of increasing the skew variation by inserting a crosslink, the authors in [12] propose an effective algorithm that significantly reduces the skew variation. Additional research [9], [10] has extended the algorithm in [12] to buffered clock trees, also exhibiting a reduction in skew variation. The rest of the paper is organized as follows. The tradeoff between power and skew is presented in Sections 2 and 3 for the case of aligned and misaligned inputs, respectively. Crosslink placement and comparison with buffer sizing are discussed in Section 4. The paper is summarized in Section 5.

# 2. ALIGNED INPUT CLOCK TREE SEGMENTS

A simplified unbalanced non-tree clock network with aligned inputs is considered in this section. An ideal step input driving CMOS inverters is assumed in the analytic expressions. Under this assumption, a large portion of the circuit operation occurs within the linear region [14], permitting the drivers to be modeled as resistors  $R_{O,1}$  and  $R_{O,2}$ , yielding the simplified models shown in Figure 3.



Figure 3: Circuit model (a) with and (b) without a crosslink

The skew  $\Delta$  between the output of the two basic tree sections shown in Figure 3(b) is based on the propagation delay of a CMOS inverter driving an *RC* load, determined at  $V_{OUT} = \frac{1}{2}V_{DD}$ [14],

$$\Delta = \left| \ln 2 \cdot R_1 \cdot C_1 - \ln 2 \cdot R_2 \cdot C_2 \right| = \ln 2 \left| R_1 \cdot C_1 - R_2 \cdot C_2 \right|$$
  
= 0.693  $\left| R_1 \cdot C_1 - R_2 \cdot C_2 \right|$ . (1)

The method described in [13] for computing skew within a clock tree with a crosslink between two nodes can be applied to the simplified clock tree model (CF is the crosslink factor), yielding

$$\Delta^{X} = \frac{R_{X}}{R_{X} + R_{1} + R_{2}} \left[ \Delta + \frac{C_{X}}{2} |R_{1} - R_{2}| \right] =$$

$$= \frac{R_{X}}{R_{X} + R_{1} + R_{2}} \left[ 1 + \frac{C_{X} |R_{1} - R_{2}|}{2 \ln 2 \cdot |R_{1}C_{1} - R_{2}C_{2}|} \right] \cdot \Delta = CF \cdot \Delta.$$
(2)

Equation (2) expresses the effect of the crosslink resistance  $R_X$  and capacitance  $C_X$  on the clock skew. Regardless of the link resistance, the first factor in (2) always decreases the skew  $\Delta$ ,

$$\frac{R_{\chi}}{R_{\chi}+R_1+R_2} < 1, \quad \forall R_{\chi}. \tag{3}$$

The effect of the link capacitance is expressed by the second factor in (2),

$$1 + \frac{C_X |R_1 - R_2|}{2 \ln 2 |R_1 C_1 - R_2 C_2|} = 1 + \frac{C_X |R_1 - R_2|}{1.386 |R_1 C_1 - R_2 C_2|} > 1, \quad \forall C_X \quad (4)$$

which always increases the original skew  $\Delta$ . To ensure that inserting a crosslink decreases the original skew  $\Delta$ ,

$$CF < 1 \Leftrightarrow R_{\chi}C_{\chi} < \frac{R_1 + R_2}{|R_1 - R_2|} \times 1.386 |R_1C_1 - R_2C_2|.$$
 (5)

To maximize the reduction in skew, a crosslink with the lowest resistance  $R_X$  and capacitance  $C_X$  is used. SPICE simulation and analytic results for skew reduction for different values of  $R_X$  and  $C_X$  are shown in Figure 4.

The primary concern in inserting a crosslink is the increase in power dissipation. The dynamic energy consumed by the clock tree sections, shown in Figure 3(b), is

$$E = (C_1 + C_2) V_{DD}^2.$$
 (6)

Once a link is inserted (see Figure 3(a)), however, the energy consumption increases. Since the energy depends linearly on the total load, the dynamic energy consumed by the load and crosslink is

$$E^X = E + C_X V_{DD}^2. \tag{7}$$

Naturally, a crosslink with the lowest possible capacitance  $C_X$  should be used.



Figure 4: Skew vs.  $R_X$  and  $C_X$ . The 25 ps skew without a crosslink caused by an unbalanced tree is reduced by inserting a crosslink.

## 3. MISALIGNED INPUT CLOCK TREE SEGMENTS

In this section, a simplified balanced non-tree clock network is analyzed assuming misaligned inputs. The clock signal reaches  $Clk_{In1}$  T time units later than  $Clk_{In2}$ , so that during that time Clk<sub>In1</sub> is high while Clk<sub>In2</sub> has already fallen. It is further assumed that the clock tree sections are balanced and misaligned inputs is the only factor that affects the skew. Therefore, without a crosslink, the outputs are also skewed by T time units (skew = T). The effect of inserting a crosslink on the skew and power consumption in the presence of short-circuit current is analyzed here. An ideal crosslink ( $R_X = C_X = 0$ ) would produce zero skew and consume no additional dynamic power. In reality, however, the crosslink consumes power and does not fully cancel the skew. While the inputs differ  $(Clk_{In2} = 0 \text{ and } Clk_{In1} = 1)$ , the crosslink enables a short-circuit current to flow through  $R_{W2}$ ,  $R_X$ , and  $R_{W1}$ (the dotted line shown in Figure 1). The total current flowing through  $R_{W2}$  is therefore divided into two sub-currents, one charging the capacitors and the other current shorted to ground. The current through  $R_{WI}$  for a step input and a slow input ramp, and for different values of  $R_X$  is illustrated in Figure 5.



Figure 5: Current through  $R_{WI}$ . The negative currents prior to T = 500 ps are undesired

The negative currents are the short-circuit currents. Once  $Clk_{In1}$  also switches (T = 500 ps), the current reverses direction.

Circuit models in the case of misaligned inputs for  $t \le T$  and t > T are shown in Figure 6.



Figure 6: Circuit models for (a)  $t \le T$ , (b) t > T (misaligned inputs)

Since in this model the misaligned inputs are the only factor that affects the skew,  $R_1 = R_2 = R$  and  $C_1 + \frac{1}{2}C_X = C_2 + \frac{1}{2}C_X = C$ . The solution of the differential equations derived from Figure 6(a)  $(Clk_{ln1} \neq Clk_{ln2}, t \leq T)$  is

$$V_{1}(t) = -\frac{V_{DD}}{2} e^{\frac{-t}{RC}} \left[ 1 - \frac{R_{X}}{R_{X} + 2R} e^{\frac{-2t}{R_{X}C}} \right] + \frac{R}{R_{X} + 2R} V_{DD}, \quad (8)$$
$$V_{2}(t) = -\frac{V_{DD}}{2} e^{\frac{-t}{RC}} \left[ 1 + \frac{R_{X}}{R_{X} + 2R} e^{\frac{-2t}{R_{X}C}} \right] + \frac{R}{R_{X} + 2R} V_{DD}, \quad (9)$$

where  $V_1(t) = V_{ClkOut1}(t)$  and  $V_2(t) = V_{ClkOut2}(t)$ . Similar to the results described in Section 2, lower values of skew are noted for smaller values of  $R_X$ .



Figure 7: Output voltage waveforms  $V_{ClkOut1}(t)$  and  $V_{ClkOut2}(t)$ with and without a crosslink

This behavior is confirmed by the skew expressions described in [6]. Based on [6], the skew with a crosslink is

$$\Delta^{X} = \Delta \exp\left(-2\ln 2\frac{R}{R_{X}}\right) = \Delta\left(\frac{R_{X}}{2R}\right)^{2}.$$
 (10)

The output voltage waveforms with and without a crosslink are presented in Figure 7. The total energy consumed during the time interval [0, T] is

$$E_{t \le T} = \left(\frac{V_{DD}^2}{R_X + 2R}\right) T + \frac{1}{2} C V_{DD}^2 \left[ \left(1 - e^{-\frac{1}{RC}T}\right) + \left(\frac{R_X}{R_X + 2R}\right)^2 \left(1 - e^{-(\frac{1}{RC} + \frac{2}{R_X C})T}\right) \right], \quad t \le T.$$
(11)

The first term of  $E_{t\leq T}$  in (11) represents the dissipated shortcircuit energy. The second term represents the dynamic energy used to charge the output capacitance. Note that not all of this dynamic energy is useful: the total current in the model (shown in Figure 8) comprises the currents that charge the output capacitors (the solid arrows in the figure), the currents that discharge the output capacitors (the dashed arrows), and the short-circuit current (the crossed arrows).



The target value of the output voltages are  $V_{CLKOUT1} = V_{CLKOUT2} =$ 1. Only the charging currents are therefore significant. To determine the additional energy consumed for t > T, the differential equations derived from Figure 6(b)  $(Clk_{In1} = Clk_{In2}, t > T)$  are

$$V_{1}(t) = V_{DD} - \frac{V_{DD}}{2} \left[ 1 + e^{\frac{-T}{RC}} \right] e^{\frac{-t}{RC}} -$$

$$\frac{R_{X}}{R_{X} + 2R} \cdot \frac{V_{DD}}{2} \left[ 1 - e^{\left(\frac{-T}{RC} + \frac{-2T}{R_{X}C}\right)} \right] e^{\left(\frac{-t}{RC} + \frac{-2t}{R_{X}C}\right)},$$

$$V_{2}(t) = V_{DD} - \frac{V_{DD}}{2} \left[ 1 + e^{\frac{-T}{RC}} \right] e^{\frac{-t}{RC}} +$$

$$\frac{R_{X}}{R_{X} + 2R} \cdot \frac{V_{DD}}{2} \left[ 1 - e^{\left(\frac{-T}{RC} + \frac{-2T}{R_{X}C}\right)} \right] e^{\left(\frac{-t}{RC} + \frac{-2t}{R_{X}C}\right)},$$
(12)
(13)

where  $V_l(t) = V_{ClkOutl}(t)$  and  $V_2(t) = V_{ClkOut2}(t)$ . The total energy consumed for t > T converges as follows:

$$E_{t>T} = CV_{DD}^{2} \left( 1 + e^{\frac{-T}{RC}} \right) \left( 1 - e^{\frac{-t}{RC}} \right) \xrightarrow{t \to \infty} CV_{DD}^{2} \left( 1 + e^{\frac{-T}{RC}} \right), \quad t > T.$$

$$(14)$$

The total energy consumption once the first input  $(Clk_{ln2})$  switches and until the output capacitors are charged is

$$E = \left(\frac{V_{DD}^2}{R_X + 2R}\right)T + \frac{1}{2}CV_{DD}^2 \left[3 + e^{\frac{-T}{RC}} + \left(\frac{R_X}{R_X + 2R}\right)^2 \left(1 - e^{(\frac{-T}{RC} + \frac{-2T}{R_X C})}\right)\right].$$
 (15)

The first term in (15) represents the short-circuit energy, which increases as  $R_X$  is reduced. The second term is the dynamic

energy, which increases with  $R_X$ . Note that the derivative  $\partial E/\partial R_X$  is negative. The higher  $R_X$ , therefore, the lower the energy. This characteristic agrees with SPICE simulations (see Figure 9). The portion of short-circuit energy as a component of the total energy decreases as well.



Figure 9: Total and short-circuit energy vs. crosslink resistance

#### 4. DISCUSSION

In this section, the effects of either  $R_X$  or  $C_X$  on the skew and energy for both aligned inputs and misaligned inputs are discussed. The relation between  $R_X$  and  $C_X$  is also considered. Alternatives for placing the crosslink are presented, and the crosslinks are compared with buffer sizing in terms of power dissipation.

As described in Section 2, in the case of aligned inputs, a lower  $R_X$  achieves enhanced skew reduction and has no effect on the power consumption. Similarly, a lower  $C_X$  decreases the skew and also dissipates less power. In the case of misaligned inputs, the following conclusions are noted: a lower  $R_X$  achieves enhanced skew reduction at higher power consumption, while a lower  $C_X$  does not affect the skew and dissipates less power.

Given the length of a crosslink X, increasing either the width or thickness results in increased capacitance  $C_X$  and reduced resistance  $R_X$ . Since  $R_X$  and  $C_X$  are not independent for a particular wire,  $R_X$  and  $C_X$  cannot be concurrently reduced. In the case of aligned inputs, only dynamic power is dissipated. Therefore, to reduce power, a lower  $C_X$  and a higher  $R_X$  should be used since dynamic power only depends on  $C_X$ . On the contrary, for enhanced skew reduction, a lower  $R_X$  should be used since the sensitivity of the skew to  $R_X$  is higher than to  $C_X$  (in the relevant range as shown in Figure 3). In the case of misaligned inputs, a higher  $R_X$  and lower  $C_X$  should be used to reduce the short-circuit and total power consumption. For skew reduction, however, lower values of  $R_X$  should be used, since the skew in this case does not depend on  $C_X$ . Alternatively, a lower  $R_X$  and higher  $C_X$ should be used in both cases for enhanced skew reduction at higher power.



Figure 10: Effect of crosslink on energy vs. skew for different values of  $R_X$  and  $C_X$ . The 40 ps skew without a crosslink, caused by either an unbalanced tree or due to misaligned inputs, is reduced by the crosslink.

The tradeoff between skew reduction and energy consumption in 180 nm technology is demonstrated in Figure 10 for an unbalanced tree with aligned inputs and a balanced tree with misaligned inputs. Both trees incur a 40 ps skew without a crosslink. Simulations are based on technology typical models [15].  $R_X$  and  $C_X$  values are calculated using closed-form formulas [16] for crosslinks with different widths in the range (0.18 um, 5.4 um). The curves demonstrate a 'knee' where a desirable combination (low skew, low power) close to the origin may be sought.



Figure 11: Crosslink insertion at different tree levels

An additional issue is where to place the crosslink within a tree. Equation (2) shows that, given aligned inputs, the skew with a crosslink is inversely proportional to  $R_X + R_{av}$ . Therefore, for enhanced skew reduction and a specific  $R_X$ , the crosslink should be inserted as far as possible from the input buffers driving a section. Placing the crosslink close to the section outputs naturally increases the values of  $R_X + R_{av}$ , reducing the skew. Since the crosslink energy for aligned inputs is not affected by the resistance, the physical placement has no effect on the power consumption. Furthermore, as shown in (10) and (15), for misaligned inputs, both skew reduction and energy consumption are inversely proportional to R ( $R = R_{av}$  where  $R_1 = R_2$ ). Inserting a crosslink as far as possible from the input drivers of a section, therefore, increases the value of R and, as a result, reduces the skew while preventing any unnecessary loss of energy.

Another issue is where to insert a crosslink along the branches of a clock tree. Consider reducing the skew between  $Clk_{Out1}$  and  $Clk_{Out2}$  (denoted as *skew*) by inserting either *Crosslink1* or *Crosslink2*, as shown in Figure 11.



#### Figure 12: Energy vs. skew for crosslink insertion at different tree levels. Inserting a crosslink at a higher level will usually achieve less skew reduction at the same energy due to the additional imbalance of loads, resistors, and drivers at the lower levels, affecting the delay.

SPICE simulations (see Figure 12) indicate that Crosslink1 results in a higher skew than Crosslink2, although with similar power consumption characteristics. The skew between Clk<sub>Outl</sub> and Clk<sub>Out2</sub>' with Crosslink1 is denoted as skew'. Variations in the propagation delay from Clk<sub>Out1</sub>' to Clk<sub>Out1</sub> as compared to the delay from Clk<sub>Out2</sub>' to Clk<sub>Out2</sub> may partially cancel the reduction in skew achieved by Crosslink1, causing skew > skew'. However, inserting Crosslink2 reduces the skew directly between the destination nodes (*Clk<sub>Out1</sub>* and *Clk<sub>Out2</sub>*), further reducing the skew. Several buffer sizing techniques have been proposed to reduce the skew [1]-[3]. Buffer sizing can serve as an alternative to inserting a crosslink. SPICE simulations (Figure 13) have been used to compare crosslink insertion with sizing the driver at one of the branches to compensate for the skew. The simulations show that inserting a crosslink may sometimes dissipate less energy for the same reduction in skew than buffer sizing. Given misaligned inputs (see Figure 13(b)), crosslink insertion results in lower energy consumption than buffer sizing for a specific skew. Note that speeding up a slow path by enhancing the driving buffer may only compensate a portion of the skew in the model with misaligned inputs. The reduction in skew due to buffer sizing is therefore limited, making zero skew unlikely. For aligned inputs (see Figure 13(a)), the skew is caused by an unbalanced load  $(C_1 \neq C_2)$ , as well as by different interconnect resistances  $(R_1 \neq R_2)$  and drivers. In this case, buffer sizing incurs less energy for the same low skew. Note that the energy-skew curve for crosslink insertion depends strongly on the crosslink capacitance which is dependent upon the physical location of  $Clk_{Out1}$  and  $Clk_{Out2}$ . The closer  $Clk_{Out1}$  and  $Clk_{Out2}$  are to each other, the smaller  $R_X$  and  $C_X$  are, resulting in a lower crosslink factor and less energy dissipation.





(b)

Figure 13: Energy vs. skew for driver size and crosslink insertion for (a) aligned inputs (b) misaligned inputs

## 5. SUMMARY

Inserting a crosslink is a power efficient method for coping with skew within a buffered clock tree. The tradeoff between energy consumption and skew reduction is investigated based on analytic expressions and simulations, demonstrating that crosslinks with minimal resistance should be used for maximum skew reduction, albeit at a high power. Alternatively, crosslinks with low capacitance should be used for minimal power overhead, traded off for lower skew reduction. Analytic expressions and simulation results can be used to determine the best tradeoff between power and skew based on specific design requirements and constraints. Analytic expressions for the short-circuit current, caused by multiple drivers in buffered non-tree clock networks, have also been developed, showing that inserting a crosslink may be power efficient despite the presence of a short-circuit current. Regarding placement, it is shown that a crosslink should be inserted as close to the clock tree leaves as possible for lower energy consumption. Crosslink insertion should be considered as an alternative or complement to buffer sizing for skew reduction at lower power.

## 6. REFERENCES

[1] J. G. Xi and W. W.-M. Dai, "Buffer Insertion and Sizing Under Process Variations for Low Power Clock Distribution." Proceedings of the ACM/IEEE Design Automation Conference, pp. 491-496, June 1995.

- [2] J.-L. Tsai, T.-H. Chen, and C.C.-P. Chen, "Zero Skew Clock-Tree Optimization with Buffer Insertion/Sizing and Wire Sizing," IEEE Transactions on Computer-Aided-Design, vol. 23, no. 4, pp. 565-572, April 2004.
- [3] S. Pullela, N. Menezes, J. Omar, and L. T. Pillage, "Skew and Delay Optimization for Reliable Buffered Clock Trees," Proceedings of the IEEE/ACM International Conference on Computer-Aided Design, pp. 556-562, November 1993.
- [4] Z. Li, Y. Zhou, and W. Shi, "Wire Sizing for Non-Tree Topology," IEEE Transactions on Computer-Aided-Design, vol. 26, no. 5, pp. 872 - 880, May 2007.
- [5] A. L. Sobczyk, A. W. Łuczyk, and W. A. Pleskacz, "Power Dissipation in Basic Global Clock Distribution Networks," Proceedings of the IEEE Workshop Design and Diagnostics of Electronic Circuits and Systems, pp. 1-4, April 2007.
- [6] M. Mori, H. Chen, B. Yao, and C.-K. Cheng, "A Multiple Level Network Approach for Clock Skew Minimization with Process Variations," Proceedings of the Asia and South Pacific Design Automation Conference, pp. 263-268, January 2004.
- [7] E. G. Friedman, "Clock Distribution Networks in Synchronous Digital Integrated Circuits," Proceedings of the IEEE, vol. 89, no. 5, pp. 665-692, May 2001.
- [8] N. A. Kurd et al., "A Multigigahertz Clocking Scheme for the Pentium 4 Microprocessor," IEEE Journal of Solid-State Circuits, vol. 36, no. 11, pp. 1647-1653, November 2001.
- [9] Rajaram and D. Z. Pan, "Variation Tolerant Buffered Clock Network Synthesis with Cross Links," Proceedings of the ACM International Symposium on Physical Design, pp. 157 - 164, April 2006.
- [10] G. Venkataraman et al., "Practical Techniques to Reduce Skew and its Variations in Buffered Clock Networks," Proceedings of the IEEE/ACM International Conference on Computer-Aided Design, pp. 592 - 596, November 2005.
- [11] A. Rajaram, D. Pan, and J. Hu, "Improved Algorithms for Link Based Non-Tree Clock Network for Skew Variability Reduction," Proceedings of the ACM International Symposium on Physical Design, pp. 55-62, April 2005.
- [12] A. Rajaram, J. Hu, and R. Mahapatra, "Reducing Clock Skew Variability via Cross Links," Proceedings of the ACM/IEEE Design Automation Conference, pp. 18-23, June 2004.
- [13] P. K. Chan and K. Karplus, "Computing Signal Delay in General RC Networks by Tree/Link Partitioning," Proceedings of the IEEE/ACM International Conference on Computer-Aided Design, pp. 898–902, August 1990
- [14] V. Adler and E. G. Friedman, "Delay and Power Expressions for a CMOS Inverter Driving a Resistive-Capacitive Load," Proceedings of the IEEE International Symposium on Circuits and Systems, vol. 4, pp. 101-104, May 1996.
- [15] URL: http://www.eas.asu.edu/~ptm/interconnect.html.
- [16] S.-C. Wong, G.-Y. Lee, and D.-J. Ma, "Modeling of Interconnect Capacitance, Delay, and Crosstalk in VLSI," IEEE Transactions on Semiconductor Manufacturing, vol. 13, no. 1, pp. 108-111, May 2000.