# Analysis of Power Consumption in VLSI Global Interconnects Youngsoo Shin Korea Advanced Institute of Science and Technology Daejeon 305-701, Korea Hyung-Ock Kim Korea Advanced Institute of Science and Technology Daejeon 305-701, Korea Abstract—The analysis of effects induced by interconnects become increasingly important as the scale of process technologies steadily shrinks. While most analyses focus on the timing aspects of interconnects, power consumption is also important. We study the trends of interconnect power consumption based on current and figure technology node parameters. We show that 20%-30% of power is consumed by interconnect resistance in optimally buffered global interconnect system. We also study the analysis method based on a reduced-order model. The relation between power consumption and the poles and residues of a transfer function is addressed. The theoretical results can be used for any kind of linear circuits including RLC circuits. ### I. INTRODUCTION As the scale of process technologies steadily shrinks and the size of designs increases, interconnects have increasing impact on the area, delay, and power consumption of circuits. Reduction in scale causes a continual increase in interconnect delays, although of course overall circuit performance continues to increase. As regards power, the situation is similar in that the portion of power associated with interconnects is increasing. This is an important fact because the conventional design, analysis, and synthesis of VLSI circuits are based on the assumption that gates are the main sources of onchip power consumption. In this paper, we study the trends of interconnect power distribution based on technology node parameters extrapolated from BPTM [1]. The study shows that for a case of optimally buffered interconnect systems 70%-80% of the total power is consumed at transistors while the remaining power is changed to heat in interconnects. For analysis of the interconnects, extensive studies have been made of the use of model order reduction over the last few years, following the introduction of Asymptotic Waveform Evaluation [2]. Model order reduction is based on approximating the Laplace-domain transfer function of a linear (or linearized) network by a relatively small number of dominant poles and zeros. Such reduced-order models can be used to predict the time-domain or frequency-domain response of the linear network. Although there has been significant progress in the analysis and simulation of performance-related aspects of VLSI interconnects, less work has been devoted to the analysis of power consumption (or distribution) of interconnects. Furthermore, the analysis of power-related aspects of interconnects is limited to power distribution networks, and deals with quantities such as IR drop, ground bounce, and electromigration. In this paper, we introduce a method based on a reduced-order model that allows the power distribution of interconnects to be analyzed. We show that the power, which inherently involves improper integration, can be derived from the poles and residues of the transfer function, which requires only algebraic computation. The remainder of the paper is organized as follows. In the next section, we revisit the model of CMOS power consumption and point out that the traditional lumped capacitance model for charging and discharging power component is not valid for global interconnect systems. In Section III, we discuss trends of interconnect power distribution, which shows that our conjecture is true. In Section IV, we address the problem of interconnect power analysis and we conclude the paper in Section V. #### II. POWER CONSUMPTION OF CMOS CIRCUITS It is a well known fact that there are three components of power consumption in CMOS circuits [3], which is given by $$P_{tot} = P_{dp} + P_{stat} + P_{dyn}, (1)$$ where the first component is due to a direct current path from $V_{DD}$ to ground when the pull-up and pull-down networks are both on for a short period of rising and falling transition. The second term is due to a leakage current that flows between the supply rails in the absence of switching activity. $P_{dyn}$ represents a *capacitive dissipation* and the dominant factor in typical CMOS circuits. For example of cascaded inverters as shown in Fig. 1(a), the load seen by a driver is usually modeled as a lumped capacitance as shown in Fig. 1(b), where $C_g$ denotes a gate capacitance of a receiver. During the rising transition, the total energy of $C_L V_{DD}^2$ is delivered by the source, the half is stored on $C_L$ and the other half is dissipated by PMOS. The energy stored on the capacitor, $1/2C_L V_{DD}^2$ , is dissipated by NMOS during the falling transition. Thus, the energy of $C_L V_{DD}^2$ is entirely dissipated by MOSFETs. The basic assumption of this model is that the interconnect resistance, denoted as $R_I$ in Fig. 1(a), is *negligible* compared to *drain effective resistance* of MOSFETs. This is generally true in local interconnets where small MOSFETs, thus having large drain effective resistance, are connected through short wires, which have small wire resistance. However, the situation is different in global interconnect system, where large MOSFETs drive long (frequently in mm Fig. 1. (a) Two cascaded inverters, (b) lumped capacitance model for power estimation, and (c) RC tree model for interconnect. Fig. 2. Buffered global interconnects. order) global interconnects, which implies small drain effective resistance and large wire resistance. This is very common in System-on-a-Chip (SoC) style integration, where many bus interconnects are implemented through long global wires. Another example is a global clock [4], where global clock is constructed through huge clock buffers to simplify clock networks. The implication of this situation is that the traditional model for power analysis such as the one in Fig. 1(b) is not valid for an interconnect system where wire resistance is significant. Thus, we need an RC (or even RLC) network such as the one in Fig. 1(c). # III. TRENDS OF INTERCONNECT POWER DISTRIBUTION In order to investigate power consumption of interconnects, we consider a buffered global interconnect system as shown in Fig. 2. Especially, we consider *optimally buffered interconnect*, where buffer size and interconnect length are determined in such a way that delay is minimized. The optimal buffer size and the optimal interconnect length can be derived from the delay equation of buffered interconnect [5], where the delay is from the input of the first buffer to the output of the last buffer: $$t = k \left[ p_1 \frac{r_i L}{k} \frac{c_i L}{k} + \frac{r_t}{h} h c_t + \frac{r_t}{h} \frac{c_i L}{k} + p_2 \frac{r_i L}{k} h c_t \right], \quad (2)$$ where k is the number of interconnect sections consisting of k+1 buffers, L is the total length of the interconnect, $c_i$ and $r_i$ are capacitance and resistance of the wire of unit length respectively, and $c_t$ and $r_t$ are gate capacitance and drain effective resistance of minimum size MOSFET respectively. The constants $p_1$ and $p_2$ depend on the switching model of the buffer, and are about 0.377 and 0.693 respectively, when 50% of the swing at the receiver side is of interest. By TABLE I TECHNOLOGY PARAMETERS | Technology (nm) | $V_{DD}$ (V) | T <sub>ox</sub><br>(Å) | $r_t$ (k $\Omega$ ) | $c_t$ (fF) | |-----------------|--------------|------------------------|---------------------|------------| | 180 | 1.8 | 40 | 3.3 | 1.18 | | 130 | 1.5 | 33 | 3.5 | 0.79 | | 100 | 1.2 | 25 | 3.6 | 0.59 | | 70 | 1.0 | 16 | 4.3 | 0.44 | TABLE II PARAMETERS OF OPTIMALLY BUFFERED INTERCONNECT SYSTEMS | Technology (nm) | $h_o$ | l <sub>o</sub> (mm) | $R_i$ $(\Omega)$ | C <sub>i</sub> (fF) | |-----------------|-------|---------------------|------------------|---------------------| | 180 | 169 | 1.76 | 38.7 | 271 | | 130 | 182 | 1.23 | 38.1 | 194 | | 100 | 203 | 0.94 | 34.8 | 161 | | 70 | 259 | 0.80 | 32.8 | 154 | differentiating (2) in terms of h, setting the derivative equal to zero, and solving the equation for h, we obtain the optimal buffer size: $$h_o = \frac{1}{\sqrt{p_2}} \sqrt{\frac{c_i r_t}{r_i c_t}}. (3)$$ Differentiating (2) again in terms of k, setting the derivative equal to zero, and solving the equation for k gives us the optimal number of stages for the interconnect of length L, denoted by $k_o$ . Dividing L by $k_o$ can be shown to give us the optimal interconnect length of each section: $$l_o = \frac{1}{\sqrt{p_1}} \sqrt{\frac{r_t c_t}{r_i c_i}}. (4)$$ In order to project the optimal buffer size and the optimal interconnect length for the current and future technology generations, we use technology parameters extrapolated from BPTM [1], which are summarized in TABLE I. For each technology node, we obtain the optimum size of buffer and the optimum length of interconnect via (3) and (4), which are tabulated in TABLE II. The total resistance ( $R_i$ ) and capacitance ( $C_i$ ) of the interconnect are also shown in the last two columns of the table. For each technology node, we configure one section of the circuit (two buffers and interconnects between them) shown in Fig. 2 with parameters in TABLE II. The interconnect is approximated by 5 sections of $\pi$ -ladder circuits [6]. We obtain the power consumption of the buffer, denoted as $P_b$ , and that of the interconnect (sum of power consumption of 5 resistors in $\pi$ -ladders), denoted as $P_i$ , through SPICE simulation. The ratio of power consumed by the buffer to the total power consumption is defined by $$\eta_b = \frac{P_b}{P_b + P_i},\tag{5}$$ and similarly for $\eta_i$ . When the step is applied at the input of the buffer, the ratios of power are shown graphically in Fig. 3(a). Since we use the step at falling edge, most of power consumed by the buffer is due to PMOS operated in a linear Fig. 3. Trends of power distribution of optimally buffered interconnect system when (a) step and (b) clock is applied at the input of the buffer. region, thus we neglect the power consumed by NMOS, which consists of mostly leakage power. The trends show that about 80% of the total power is consumed at the transistor and the remaining power is changed to heat in the interconnect. The trends do not change significantly when we use clock instead of step and taking all power components of the buffer into account as shown in Fig. 3(b). Since clock trees consume significant power and large buffer and long interconnect (even larger and longer than those for optimally buffered interconnect as presented in this paper) are frequently used to build clock trees [4], the power consumption of interconnects should be considered as an important factor in the design of clock distribution networks. # IV. ANALYSIS OF INTERCONNECT POWER DISTRIBUTION For a given linear or linearized circuit, the total power consumption is readily obtained. However, this does not give how the power consumption is distributed over circuit elements. In order to find the power consumption (or energy dissipation)<sup>1</sup> of a particular resistor element, we first obtain the reduced-order model of current flowing through the resistor, denoted by $\hat{J}(s)$ (with the corresponding time-domain function $\hat{j}(t)$ ), using a model order reduction techniques [2], [7]. The approximate energy dissipated by $R_i$ , denoted by $\hat{E}_i$ , during time period $[\tau_1, \tau_2]$ is then given by $$\hat{E}_i = R_i \int_{\tau_1}^{\tau_2} \hat{j}^2(t) \, dt. \tag{6}$$ <sup>1</sup>Power consumption and energy dissipation are used interchangeably. More precisely, power consumption in this paper means average power consumption, which is equal to energy dissipation divided by the time period of interest. If we are interested in the total energy dissipated by a specific resistor element during signal transition, we can choose to consider a semi-infinite interval of $\tau$ , without loss of generality. We make $\tau_1$ the time origin and $\tau_2$ infinite time. Then $\hat{j}(t)$ will reach a steady state, provided that $\hat{j}(t)$ corresponds to the reduced-order model of an individual transition. This leads us to the improper integral $$\hat{E}_i = R_i \int_0^\infty \hat{j}^2(t) \, dt. \tag{7}$$ The direct computation of improper integration in (7) is difficult, especially if there are multiple-order poles or if $\hat{j}^2(t)$ is expressed as combination of functions other than exponentials. Fortunately, we can avoid this by deriving a general relation between improper integration in the time-domain and algebraic computation in the s-plane as expressed by the following theorem and proved in [8]. Theorem 1: If the Laplace transform of a time-domain signal h(t), denoted by H(s), has q singularities in the left half of the s-plane, then $$\int_{0}^{\infty} h^{2}(t) dt = \sum_{i=1}^{q} \tilde{r}_{i}, \tag{8}$$ where $\tilde{r}_i$ is a residue of H(-s)H(s) at the singularity of H(s). Note that the only constraint imposed by Theorem 1 is that the transfer function has singularities to the left of the s-plane, which is a typical situation because we are concerned mostly with stable systems. Thus, we can apply the same relation even for RLC circuits. If we have a reduced-order model of H(s), then $\tilde{r}_i$ can be obtained by a matrix computation involving the moments of $\hat{H}(-s)\hat{H}(s)$ and the singularities of $\hat{H}(s)$ [2]. In the case when all the singularities are simple poles, we obtain the less complicated relation expressed by the following theorem. Theorem 2: If the Laplace transform of a time-domain signal h(t), denoted by H(s), has q simple poles in the left half of s-plane, then $$\int_0^\infty h^2(t) \, dt = \sum_{i=1}^q r_i H(-p_i),\tag{9}$$ where $r_i$ is a residue of H(s) at the pole $p_i$ of H(s). As an example, suppose we have $$H(s) = \frac{3s+5}{(s+1)(s+2)} = \frac{2}{s+1} + \frac{1}{s+2} = \frac{r_1}{s-p_1} + \frac{r_2}{s-p_2}.$$ We apply the above theorem: $$r_1H(-p_1) + r_2H(-p_2) = 2(\frac{2}{1+1} + \frac{1}{1+2}) + 1(\frac{2}{2+1} + \frac{1}{2+2}) = \frac{43}{12}.$$ It can be easily shown that this is equal to the result of direct computation of improper integration: $$\int_0^\infty h^2(t) dt = \int_0^\infty (2e^{-t} + e^{-2t})^2 dt.$$ To summarize, the overall procedure to analyze power distribution of linear circuit (either RC or RLC circuit) is: at each resistor element, we obtain the reduced-order model of current flowing through the resistor $(\hat{J}(s))$ ; we apply either Theorem 1 or 2 to compute the improper integration, which is then multiplied by the resistance to obtain the energy dissipation. In order to verify the validity of the proposed analysis method, we implement a prototype tool written in C<sup>++</sup>, and based on the results presented in this section with moment matching-based model order reduction [2]. The program reads in a circuit in a SPICE-like format and outputs the power distribution of the interconnect. For the experiments, we randomly generate RC tree networks while varying the number of nodes from 100 to 500, and compare the energy distribution obtained by SPICE with that obtained by our method. As an example, Fig. 4 shows the result for a circuit with 300 nodes. The result shows that a single pole approximation is quite accurate for most of cases. To understand this, first note that the area under the current waveform when it is approximated by a single pole is equal to that under the exact waveform. Since the exact waveform is bell-shaped (except for the driving end) while the approximated one decays monotonically, the accuracy of energy approximation with a single pole depends on the peakness exhibited by the curve, because we are interested in the area under the square of the waveform. If the peakness occurs highly skewed to the left side, meaning that the waveform has high frequency components, the approximated current waveform has a large error around its peak (although the area underneath is correct) and this error becomes more significant when we compute the square of Fig. 4. Comparison of the energy distribution for a randomly generated circuit with 300 nodes. the current waveform, as we must. However, the number of waveforms with such dominant high frequency components is not large for most of practical circuits. The approximation with two poles gives accurate result, which is consistently observed with other circuits. #### V. CONCLUSION We study the interconnect power consumption based on current and future technology node parameters. The study shows that for a case of optimally buffered global interconnect systems about 20%–30% power is changed to heat in interconnects, which is in sharp contrast to the traditional CMOS power consumption model. We describe a method for the power distribution analysis of an interconnect based on a reduced-order model. We show that power consumption can be computed efficiently in the *s*-domain using an algebraic formulation, instead of improper integration in the time-domain. The theoretical results relies on the poles and residues of a transfer function, and can thus be used in any kind of model order reduction technique. ## REFERENCES - [1] D. G. at UC Berkeley. (2004) Berkeley predictive technology model. [Online]. Available: http://www-device.eecs.berkeley.edu/~ ptm/ - [2] L. T. Pillage and R. A. Rohrer, "Asymptotic waveform evaluation for timing analysis," *IEEE Trans. on Computer-Aided Design*, vol. 9, no. 4, pp. 352–366, Apr. 1990. - [3] J. M. Rabaey, A. Chandrakasan, and B. Nikolic, Digital Integrated Circuits: A Design Perspective. Prentice Hall, 2003. - [4] K. M. Carrig, N. T. Gargiulo, R. P. Gregor, D. R. Menard, and H. E. Reindel, "A new direction in ASIC high-performance clock methodology," in *Proc. IEEE Custom Integrated Circuits Conf.*, 1998. - [5] T. Sakurai, "Superconnect technology," *IEICE Trans. on Electron.*, vol. E84-C, no. 12, pp. 1709–1716, Dec. 2001. - [6] —, "Approximation of wiring delay in MOSFET LSI," *IEEE Journal of Solid-State Circuits*, vol. SC-18, no. 4, pp. 418–426, Aug. 1983. - [7] P. Feldman and R. Freund, "Efficient linear circuit analysis by Padé approximation via the Lanczos process," *IEEE Trans. on Computer-Aided Design*, vol. 14, no. 5, pp. 639–649, May 1995. - [8] Y. Shin and T. Sakurai, "Estimation of power distribution in VLSI interconnects," in *Proc. Int'l Symposium on Low Power Electronics and Design*, Aug. 2001, pp. 370–375.