Technical Report CS0835

Authors: G. Even and A. Litman
Abstract: In general, mapping a circuit onto several chips incurs a physical setting which differs from those within a chip. Specifically, the delay of the chip-to-chip interconnections is much longer than on-chip delays of wires and gates. This delay effects the bandwidth as well. In addition, the clock skew between chips is larger than the clock skew within a chip. One may mistakenly conclude that the feasible clock period of a systolic array cannot be smaller than the maximal delay of an interconnection in a realization of the circuit. This paper proposes a technique for mapping large systolic linear arrays and systolic two-dimensional arrays onto several chips while almost maintaining the clock rates which are obtainable when these circuits are small enough to fit into a single chip. Our solution does not rely on special analogue techniques. It is described in a sequence of transformations (logic duplication and retiming), reductions, and an implementation of interconnections which have a required behavior in a given physical setting. It is shown that each step preserves functionality, and subsequently, the correctness of the proposed solution is implied.
