Loop Partitioning versus Tiling for Cache based
Multiprocessors
Fabrice Rastello
LIP, Ecole Normale Supérieur, Lyon
In this paper, an efficient algorithm to implement loop partitioning
is introduced and evaluated. We improve recent results of Agarwal, Kranz and
Natarajan~\cite{AgarwalKN95} in several directions. We derive a new
formulation of the cumulative footprint, which enables us to deal with
arbitrary parallelepiped-shaped tiles, as opposed to rectangular tiles in
~\cite{AgarwalKN95}. We design an efficient heuristic to determine the
optimal tile shape. We illustrate the superiority of our algorithm on the same
examples as in~\cite{AgarwalKN95} to ensure the fairness of the comparisons.
Key words: compilation technique, hierarchical memory systems, loop
partitioning, tiling, cache, data locality, footprint.
Further info and related paper(s):