Loop Partitioning versus Tiling for Cache based Multiprocessors

Fabrice Rastello
LIP, Ecole Normale Supérieur, Lyon


In this paper, an efficient algorithm to implement loop partitioning is introduced and evaluated. We improve recent results of Agarwal, Kranz and Natarajan~\cite{AgarwalKN95} in several directions. We derive a new formulation of the cumulative footprint, which enables us to deal with arbitrary parallelepiped-shaped tiles, as opposed to rectangular tiles in ~\cite{AgarwalKN95}. We design an efficient heuristic to determine the optimal tile shape. We illustrate the superiority of our algorithm on the same examples as in~\cite{AgarwalKN95} to ensure the fairness of the comparisons.

Key words: compilation technique, hierarchical memory systems, loop partitioning, tiling, cache, data locality, footprint.


Further info and related paper(s):