Our system is built on three components: the data structures for representing the program, the machine description used to represent resource usage, and finally, on the user interface that enables the writing of instrumentation or scheduling algorithms.
A program written in assembly language can be viewed as consisting of several procedures, each of which is a list of instructions. Within a procedure, instructions are grouped into basic blocks.
While parsing the code, SALTO builds the list of the procedures it encounters. Each procedure has a list of basic blocks, and each block ``knows'' its list of instructions. Figure 1.2 illustrates this object structure. The internal representation of these objects is hidden by the user interface, as shown in section 1.2.2. Markers are added by the analyzer to give useful information about the code being processed: basic blocks frontiers (shaded instructions in figure 1.2), or procedures frontiers, name of current segment, etc.
After parsing the code, a control flow graph (CFG) is built for each procedure. The vertices of a control flow graph are basic blocks and its edges denote the execution order of the blocks. Edges are labeled to indicate if they correspond to the taken or not-taken branch (see figure 1.3 for an example of the graph corresponding to a simple procedure written in C language.) A known limitation of this scheme is its conservative nature, due to the static analysis of the assembly-code. If the target address of a branch instruction is a computed register value, then the SALTO parser leaves the determination of the actual branch target to a user-supplied tool.
/* * Computes the GCD of two integers * using Euclide's method */ int gcd(a,b) int a, b; { int result; if (a < b) a^=b, b^=a, a^=b; if (a%b) result = gcd(a-b,b); else result=b; return result; } |
The second part of the data structures provided by SALTO gives information about the resources needed by an instruction to complete its execution. A resource is usually a register, a functional unit or the memory, but it could be any piece of hardware needed to describe the behavior of the machine (see section 1.2.3 for an explanation on how to define resources.) Each instruction needs a resource during a number of cycles with a particular access mode: either read, write or use.
Each instruction is described by a reservation table, which indicates the list of resources it needs and the mode and cycle a resource is accessed. This information is used when determining the type of data flow between two instructions: RAW (read after write), WAW (write after write), WAR (write after read).
The memory is seen as a unique resource and all memory accesses are considered to be to the same memory location. SALTO is conservative when checking data dependences and thus two memory accesses, one of which is a write, always lead to a dependence. However, the functions of the user interface make it possible for the user to write his own alias analysis algorithm to detect such situations and to implement a link to the data dependence analysis subsystem of a compiler.
This goal is achieved through the use of a Lisp-like language based on the reservation tables formalism. The description file is parsed by SALTO and an internal representation is built using RTL (Register Transfer Language) [29]. The target system description covers:
The remainder of this report is organized as follows. Section 2 provides the specification of target description files. Section 3 contains the detailed specification of the user interface, covering the types, classes, functions, and methods we intend to provide in SALTO. Finally, in section 4 we describe two applications developed using a limited prototype of the tool. These examples are further developed in the appendix. We conclude with an outline of intended development and experimentation work.