Next: Operand Abstraction
Up: SALTO User Interface Specification
Previous: SALTO-Specific Types
The primitives of SALTO are divided into two groups: global functions,
operating at top level in the target code, and class methods, which
manipulate the contents and properties of specific SALTO objects.
All indices in lists (position of CFG in the program, of a basic block in a
CFG, etc.) start from 0.
Failure of a function returning a pointer is indicated by returning
NULL.
The first and last instruction in a basic block are special internal markers
and can neither be moved nor extracted.
An instruction can belong to at most one basic block. To be moved from one
block to another, an instruction must be first removed from its original
block, then inserted into the destination one.
Global functions provide the means of manipulating the list of procedures
appearing in a program, extracting the name of the target
architecture, and finding the position of an object (CFG, basic block,
instruction) in its container (program, CFG, or basic block).
- void loadFile(char *fileName)
- reads and parses the
assembly file fileName. N.B.: this function should only be used if
a new file has to be processed. In normal operation, an input
file has already been read and parsed before the user application started
executing.
- char *getTargetName(void)
- returns the name of target
architecture as specified in the target description being used, for
example "sparc" or "mips".
- unsigned int numberOfCFG(void)
- returns the number of distinct
control flow graphs in the program, that is, the number of procedures.
- CFG *getCFG(unsigned int pos)
- returns the control flow
graph of the pos-th procedure in the current program. pos must be
a value between 0 and numberOfCFG() - 1. Otherwise, an error
message is generated and NULL is returned.
- unsigned int numberOfInstructions(void)
- returns the number of
instructions (including labels and directives) in the program seen as a
flat list of instructions. N.B.: the expansion of macros was
performed beforehand, when the input program was parsed.
- INST *getInstruction(unsigned int pos)
- returns the
pos-th instruction in the program seen as a flat list of
instructions. pos must be a value between 0
andnumberOfInstructions(). Otherwise, an error
message is generated and NULL is returned.
- void removeCFG(int pos)
- suppresses the pos-th
procedure from the program. pos must be a value between 0 and
numberOfCFG() - 1. Otherwise, an error message is generated and
the call has no effect.
- unsigned int getPositionInPgm(CFG *cfg)
- returns the
position of the specified procedure in the current program. If the
procedure has not been found in the abstract representation of the
program, an error message is generated.
- unsigned int getPositionInCFG(BB *b)
- returns the
position of the specified basic block within its enclosing procedure. If
the basic block has not been found in the abstract representation of the
program, an error message is generated.
- unsigned int getPositionInBB(INST *st)
- returns the position of the
specified instruction within its enclosing basic block. If the
instruction has not been found in the abstract representation of the
program, an error message is generated.
- void produceCode(FILE *outFile)
- unparses the internal
representation of the current assembly program to the specified file.
outFile must already be open for writing. The
default value of outFile is stdout.
- void producePrologue(FILE *outFile)
- unparses the internal
representation of the prologue of current assembly program to the
specified file. The prologue consists of all instructions (directives,
data labels etc.) from the beginning of the program up to (but not
including) the first CFG of the program. outFile must already be
open for writing. The default value of outFile is stdout.
- void produceEpilogue(FILE *outFile)
- unparses the internal
representation of the epilogue of the current assembly program to the
specified file. The epilogue consists of all instructions located past
the end of the last CFG of the program. outFile must already be
open for writing. The default value of outFile is stdout.
The data in CFG objects is entirely privatized: all modifications of
their values are performed through the methods listed below:
- char * CFG::getName(void)
- returns the name of the procedure
corresponding to this CFG, i.e., the first label of its first basic
block. A NULL pointer is returned if the CFG is empty.
- unsigned int CFG::numberOfBB(void)
- returns the number of basic
blocks in the procedure.
- BB *CFG::getBB(unsigned int pos)
- returns the
pos-th basic block of the procedure. pos must be a value
between 0 and this->numberOfBB() - 1. Otherwise, an error message is
generated and NULL is returned.
- void CFG::deleteBB(unsigned int pos)
- deletes the
pos-th basic block and its instructions from the control flow
graph and updates the edges of the graph. Prints an error message and
does not modify the graph if pos is out of bounds.
- BB *CFG::createNewBB(void)
- creates a new basic block with no
instructions in it.
- BB *CFG::extractBB(unsigned int pos)
- extracts a basic
block from the procedure without destroying its contents or modifying the
dependences (edges) of the graph. Allows the basic block to be inserted
elsewhere. pos must be a value between 0 and
this->numberOfBB() - 1. See also methods CFG::linkBB()
and CFG::unlinkBB().
- void CFG::insertBB(unsigned int pos, BB *b)
-
inserts a previously extracted or newly created basic block into the
control flow graph at the position specified by pos. No edges are
added to the graph. pos must be a value between 0 and
this->numberOfBB() - 1. See also methods CFG::linkBB()
and CFG::unlinkBB().
- void CFG::linkBB(BB *source, BB *sink, enum cft_type
t)
- adds an edge between basic blocks source and
sink. The parameter indicates whether the edge corresponds to the
branch being taken (test condition satisfied, = TAKEN) or not
(test condition failed, = NOT_TAKEN).
- void CFG::unLinkBB(BB *source, BB *sink)
- suppresses
the edge between basic blocks source and sink.
- void CFG::producePrologue(FILE *outFile)
- writes to file
outFile the totality of the (pseudo-)code preceding the first
basic block of the current procedure. This may include directives,
data labels, comments etc.
- void CFG::produceEpilogue(FILE *outFile)
- writes to file
outFile the (pseudo-)code following the last basic block of
the current procedure.
- void CFG::produceCode(FILE *outFile)
- writes to file
outFile the complete code of the current procedure, including its
prologue and epilogue.
The objects of class BB represent the basic blocks of the target code,
that is, lists of instructions containing neither branches nor jumps, except
at the end. As for class CFG, there is no direct access to the data of
class BB. All accesses are made through the methods listed below:
- unsigned int BB::numberOfInstructions(void)
- returns the number of
instructions in the current basic block. NOTE: as macro expansion
is performed beforehand, the count returned will be that corresponding to
the expanded code.
- unsigned int BB::numberOfAsm(void)
- returns the number of actual
assembler mnemonics in the current basic block. NOTE: as macro expansion
is performed beforehand, the count returned will be that corresponding to
the expanded code.
- INST *BB::getInstruction(unsigned int pos)
- returns the
pos-th instruction in the current basic block. pos must be a
value between 0 and this -> numberOfInstructions() -
1. If pos is out of bounds, NULL is returned and an error
message is generated.
- INST *BB::getAsm(unsigned int pos)
- returns the
pos-th assembler mnemonic of the current basic block. pos
must be a value between 0 and this -> numberOfAsm() - 1, otherwise
NULL is returned and an error message is generated.
- void BB::extractInstruction(unsigned int pos)
-
suppresses the pos-th instruction from the current basic
block. pos must be a value between 1 and
this -> numberOfInstructions() - 2, otherwise an error
message is generated and the call has no effect. Reminder: the
first and the last instruction of the basic block are SALTO markers and
cannot be removed.
- void BB::extractInstruction(INST *st)
- suppresses the
specified instruction from the current basic block. Instruction st
must belong to the current basic block, otherwise an error message is
generated and the call has no effect. Reminder: the first and the
last instruction of the basic block are SALTO markers and cannot be
removed.
- void BB::extractAsm(unsigned int pos)
- suppresses the
pos-th assembly instruction from the current basic block.
pos must be a value between 0 and this ->
numberOfAsm() - 1, otherwise an error mesage is issued and the call
has no effect.
- void BB::insertInstruction(unsigned int pos, INST
*st)
- inserts a new instruction before the instruction at
position pos in the current basic block. pos must be a
value between 1 and this -> numberOfInstructions() - 1,
otherwise an error message is generated and the call has no effect. The
position given is that at which the inserted instruction should appear
after the call. N.B.: an instruction can only belong to one basic
block: if instructions are moved between blocks, they must first be
extracted from the original block, then inserted into the destination one.
- void BB::insertAsm(unsigned int pos, INST *st)
-
inserts a new assembly instruction before the assembly
instruction at position pos in the current basic block.
pos must be a value between 0 and this ->
numberOfAsm(), otherwise an error message is generated and the call
has no effect. The position given is that at which the inserted
instruction should appear in the assembly instruction list after
the call. If the position given is 0 and the block contains no
assembly instructions, or if the position given is this ->
numberOfAsm(), the assembly instruction st is inserted as the
last instruction of the block. N.B.: an instruction can only belong to
one basic block: if instructions are moved between blocks, they must first
be extracted from the original block, then inserted into the destination
one.
- void BB::swapInstruction(unsigned int pos1, unsigned int
pos2)
- exchange the
instructions located at positions
pos1 and pos2 in the current basic block. pos1
and pos2 must be comprised between 1 and
this -> numberOfInstructions() - 2.
- void BB::orderAccordingToCycles(void)
- reorders the instructions
of the basic block according to the schedule attributed beforehand to each
instruction using calls to INST::setCycle().
- void BB::addNecessaryNops(void)
- insert all necessary NOP
pseudo-instructions corresponding to the cycles for which no instructions
are scheduled. See also orderAccordingToCycles() and
INST::setCycle().
- unsigned int BB::numberOfSuc(void)
- returns the number of
successors of the current basic block in its enclosing control flow graph.
- unsigned int BB::numberOfPred(void)
- returns the number of
predecessors of the current basic block in its enclosing control flow
graph.
- BB *BB::getSuc(unsigned int pos)
- returns the
pos-th successor of the current basic block in the enclosing
control flow graph. pos must be between 0 and this ->
numberOfSuc() - 1, otherwise NULL is returned and an error
message is generated.
- BB *BB::getPred(unsigned int pos):
- returns the
pos-th predecessor of the current basic block in the enclosing
control flow graph. pos must be between 0 and this ->
numberOfPred() - 1, otherwise NULL is returned and an error
message is generated.
- enum cft_type BB::getSucType(unsigned int pos)
- returns
the type of the pos-th successor of the current basic block in the
enclosing control flow graph. pos must be between 0 and
this -> numberOfSuc() - 1, otherwise the call returns
NOT_TAKEN and an error message is generated.
- enum cft_type BB::getPredType(unsigned int pos):
-
returns the type of the pos-th predecessor of the current basic
block in the enclosing control flow graph. pos must be
between 0 and this -> numberOfPred() - 1, otherwise the
call returns NOT_TAKEN and an error message is generated.
- void BB::addSuc(BB *b, enum cft_type t)
- adds a
successor of the current basic block and updates the predecessor list of
the basic block being added. The parameter t indicates whether the
edge corresponds to the branch being taken (test condition satisfied, t
== TAKEN) or not (test condition failed, t == NOT_TAKEN).
- void BB::addPred(BB *b, enum cft_type t)
- adds a
predecessor of the current basic block and updates the successor list of
the basic block being added. The parameter t indicates whether the
edge corresponds to the branch being taken (test condition satisfied,
t == TAKEN) or not (test condition failed, t == NOT_TAKEN).
- void BB::notPredAnymore(unsigned int pos)
- suppresses the
edge between the current basic block and its pos-th predecessor.
pos must be a value between 0 and this ->
numberOfPred() - 1, otherwise an error message is generated.
- void BB::notSucAnymore(int pos)
- suppresses the edge
between the current basic block and its pos-th successor. pos
must be a value between 0 and this -> numberOfSuc() - 1,
otherwise an error message is generated.
- unsigned int BB::contains(INST *st)
- checks whether or
not instruction st belongs to the current basic block.
Returns 0 if the instruction was not found or was a marker
pseudo-instruction. A non-zero return value is the position of the
instruction in the basic block.
- void BB::produceCode(FILE *fg)
- writes the external
representation of the current basic block to the file fg.
- INST *BB::firstInstruction(void)
- returns the first instruction
of the current basic block. It is necessarily a marker pseudo-instruction
BEGIN_BASIC_BLOCK (type
X_INFO_TYPE).
- INST *BB::lastInstruction(void)
- returns the last instruction of
the current basic block. It is necessarily a marker pseudo-instruction
END_BASIC_BLOCK (type X_INFO_TYPE).
The class INST implements a representation of target code
instructions. As for the classes CFG and BB, all data of
class INST objects are private and can only be manipulated using the
methods listed below.
The type and several semantical propreties of an instruction can be checked
by calling the following predicates:
- xNode_Type INST::getType(void)
- returns the type of the current
instruction (see section 3.2 above.)
- bool INST::isLabel(void)
- returns true if the current
instruction is a label.
- bool INST::isPseudo(void)
- returns true if the current
instruction is a ``pseudo-instruction'', i.e., an assembler directive.
- bool INST::isAsm(void)
- returns true if the current
instruction is an actual assembler instruction.
- bool INST::isBranch(void)
- returns true if the current
instruction is a conditional branch. NOTE: applies only to actual
assembler instructions; otherwise, returns false and generates an
error message.
- bool INST::isJump(void)
- returns true if the current
instruction is an unconditional jump. NOTE: applies only to actual
assembler instructions; otherwise, fails with an error message.
- bool INST::isCall(void)
- returns true if the current
instruction is a subroutine call. NOTE: applies only to actual assembler
instructions; otherwise, returns false and generates an
error message.
- bool INST::isReturn(void)
- returns true if the current
instruction is a return from subroutine. NOTE: applies only to actual
assembler instructions; otherwise, returns false and generates an
error message.
- bool INST::isNop(void)
- returns true if the current
instruction is a NOP. NOTE: applies only to actual assembler instructions
and macros; otherwise, returns false and generates an
error message.
- bool INST::isCTI(void)
- returns true if the current
instruction is a control transfer instruction (branch, jump, call or
return). NOTE: applies only to actual assembler instructions and macros;
otherwise, returns false and generates an
error message.
New instructions can be created using the following set of functions:
- INST *newAsm(char *opcode, unsigned int numOps = 0,
...)
- returns a new assembler instruction with mnemonic opcode
and numOps operands, built using the first instruction format
specified for that mnemonic in the machine description file. Instruction
operands are passed in the optional argument part as C++ references to
class OperandInfo objects. A reservation table matching the operands
of the instruction is also created and attached to the instruction. NOTE:
each operand is passed as a separate argument.
- INST *newAsm(char *opcode, char *format, unsigned int
numOps = 0, ...)
-
returns a new assembler instruction with
mnemonic opcode and numOps operands, built using the
specified instruction format. A matching instruction declaration must
exist in the machine description file. Operands are passed in the
optional argument part as C++ references to class OperandInfo
objects. A reservation table matching the operands of the instruction is
also created and attached to the instruction. NOTE: each operand is passed
as a separate argument.
- INST *newLabel(char *name)
- returns a new label with the
specified name. NOTE: name should not contain the trailing
colon character (`:').
- INST *newPseudo(char *text)
- returns a new
pseudo-instruction whose textual representation (including the leading
dot) is text.
The duplication of an instruction is implemented through the method
`copy()':
- INST *INST::copy(void)
- returns a copy (a clone) of the current
instruction.
The cycle at which the instruction is to be issued can be directly
manipulated through the following two methods:
- int INST::getCycle(void)
- extracts the cycle at which the
instruction will be issued. By convention, a negative value indicates
that the instruction has not been scheduled yet.
- void INST::setCycle(int c)
- sets the cycle at which the
instruction will be issued.
The INST interface provides access to the textual representation of
instructions and to :
- char *INST::getName(void)
- called on an assembly instruction or
a macro, returns the mnemonic without the arguments. On a label,
returns the textual representation of its symbol, without the trailing
colon. On a pseudo-instruction (assembler directive), returns the name
of the directive, including the leading dot (`.').
- char *INST::getAsmInfo(void)
- returns the contents of the textual
information field attached to the assembler instruction or macro-operation
definition matching the current instruction. Note: this method should
only be called on assembler instructions and macros.
- char *INST::unparse(void)
- returns the unparsed (external)
representation of the current instruction irrespective of attributes
attached to that instruction. The memory space for the string is allocated
through a call to new char[] and should be released after use.
- char *INST::unparse(char *st)
- stores in st the
unparsed (external) representation of the current instruction,
irrespective of attributes attached to that instruction. The size of the
memory area pointed to by st must be sufficient to hold the
unparsed text.
- void INST::produceCode(FILE *outFile)
- writes the
unparsed representation of the current instruction to the file
outFile. If an attribute of type UNPARSE_ATT containing a
pointer to a character string is attached to the instruction, only the
that string is written, instead of the textual representation of the
instruction. Comments specified through attributes of type
COMMENT_ATT attached to the instruction are printed in their order of
attachment after the textual representation of the instruction.
The following methods provide the means of extracting and replacing
instruction operands. They should only be called on actual assembly
instructions. Operand abstractions (class OperandInfo) are further
discussed in section 3.4.)
- unsigned int INST::numberOfOperands(void)
- returns the number of
operands attached to the instruction.
- operand *INST::getRawOperand(unsigned int pos)
- returns
the low-level representation of the pos-th operand of the
instruction. pos must be in the range 0..this ->
numberOfOperands() - 1, otherwise an error message is generated and the
call returns NULL.
- void INST::setRawOperand(unsigned int pos, operand
*op)
- sets the pos-th low-level operand of the
instruction to op. pos must be in the range 0..this
-> numberOfOperands() - 1, otherwise an error message is
generated and the call has no effect.
- OperandInfo &INST::getOperand(unsigned int pos)
- returns
the abstraction of the pos-th operand of the
instruction. pos must be in the range 0..this ->
numberOfOperands() - 1, otherwise an error message is
generated and the call returns a reference to an operand of type
unknownOpdT.
- void INST::setOperand(unsigned int pos, OperandInfo
&op)
- sets the pos-th operand of the instruction from
the operand abstraction op. pos must be in the range
0..this -> numberOfOpe-rands() - 1, otherwise an error
message is generated and the call has no effect.
- int INST::numberOfInput(void)
- returns the number of resources
read by the current instruction. NOTE: applies only to actual
assembler instructions.
- int INST::numberOfOutput(void)
- returns the number of resources
written by the current instruction. NOTE: applies only to actual
assembler instructions.
- int INST::numberOfUse(void)
- returns the number of resources
used by the current instruction. NOTE: applies only to actual
assembler instructions.
- res_ref *INST::getInput(int pos)
- returns the
pos-th resource read by the current instruction; pos must be
in the range 0..numberOfInput() - 1. NOTES: 1) the order of
resources returned by getInput() does not necessarily match the
chronological order in which they are accessed by the instruction; 2) this
method applies only to actual assembler instructions.
- res_ref *INST::getOutput(int pos)
- returns the
pos-th resource written by the current instruction; pos must
be in the range 0..numberOfOutput() - 1. NOTES: 1) the order of
resources returned by getOutput() does not necessarily match the
chronological order in which they are accessed by the instruction; 2) this
method applies only to actual assembler instructions.
- res_ref *INST::getUse(int pos)
- returns the pos-th
resource used by the current instruction; pos must be in the range
0..numberOfUse() - 1. NOTE: the order of resources returned by
getUSe() does not necessarily match the chronological order in which
they are used by the instruction; 2) this method applies only to actual
assembler instructions.
- void INST::setInput(int pos, res_ref *r)
- updates
the description of the pos-th resource read by the
instruction; pos must be in the range 0..numberOfInput() - 1.
NOTE: applies only to actual assembler instructions.
- void INST::setOutput(int pos, res_ref *r)
-
updates the description of the pos-th resource written by
the instruction; pos must be in the range 0..numberOfOutput() -
1. NOTE: applies only to actual assembler instructions.
- void INST::setUse(int pos, res_ref *r)
- updates
the description of the pos-th resource used by the
instruction; pos must be in the range 0..numberOfUse() - 1.
NOTE: applies only to actual assembler instructions.
- void INST::getResUsageMode(res_ref *r, int *tab, int
len)
- fills the integer array tab of size len
with the markers indicating the nature of references made by the current
instruction to the resource r at each cycle of its execution.
Non-zero entries in the array correspond to the cycles at which the
resource is referenced by the instruction. NOTE: applies only to actual
assembler instructions; otherwise, fails with an error message.
- void INST::setResUsageMode(res_ref *r, int *tab, int
len)
- sets the use information of resource *r from
the integer array *tab of size len containing the markers
indicating the nature of references made by the current instruction to the
resource *r at each cycle of its execution. NOTE: applies only to
actual assembler instructions; otherwise, fails with an error message.
- int INST::noReorder(void)
- returns TRUE (non-zero) if the
instruction cannot be moved, e.g., if it lies in a delay slot.
NOTE: applies only to actual assembler instructions; otherwise, fails with
an error message.
- enum dependence INST::dependsOn(INST *ii, bool
noCtrlFlow, bool noMem)
-
returns the type of the data
dependence between instruction ii and current instruction,
assuming that ii is executed before the current
instruction. By default, both noCtrlFlow and noMem are
not set (value false). The value returned is one of NONE,
RAW, WAW and WAR (see section 3.2 above.)
If the flag noCtrlFlow is set, INST::dependsOn(...) does not
check whether instruction ii follows current instruction in the
control flow (otherwise, it returns NONE.) If the flag noMem
is set, no tests are made for memory dependences. N.B.: both instructions
(this and ii) should belong to the same basic block.
- int INST::getDelay(INST *ii)
- determines the minimum
delay between current instruction and instruction *ii that will
solve all data aliasing conflicts. If the function int
updateDelay(int delay, INST *first, INST *last, enum
dependence dep) is defined, it is called with first == this
and last == ii to account for the write-back by-pass, if any.
- int INST::getResDelay(INST *ii)
- determines the minimum
delay between current instruction and instruction *ii that will
solve all resource conflicts, assuming that there is exactly one
instance of every functional unit. If the function int
updateDelay(int delay, INST *first, INST *last, enum
dependence dep) is defined either in
user's tool, or in the
target-specific module of SALTO (searched in that order), it is called
with first == this and last == ii to account for the
write-back bypass, if any.
Next: Operand Abstraction
Up: SALTO User Interface Specification
Previous: SALTO-Specific Types
Erven Rohou
Fri Oct 17 09:15:29 MET DST 1997