Architecture/Compilation

 

PACCMAN

A compiler/simulator suite for cryptography ASIP

IATO, The IAOO Toolkit

IATO, the IAOO Toolkit is a flexible environment that permits to analyze, emulate or simulate the IA64 Instruction Set Architecture (ISA) binary executables.


Out-of-order execution on IA64 microarchitectures

We are investigating a novel register management policy that is designed to operate smoothly with a fully predicated ISA. This new system is based on an intermediate representation called Translation Register Buffer (TRB). The TRB mechanism that translates a logical register into a physical register is shown to be effective when an instruction is canceled by a predicate.

Related publications:

 

Decoupled Architectures

Needs for performance on embedded applications will lead to the use of dynamic execution on embedded processors in the next few years. However, complete out-of-order superscalar cores are still expensive in terms of silicon area and power dissipation. Decoupled architectures provide a more limited form of dynamic execution, yet simpler to implement. We have studied the adequation of decoupled architectures to embedded applications.

Related publications:

 

Power / Performance Tradeoffs

Power consumption is becoming a major issue on most processors. We are exploring the impact of compiler optimizations on  power consumption. We have shown that there exists a threshold above which ILP enhancing optimizations may necessarily turn into diminishing energy reduction returns. Our analysis revealed that this can be mainly attributed to the limited available instruction parallelism of applications.
We are also exploring the use of reconfigurable hardware to decrease power consumption without impacting performance. The cache hierarchy is a typical example of such a power/performance tradeoff. On some processors, the cache accounts for up to 50% of the total chip area and for about 80% of the total transistor count, making the cache hierarchy a critical source of power dissipation. One way to tackle this problem is to have reconfigurable caches which size and associativity can adapt to the workload characteristics. We are exploring fine-grain reconfiguration strategies that try to identify phases during the program execution  and reconfigure the cache on a per-phase basis.

Related publications:

 

Exploitation of special-purpose instruction sets in C programs

Many of modern processors provide extensions to their instruction set specifically designed for computation-intensive multimedia applications. These multimedia extensions are usually provided as intrinsics that can be inserted in C code. Direct insertion in the assembly code is possible but requires good knowledge of both processor architecture and compilation techniques. Moreover such an approach does not lead to portable codes. Using intrinsics in C source code still requires code transformations such as vectorization for highlighting code regions where data parallelism can be exploited. We have developed a C-to-C retargetable preprocessor called SWARP that searches for portions of code suitable to the use of multimedia instructions and automatically inserts their intrinsic equivalent. SWARP is based on modern code analysis and code transformation (dependence analysis, alias analysis, loop transformation, vectorization,...) and on pattern matching for recognizing and replacing suitable code patterns.

Related publications:

 

High speed instruction-set simulation

Instruction-set simulation can be used to evaluate different instruction-set architectures in the context of architecture exploration, or to validate a compiler back-end, to test, tune and debug programs, on a user-friendly PC or workstation rather than on actual silicon which might not even exist yet. The increasing size and complexity of embedded software require extremely fast instruction-set simulation. Compiled instruction-set simulation is an approach that is potentially much faster than interpretation, but it has a start-up cost due to the generation and compilation of the simulator. This start-up cost is often seen as a major drawback and has limited the adoption of compiled instruction-set simulation. We have designed a new approach to compiled instruction-set simulation, that aims at reconciling flexibility, retargetability, high simulation speed, and small start-up cost. This approach was implemented in ABSCISS, a generator of compiled instruction-set simulators that works at the assembler level.

Related publications: