Context
Computing platforms are more and more complex with increasing heterogeneity at different scales, containing various processing elements (PE) in the same package of complex Multi-Processor System-on-Chips (MPSoC), inside single-node server cards or in server racks integrating several heterogeneous servers. In all cases, different and evolving combinations of CPUs, GPUs, FPGAs, and other domain-specific accelerators like Google TPUs are progressively being deployed in the data center. Complexity and heterogeneity also come from communication infrastructures, which include AXI buses and Networks-on-Chip (NoC) inside SoCs and PCIe links at the server level to connect different PE, either cache-coherent or not, and top-of-rack switches to connect different servers together [1]. In this context, information crosses and produces logical and electrical interactions at more system layers than ever. In multi-tenant cloud scenarios with high rates of hardware resource sharing, this opens the door to new hardware vulnerabilities that can be remotely exploitable from software.
In this PhD, we focus on the security of reconfigurable cloud computing [1,2], as Field Programmable Gate Arrays (FPGA) are increasingly deployed in the data center. In this scenario, independent users can share the same FPGA fabric as a reconfigurable accelerator board deployed in server racks and connected through PCIe interfaces with high-end CPUs and other accelerators like GPUs. As a result, to the existing attack surface in the more standard cloud composed of, e.g., CPUs and GPUs, we now have to add the new attack surface brought by the adoption of the FPGA. A specific focus in this work will be on software-accessible electrical-level attacks through embedded sensors, e.g., power or temperature while studying original side-channel sources arising from the complexity (heterogeneity) of cloud platforms, multi-tenancy, resource sharing, and FPGA reconfigurability.
State of the art
Attacks to hardware no longer require complex setups and short-range access to victims as used to be the case for side-channel attacks (SCA), since new vulnerabilities at the interface between software and different hardware layers already allow to mount attacks remotely from software.
At the electrical level, adversaries sharing an FPGA fabric with the victim can exploit the electrical coupling with the victim by abusing the standard on-chip logic to mount SCA, covert-channel, and FI attacks [2]. Focusing on remote SCA vulnerabilities, these arise from leakage sources, including shared on-chip resources like the Power Distribution Network (PDN) [3], routing elements like long wires [4], as well as medium wires, multiplexers, and logic resources [5]. These attacks use malicious constructs in the FPGA deployed by the attackers that behave as voltage-delay sensors that capture dynamic power supply variations through the PDN that correlate with the victim computations [6] or exploit coupling among adjacent logic resources [4,5]. Malicious constructs include Ring Oscillators (ROs) and Time-to-Digital Converters (TDCs) [1,2,6]. Some attacks include: (i) breaking cryptographic implementations from within the same FPGA fabric [3] to CPUs in the same SoC [7] using TDC sensors and leakage from wires [8]; (ii) recovering model secrets and inference inputs from Deep Neural Network (DNN) accelerators [9,10,11]; and (iii) uniquely identifying FPGAs by generating PDN-based signatures through PDN impedance measurement with on-chip digital resources (ROs to create noise and TDCs to measure power consumption) [12]. Directly obtainable from the platform management resources through standard APIs, the Intel Running Average Power Limit (RAPL) interface has been used as a power side-channel to breach Intel SGX [13].
The aforementioned issues pushed FPGA tool vendors and cloud service providers to detect and prevent the use of certain circuits like ROs. However, recent preliminary work shows how to use regular, apparently harmless circuits that can behave as leakage sensors under specific configurations. For instance, benign circuits like adders can detect voltage drops using carry propagation logic when operated at timing-critical regions, which is enough to perform Correlation Power Analysis attacks (CPA) to get AES keys [14].
The typical threat model in attacks to cloud FPGAs assumes spatial and temporal multi-tenancy. This means that attackers successfully co-locate in the same FPGA with victims and at the same time, hence being aware of the existence of victims in the same FPGA when they launch their attacks. However, this is a non-trivial issue to solve. To fingerprint cloud FPGA infrastructures, researchers started exploring methods that include: (i) exploiting the on-board DRAM to generate Physical Unclonable Functions (PUFs) to extract unique and stable FPGA fingerprints [15]; (ii) analyzing PCIe contention among FPGA boards in a server during simultaneous use of the PCIe bus to gain insights on how FPGA instances are physically co-located within a server [16]; generating PDN-based signatures through impedance measurement to uniquely distinguish different FPGAs due to process variation (iii) [12]; and (iv) using communications side-channel by crafting non-malicious FPGA accelerators able to sniff FPGA-host communication links, collect performance traces, and use different Machine Learning (ML) classifiers to reveal the unique communication fingerprints of co-located accelerators [17].
Scientific challenges and objectives of the thesis
This thesis will focus on remote SCA and data extraction in multi-tenant reconfigurable cloud computing scenarios. It will tackle scientific challenges for: i) the remote identification and extraction of valuable leakage information through malicious constructs like embedded power sensors and existing built-in counters and sensors for platform management; ii) the analysis of the multi-source information coming from these sensors and counters; and iii) the runtime detection of suspicious attacker activities.
In the context described, several open questions remain to fully unlock remote side-channel attacks to FPGAs, which include how to efficiently and reliably exfiltrate leaked data while circumventing the increasing countermeasures imposed by cloud providers: e.g., which stealthy, benign circuits can be used, how to operate them, and and how to exploit the captured data.
Regarding data exploitation, although embedded power sensors have been proven to be a powerful attack vector, important challenges remain, such as the impact of the sensor location and its proximity to the victim design. Also, the cooperation among different instances of the same sensor and among different types of sensors has not yet been thoroughly studied. For example, we are not aware of any work that exploits the combined information obtained from different sources like performance counters available in CPUs, from the energy management interface exposed to software (e.g., temperature sensors), with information from custom sensors built on the FPGA logic.
As for the analysis, ML-based techniques offer important means to treat and correlate a great amount of information from different sources. This information can be exploited to mount more sophisticated techniques and detect possible leakage vulnerabilities. We will evaluate which sources of information can be obtained and exploited using analysis techniques including but not limited to ML:, e.g., power traces from the embedded sensors, performance counter values and their thresholds, execution times, temperature information from the platform and energy management infrastructure, etc.
Eventually, with the insights learned so far, we will be able to identify potentially vulnerable information sources and features that can be exploited to extract secret data and some patterns used by attackers to access this information. This can set the ground for runtime detection of ongoing attacks.
Methodology
- We start this work by studying the specificities of the reconfigurable cloud computing scenario, including its architecture and multi-tenancy support. The objective is the identification and extraction of potential information leakage sources from embedded power or temperature sensors, as well as other sources that might still remain hidden under the underlying complexity and heterogeneity of the reconfigurable cloud computing scenario.
- The information gathered in the previous step will come from different sources, thus capturing different side channels that can represent different physical or logical features. The next step of the work addresses the study of advanced analysis techniques, likely based on machine learning, to exploit these different leakage sources to extract confidential information on the users and/or the datacenter structure or resource allocation mechanisms.
- In the last step we will address the analysis of various data sources from sensors and counters to determine indicative features of malicious activity. A possible line to investigate is a combined SW/HW approach that gathers runtime information from different platform sensors and counters (from the CPU, FPGA, etc.) to be analyzed in real-time. The approach will consider integrating runtime monitoring SW in the server CPU and HW in the FPGA shell, which acts as the interface with the PCIe link and orchestrates user access to the reconfigurable logic of the FPGA fabric.
- With the integration of this SW/HW runtime monitoring infrastructure, we will reassess from the beginning to examine if there are new dependencies and relationships between other platform features and the attacks so that the monitoring infrastructure can be updated to ensure that the system can effectively detect and respond to evolving threats. Therefore, this iterative process will allow us to continuously refine our detection methods and ensure robust protection against various types of attacks, both existing and to arise.
[1] M. Stojilović, K. Rasmussen, F. Regazzoni, M. B. Tahoori, and R. Tessier, “A Visionary Look at the Security of Reconfigurable Cloud Computing,” Proceedings of the IEEE, pp. 1–24, 2023
[2] J. Szefer and R. Tessier, Eds., Security of FPGA-Accelerated Cloud Computing Environments. Springer, 2024
[3] M. Zhao and G. E. Suh. ‘FPGA-Based Remote Power Side-Channel Attacks’. IEEE Symp. S&P. 2018,pp. 229–244
[4] I. Giechaskiel, K. Eguro and K. B. Rasmussen. ‘Leakier Wires: Exploiting FPGA Long Wires for Covert- and Side-channel Attacks’. ACM Transactions on Reconfigurable Technology and Systems 12.3, ACM, 2019,pp. 1–29
[5] I. Giechaskiel and J. Szefer. ‘Information Leakage from FPGA Routing and Logic Elements’. 2020 IEEE/ACM International Conference On Computer Aided Design (ICCAD). Nov. 2020, pp. 1–9.
[6] S. Moini et al. ‘Voltage Sensor Implementations for Remote Power Attacks on FPGAs’. ACM Transactions on Reconfigurable Technology and Systems July 2022
[7] J. Gravellier, J.-M. Dutertre, Y. Teglia, P. L. Moundi and F. Olivier. ‘Remote Side-Channel Attacks on Heterogeneous SoC’. Smart Card Research and Advanced Applications. Springer, 2020, pp. 109–125
[8] C. Ramesh et al. ‘FPGA Side Channel Attacks without Physical Access’. 2018 IEEE 26th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM). IEEE, Apr. 2018, pp. 45–52
[9] Zhang, R. Yasaei, H. Chen, Z. Li and M. A. A. Faruque. ‘Stealing Neural Network Structure Through Remote FPGA Side-Channel Analysis’. IEEE Trans. Inf. Forensics Secur. 16 2021, pp. 4377–4388
[10]. May Myat Thu, Maria Méndez Real, Maxime Pelcat, and Philippe Besnier. You only get one-shot: Eavesdropping input images to neural network by spying soc-fpga internal bus. In Proceedings of the 18th International Conference on Availability, Reliability and Security (ARES), pages 1–7, 2023.
[11]. May Myat Thu, Maria Méndez Real, Maxime Pelcat, and Philippe Besnier. Bus electrocardiogram: Vulnerability of soc-fpga internal axi bus to electromagnetic side-channel analysis. In 2023 International Symposium on Electromagnetic Compatibility (EMC Europe), pages 1–6. IEEE, 2023.
[12] H. Zhu, W. Cao, and X. Zhang, “PDNSig: Identifying Multi-Tenant Cloud FPGAs with Power Distribution Network-Based Signatures,” in 2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD), Oct. 2023, pp. 1–8
[13] M. Lipp et al. ‘PLATYPUS: Software-based Power Side-Channel Attacks on X86’. 2021 IEEE Symposium on Security and Privacy (SP). IEEE, 2021, p. 17.
[14] D. R. E. Gnad et al. ‘Stealthy Logic Misuse for Power Analysis Attacks in Multi-Tenant FPGAs’. 2021 Design, Automation & Test in Europe Conference & Exhibition (DATE). Feb. 2021, pp. 1012–1015
[15] S. Tian, W. Xiong, I. Giechaskiel, K. Rasmussen, and J. Szefer, “Fingerprinting Cloud FPGA Infrastructures,” in The 2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, in FPGA ’20, ACM 2020, pp. 58–64
[16] S. Tian, I. Giechaskiel, W. Xiong, and J. Szefer, “Cloud FPGA Cartography using PCIe Contention,” in 2021 IEEE 29th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), May 2021, pp. 224–232
[17] C. Fang et al., “Gotcha! I Know What You Are Doing on the FPGA Cloud: Fingerprinting Co-Located Cloud FPGA Accelerators via Measuring Communication Links,” in Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security, in CCS ’23. ACM, 2023, pp. 2024–2037