Latency-Driven container network optimization in edge industrial IoT

Publié le
Equipe (ou département si l'offre n'est pas rattachée à une équipe)
Date de début de thèse (si connue)
01/09/2025
Lieu
IRISA
Unité de recherche
IRISA - UMR 6074
Description du sujet de la thèse

Context:

Cloud computing and its three facets (IaaS, PaaS and SaaS) have become essential in today's Internet Applications, offering many advantages such as scalability, elasticity or flexibility. Although countless businesses have moved their IT operations to the cloud, there remains many challenges to enable cloud technologies to address a very wide variety of needs.

One limitation of standard cloud platforms is due to the fact that they centralize their compute, storage and networking resources in a small number of very large data centers. Although this organization delivers many benefits, it also means that end-user devices are usually located far from the data centers. This generates large volumes of long-distance network traffic and imposes high latency before data can be processed in the cloud and an output can be obtained by the user. One application domain where it is critical to reduce latencies between the end devices and the servers that are used to process the data is Industrial Internet of Things (IIoT) where processed outputs are utilize to drive industrial processes in real time.

To mitigate these problems, Fog and Edge computing propose to extend the cloud platforms with additional resources located in the  vicinity of the end users and where their data may be processed. It has been shown that edge computing can indeed reduce user-perceived latency compared to cloud systems [4]. However this is not always true, and in some cases an incorrect usage of edge technologies may even lead to worse performance [7]. The objective of this PhD thesis is to provide tools and methods for understanding the sources of network latency in the edge computing platforms, and for driving better network routing decisions to avoid the identified inefficiencies.


The topology of an edge computing network can be complex as it involves multiple networking technologies and makes extensive usage of network virtualization techniques. These techniques are invaluable to help deliver a simple view of networking to the other components of the edge platforms, but on the other hand they also tend to hide many internal details that may be useful to identify and mitigate unnecessary latency. For instance, standard container orchestration platforms, such as Kubernetes, rely on the Container Network Interface (CNI) to define the network requirements and manage the connectivity of containerized workloads. Consequently, the performance of the network is heavily influenced by the specific implementation of the CNI [3]. In latency-critical applications, dynamic changes in the network can significantly impact the responsiveness of the CNI, posing challenges for routing technologies.

To ensure that the quality of service is guaranteed, several solutions exist to reconfigure the components placement (migration) and/or change the network routes to these components. One potential technique relies on the usage of Segment Routing (SR). SR introduces source-based routing, where nodes choose specific paths for packet forwarding by inserting an ordered list of segments. This approach may offer enhanced packet forwarding behavior, allowing networks to transport packets through tailored paths based on application requirements. However, knowing precisely which component is the source of the problematical latency remains scarcely addressed. While some studies have explored the integration of SR with Kubernetes [6], they have predominantly focused on throughput metrics. There remains a significant research gap in addressing SR’s implications for latency, which warrants further investigation.

Assignement:

In the context of IIoT applications (i.e. critical response time environments such as Manufacturing or Monitoring of critical equipment) latency is at the center of a tremendous number of studies to optimize the placement of resources in distributed architectures. To ensure that the quality of service is guaranteed, several solutions exist to reconfigure the components placement (migration) and can reduce the overall latency by changing the components and routes. However, knowing precisely which component is the source of the problematical latency remains scarcely addressed. When taking a decision for a reconfiguration or a migration, which can be triggered due to latency issue, it can be beneficial to check if the source of the latency can be solved before instantiating a migration or a full reconfiguration. Some studies exist where a comparison of response time is done between the major cloud actors depending on the load [5]. Proper measurement protocols exist but always refer to specific case studies [1, 2] and would not allow to be integrated in edge systems.

Objectives:

The objective of this thesis is to study the optimization of the container network in Edge-based IIoT systems based on latency measurement, by evaluating the control plane cost of a change in the architecture. It will particularly address the problem of how to identify the origin of a latency issue, and based on this finding, propose routing optimizations that take into account the cost and elasticity of the control plane.

Main activities:

  • Explore the State-of-the-Art of the IoT/Fog Emulation/Simulation platforms

  • Integrate an IIoT solution in a Edge architecture platform with a latency measurement

  • Propose a profile and a classification of latency issues

  • Through the use of Segment Routing, propose an innovative way to optimize CNI taking into account the latency metrics and the control plane capabilities

Skills:

  • A master degree in distributed systems and/or Cloud computing/Networking.

  • Good knowledge of distributed systems and Network Virtualization

  • Good programming skills (e.g., C++ and Python).

  • Basic knowledge of simulation.

  • Excellent communication and writing skills.

Knowledge of the following technologies is not mandatory but will be considered as a plus:

- Cloud resource scheduling

- Routing, SDN, CNI plugins (Calico)

- Revision control systems: git, svn.

- Linux distributions: Debian, Ubuntu.

Bibliographie
  1. [1]  Dániel Géhberger, Dávid Balla, Markosz Maliosz, and Csaba Simon. Performance eval- uation of low latency communication alternatives in a containerized cloud environment. In 2018 IEEE 11th International Conference on Cloud Computing (CLOUD), pages 9–16, 2018.

  2. [2]  Devasena Inupakutika, Gerson Rodriguez, David Akopian, Palden Lama, Patricia Chalela, and Amelie G. Ramirez. On the performance of cloud-based mhealth ap- plications: A methodology on measuring service response time and a case study. IEEE Access, 10:53208–53224, 2022.

  3. [3]  Zhuangwei Kang, Kyoungho An, Aniruddha Gokhale, and Paul Pazandak. A compre- hensive performance evaluation of different kubernetes cni plugins for edge-based and containerized publish/subscribe applications. 2021 IEEE International Conference on Cloud Engineering (IC2E), pages 31–42, 2021.

  4. [4]  Zheng Li and Francisco Millar-Bilbao. Characterizing the cloud’s outbound network latency: An experimental and modeling study. In 2020 IEEE Cloud Summit, pages 172–173, 2020.

  5. [5]  István Pelle, János Czentye, János Dóka, and Balázs Sonkoly. Towards latency sensitive cloud native applications: A performance study on aws. In 2019 IEEE 12th International Conference on Cloud Computing (CLOUD), pages 272–280, 2019.

  6. [6]  José Santos, Jeroen van der Hooft, Maria Torres Vega, Tim Wauters, Bruno Volckaert, and Filip De Turck. Srfog: A flexible architecture for virtual reality content delivery through fog computing and segment routing. 2021 IFIP/IEEE International Symposium on Integrated Network Management (IM), pages 1038–1043, 2021.

  7. [7]  Sami Yangui, Pradeep Ravindran, Ons Bibani, Roch H. Glitho, Nejib Ben Hadj- Alouane, Monique J. Morrow, and Paul A. Polakos. A platform as-a-service for hy- brid cloud/fog environments. In 2016 IEEE International Symposium on Local and Metropolitan Area Networks (LANMAN), pages 1–7, 2016.

Liste des encadrants et encadrantes de thèse

Nom, Prénom
Pierre Guillaume
Type d'encadrement
Directeur.trice de thèse
Unité de recherche
IRISA
Equipe

Nom, Prénom
Lemercier François
Type d'encadrement
Co-encadrant.e
Unité de recherche
IRISA
Equipe
Contact·s
Nom
Lemercier François
Email
francois.lemercier@irisa.fr
Téléphone
0299837327
Mots-clés
Edge-Computing, Routing, Performance Evaluation, Latency-based optimization