Direction des Relations Internationales (DRI)

Programme INRIA "Equipes Associées"
/ INRIA "Associate Teams" Programme

I. DEFINITION

EQUIPE ASSOCIEE / ASSOCIATE TEAM	DataCloud@work
sélection	2010

Equipe-Projet INRIA : KerData

Organisme étranger partenaire / Partner Institution:

Politehnica University of Bucharest (PUB)

Centre de recherche INRIA :

Rennes - Bretagne Atlantique

Thème INRIA :

Réseaux, systèmes et services, calcul distribué
Calcul distribué et applications à très haute performance

Pays / Country :

Romania

	Coordinateur français / French Coordinator	Coordinateur étranger / Partner Coordinator	Autre partenaire français / Other French Partner
Nom, prénom / First name, Given name	ANTONIU Gabriel	CRISTEA Valentin	MORIN Christine
Grade, statut / Position	Chargé de recherche	Professeur	Directrice de recherche
Organisme d'appartenance/ Home Institution (précisez < le département et/ou le laboratoire)	INRIA, Centre Rennes - Bretagne Atlantique Equipe KerData	National Center for International Technology (NCIT) Politehnica University of Bucharest (PUB)	INRIA, Centre Rennes - Bretagne Atlantique Equipe-projet PARIS
Adresse postale / Postal address	Campus de Beaulieu, 35042 Rennes cedex	313, Splaiul Independentei, 0600042, Bucuresti, Romania	Campus de Beaulieu, 35042 Rennes cedex
URL / Website	http://www.irisa.fr/kerdata/people/Gabriel.Antoniu/	http://csite.cs.pub.ro/index.php/en/component/comprofiler/?task=userProfile&user=73/	http://www.irisa.fr/paris/web/component/option,com_uhp/task,view/Itemid,110/id,40/
Téléphone / Telephone	+33 2 99 84 72 44	+40 214 029 332	+33 2 99 84 72 90
Télécopie / Fax	+33 2 99 84 71 71	+40 214 029 333	+33 2 99 84 71 71
Courriel / Email	gabriel.antoniu@inria.fr	valentin.cristea@cs.pub.ro	christine.morin@inria.fr

NOTA: Si la proposition d'Equipe Associée comporte plusieurs partenaires, français et/ou étrangers, vous pouvez :
- soit ajouter une colonne,
- soit dupliquer le tableau ci-dessus autant de fois que nécessaire, en remplaçant "Coordinateur français ou étranger" par "Autre participant français ou étranger".
/ In the case of multiple INRIA project-teams and/or multiple foreign partners, applicant may:
- either add another column on the right
- or duplicate the above table as many times as needed, and replace "French coordinator" / "Partner coordinator" by "Other french or partner Participant"

La proposition en bref / The proposal in brief

Titre de la thématique de collaboration (en français et en anglais) / Title of the collaboration theme (in French and in English) :

Stockage Autonome pour les Services sur Clouds / Autonomic Storage for Cloud Services

Descriptif (environ 10 lignes) / Description (approximately 10 lines) :

While the cloud computing paradigm is progressively being adopted by companies wishing to deliver large-scale distributed services, such as Amazon, IBM, Google or Yahoo!, other research efforts in the area of large-scale distributed computing are exploring the concept of a grid operating system. Both kinds of systems aim at providing seamless access to a powerful distributed processing infrastructure, while hiding as much as possible all aspects related to the management of the underlying physical resources. In both contexts, data management is a key issue. It significantly impacts the quality of service being delivered by such distributed infrastructures. In this project, we aim at investigating ways to provide advanced, autonomic storage mechanisms for cloud services. More specifically, the goal is explore how to build an efficient, secure and reliable storage service for data intensive distributed applications running in cloud environments by enabling an autonomic behavior. In addition, we will leverage the grid operating system approach as a cloud technology (e.g., by relying on its OS-support for virtual organizations). For validation purposes, experimental prototypes will be implemented based on the BlobSeer data-sharing platform (designed by the KerData Team), on the MonALISA monitoring framework (using the expertise of the PUB Team), and on the XtreemOS grid operation system (designed under the leadership of the PARIS Team). The work will also include interactions with the Nimbus team from Argonne National Lab, led by Kate Keahey: experiments will be carried out using the Nimbus cloud software. The validation phase will include intensive, large-scale experiments on the ALADDIN-Grid'5000 grid testbed.

Présentation détaillée de l'Équipe Associée
Detailed presentation of the Associate Team

1. Scientific goals of the proposal

The emerging cloud computing model [1,2,3] is gaining serious interest from both industry and academia in the area of large-scale distributed computing. It provides a new paradigm for managing computing resources: instead of buying and managing hardware, users rent virtual machines and storage space. Various cloud software stacks have been proposed by leading industry companies, like Google, Amazon or Yahoo!. They aim at providing fully configurable virtual machines or virtual storage (IaaS: Infrastructure-as-a-Service [4,5,6]), higher-level services including programming environments such as Map-Reduce [7] (PaaS: Platform-as-a-Service [8,9]) or community-specific applications (SaaS: Software-as-a-Service [10,11]). On the academic side, one of the most visible projects in this area is Nimbus [5,12], from the Argonne National Lab (USA), which aims at providing a reference implementation for a IaaS. In parallel to these trends, other research efforts focused on the concept of grid operating system: a distributed operating system for large-scale wide-area dynamic infrastructure spanning multiple administrative domains. XtreemOS [13, 14] is such a grid operating system, which provides native support for virtual organizations. Since both the cloud approach and the grid operating system approach deal with resource management on large-scale distributed infrastructures, the relative positioning of these two approaches with respect to each other are currently subject to on-going investigation within the PARIS Project-Team (http://www.irisa.fr/paris/web/) at INRIA Rennes - Bretagne Atlantique: a preliminary discussion is available in [15].

Both in the contexts of the emerging cloud infrastructures and in that of grid operating systems, some of the most critical open issues relate to data management. The KerData research team (http://www.irisa.fr/kerdata/) of INRIA Rennes - Bretagne Atlantique, has recently been created with the goal of exploring ways to address the main challenges raised by data storage and management on cloud infrastructures. The team is designing and implementing BlobSeer [16, 17], a generic data-sharing platform which aims at providing support for storing massive data with fine-grained access control under heavy concurrency on large-scale distributed infrastructures. In addition, it will support versioning and decentralized metadata management. Providing the users with the possibility to store and process data on externalized, virtual resources from the cloud requires simultaneously investigating important aspects related to security, efficiency and quality of service. To this purpose, it clearly becomes necessary to create mechanisms able to provide feedback about the state of the storage system along with the underlying physical infrastructure. This information thus monitored, can further be fed back into the storage system and used by self-managing engines, in order to enable an autonomic behavior, possibly with several goals such as self-configuration, self-optimization, or self-healing. To start moving towards this goal, the KerData Team has started to work with the Distributed Systems and Grids team from NCIT (PUB, Romania) on the design of preliminary introspection mechanisms for BlobSeer. This work is relying on MonALISA [18,19], a general purpose monitoring framework whose main contributors belong to the PUB Team. This preliminary work is detailed in [20].

In this project, we aim at investigating several open issues related to autonomic storage in the context of cloud services. The goal is explore how to build an efficient, secure and reliable storage IaaS for data-intensive distributed applications running in cloud environments by enabling an autonomic behavior, while leveraging the advantages of the grid operating system approach (such OS-support for virtual organizations). For validation purposes, experimental prototypes will be implemented based on the BlobSeer data-sharing platform (designed by the KerData Team), on the XtreemOS grid operation system (designed under the leadership of the PARIS Team) and on the MonALISA monitoring framework (using the expertise of the PUB Team). This work will also include involvement with the Nimbus team from Argonne National Lab, led by Kate Keahey. Experiments will be carried out with the Nimbus cloud software infrastructure. The validation phase will include intensive, large-scale experiments on the Grid'5000 [21,22] grid testbed. We have divided the work in three main areas (each of which corresponds to one of the three years of the project), as described below.

Direction 1: Using BlobSeer for sharing application data in a IaaS

Using BlobSeer for sharing application data in a IaaS Scenario: Infrastructure as a Service (IaaS) is the delivery of computer infrastructure (typically a platform virtualization environment) as a service. The client typically runs a distributed application using virtual machines (VMs) rented from a service provider. The client applications are executed by the service provider as a set of virtual machines in a secure environment that enforces several restrictions, according to some pre-established contract. In such a context, access to local storage space on the physical machine where the application is running (owned by the service provider) is typically denied. Clients are instead provided with a specialized storage service they can access directly, through a specific API (e.g., Amazon S3 [23]).

Role of BlobSeer: In this context, the BlobSeer storage system will serve to enable the IaaS provider to offer advanced data sharing facilities to collaborating clients running within distinct VMs on the IaaS. BlobSeer's API will be directly made available as a distributed file system (e.g., within a given virtual organization). BlobSeer exposes a multiversioning interface which can be used in two ways: (1) to enable application data checkpointing (as part of checkpointing the application itself) and (2) to expose a multiversioning interface directly at application level through a specific access API. In the second phase, we will also enable the IaaS provider to allow client applications to share application data through a standard POSIX file system API. File system calls are transparently mapped to specific, secured data accesses to the internal storage service implementing data sharing for multiple VMs that form a given distributed application.

Role of MonALISA: First, the MonALISA monitoring framework includes automated management functions performed by higher-level, agent-based services. We will use these facilities to define a self-adaptive, autonomic behavior of BlobSeer through optimized, dynamic control for large-scale data transfers on dedicated circuits, data-transfer scheduling, distributed data scheduling, automated management and performance prediction of remote storage services (e.g., BlobSeer's Data Providers). Second, MonALISA will serve to introduce client monitoring , in order to ensure that the contract established with the provider is being respected. Related to security in this context, the storage service has to be aware of the different types of clients and of their access rights. Based on configurable policies that can be implemented based on MonALISA, BlobSeer will support different access patterns and enforce adaptive security rules. Moreover, the MonALISA monitoring framework can be used to monitor and to detect malicious behavior. In case of such events, MonALISA will alert the administrators or automatically implement pre-defined policies (e.g., blacklisting users and banning access for specific periods of time). Finally, the same mechanism can also be used to build a consistent reputation system that than further be used by the IaaS provider when scheduling storage resources to users.

Role of XtreemOS: Here, XtreemOS will be used as an internal cloud technology: the IaaS is XtreemOS and the secure environment where the application is running is the virtual machine itself. In its current version, XtreemOS internally relies on XtreemFS [24,25] for distributed data sharing. Our goal is to explore the possibility of using BlobSeer as an advanced, version-enabled, concurrency-optimized storage back-end, used by operating systems running inside VMs.

Direction 2: Using BlobSeer as a cost-effective storage service built on top of multiple IaaS'es

Using BlobSeer as a cost-effective storage service built on top of multiple IaaSes Scenario: We consider an Infrastructure as a Service (IaaS) provider which exposes a specialized storage service to the client applications that run on the rented virtual machines, as presented in the scenario above. The storage service is a large-scale distributed application itself, which has to be able to efficiently handle massive data and heavy concurrent accesses. Therefore it needs to run on a large number of physical storage nodes, which can be rented from other, possibly multiple second-level IaaS providers. Each such second-level IaaS provider has its own pricing policies, charging clients for the number of hours they use the resources, for the amount of stored data or for the amount of network traffic generated, while offering different QoS levels. In this context, it is important to design a cost-effective scheduling policy for our first-level IaaS (in terms of money spent), for deciding how to provision storage space from the various second-level IaaS providers.

Role of BlobSeer: We will design for BlobSeer a cost-based scheduler for its storage manager in order to select one or more IaaSes that will provide the external, virtualized storage resources needed by BlobSeer. The goal is to minimize the costs of storing and transferring the data and to preserve agreed QoS levels. Clients benefit transparently from minimized storage costs, whereas BlobSeer seamlessly handles the dynamic migration of its virtualized storage hosts from one IaaS provider to another.

Role of MonALISA: In this case, the MonALISA framework is an essential building block, necessary for building an efficient cost-aware BlobSeer provider manager. Its contribution is twofold: MonALISA will monitor both the data providers for QoS evaluations and the corresponding IaaS sites for pricing information. We rely on MonALISA's ability to collect and store data from a large number of nodes in near real time and quickly retrieve it on demand. MonALISA provides an abstract data API, thus enabling the user to define which is the relevant information that has to be collected, i.e., the data needed for selecting the best IaaS providers, both in terms of cost and of node capabilities (network latency, memory size, storage space). As the process of monitoring numerous nodes or services yields a large volume of raw data which have to be stored and interpreted (millions of published parameters with high update frequency rates), we will rely on MonALISA's advanced mechanisms such as dynamically-loadable filters to select and aggregate relevant information.

Role of XtreemOS: The future direction for XtreemOS is to explore how its technology can help for federating clouds. In this context, XtreemOS aims to offer a unified cloud image, while being set up on top of several cloud infrastructures. An important aspect here is data sharing at virtual organization level, task currently fulfilled by XtreemFS. However, as XtreemFS was designed in the context of grid computing, it does not include cost-effective resource management for the case where the resources are provisioned from clouds. We will investigate how BlobSeer can do this job of implementing cost-effective storage on top of multiple, external IaaS providers.

Direction 3: Using BlobSeer for VM management to build a highly-available IaaS

Using BlobSeer for VM management to build a highly-available IaaS Scenario: IaaS providers rely on virtualization techniques to offer resources to clients. Clients are typically allowed to upload a virtual-machine image to the system, so that they could use an environment compatible with their applications. This image is then executed on each computing element rented to the client. In such a context, BlobSeer can help providing a highly-available service, as it can serve as a storage system for checkpointing images of the virtual machines. The idea is simple: instead of checkpointing virtual-machine instances locally on the computing elements, XtreemOS stores them as binary large objects (BLOBs) within BlobSeer.

Role of BlobSeer: The scenario described above is a perfect application for BlobSeer, which natively provides versioning support for all objects it stores. A new (incremental) version of a BLOB is created each time a write operation is performed on it: this feature can efficiently be used for incremental checkpointing of virtual machines. Moreover, since BlobSeer data (and thus the virtual machines) are globally accessible to the system, various management operations such as migration can be easily implemented on top of it.

Role of MonALISA: For this scenario, we will rely on MonALISA's extensible monitoring modules, which allow to monitor parameters specific to BlobSeer (ex: BLOB IDs, BLOB sizes, data providers characteristics, etc.), to the IaaS or to the VMs. Another valuable property in this context is MonALISA's capability to monitor a large number of heterogeneous nodes with different response times, and at the same time to handle monitored units which are down or not responding, without affecting the other measurements.

Role of XtreemOS: XtreemOS aims to be available as a IaaS cloud, offering the clients the possibility to rent virtualized resources. In order to reach this goal, virtual-machine management and checkpointing are crucial aspects that need to be further investigated. In the current version of XtreemOS, support for managing virtual machines is not integrated, yet. Integrating BlobSeer as a highly-available storage back-end for virtual machines will definitely help XtreemOS make progress in this direction.

2. Partners presentation

This associated team is built by leveraging the strong, specific, and complementary expertise brought by each of the 3 partners:

Monitoring for large-scale, distributed infrastructures (PUB), demonstrated through the MonALISA project;
Large scale distributed management for clouds and grids (KerData), demonstrated through the BlobSeer project;
Distributed services and grid operating systems (PARIS), demonstrated through the XtreemOS project.

Romanian partner: the National Center for Information Technology (NCIT) from the Politehnica University of Bucharest (PUB)

Politehnica University of Bucharest (PUB) is the largest technical university in Romania (26,000 students, among which 1,500 with the Computer Science Department). The National Center for Information Technology (NCIT, http://csite.cs.pub.ro/ncit) is part of the PUB, within the Computer Science Department of PUB. The Center is dedicated to advanced and inter-disciplinary research. It includes several research and teaching laboratories in the fields of High-Performance Computing, Distributed Systems and Applications, E-Business and E-Government, Artificial Intelligence, Computer Networks. The Distributed Systems and Grids team (http://csite.cs.pub.ro/ds_team) is part of the NCIT and its research is directed on large-scale distributed systems middleware and applications. The focus is on distributed system monitoring and control, evaluation of distributed system using modeling and simulation, resource management, meta-scheduling, web service-based and workflow-based scheduling. The team is actively involved in multiple international, European and national collaborative projects in the area of distributed computing. Below is a brief description of a few selected projects of PUB relevant for this proposal. They illustrate the high expertise acquired by the PUB Team in the specific area of large-scale distributed monitoring, a significant success factor for the proposed associated team.

International projects. MonALISA (http://monalisa.cern.ch) is an outstanding collaborative project between PUB, Caltech (USA) and CERN (Switzerland) aiming at developing a distributed monitoring framework. PUB is the main technical contributor to MonALISA. It strongly relies on the involvement of former graduate students of PUB, currently hired by Caltech and CERN. The system makes use of agent technology to develop a scalable, fault-tolerant, monitoring platform for distributed infrastructures. It collects and processes a wide range of information and provides it to other services or clients in a dynamic, customized, self-describing way. The principles of the architecture are described in this on-line paper. The MonALISA monitoring framework is currently used in production by several important communities including Open Science Grid, CMS, the USLHCNet high-speed transatlantic network, the Alice experiment at CERN. More than 350 MonALISA services are running throughout the world (http://monalisa.caltech.edu/monalisa__Looking_Glass.htm), monitoring more than 20,000 compute servers and thousands of concurrent jobs. More than 1.5 million parameters are currently monitored in near-real time with an aggregate update rate of approximately 15,000 parameters per second. The joint MonALISA team received the “CENIC (Cooperation for Education Networks in California) 2006 Innovation Award for High-Performance Applications” for the MonALISA project. In collaboration with the same partners, the PUB Team is also actively involved in the development of 2 other projects: Fast Data Transfer (FDT, http://monalisa.cern.ch/FDT/), successfully demonstrated at SuperComputing conferences in 2007, 2008 and 2009 and MONARC 2 (http://monarc.cacr.caltech.edu), a simulation framework that allows the analysis of Grid dynamic behavior of Grids by allowing to build simulation models that capture specific characteristics of resources and activities.
European projects. EU-NCIT (FP6): the Center was granted the FP6 EU-NCIT NCIT leading to EU IST Excellency Project, whose strategic objective was the reinforcement of the scientific and technical know-how and experience in elaborating and coordinating proposals and projects related to several IST research areas of FP7. In Enabling Grids for the E-science III-EGEE III (FP7) (http://www.eu-egee.org/), the team is responsible with the development and operation of the education and research networking infrastructure, along with RoEduNet as the national operator and with Grid- and Cloud-related training and dissemination activities. In the SEE-GRID-SCI (FP7) project (http://www.see-grid.eu/), the team monitors the Grid infrastructure using MonALISA equipped with specific modules and develops scheduling algorithms for computational Grids. P2P-NEXT (FP7) (http://www.p2p-next.org/) is a project aiming at building a next generation P2P content delivery platform: here, the team is responsible for the technical work packages dedicated to monitoring, upload bandwitdh estimations and congestion control. In the SENSEI (FP7) project (http://www.sensei-project.eu/), the team focuses is on monitoring and evaluating heterogeneous wireless sensor and actuator networks integrated into a common framework of global scale available to services and applications via universal service interfaces.
National projects. The team is also involved in several national projects in multiple areas of distributed computing: resource management, activity scheduling, resource optimization, dependability, security, etc. Examples of such projects are: GridMOSI ( http://wiki.gridmosi.ro/), MedioGRID (http://mediogrid.utcluj.ro/app), PEGAF, DepSys. The team is member of the Romanian Grid consortium, RoGrid, organizing yearly the Grid Initiative Summer School (http://gridinitiative.ncit.pub.ro/) and other events related to Grid research in Romania.

First French partner: the KerData Team from INRIA Rennes - Bretagne Atlantique

The KerData Team (http://www.irisa.fr/kerdata/) has been created on the 1st of July 2009 as a joint research team of INRIA Rennes - Bretagne Atlantique and École Normale Supérieure de Cachan - Antenne de Bretagne. It is strongly engaged in the process of becoming an INRIA project-team. KerData has been created by Luc Bougé, Professor at ENS Cachan - Antenne de Bretagne (team leader) and Gabriel Antoniu, Research Scientist at INRIA - former members of the PARIS Project-Team. KerData is focusing on Cloud storage for very large distributed data. It is addressing the challenges raised by today's data-oriented high-performance applications that exhibit the need to handle massive, non-structured data - BLOBs: binary large objects (in the order of terabytes) - stored in a large number (thousands to tens of thousands), accessed under heavy concurrency by a large number of clients (thousands to tens of thousands at a time) with a relatively fine access grain (on the order of megabytes). These challenges are investigated through the design, implementation and experimental validation of a generic data-sharing platform called BlobSeer. Among the applications targeted by the KerData Team, one class is particularly relevant for the work proposed for this project: service-based distributed applications executed in cloud environments.

Background: large-scale distributed data management. The main contribution of Gabriel Antoniu, Luc Bougé and of their students during the last 6 years (mainly while they were members of the PARIS Project-Team) was to propose the concept of grid data-sharing service whose goal has been to provide a transparent data access model at a grid scale. The service provides the grid applications with the abstraction of a globally shared memory, in which data can be easily stored and accessed through global identifiers. This concept has been illustrated through an architecture called JuxMem (http://juxmem.gforge.inria.fr/) that leverages results from several fields: consistency protocols inspired by DSM systems; scalable P2P discovery and data exchange with good scalability and volatility support; algorithms for fault-tolerant distributed systems for dynamic group management in volatile environments. This work was the subject of 3 Ph.D. theses: Mathieu Jan and Sébastien Monnet (defended in 2006) and Loïc Cudennec (defended in 2009). It has been validated at several levels:

An extensive large-scale experimental evaluation was performed on the Grid'5000 testbed.
The proposed transparent access model has been integrated into several current grid programming models (GridRPC, component-based models, such as CCM and CCA).
The proposed architecture and implementation of the service has been integrated with other existing storage architectures and technologies : the DIET GridRPC environment designed by the GRAAL project-team (INRIA - ENS Lyon), the ASSIST component-based environment from the University of Pisa (Italy), the Knowledge Grid environment from the University of Calabria (Italy), the Gfarm grid file system from the University of Tsukuba (Japan).

This work led to collaborations with several partners.

European collaborations. We participated to the CoreGRID European Network of Excellence (http://www.coregrid.net/). Main collaborators: University of Pisa - Marco Danelutto, Marco Aldinucci; University of Calabria - Domenico Talia. We are currently involved in the SCALUS Marie-Curie Initial Training Network (call FP7-PEOPLE-ITN-2008), focused on distributed storage, to start in October 2009 (2009 - 2013). Partners: Universidad Politécnica de Madrid, Barcelona Supercomputing Center, University of Paderborn, Ruprecht-Karls-Universität Heidelberg, Durham University, FORTH, École des Mines de Nantes, XLAB, CERN, NEC, Microsoft Research, Fujitsu, Sun Microsystems.
International collaborations. Bilateral collaborations have also been contracted with the University of Illinois at Urbana-Champaign, USA (Indranil Gupta) and with the University of Tsukuba, Japan (Osamu Tetebe).
National collaborations. At the national level, the strongest interactions took place with the REGAL, GRAAL, ATLAS teams from INRIA and with the database team (BD) of LIP6. These collaborations were supported by several ANR projects (GDS: 2003-2006, GdX: 2003-2006, LEGO: 2006-2008, RESPIRE: 2006-2008) and took benefit from the Grid'5000 testbed.
Industrial partners. A strong support was provided from Sun Microsystems through a 3-year contract (2005-2008) who sponsored Loïc Cudennec’s Ph.D. Thesis.

Second French partner: the PARIS Project-Team from INRIA Rennes - Bretagne Atlantique

The PARIS Project-Team (http://www.irisa.fr/paris/web/) from INRIA Rennes - Bretagne Atlantique research center aims at contributing to the programming of large-scale, parallel and distributed systems. It investigates new approaches to build software mechanisms that hide the complexity of programming computing infrastructures that are both parallel and distributed. Our contribution to the field can thus be summarized as follows: combining parallel and distributed processing whilst preserving performance and transparency. The PARIS Project-Team has carried out research activities on the design and implementation of Grid-aware operating systems. It has designed and implemented Vigne [34,36,37], a system for large-scale dynamic Grids. Since June 2006, Christine Morin has been the scientific coordinator of the XtreemOS European Integrated Project. The objective of the XtreemOS Project [20] is to design, implement and promote a Linux-based Grid operating system providing a native virtual organization support. The research activities of the PARIS Project-Team in XtreemOS are focused on the design and implementation of a fault-tolerance service offering transparent checkpointing to Grid applications [31,32], on the design of virtual organization and security services [19,22,35], on the design and implementation of system services to manage virtualized infrastructures and on the design and implementation of LinuxSSI, leveraging Kerrighed SSI operating system for the cluster flavor of XtreemOS system. The PARIS Project-Team has been involved in the Grid'5000 project since the beginning in 2003 (http://www.grid5000.fr/). Grid'5000 is an infrastructure distributed in 9 sites around France, for research in large-scale parallel and distributed systems.

PUB Team: permanent staff involved

Prof. Valentin Cristea (coordinator for PUB) is the Head of the Computer Science and Engineering Department of Politehnica University of Bucharest. His main fields of expertise are Distributed Systems, Grid Computing and E-Services. He is the Director of the National Center for Information Technology, within which he leads the CoLaborator, Distributed Systems and Grid and e-Business/e-Government laboratories. He has a long experience in the development, management and/or coordination of international and national research projects. He collaborates with by Prof. Harvey Newman (from Caltech), Iosif Legrand (from CERN) and Prof. Nicolae Tapus (from PUB-NCIT) to define RoDiCA - Romanian Distributed Collaborative Architectures, which led to the development of MonALISA, MONARC2 and other projects. He also collaborates with University of Wisconsin, USA (NetPy project) and with Rutgers University, USA (VNSim, Prof. Liviu Iftode). He co-supervised the PUB Team in SEE-GRID-SCI (Contract FP7 nr. 211338), EGEE III (Contract FP7), EU-NCIT (Contract INCO-CT-2005-017101), COOPER (Contract no. 027073, FP6), CoLaborator (World Bank and CNCSIS Contract Nr. 26389/2000), and others. In 2003 he received the IBM award for excellence. He is the Romanian coordinator of the Master program on Parallel and Distributed Computer Systems co-developed with Free University of Amsterdam. He has been a visiting professor in European and US Universities, such as: Free University of Berlin (Germany), Oulu University (Finland), Free University of Amsterdam (The Nederlands), Politecnico di Torino (Italy) and Rutgers University, USA. Detailed CV.

Prof. Nicolae Tapus is the vice-rector of the Politehnica University of Bucharest. His main fields of expertise are Distributed Systems, Local Area Networks, Computer Architecture and Grid Computing. He is also a member of the NCIT board. He has a long experience in the development, management and/or coordination of research national and international projects. He is actively collaborating with IT Companies (CISCO, Microsoft, HP) and is participating in the elaboration of the strategy for the research development in ICT (including Grid development) in Romania, as a member of the government's Experts Council in these problems. He serves as a coordinator of IEEE Computer Society Chapters for IEEE Region 8 Europe, Middle Asia and Africa. He co-supervised the PUB involvement in several international projects: EGEE III (Contract FP7), P2P-Next (Contract FP7), SENSEI (Contract FP7), EU-NCIT (Contract INCO-CT-2005-017101), SEE-GRID-SCI (Contract FP7 nr. 211338), CoLaborator (World Bank and CNCSIS Contract Nr. 26389/2000) and others. He was a visiting professor in European and US Universities, such as: Grenoble (France), Free University of Amsterdam (The Netherlands), Politecnico di Torino (Italy) and Maryland University (USA). He is a member of the IEEE, Chair of Romania Computer Society Chapter and ACM professional organizations, Director of ACM International Collegiate Programming Contest for South-Eastern Europe. He manages the organization of yearly ACM programming contests in Romania, within the PUB. He is a member of the New York Science Academy (1991) and member of the Romanian Technical Science Academy (2004). Detailed CV.

Florin Pop is an assistant professor of the Computer Science and Engineering Department of the Politehnica University of Bucharest. His research interests are oriented to: scheduling in Grid environments (his Ph.D. research), distributed system, parallel computation, communication protocols and numerical methods. He received his Ph.D. in Computer Science in 2008 with “Magna cum laudae” distinction. He is member of RoGrid consortium and participates in several research projects in these domains, in collaboration with other universities and research centers from Romania and from abroad developer (in the national projects like CNCSIS, GridMOSI, MedioGRID and international project like EGEE, SEE-GRID, EU-NCIT). He has received an IBM Ph.D. Assistantship in 2006 (top ranked 1st out from 17 awarded students) and a Ph.D. Excellency Grant from Oracle in 2006-2008. Detailed CV.

Ciprian Dobre, currently a post-doc researcher, received his Ph.D. in Computer Science at the Politehnica University of Bucharest in 2008. His main research interests are Grid Computing, Monitoring and Control of Distributed Systems, Modeling and Simulation, Advanced Networking Architectures, Parallel and Distributed Algorithms. He is member of the RoGrid consortium and is involved in a number of national projects (CNCSIS, GridMOSI, MedioGRID, PEGAF) and international projects (MonALISA, MONARC, VINCI, VNSim, EGEE, SEE-GRID, EU-NCIT). His research activities were awarded with the Innovations in Networking Award for Experimental Applications in 2008 by the Corporation for Education Network Initiatives (CENIC). Detailed CV.

PUB Team: Ph.D. and Master students involved

Alexandru Costan is a Ph.D. student and Teaching Assistant at the Computer Science department of the Politehnica University of Bucharest. His research interests include: Grid Computing, Data Storage and Modeling, P2P systems. He is actively involved in several research projects related to these domains, both national and international, from which it worth mentioning MonALISA, MedioGRID, EGEE, P2P-NEXT, BlobSeer. His Ph.D. thesis is oriented on Data Storage, Representation and Interpretation in Grid Environments. He has received a Ph.D. Excellency Grant from Oracle in 2006-2009 and was awarded an IBM Ph.D. Fellowship in 2009. Detailed CV.

Eliana-Dina Tîrşa is a Ph.D. student in Computer Science at the University Politehnica of Bucharest. In July-September 2009 she worked as an INRIA intern, on Dynamic Provisioning of Resources from Clouds in XtreemOS System. Her research interests are Fault Tolerance, Monitoring and Virtualization in Distributed Systems, Cloud Computing and Peer-to-Peer Systems. She is a participant in several national (PEGAF, Depsys) and international (MonALISA, EU-NCIT, P2P-Next) research projects. Detailed CV.

Catalin Leordeanu is a Ph.D. student in the Computer Science Department at the Politehnica University of Bucharest. He received his Master in Computer Science in 2009. His research interests include Security of Distributed Systems, Intrusion Detection and Dependability of Large-Scale Distributed Systems. He also works on numerous national and international projects on these subjects. Detailed CV.

Mugurel Ionut Andreica is a Ph.D. student in the Computer Science Department at the Politehnica University of Bucharest. His research interests include theoretical and practical aspects of Communication Optimization in Distributed Systems, Peer-to-Peer Systems, Multi-core Programming, Distributed Data Storage, Sequential and Distributed Algorithms and Data Structures. He has participated in several EU and national research projects focused on the previously mentioned topics. He was awarded a Ph.D. research scholarship from Oracle and a Ph.D. Fellowship from IBM. Detailed CV.

KerData Team: permanent staff involved

Gabriel Antoniu (coordinator for KerData), is a Research Scientist at INRIA Rennes - Bretagne Atlantique (CR1) and is a member of the KerData research team. His research interests include: grid and cloud distributed storage, large-scale distributed data management and sharing, data consistency models and protocols, grid and peer-to-peer systems. He coordinates the involvement of INRIA Rennes - Bretagne Atlantique in the SCALUS project of the Marie-Curie Initial Training Networks programme (ITN), call FP7-PEOPLE-ITN-2008 (2009-2012) and in the CoreGRID ERCIM Working Group. He has been involved in several other international, European and national collaborative projects in these fields, including the CoreGRID European Network of Excellence. Gabriel Antoniu received his Bachelor of Engineering degree from INSA Lyon in 1997; his Master degree in Computer Science from ENS Lyon in 1998; his Ph.D. degree in Computer Science in 2001 from ENS Lyon; his Habilitation for Research Supervision (HDR) from ENS Cachan in 2009. Detailed CV.

Luc Bougé, Professor, is the Chair of the Informatics and Telecommunication Department (DIT) at ENS Cachan - Antenne de Bretagne. He is also the leader of the KerData Joint Team of INRIA Rennes - Bretagne Atlantique and ENS Cachan - Antenne de Bretagne. His research interests include the design and semantics of parallel programming languages and the management of data in very large distributed systems such as grids, clouds and peer-to-peer (P2P) networks. Detailed CV.

KerData Team: Ph.D. and Master students involved

Bogdan Nicolae is a Ph.D. student at the University of Rennes 1, France, working in the KerData Team at INRIA Rennes - Bretagne Atlantique. His research interests include: parallel and distributed computing, cloud computing, large-scale distributed data-storage solutions, versioning, transactional concurrency control. His main focus is BlobSeer, a storage service for data-intensive distributed applications designed to sustain a high data throughput under heavy access concurrency. He is also actively involved in research activities related to several other projects: LEGO, Gfarm, Hadoop, Grid'5000. His Ph.D. thesis is funded by the French Ministry of Education (2007-2010). Detailed CV.

Alexandra Carpen-Amarie is a Ph.D. student in Computer Science at ENS Cachan, working in the KerData Team at INRIA Rennes - Bretagne Atlantique. Her research interests include large-scale distributed systems, distributed data storage, cloud computing, monitoring in distributed systems. She is working on the design of an introspection layer for the BlobSeer data storage system, a first step towards an autonomic steering for this system. This layer is being implemented as an extension of the MonALISA monitoring framework and evaluated on the Grid'5000 testbed. This work is being carried out in collaboration with Alexandru Costan, Ph.D. student in the PUB Team and member of the MonALISA project (see above). Her Ph.D. thesis is funded through a grant from the INRIA CORDIS program (2008-2011). Detailed CV.

Diana Moise is a Ph.D. student in Computer Science at ENS Cachan, working in the KerData Team at INRIA Rennes - Bretagne Atlantique. Her research interests comprise parallel and distributed computing, distributed file systems, distributed data storage, data-intensive applications, the Map/Reduce paradigm. Her work has so far focused on two projects: developing a file system interface on top of the BlobSeer data storage service; integrating BlobSeer with the Hadoop framework by replacing the default storage layer, the Hadoop File System (HDFS), with BlobSeer in order to improve data throughput and add functionalities. Her Ph.D. thesis is funded through a joint grant of the Brittany Region and of INRIA (2008-2011). Detailed CV.

Viet-Trung Tran is a Ph.D. student in Computer Science at ENS Cachan, working in the KerData Team at INRIA Rennes - Bretagne Atlantique. He is working in the KerData Team at INRIA Rennes - Bretagne Atlantique. His research interests are parallel and distributed computing, distributed file systems, high-performance computing, distributed data storage. His recent Master’s thesis focused on efficient use of BlobSeer, a large-scale data-management platform, as an underlying substrate for grid file systems. His Ph.D. thesis is funded by the French Ministry of Education (2009-2012). Detailed CV.

PARIS Team: permanent staff involved

The following staff from the PARIS Project-Team will be involved into this associated team:

Christine Morin (coordinator for PARIS) is a senior researcher at INRIA Rennes Bretagne Atlantique. Since July 2009, she has been the scientific leader of the INRIA PARIS Project-Team. She has been leading research activities on single system image OS for high performance computing in clusters, resulting in the Kerrighed Cluster OS, now developed in open source. She is the scientific coordinator of the XtreemOS Project which is a 4-year European integrated project started in June 2006. She is a co-founder of Kerlabs Start-Up, created in 2006 to exploit Kerrighed technology. Her research interests are in operating systems, distributed systems, fault tolerance, cluster, grid and cloud computing. Detailed CV.

Yvon Jégou got his engineering degree from Institut National des Sciences Appliquées (INSA) of Rennes, France and then his Ph.D. degree from the University of Rennes in 1979. He is a full-time INRIA researcher in the PARIS Project-Team. His research activities are focused on architecture, operating systems and compilation techniques for parallel and distributed computing. His current work is focused on the development of a DSM for the implementation of runtime systems on large clusters and for the management of data repositories on the Grid. In the recent past, he participated to the IST POP project on the implementation of an OpenMP runtime systems for clusters using distributed shared memories (DSM). He is currently involved in the XtreemOS IP Project. He is also involved in the Grid'5000 Project and serves as the leader of the Grid'5000 Team at INRIA Rennes - Bretagne Altantique.

PARIS Team: Ph.D. and Master students involved

Pierre Riteau is a Ph.D. student at University of Rennes 1. His research interests are in the areas of virtualization, cloud computing, distributed systems and high-performance computing. His current research focuses on efficient migration and storage of virtual environments for high-performance computing over large-scale distributed architectures. Detailed CV.
Jérôme Gallard is a Ph.D. student in the PARIS Project Team at INRIA Rennes - Bretagne Atlantique. His thesis work is contributing to the XtreemOS European Project, which aims at building and promoting a Linux-based operating system to support virtual organizations for next generation Grids. His main research activities focus on reliability and performance execution in the context of grid environment. This includes: distributed operating systems (XtreemOS, Kerrighed - http://www.kerrighed.org/), virtualization, grid/cloud computing (Aladdin-Grid'5000 - http://www.grid5000.fr/). Detailed CV.
Sylvain Jeuland is a Ph.D. student at University of Rennes 1. During his Ph.D. thesis started in 2007, he has been involved in the XtreemOS project and has focused on designing security services for virtual organizations (VOs) in a Grid operating system. This includes mechanisms for user management in VOs, security certificates, authentication, authorization and job submissions to the Grid OS. Detailed CV.

Background of the collaboration between the teams

This collaboration started 2 years ago, after 2 visits of Gabriel Antoniu at PUB. We have then set up a bilateral project co-funded by the CNRS and the Romanian Academy of Science. Since then, our joint activities have become more and more active, especially thanks to the intensive involvement of French and Romanian Ph.D. and Master students through long internships and short mutual visits. Note that 3 Ph.D. students now members of INRIA's KerData Team have previously graduated as engineers at PUB, in Bucharest: this substantially impacts the efficiency of our contacts. Most of the Ph.D. students mentioned above will be involved in the associate team. A summary of the main stages of this collaboration id given below.

Bilateral collaboration: A bilateral project submitted in response to a call for projects jointly issued by the French CNRS and by the Romanian Academy of Science was accepted for the period 1/1/2008 - 31/12/2009. This project was called "GridDataViz: Visualisation et contrôle à distance pour une plate-forme de partage de données sur grille basée sur des techniques pair-à-pair". The outcome of this project (which lead to a preliminary definition of an introspection layer for BlobSeer based on MonALISA) will serve as a starting point for this Associate Team.

Ph.D. theses: 3 Ph.D. candidates who graduated at PUB as engineers started their theses at INRIA Rennes - Bretagne Atlantique (Bogdan Nicolae in 2007, Alexandra-Carpen Amarie and Diana Moise in 2008). They are now members of the KerData team. We are now looking for funding possibilities for new, jointly supervised Ph.D.theses (co-tutelle) in the next years (2010, 2011).

Internships: 3 Ph.D. students from PUB were hosted in 2009 by KerData and PARIS at INRIA Rennes - Bretagne Atlantique: Alexandru Costan (KerData, one month, funded by the bilateral CNRS-Romanian Academy project); Eliana Tirsa (PARIS, 3 months) and Catalin Leordeanu (PARIS, 3 months) were both co-funded by INRIA's Internships programme. Additionally, 2 extra MS students from PUB are hosted this year for 5 months for their Master research project : Mihaela Vlad (KerData) and Stefania Costache (PARIS), both co-funded by INRIA's Internship programme.

Short mutual visits: In November 2008, Alexandra Carpen-Amarie and Diana Moise make a one-week visit to PUB. They also participated to the HiPerGrid International Workshop, co-organized by our partner team at PUB, where they presented 3 papers. Gabriel Antoniu visited the PUB team twice a year since 2007 for discussions related to the bilateral project and to the global development of our cooperation, including the preparation of this proposal for an associate team.

Most of the Ph.D. students mentioned above will be involved in the associate team.

To get more details on the steps of the collaboration: read more...

Relevant joint papers

The collaboration established between PUB with the KerData and PARIS Teams at INRIA Rennes - Bretagne Atlantique began to be productive in terms of co-authored research papers especially thanks to the presence of 5 PUB interns (4 Ph.D. students, 1 Master student, see above) hosted by the two INRIA teams in 2009. Note that most of these results are very recent: they have been (or are in the process of being) published as INRIA Research reports. Some of them are already submitted for publication to international conferences and workshops.

Alexandra Carpen-Amarie, Cai Jing, Alexandru Costan, Gabriel Antoniu, Luc Bougé. "Bringing Introspection Into the BlobSeer Data-Management System Using the MonALISA Distributed Monitoring Framework". INRIA Research Report RR-7043, September 2009. Submitted for publication. Available on HAL: http://hal.inria.fr/inria-00419978/.
Alexandra Carpen-Amarie, Cai Jing, Luc Bougé, Gabriel Antoniu, Alexandru Costan. "Monitoring the BlobSeer distributed data-management platform using the MonALISA framework". INRIA Research Report RR-7018, August 2009. Submitted for publication. Available on HAL: http://hal.inria.fr/inria-00410216/.
Eliana-Dina Tîrsa, Jérôme Gallard, Pierre Riteau, Yvon Jégou, Christine Morin. "Towards XtreemOS in the Clouds: Automatic deployment of the XtreemOS Grid Operating System in a Nimbus Cloud". Publication as an INRIA Technical Report in progress.
Catalin Leordeanu, Thomas Ropars, Christine Morin. "Failure Detection in Large-Scale Distributed Systems - Building a Failure History Storage". Publication as an INRIA Research Report in progress.

Other joint papers (less relevant to the focus of this project) include:

Alexandra Carpen-Amarie, Mugurel Ionut Andreica, Valentin Cristea. "An Algorithm for File Transfer Scheduling in Grid Environments". In Proc. 2nd International Workshop on High Performance Grid Middleware (HiPerGrid), Bucharest, Romania, November 2008. Available on HAL: http://hal.inria.fr/inria-00343793/.
Diana Moise, Izabela Moise, Florin Pop, Valentin Cristea. "Resource CoAllocation for Scheduling Tasks with Dependencies, in Grid". In Proc. 2nd International Workshop on High Performance Grid Middleware (HiPerGrid), Bucharest, Romania, November 2008. Available on HAL: http://hal.inria.fr/inria-00343770/.
Izabela Moise, Diana Moise, Florin Pop, Valentin Cristea. "Advance reservation of resources for task execution in grid environments". In Proc. 2nd International Workshop on High Performance Grid Middleware (HiPerGrid), Bucharest, Romania, November 2008. Available on HAL: http://hal.inria.fr/inria-00343780/.

3. Impact

The goal of this associated team is to set up a large-scale distributed data management system for cloud platforms, with self-reconfiguring, self-managing, self-adapting, self-healing autonomous capabilities, coping with the security constraints of multiple virtual organization regarding administration, authentication and security restrictions.

At the best of our knowledge, no such system exists yet, though various project have addressed parts of these requirements. In the case of this project, we believe we can address all these requirements at once in an integrated approach, by leveraging the strong, specific and complementary expertise brought by each of the 3 partners:

Large-scale distributed data management for clouds and grids (KerData), demonstrated through the BlobSeer project;
Monitoring for large-scale, distributed infrastructures (UPB), demonstrated through the MonALISA project;
Distributed services and grid operating systems (PARIS), demonstrated through the XtreemOS project.

What is the expected impact on the domain?

The availability of such a data management system is eagerly expected over the world and several research groups are involved in the competition. It is of course expected by end-users with high storage needs, but, more importantly, by the managers of the first cloud computing platforms which have recently been released and opened for public use. For instance, the Amazon Elastic Compute Cloud (EC2) is a service which provides resizable compute capacity in the cloud through a web service interface. It is designed to make web-scale computing easier for developers, but it only provides basic services regarding data management through the Simple Storage Service (S3). Amazon S3 provides a simple web services interface that can be used to store and retrieve any amount of data, at any time, from anywhere on the web. Each object is stored in a single bucket: consequently, the objects may only contain (from 1 byte to) 5 gigabytes of data each. Each stored object is retrieved via a unique, developer-assigned key. Only basic authentication mechanisms are provided: objects can be made private or public and rights can be granted to specific users, but no other high-level mechanism is available.

Our goal is to leverage on our mutualized skills and experiences to do better than today's basic cloud storage services, such as S3:

Provide advanced storage mechanisms to the user. We aim to provide the service user with a virtually unlimited storage capacity, where objects of any size can be stored without any per-data, size-related restriction. To this purpose, we will rely on our BlobSeer BLOB management platform, which transparently splits objects into chunks, spreads them over the storage providers as evenly as possible, so that highly efficient parallel access to the chunks is possible. Moreover, versioning facilities will be exposed to the users, who will then be able to manage and access multiple versions of their data.
Enable the storage platform with autonomic behavior. We aim to offer the service platform manager with high-level monitoring tools, so that any poor behavior can be rapidly detected. Also, equipping the platform with high-level self-managing tools, so that it can adapt to the user requests: load-balancing the distribution of the chunks over the storage providers, adjusting the number of such providers, etc. More importantly, we want to provide the user with a level of error resiliency by automatically replicating the stored data up to a certain level. This level will be specified by the user, but, more importantly, it can be autonomously raised or lowered by the platform itself, depending on the volatility of the nodes, as observed by the monitoring service.
Coping with platform security and administration. Finally, we aim to provide the user with high-level administration tools, so that the user can specify the intended level of security, depending on the sensitivity of the data. For instance, if the storage platform is run over a variety of administrative domains and virtual organizations, the user could specify that the data may only be stored on storage providers managed by a subset of the organizations. Also, the platform manager should be able to drive the platform over the various organizations with full transparency: single authentication is a must.

What is the expected impact on the partners?

KerData is a young team, with only two permanent members at this time, launched with a specific focus: providing storage for very large distributed data in grids and clouds. The stress has been put on building a prototype for managind massive data at large scales (BlobSeer), to validate our initial ideas. Of course, it is out of question to rebuild everything from scratch. Reusing available technology as much as possible has been a priority. For instance, the whole system is based on the BOOST C++ runtime library. Related to monitoring, being able to reuse the already proven MonALISA technology will save a lot of time, as it directly provides mechanisms enabling BlobSeer with a self-* behavior. Also, PUB has been a major source of brilliant Ph.D. students for us in prior years: this substantially helped the KerData Team to be set up in a very efficient and productive way.

The PUB Group has been developing MonALISA for a number of years in close collaboration with Caltech and CERN. This collaboration has mainly been targeted to the monitoring of very large distributed systems widespread at continental level. In contrast, the collaboration of PUB with KerData and PARIS opens new fields of application for MonALISA, whose integration into other systems makes them aware of their own behavior in order to take appropriate self-* actions. These new fields bring about very interesting research questions. For instance, the flexibility and modularity of the MonALISA design is of utter importance in order to let it manage a very large spectrum of parameters.

The PARIS Project-Team has been working for many years in the field of operating systems for clusters and now for grids, with the objective to present the user with the illusion of a single computer (SSI, Single System Image), whatever the number of machines or virtual organizations. Highly sophisticated mechanisms have been implemented at the kernel level for this purpose. In contrast, KerData and PUB have always been working at the user level, with as little dependency on the (Linux) kernel as possible. Collaborating with these groups will open new perspective for the PARIS Project-Team. Also, it will bring them the sophisticated MonALISA introspection technology, enabling their operating system to become not only fault-tolerant, but fully self-healing. As the KerData Team, PARIS has benefited from a flow of brilliant interns coming from PUB.

The detailed list of our prior contact history (see above) makes it clear that all three teams have the objective interest, the scientific capacity and the strategic means to collaborate together through this project. This is a major win-win project for all of us.

What is the expected impact on our respective institutions?

On a higher level, there is a growing interest of PUB in further enhancing the scientific collaboration with INRIA.

So far, the collaboration developed in the framework of the project GridDataViz Project supported by the French CNRS and the Romanian Academy of Science. This project focused on Grid monitoring and data management in Grid environments. This bilateral project facilitated a better contact of members of the two institutions and the enlargement of the collaboration. For instance, the application of 3 former PUB students for Ph.D. theses under the supervision of Luc Bougé and Gabriel Antoniu was a remarkable outcome of these contacts.

This fruitful experience is a major incentive to launch this new cooperation project. The emerging cloud computing environments provide a new context for all partners and brings about an opportunity to federate our research efforts. The setup of an INRIA Associate Team will contribute to the extension of the scientific exchanges, but it will also facilitate an increased number of Master and Ph.D. students from PUB to apply to internships and other academic programs at Rennes and be involved in this project.

In the other direction, the involvement of INRIA in projects with Romania in the broad field of grid and cloud computing has been very limited. Among the 80 Associate Team currently funded by INRIA, only 3 have partners from Eastern Europe (2 with Russia, 1 with Ukraine) and none has a Romanian partner! Also, the contacts between INRIA and Romanian teams involved in the deployment of a national grid in Romania (the RoGrid Project) are rather recent: in Rennes we have set up bilateral cooperations with PUB and the Technical University of Cluj-Napoca. This global situation is rather unsatisfactory, as this huge potential for collaboration is clearly underused. There have been numerous contacts in the past with Romanian partners, in the field of applied mathematics and related fields of Informatics (operational research, data bases, etc.) It is clear that, in these fields, Romanian collaborations have been quite beneficial for INRIA. This project of associate team can thus be a very good step in correcting this imbalance, at least in the area of distributed computing. We can expect the synergy created through such a team to lead to the submission of joint proposals for European projects in the future.

4. Miscellaneous

PUB Team: relevant publications

Ciprian Dobre, Florin Pop, Valentin Cristea. "Simulation Framework for the Evaluation of Dependable Distributed Systems". In Scalable Computing: Practice and Experience, Scientific International Journal for Parallel and Distributed Computing (SCPE), Volume 10, Number 1, pp. 13-23. http://www.scpe.org, 2009.

Eliana-Dina Tîrsa, Mugurel Ionut Andreica, Alexandru Costan. "Data Replication Techniques with Applications to the MonAlisa Distributed Monitoring System". In Proceedings of the IEEE International Conference on "Computer as a Tool" (EUROCON), pp. 339-346, Sankt-Petersburg, Russia, 18-23 May, 2009.

Florin Pop, Ciprian Dobre, Corina Stratan, Alexandru Costan, Valentin Cristea. "Dynamic Meta-Scheduling Architecture based on Monitoring in Distributed Systems". In Proceedings of The Third International Conference on Complex, Intelligent and Software Intensive System, Third International Workshop on P2P, Parallel, Grid and Internet computing - 3PGIC-2009 (CISIS'09), March 16-19, 2009, Fukuoka, Japan, Published by IEEE Computer Society.

Alexandru Costan, Ciprian Dobre, Ramiro Voicu, Valentin Cristea. "A Monitoring Architecture for High-Speed Networks in Large-Scale Distributed Collaborations". In the 7th IEEE International Symposium on Parallel and Distributed Computing, ISPDC 2008, July 1-5 2008 Krakow, Poland.

Florin Pop, Alexandru Costan, Ciprian Dobre, Corina Stratan, Valentin Cristea. "Monitoring of Complex Applications Execution in Distributed Dependable Systems". In the 8th International Symposium on Parallel and Distributed Computing, ISPDC 2009, July 1-3 2009 Lisbon, Portugal.

Florin Pop, Ciprian Dobre, Valentin Cristea. "Evaluation of Multi-Objective Decentralized Scheduling for Applications in Grid Environment". In Proceedings of 2008 IEEE 4th International Conference on Intelligent Computer Communication and Processing, pp. 231-238, August 28-30, 2008, Cluj-Napoca, Romania, Published by IEEE Computer Society.

KerData Team: relevant publications

Bogdan Nicolae, Gabriel Antoniu, and Luc Bougé. "Enabling high data throughput in desktop grids through decentralized data and metadata management: The BlobSeer approach". In Proc. 15th International Euro-Par Conference on Parallel Processing (Euro-Par ’09), volume 5704 of Lect. Notes in Comp. Science, pages 404–416, Delft, The Netherlands, August 2009. Springer-Verlag.

Viet Trung Tran, Gabriel Antoniu, Bogdan Nicolae, Luc Bougé and Osamu Tatebe. "Towards a Grid File System Based on a Large-Scale BLOB Management Service". In Proc. CoreGRID ERCIM Working Group Workshop on Grids, P2P and Service computing, held in conjunction with Euro-Par 2009, Delft, The Netherlands, August 2009.

Bogdan Nicolae, Gabriel Antoniu, and Luc Bougé. "BlobSeer: How to enable efficient versioning for large object storage under heavy access concurrency". In Proc. 2nd Workshop on Data Management in Peer-to-Peer Systems (DAMAP’2009), Saint Petersburg, Russia, March 2009. Held in conjunction with EDBT’2009.

Bogdan Nicolae, Gabriel Antoniu, Luc Bougé. "Enabling lock-free concurrent fine-grain access to massive distributed data: Application to supernovae detection". In Proc. International Conference CLUSTER 2008, pages 310-315,Tsukuba, Japan, September 2008.

Gabriel Antoniu, Jean-François Deverge, Sébastien Monnet. "How to bring together fault tolerance and data consistency to enable Grid data sharing". In Concurrency and Computation: Practice and Experience 18(13): 1705-1723 (2006).

PARIS Team: relevant publications

John Menhert-Spahn, Thomas Ropars, Michael Schoettner, and Christine Morin. "The architecture of the XtreemOS grid checkpointing service". In Proc. of EuroPar 2009, LNCS, Delft, The Netherlands, August 2009. Springer.

Massimo Coppola, Yvon Jégou, Brian Matthews, Christine Morin, Luis Pablo Prieto, Oscar David Sanchez, Erica Yang, and Haiyan Yu. "Virtual organization support within a grid-wide operating system". In IEEE Internet Computing, 12(2) :20-28, March 2008.

Christine Morin, Jérôme Gallard, Yvon Jégou, and Pierre Riteau. "Clouds : a new playground for the XtreemOS grid operating system". In Parallel Processing Letters, 2009. To appear.

Christine Morin. "XtreemOS : A grid operating system making your computer ready for participating in virtual organizations". In ISORC'07 : Proceedings of the 10th IEEE International Symposium on Object and Component-Oriented Real-Time Distributed Computing, pages 393-402, Santorini Island, Greece, May 2007. IEEE Computer Society.

Emmanuel Jeanvoine, Louis Rilling, Christine Morin, and Daniel Leprince. "Using overlay networks to build operating system services for large-scale grids". In Proceedings of the 5th International Symposium on Paral lel and Distributed Computing (ISPDC 2006), pages 191-198, Timisoara, Romania, July 2006.

II. PREVISIONS 2010
/ 2010 Forecast

Programme de travail
Work programme

Description du programme scientifiquede travail (1 à 2 pages maximum)
/Description of the scientific work programme (maximum 1 to 2 pages)

Methodology used for task definition. Our work in 2010 will focus on Direction 1: Using BlobSeer for sharing application data in a IaaS. We have identified three main tasks, described below. The successful completion of these tasks relies on the substantial involvement of Ph.D students from all partner teams. Therefore, we defined these tasks by first identifying the Ph.D. students involved, the object of their joint activities and the schedule of their visits. Note that the expected outcome of these tasks will also partly serve (from a technical point of view) the other two directions of the project, planned for the following years.

Task 1: Introducing self-adaptation in BlobSeer based on MonALISA

Ph.D. students involved: Alexandru Costan (PUB), Alexandra Carpen-Amarie (KerData)

Goals. The final goal of this project is to enable autonomic storage for cloud services. As a first milestone, we aim to introduce self-management and self-adaptation facilities in BlobSeer. A preliminary introspection layer has already been jointly defined this year by the PUB and the KerData teams [20]. Based on advanced introspection mechanisms that we will build within this framework using MonALISA, we will target several features: an automatic management of the replication degree used by the storage (data) providers, automatic load balancing through data migration from overloaded to underloaded data providers, removal of providers with poor communication links or poor performance, along with automatic replacement of failed data providers.

Main challenges and difficulties. Integration of BlobSeer with MonALISA has shown to be nontrivial, as demonstrated by our preliminary work carried out in 2009. Intrusiveness, fault tolerance and scaling are key issues: the monitoring system should seemlessly fulfill its function with a very large number of nodes, even in the presence of failures. A serious difficulty comes from the need to simultaneously address multiple properties, some of which are difficult to reconcile: the ability for self-protection, failure recovery, self-reconfiguration in response to changes in the environment, while maintaining near-optimal performance. Existing approaches to these problems typically assume the existence of a performance model that allows optimizations or predictions of the observed behavior. However, creating performance models is inherently difficult and requires knowledge about the application environment. In addition, we are also interested in extending the adaptation strategies to support opportunistic process migrations within the cloud. This, however, requires the development and deployment of new MonALISA modules for dynamic monitoring of a wide range of parameters.

Human resources and organization. Alexandru Costan, Ph.D. student at PUB, is one of the main contributors to the MonALISA distributed monitoring framework. In June 2009, he visited the KerData team in June 2009 and worked together with Alexandra Carpen-Amarie, Ph.D. student in the KerData team at the design and implementation of an introspection layer for BlobSeer, based on MonALISA. This work was a perfect validation for the monitoring mechanisms he developed during his Ph.D thesis. In 2010, Alexandru will visit INRIA again and will work on the design of an upper layer using this introspection layer: the self-adaptation engine. A Master student (that we hope to recruit through INRIA's Internships program) will also contribute to this task.

Task 2: Security and client monitoring

Ph.D. students involved: Catalin Leordeanu (PUB), Alexandra Carpen-Amarie (KerData), Diana Moise (KerData), Sylvain Jeuland (PARIS).

Goals. The following situations can be detected through the analysis of the stored user activity logs: users breaking existing policies, abnormal client activity or incorrect client requests. The restrictions of the provider must be enforced so all attempts to break them must be detected. These restrictions can take various shapes, for example by using only certain resources for each client or restricting the bandwidth in certain time periods. Through strict monitoring of the client activity the cases when the actions of the clients are outside these restrictions can be detected and can restrict the actions of that user or temporarily suspend his access rights.Through the same analysis we can determine what falls into the category of normal client activity, thus detecting unexpected events, such as a sudden increase in the number of requests. This suspicious activity can lead to the detection of users which may have been compromised by an external attacker. In this case the attacker may have taken control of the client and is using it to access unauthorised data or an attempting to affect the system. A compromised client may also try to damage the system by using large numbers of malformed or incomplete requests, as a form of a Denial of Service attack by an external intruder masquerading as one of the clients. Advanced mechanisms must be developed to quickly detect such dangerous activity and isolate the client which may be a security risk. All of these objectives could be reached by developing a specific security system which will continually monitor and analyze the client activity and the state of the system to detect security threats, malicious activity or other kinds of intrusions. Through monitoring, the security system will define (and continuously refine) a suspicion level for each client. When security alerts occur, coresponding to suspicious behavior, the system will automatically take appropriate actions (i.e., ban the client or revoke some of its access rights, according to some predifined policy).

Main challenges and difficulties. A very important challenge related to the above goals is the management of the client's history, which must be stored in a secure, fault tolerant and scalable manner. To this purpose, given the large number of nodes, a centralized client does not seem appropriate. A possible solution is to develop a distributed security system where the policies and user logs are managed by a number of entities which are in constant communication. It could be based on the software architecture internally used by BlobSeer for distributed metadata storage. The development of detection methods for malicious activity in the context described above is very difficult. It will be be necessary to create novel intelligent algorithms capable of defining and detecting both unsecure client activity (according to some predefined pattern), and suspicious behavior (corresponding to previously unknown activity patterns). These novel methods must also meet performance and scalability requirements, in order to be efficiently used in BlobSeer.

Human resources and organization. Catalin Leordeanu, Ph.D. student at PUB, whose thesis focuses on security in distributed systems, will visit INRIA again for 3 months to work on this task and will interact with the Ph.D. students from KerData and PARIS. Afterwards, at PUB, one Romanian Master student will contribute to this task during their master research internship, under the supervision of Catalin.

Task 3: Deploying BlobSeer on an Xtreem-OS enabled IaaS based on Nimbus

Ph.D. students involved: Eliana Tirsa (PUB), Alexandra Carpen-Amarie (KerData), Bogdan Nicolae (KerData), Pierre Riteau (PARIS), Jérôme Gallard (PARIS).

Goals. This task aims at enabling BlobSeer as a storage service for sharing data of applications running in a Nimbus-enabled IaaS. There are two main goals that need to be reached. First, design and implement an IaaS client access interface that supports the deployment and management of a BlobSeer instance. This BlobSeer instance is used to share the application data among the VMs running the application. The client access interface must be integrated with the Nimbus cloud software (developed at Argonne National Lab). It should offer the same level of functionality as with the client access interface offered by Nimbus for virtual machine deployment and management. Second, we need to design and implement an interface for accessing the BlobSeer data-sharing service for the application running inside the VM. This access interface must access the same BlobSeer instance from within any VM regardless of the physical machine where the VM is deployed on.

Main challenges and difficulties. As regards deployment and management of BlobSeer instances, there are several aspects that need to be addressed. One such aspect is to define a security policy. In the simplest scenario, each client application is allowed to manage and access its own BlobSeer instance solely. In a more complex scenario, access rights to a BlobSeer instance may be shared by multiple client applications. Another aspect is directly related to the integration with the Nimbus cloud software. Nimbus provides a client access interface that enables the control and monitoring of the deployed VMs. The interaction works as follows: the client listens for event notifications that are sent by a so-called workspace service and can react by sending commands to the same service. Providing a similar interface to control and monitor the BlobSeer instance requires reasoning about what commands and events are meaningful from the client point of view and then implement a corresponding workspace service. With respect to the access interface to BlobSeer from inside a VM, several aspects are to be discussed as well. The access interface certainly needs to know about the key actors of BlobSeer with which it needs to interact. These actors are defined in a special configuration file. Each VM must keep up with the configuration changes: such changes may occur since a BlobSeer instance can directly be manipulated by the client at any time. Providing uniform and up-to-date BlobSeer configuration files across all VMs is not trivial.

Human resources and organization. Eliana Tirsa, Ph.D. student at PUB, already worked at integrating Nimbus with XtreemOS during her 3-month internship hosted by PARIS in 2009. She will visit the KerData and PARIS Teams for another 3 months in 2010 to work on this task together with the Ph.D. students of our INRIA teams. Additionally, two Master students will be involved: on one side, a Romanian Master student at PUB, co-advised by Eliana, will contribute to this task through his Master research project. On the other side, at INRIA, a student from the local Master Program in Rennes hosted by the KerData team for their research internship will also be involved in this task.

Programme d'échanges avec budget prévisionnel
Exchanges schedule and estimated budget

Actions planned for FY 2010

Common actions

We plan to hold a N+N France-Romania Residential Workshop in Romania. It will last for 3 days, within a week, may be in Spring 2010. The subject will be Towards autonomic storage environment for very large-scale infrastructures. This workshop will gather 9 members of the French partners and (more or less) as many partners from the Romanian partner. Additional scientists may also be invited to join the workshop, for instance from the ANL Kate Meahey's group. The invited participants will be invited to spend the rest of the week in Bucharest to hold specific technical discussions at PUB. We budget an overall additional cost of 1000 Euros for this action, excluding traveling and accommodation expenses to be mentioned below.

French visits to the Romanian partner

For each visit, we budget 600 Euros for traveling expenses, plus 80-100 Euros a day for short visits. Longer visits of French students of Romanian origin receive a specific estimation, as hosting expenses can be kept low if accommodation is possible through personal links.

Gabriel Antoniu (Senior): 2 visits of 1 week each, 2000 Euros. Subject: one of the visits for the Workshop, the other one for project management
Luc Bougé (Senior): 1 visit of 1 week, 1000 Euros. Subject: Workshop + Project management
Christine Morin (Senior): 1 visit of 1 week, 1000 Euros. Subject: Workshop + Project management
Bogdan Nicolae (Ph.D.): 1 visit of 1 week, 1000 Euros. Subject: Workshop + work on Task 3
Alexandra Carpen-Amarie (Ph.D.): 1 long visit of 2 months: 2000 Euros (personal accommodation). Subject: all tasks. This long visit with a broad subject is important as her Ph.D. proposal is right in the focus of the collaboration.
Diana Moise (Ph.D.): 1 visit of 1 week, 1000 Euros. Subject: Workshop + work on Task 2
Viet-Trung Tran (Ph.D.): 1 visit of 1 week, 1000 Euros. Subject: Workshop + work on Task 1
Pierre Riteau (Ph.D.): 1 visit of 1 week, 1000 Euros: Workshop + work on Task 3
Jérôme Gallard (Ph.D.): 1 visit of 1 week, 1000 Euros: Workshop + work on Task 3

Romanian visits to french partners

We budget visits on the same financial basis in the opposite direction.

Valentin Cristea (Senior): 1 visit of 1 week, 1000 Euros. Subject: Project management
Nicolae Tapus (Senior): 1 visit of 1 week, 1000 Euros. Subject: Project management
Ciprian Dobre (Post-Doc): 1 visit of 2 months, 2000 Euros; Subject: Work on all tasks
Alexandru Costan (Ph.D.): 1 visit of 2 months, 2000 Euros. Subject: Work on Task 1
Catalin Leordeanu (Ph.D.): 1 visit of 2 months, 2000 Euros. Subject: Work on Task 2
Eliana Tirsa (Ph.D.): 1 visit of 2 months, 2000 Euros. Subject: Work on Task 3
Master Intern (to be defined): 1 visit of 2 moths, 2000 Euros . Subject: Work on Task 3

1. ESTIMATION DES DÉPENSES EN MISSIONS INRIA VERS LE PARTENAIRE Estimated spending for missions of INRIA researchers abroad	Nombre de personnes Number of persons	Coût estimé Estimated cost
Chercheurs confirmés Senior researcher	3	4000
Post-doctorants Postdoctoral fellows	0
Doctorants Ph.D. students	4	7000
Stagiaires Interns	1
Autre (précisez) : Other (detail): Additional funding for Workshop	0	1000
Total	8	12000

2. ESTIMATION DES DÉPENSES EN INVITATIONS DES PARTENAIRES Estimated spending for invitations of Partner researchers in France	Nombre de personnes Number of persons	Coût estimé Estimated cost
Chercheurs confirmés Senior researcher	2	2000
Post-doctorants Postdoctoral fellows	1	2000
Doctorants Ph.D. students	3	6000
Stagiaires Interns	1	2000
Autre (précisez) : Other (detail):	0	0
Total	7	12000

2. Cofinancing

Is this collaboration already supported by INRIA, by the partner institution or by a third party (European project, National Science Foundation, etc)? Please indicate the related amount of funding.

Additional supports for traveling

French Embassy in Bucharest: We will apply for support for traveling expenses between France and Romania: 2,000 Euros.
University Rennes 1: We will apply for Research (BQR) support for this collaboration (International Program): 2,000 Euros

Additional support from the French site

Master internship in Rennes on the subject of the collaboration: 5 months, 5,000 Euros. This support is already committed through a grant from the International Internship Programme of ENS Cachan

Additional support from the Partner side

N+N workshop in Romania: PUB will organize and support the cost of the workshop. Only 1000 Euros have been provisioned on the French side for this operation to support an additional invited participant, as specified above.
Eiffel Scholarship: PUB will propose to set up a co-supervision (co-tutelle) for a Romanian Ph.D. student supported by a Romanian Ph.D. grant (4800 Euros/year). An application will be made to the Eiffel Doctoral Scholarship Program to support a 10-month visit to Rennes: 1400 Euros/month + additional support for traveling, etc. Total support from the Eiffel Program: 15,000 Euros.

3. Proposed budget

Indiquez, dans le tableau ci-dessous, le coût global estimé de votre projet et le budget demandé à la DRI dans le cadre de cette Equipe Associée (maximum 20 K€).

Commentaires	Montant
A. Coût global de la proposition (total des tableaux 1 et 2 : invitations, missions, ...) A. Global cost of the collaboration project	Mutual visits: 24,000
B. Cofinancements utilisés (financements autres que Equipe Associée) B. Cofinancing (other than Associate Team programme)	French Embassy: 2,000 University Rennes 1: 2,000
C. Additional supports not included in this application	ENS Cachan International Internship Programme: 5,000 Eiffel Doctoral Scholarship Program: 14,000
Financement "Équipe Associée" demandé (A.-B.) Funding from the Associate Team programme (maximum 20 000 €)	20,000

References

[1] L. M. Vaquero, L. Rodero-Merino, J. Caceres, M. Lindner. "A break in the clouds: towards a cloud definition". SIGCOMM Comput. Commun. Rev. 39, 1 (Dec. 2008), 50-55.

[2] A. Lenk, M. Klems, J. Nimis, S. Tai, T. Sandholm. "What's inside the Cloud? An architectural map of the Cloud landscape". Software Engineering Challenges of Cloud Computing, 2009. CLOUD '09. ICSE Workshop on 23-23 May 2009 Page(s):23 - 31.

[3] R. Buyya, Chee Shin Yeo, S. Venugopal. "Market-Oriented Cloud Computing: Vision, Hype, and Reality for Delivering IT Services as Computing Utilities". High Performance Computing and Communications, 2008. HPCC '08. 10th IEEE International Conference on 25-27 Sept. 2008 Page(s):5 - 13.

[4] The Amazon Elastic Compute Cloud: http://aws.amazon.com/ec2/

[5] The Nimbus project: http://workspace.globus.org/

[6] The Eucalyptus project: http://open.eucalyptus.com/

[7] J. Dean and S. Ghemawat. "MapReduce: simplified data processing on large clusters". Communications of the ACM, 51(1):107-113, 2008

[8] Google App Engine http://code.google.com/appengine/

[9] Microsoft Azure http://www.microsoft.com/azure/default.mspx

[10] Google Docs: http://www.google.com/google-d-s/tour1.html

[11] Microsoft Office Live: http://www.officelive.com/

[12] Kate Keahey, Tim Freeman. "Science Clouds: Early Experiences in Cloud Computing for Scientific Applications". Cloud Computing and Its Applications 2008 (CCA-08), Chicago, IL. October 2008.

[13] Christine Morin. "XtreemOS: a Grid Operating System Making your Computer Ready for Participating in Virtual Organizations". IEEE International Symposium on Object/component/service-oriented Real-time distributed Computing (ISORC), Santorini Island, Greece, May 2007.

[14] The XtreemOS project: http://www.xtreemos.eu/

[15] Christine Morin, Jérôme Gallard, Yvon Jégou, and Pierre Riteau. "Clouds : a new playground for the XtreemOS grid operating system". Parallel Processing Letters, 2009. To appear. Published as INRIA Research Report No RR-6824, February 2009. Available on HAL: http://hal.inria.fr/inria-00358594_v1/

[16] The BlobSeer project: http://blobseer.gforge.inria.fr/

[17] Bogdan Nicolae, Gabriel Antoniu, Luc Bougé. "BlobSeer: How to Enable Efficient Versioning for Large Object Storage under Heavy Access Concurrency". Data Management in Peer-to-Peer Systems, St-Petersburg, Russia, 2009.

[18] I. Legrand, H. Newman, R. Voicu, et al. "MonALISA: An agent based, dynamic service system to monitor, control and optimize grid based applications". In Computing for High Energy Physics, Interlaken, Switzerland, 2004

[19] The MonALISA project: http://monalisa.cern.ch/

[20] Alexandra Carpen-Amarie, Cai Jing, Alexandru Costan, Gabriel Antoniu, Luc Bougé. "Bringing Introspection Into the BlobSeer Data-Management System Using the MonALISA Distributed Monitoring Framework". INRIA Research Report No RR-7043, September 2009. Submitted for publication. Available on HAL: http://hal.inria.fr/inria-00419978/.

[21] Yvon Jégou, Stephane Lantéri, Julien Leduc, Melab Noredine, Guillaume Mornet, Raymond Namyst, Pascale Primet, Benjamin Quetier, Olivier Richard, El-Ghazali Talbi, and Touche Iréa. "Grid'5000: a large-scale and highly reconfigurable experimental grid testbed". International Journal of High Performance Computing Applications, 20(4):481'494, November 2006.

[22] The Grid'5000 project: http://www.grid5000.org/

[23] The Amazon Simple Storage Service: http://aws.amazon.com/s3/

[24] F. Hupfeld, T. Cortes, B. Kolbeck, E. Focht, M. Hess, J. Malo, J. Marti, J. Stender, E. Cesario. "XtreemFS - a case for object-based file systems in Grids". In Concurrency and Computation: Practice and Experience. Volume 20 Issue 8 June 2008.

[25] The XtreemFS project: http://www.xtreemfs.org/

Programme INRIA "Equipes Associées"/ INRIA "Associate Teams" Programme