Important Dates

Submission Deadline:
~~May 22, 2015~~
June 2, 2015
Notification of Acceptance:
June 30, 2015
Early Registration:
July 17, 2015
Informal Proceedings:
July 31, 2015
Camera Ready:
October 2, 2015

News

19/8/2015: Luc Bougé confirmed as keynote speaker!
18/8/2015: The program is now online
20/5/2015: The submission deadline was extended!

4^th Workshop on Big Data Management in Clouds

in conjunction with Euro-Par 2015

Welcome

The fourth edition of the Workshop on Big Data Management in Clouds will be held in Vienna, Austria. BigDataCloud 2015 follows the successful previous editions held in conjunction with EuroPar. Its goal is to aggregate the data management and Clouds / Grids / P2P communities in order to complement the Big Data handling issues with a comprehensive system / infrastructure perspective.

Workshop Program [top]

The workshop will take place in Room EI5.

09h00 - 10h15 Session 1, Chair: Frédéric Desprez (Inria / LIP ENS Lyon)

09:00 - 09:15 Frédéric Desprez, Alexandru Costan. Workshop Opening
09:15 - 09:45 Florian Klein, Kevin Beineke, Michael Schöttner (Heinrich-Heine-Universität Düsseldorf). Distributed Range-Based Meta-Data Management for an In-Memory Storage
09:45 - 10:15 Bartosz Kryza, Jacek Kitowski (AGH University of Science and Technology). File-less Approach to Large Scale Data Management

10h30 - 11h00 Coffee Break

11h00 - 12h30 Session 2 , Chair: Alexandru Costan (IRISA / INSA Rennes)

11:00 - 12:00 Keynote: Luc Bougé - Big Data Computing in Distributed, Very-large Scale Clouds: from Execution Models to Programming Models
12:00 - 12:30 Hiroki Ohtsuji, Osamu Tatebe (University of Tsukuba). Network-based Data Processing Architecture for Reliable and High-performance Distributed Storage System

Invited Talk [top]

Prof. Luc Bougé, ENS Rennes / IRISA.

Title: Big Data Computing in Distributed, Very-large Scale Clouds: from Execution Models to Programming Models

Abstract: A major question for today's BigData applications is the programming model. The impressive infrastructures of geographically distributed, interconnected clouds is of no use if there is no matching model to control and program them. This situation is not new, as it has already occurred several times within the 70-year history of computing. A disruptive advance regarding the execution model, for instance, clusters of parallel processors, triggers a matching advance in the programming model, say, message passing and MPI. If the advances of execution models is most often triggered by new advances in technology, the matching advances in programming models is often based on earlier theoretical studies, sometimes rather overlooked in their time. For instance, the origin of message-passing model of MPI can be traced back to Haore's CSP and Occam.

However, today's situation is fascinatingly original, as we are facing a double disruption. First, because the scale has reached unheard levels, and second, because the focus has moved from control to data. In this talk, we will present a synthetic overview of this general rule, focusing on the interaction between the progress in the execution model and the progress in the programming model. We will apply this analysis to the current situation, focusing on the Map-Reduce programming model for very-large scale Big Data applications. We will show that the origin of the Map-Reduce programming model can be traced back to the theory of parallel skeletons, introduced by Murray Cole in 1989, and BSP, introduced by Leslie Valiant and Bill McColl in 1990. We will show that Map-Reduce is however only one possible answer among many others, with specific strengths and weaknesses, and that the future is widely open for other possibilities.

Bio: Luc studied pure mathematics in Paris, and then turned to Informatics for his PhD Thesis. He worked at CNRS with Krzysztof Apt on the semantics of parallel programming languages (Hoare’s CSP language), which was the subject of his Habilitation Thesis in 1987. He was appointed on a full professor position at ENS Lyon in 1990. At the LIP laboratory, he worked in the team of Michel Cosnard and Yves Robert on the semantics of high-performance data-parallel languages (C*, HPF) and their implementation on distributed clusters, and on high-performance multithreading (PM2) and zero-copy communication (Madeleine) libraries. He moved to ENS Cachan/Rennes in 2001, where he joined the IRISA laboratory and the local INRIA Research Center in the team of Thierry Priol. In 2009, he co-founded the KerData team together with Gabriel Antoniu to work specifically on the management of very large-scale data on distributed platforms as clusters, grids and clouds. Luc is currently serving part-time as a Program Officer in the French National Research Agency (ANR). He is also serving as the Vice-Chair of the Steering Committee of the Euro-Par conference series.

Workshop Description

As data volumes increase at exponential speed in more and more application fields of science, the challenges posed by handling Big Data gain an increasing importance. Large scientific experiments, such as climate modelling, genome mapping, and high-energy physics simulations generate data volumes reaching petabytes per year, further used for real-time or offline processing. Initially designed for powerful and expensive supercomputers, such applications have seen an increasing adoption on clouds, exploiting their elasticity and economical model.

However, running such applications in an efficient fashion on clouds is challenging. One such open challenge is how to handle this “data deluge”. Sharing, disseminating and analyzing large data sets has become a critical issue despite the deployment of petascale computing systems, and optical networking speeds reaching up to 100 Gbps. While Map/Reduce covers a large fraction of the development space, there are still many applications that are better served by other models and systems. In such a context, we need to embrace new programming models, scheduling schemes, hybrid infrastructures and scale out of single datacenters to geographically distributed deployments in order to cope with these new challenges effectively.

The BigDataCloud workshop provides a platform for the dissemination of recent research efforts that explicitly aim at addressing these challenges. It supports the presentation of advanced solutions for the efficient management of Big Data in the context of Cloud computing, new development and deployment efforts in running data-intensive computing workloads. In particular, we are interested in how the use of Cloud-based technologies can meet the data intensive scientific challenges of HPC applications that are not well served by the current supercomputers or grids, and are being ported to Cloud platforms. The goal of the workshop is to support the assessment of the current state, introduce future directions, and present architectures and services for future Clouds supporting data intensive computing.

Call for Papers

Formats: PDF TXT PPT

Workshop Topics

The BigDataCloud workshop calls for contributions that address fundamental research and system issues in Cloud data management including but not limited to the following:

Cloud storage architectures for Big Data
Reliability of data intensive applications and services running on the Cloud
Query processing and indexing in Cloud computing systems
Data privacy and security in Clouds
Data-intensive computing on hybrid infrastructures (Grids/Clouds/P2P)
Cloud storage resource management
Data-intensive Cloud-based applications
Content delivery networks using storage Clouds
Data intensive scalable computing on Clouds
Data management within and across multiple geographically distributed data centers
Data handling in MapReduce based computations
Data management in HPC Clouds
Advanced programming models for IaaS, PaaS and SaaS
Elasticity for Cloud data management systems
Self-* and adaptive mechanisms.
Many-Task Computing in the Cloud
Performance evaluation of Cloud environments and technologies
Event streaming and real-time processing on Clouds
Energy-efficiency for BigData in Clouds

Organizing Commitee

Workshop Co-Chairs

Alexandru Costan, IRISA / INSA Rennes, France
Frédéric Desprez, Inria / ENS Lyon, France

Program Committee

Gabriel Antoniu, Inria, France
Luc Bougé, ENS Rennes, France
Shadi Ibrahim, Inria, France
Olivier Nano, Microsoft Research ATLE, Germany
Bogdan Nicolae, IBM Research, Ireland
Maria S. Pérez, Universidad Politecnica De Madrid, Spain
Florin Pop, University Politehnica of Bucharest, Romania
Anna Queralt, Barcelona Supercomputing Center, Spain
Leonardo Querzoni, University of Rome La Sapienza, Italy
Balaji Subramaniam, Virginia Tech, USA
Domenico Talia, University of Calabria, Italy
Osamu Tatebe, University of Tsukuba, Japan
Radu Tudoran, Huawei European Research Center, Germany
Cristian Zamfir, EPFL, Switzerland

Submission Guidelines

Authors are invited to submit research and application papers not exceeding 12 pages following the Springer LNCS format. You can download LNCS Latex style here

We solicit the submission of academic workshop papers representing original, previously unpublished work. Submitted papers will be carefully evaluated based on originality, significance, technical soundness and clarity of exposition. Papers should be prepared as the .pdf files and submitted electronically to the BigDataCloud 2015 online submission system. Submission of the paper implies that should the paper be accepted, at least one of the authors must register and present the paper at the workshop.

Submission page: EuroPar 2015 WorkShops - EasyChair (BigDataCloud Track)

Accepted papers that are presented at the workshop, will be published in a revised form in a special Euro-Par Workshop Volume in the Lecture Notes in Computer Science (LNCS) series after the Euro-Par conference.

Previous Workshops

BigDataCloud 2014
BigDataCloud 2013
BigDataCloud 2012
CGWS 2012
CGWS 2011
CGWS 2010
CGWS 2009