[PhD] Cotalis : Guided composition of tasks with logical information systems - application to data analysis workflows in bioinformatics.
Supervisors : Sébastien Ferré, Mireille Ducassé, Peggy Cellier.
In collaboration with the GenOuest bioinformatic platform and the Medical Computer Science Laboratory of University Rennes 1.
funding: ARED/INSA, allocated October 1st, 2012, not available any more
In a number of domains, there is a collection of available tasks,
each of which has inputs and outputs, and one is
interested in composing those tasks to define workflows. In
bioinformatics, for example, complex data analysis
are performed by composing various elementary data analysis operations
(e.g., search for homologous
sequences, transcription). Other examples are workflows in
organizations where tasks are performed by humans,
software components (e.g., Web services) communicating through
well-defined interfaces, or at a
finer granularity, the composition of math expressions with operators
and values (e.g., in spreadsheets). The
difficulties for users when defining workflows are the selection of
tasks in often large collections, and to express
valid task compositions in complex programming languages. Indeed, users
are generally not computer scientists,
but experts of the application domain (e.g., biologists, data
analysts). Manual approaches require users to
"program" in detail the workflow. Automatic approaches require a
precise specification of the workflow, which is
itself difficult to express, and give less control to users. Therefore,
we aim to develop a semi-automatic
and guided approach to task composition.
The LIS team has an expertise in guided approaches for data
exploration and authoring. Logical Information
Systems (LIS) let users build complex queries [1] and updates [2] over
Semantic Web [3] data through guided
navigation, suggesting relevant pieces of queries and updates at each
step. On the application side, the LIM
(Laboratoire d'Informatique Medicale de Rennes 1) has worked on the
modelisation of bioinformatic web services
with Semantic Web technologies, in order to reason on them and their
composition [4]. The GenOuest
Bioinformatics Platform [5] provides a growing collection of tools and
services, which could serve as a case
study.
The objective of this PhD is to extend the guided approach of LIS
to the composition of tasks, given a
modelisation of those tasks in Semantic Web technologies. The GenOuest
platform and its collection of data
analysis tasks will serve as a real application and case study, but the
results should also apply to other domains.
As a side-effect, the approach should provide guided search for tasks
according to user requirements.
[1] S. Ferre, A. Hermann. Semantic Search: Reconciling Expressive
Querying and Exploratory Search.
ISWC 2011.
[2] A. Hermann, S. Ferre, M. Ducasse. Guided creation and update of
objects in RDF(S) bases. K-CAP
2011.
[3] P. Hitzler, M. Krötzsch, S. Rudolph. Foundations of Semantic Web
Technologies. Chapman &
Hall/CRC, 2009.
[4] N. Lebreton (directed by O. Dameron). Realisation d'ontologies de
taches et de domaine en
bioinformatique et utilisation de la semantique pour l'appariement
semi-automatique de services Web.
PhD Universite de Rennes 1, 2010.
[5] http://www.genouest.org/