Passer au contenu
  Team SemLIS  

[PhD] Cotalis : Guided composition of tasks with logical information systems - application to data analysis workflows in bioinformatics.

Supervisors : Sébastien Ferré, Mireille Ducassé, Peggy Cellier.

In collaboration with the GenOuest bioinformatic platform and the Medical Computer Science Laboratory of University Rennes 1.

funding: ARED/INSA, allocated October 1st, 2012, not available any more

    In a number of domains, there is a collection of available tasks, each of which has inputs and outputs, and one is
interested in composing those tasks to define workflows. In bioinformatics, for example, complex data analysis
are performed by composing various elementary data analysis operations (e.g., search for homologous
sequences, transcription). Other examples are workflows in organizations where tasks are performed by humans,
software components (e.g., Web services) communicating through well-defined interfaces, or at a
finer granularity, the composition of math expressions with operators and values (e.g., in spreadsheets). The
difficulties for users when defining workflows are the selection of tasks in often large collections, and to express
valid task compositions in complex programming languages. Indeed, users are generally not computer scientists,
but experts of the application domain (e.g., biologists, data analysts). Manual approaches require users to
"program" in detail the workflow. Automatic approaches require a precise specification of the workflow, which is
itself difficult to express, and give less control to users. Therefore, we aim to develop a semi-automatic
and guided approach to task composition.  
    The LIS team has an expertise in guided approaches for data exploration and authoring. Logical Information
Systems (LIS) let users build complex queries [1] and updates [2] over Semantic Web [3] data through guided
navigation, suggesting relevant pieces of queries and updates at each step. On the application side, the LIM
(Laboratoire d'Informatique Medicale de Rennes 1) has worked on the modelisation of bioinformatic web services
with Semantic Web technologies, in order to reason on them and their composition [4]. The GenOuest
Bioinformatics Platform [5] provides a growing collection of tools and services, which could serve as a case
study.  
    The objective of this PhD is to extend the guided approach of LIS to the composition of tasks, given a
modelisation of those tasks in Semantic Web technologies. The GenOuest platform and its collection of data
analysis tasks will serve as a real application and case study, but the results should also apply to other domains.
As a side-effect, the approach should provide guided search for tasks according to user requirements. 

[1] S. Ferre, A. Hermann. Semantic Search: Reconciling Expressive Querying and Exploratory Search.
ISWC 2011.
[2] A. Hermann, S. Ferre, M. Ducasse. Guided creation and update of objects in RDF(S) bases. K-CAP
2011.
[3] P. Hitzler, M. Krötzsch, S. Rudolph. Foundations of Semantic Web Technologies. Chapman &
Hall/CRC, 2009.
[4] N. Lebreton (directed by O. Dameron). Realisation d'ontologies de taches et de domaine en
bioinformatique et utilisation de la semantique pour l'appariement semi-automatique de services Web.
PhD Universite de Rennes 1, 2010.
[5] http://www.genouest.org/