-
Mouhamadou Ba,
Sébastien Ferré,
and Mireille Ducassé.
Solving Data Mismatches in Bioinformatics Workflows by Generating Data Converters.
Transactions on Large-Scale Data and Knowledge-Centered Systems (TLDKS),
LNCS 9510:88-115,
2016.
Keyword(s): workflow,
data conversion,
bioinformatics,
type system.
Abstract:
Heterogeneity of data and data formats in bioinformatics entail mismatches between inputs and outputs of different services, making it difficult to compose them into workflows. To reduce those mismatches, bioinformatics platforms propose ad'hoc converters, called shims. When shims are written by hand, they are time-consuming to develop, and cannot anticipate all needs. When shims are automatically generated, they miss transformations, for example data composition from multiple parts, or parallel conversion of list elements.
This article proposes to systematically detect convertibility from output types to input types. Convertibility detection relies on a rule system based on abstract types, close to XML Schema. Types allow to abstract data while precisely accounting for their composite structure. Detection is accompanied by an automatic generation of converters between input and output XML data. % We show the applicability of our approach by abstracting concrete bioinformatics types (e.g., complex biosequences) for a number of bioinformatics services (e.g., blast). We illustrate how our automatically generated converters help to resolve data mismatches when composing workflows. % We conducted an experiment on bioinformatics services and datatypes, using an implementation of our approach, as well as a survey with domain experts. The detected convertibilities and produced converters were validated as relevant from a biological point of view. Furthermore the automatically produced graph of potentially compatible services exhibited a connectivity higher than with the ad'hoc approaches. Indeed, the experts discovered unknown possible connexions. |
@article{ba2015TLDKS,
author = {Mouhamadou Ba and Sébastien Ferré and Mireille Ducassé},
title = {Solving Data Mismatches in Bioinformatics Workflows by Generating Data Converters},
journal = {Transactions on Large-Scale Data and Knowledge-Centered Systems (TLDKS)},
year = {2016},
volume = {LNCS 9510},
publisher = {Springer},
pages = {88-115},
keywords = {workflow, data conversion, bioinformatics, type system},
abstract = { Heterogeneity of data and data formats in bioinformatics entail mismatches between inputs and outputs of different services, making it difficult to compose them into workflows. To reduce those mismatches, bioinformatics platforms propose ad'hoc converters, called shims. When shims are written by hand, they are time-consuming to develop, and cannot anticipate all needs. When shims are automatically generated, they miss transformations, for example data composition from multiple parts, or parallel conversion of list elements.
This article proposes to systematically detect convertibility from output types to input types. Convertibility detection relies on a rule system based on abstract types, close to XML Schema. Types allow to abstract data while precisely accounting for their composite structure. Detection is accompanied by an automatic generation of converters between input and output XML data. % We show the applicability of our approach by abstracting concrete bioinformatics types (e.g., complex biosequences) for a number of bioinformatics services (e.g., blast). We illustrate how our automatically generated converters help to resolve data mismatches when composing workflows. % We conducted an experiment on bioinformatics services and datatypes, using an implementation of our approach, as well as a survey with domain experts. The detected convertibilities and produced converters were validated as relevant from a biological point of view. Furthermore the automatically produced graph of potentially compatible services exhibited a connectivity higher than with the ad'hoc approaches. Indeed, the experts discovered unknown possible connexions. }
}
-
Mireille Ducassé and Peggy Cellier.
Using Bids, Arguments and Preferences in Sensitive Multi-unit Assignments: A p-Equitable Process and a Course Allocation Case Study.
Journal of Group Decision and Negotiation,
25(6):1211-1235,
2016.
[WWW]
[doi:10.1007/s10726-016-9483-9]
Keyword(s): group decision support,
thinklet,
formal concept analysis,
logical information systems.
Abstract:
Bonus distribution in enterprises or course allocation at universities are examples of sensitive multi-unit assignment problems, where a set of resources is to be allocated among a set of agents having multi-unit demands. Automatic processes exist, based on quantitative information, for example bids or preference ranking, or even on lotteries. In sensitive cases, however, decisions are taken by persons also using qualitative information. At present, no multi-unit assignment system supports both quantitative and qualitative information. In this paper, we propose \muaplis, an interactive process for multi-assignment problems where, in addition to bids and preferences, agents can give arguments to motivate their choices. Bids are used to automatically make pre-assignments, qualitative arguments and preferences help decision makers break ties in a founded way. A group decision support system, based on Logical Information Systems, allows decision makers to handle bids, arguments and preferences in a unified interface. We say that a process is {\em p-equitable} for a property $p$ if all agents satisfying $p$ are treated equally. We formally demonstrate that \muaplis{} is p-equitable for a number of properties on bids, arguments and preferences. It is also Pareto-efficient and Gale-Shapley-stable with respect to bids. A successful course allocation case study is reported. It spans over two university years. The decision makers were confident about the process and the resulting assignment. Furthermore, the students, even the ones who did not get all their wishes, found the process to be equitable. |
@Article{ducasse2016,
Author={Mireille Ducassé and Peggy Cellier},
Title={Using Bids, Arguments and Preferences in Sensitive Multi-unit Assignments: A p-Equitable Process and a Course Allocation Case Study},
Pages={1211-1235},
Journal={Journal of Group Decision and Negotiation},
Year={2016},
Volume={25},
Publisher={Springer},
Number={6},
Keywords={group decision support, thinklet, formal concept analysis, logical information systems},
DOI={10.1007/s10726-016-9483-9},
url= {https://hal.archives-ouvertes.fr/hal-01485046},
Abstract={ Bonus distribution in enterprises or course allocation at universities are examples of sensitive multi-unit assignment problems, where a set of resources is to be allocated among a set of agents having multi-unit demands. Automatic processes exist, based on quantitative information, for example bids or preference ranking, or even on lotteries. In sensitive cases, however, decisions are taken by persons also using qualitative information. At present, no multi-unit assignment system supports both quantitative and qualitative information. In this paper, we propose \muaplis, an interactive process for multi-assignment problems where, in addition to bids and preferences, agents can give arguments to motivate their choices. Bids are used to automatically make pre-assignments, qualitative arguments and preferences help decision makers break ties in a founded way. A group decision support system, based on Logical Information Systems, allows decision makers to handle bids, arguments and preferences in a unified interface. We say that a process is {\em p-equitable} for a property $p$ if all agents satisfying $p$ are treated equally. We formally demonstrate that \muaplis{} is p-equitable for a number of properties on bids, arguments and preferences. It is also Pareto-efficient and Gale-Shapley-stable with respect to bids. A successful course allocation case study is reported. It spans over two university years. The decision makers were confident about the process and the resulting assignment. Furthermore, the students, even the ones who did not get all their wishes, found the process to be equitable.}
}
-
Peggy Cellier,
Sébastien Ferré,
Annie Foret,
and Olivier Ridoux.
Exploration des Données du Défi EGC 2016 à l'aide d'un Système d'Information Logique.
In Cyril de Runz and Bruno Crémilleux, editors,
Journées Francophones Extraction et Gestion des Connaissances, EGC,
RNTI E-30,
pages 443-448,
2016.
Hermann-Éditions.
[WWW]
@inproceedings{CFFR2016egc,
author = {Peggy Cellier and Sébastien Ferré and Annie Foret and Olivier Ridoux},
title = {Exploration des Données du Défi {EGC} 2016 à l'aide d'un Système d'Information Logique},
booktitle = {Journées Francophones Extraction et Gestion des Connaissances, {EGC}},
pages = {443--448},
year = {2016},
OPTurl = {http://editions-rnti.fr/?inprocid=1002198},
url = {https://hal.inria.fr/hal-01253026},
editor = {Cyril de Runz and Bruno Crémilleux},
series = {{RNTI} {E-30}},
publisher = {Hermann-Éditions},
}
-
Sébastien Ferré.
An RDF Design Pattern for the Structural Representation and Querying of Expressions.
In Int. Conf. Knowledge Engineering and Knowledge Management,
LNAI 10024,
2016.
Springer.
[WWW]
Keyword(s): expression,
knowledge representation,
blank node,
querying,
RDF,
Turtle,
SPARQL,
mathematical formulas.
Abstract:
Expressions, such as mathematical formulae, logical axioms, or structured queries, account for a large part of human knowledge. It is therefore desirable to allow for their representation and querying with Semantic Web technologies. We propose an RDF design pattern that fulfills three objectives. The first objective is the structural representation of expressions in standard RDF, so that expressive structural search is made possible. We propose simple Turtle and SPARQL abbreviations for the concise notation of such RDF expressions. The second objective is the automated generation of expression labels that are close to usual notations. The third objective is the compatibility with existing practice and legacy data in the Semantic Web (e.g., SPIN, OWL/RDF). We show the benefits for RDF tools to support this design pattern with the extension of SEWELIS, a tool for guided exploration and edition, and its application to mathematical search. |
@inproceedings{Fer2016ekaw:Expr,
author = {Sébastien Ferré},
title = {An RDF Design Pattern for the Structural Representation and Querying of Expressions},
booktitle = {Int. Conf. Knowledge Engineering and Knowledge Management},
year = {2016},
series = {LNAI 10024},
publisher = {Springer},
url={https://hal.inria.fr/hal-01405495},
keywords = {expression, knowledge representation, blank node, querying, RDF, Turtle, SPARQL, mathematical formulas},
abstract = {Expressions, such as mathematical formulae, logical axioms, or structured queries, account for a large part of human knowledge. It is therefore desirable to allow for their representation and querying with Semantic Web technologies. We propose an RDF design pattern that fulfills three objectives. The first objective is the structural representation of expressions in standard RDF, so that expressive structural search is made possible. We propose simple Turtle and SPARQL abbreviations for the concise notation of such RDF expressions. The second objective is the automated generation of expression labels that are close to usual notations. The third objective is the compatibility with existing practice and legacy data in the Semantic Web (e.g., SPIN, OWL/RDF). We show the benefits for RDF tools to support this design pattern with the extension of SEWELIS, a tool for guided exploration and edition, and its application to mathematical search.},
}
-
Sébastien Ferré.
Bridging the Gap Between Formal Languages and Natural Languages with Zippers.
In Harald Sack,
Eva Blomqvist,
Mathieu d'Aquin,
Chiara Ghidini,
Simone Paolo Ponzetto,
and Christoph Lange, editors,
The Semantic Web (ESWC). Latest Advances and New Domains,
LNCS 9678,
pages 269-284,
2016.
Springer.
[WWW]
[doi:10.1007/978-3-319-34129-3_17]
Keyword(s): semantic web,
formal lamguage,
natural language,
zipper,
Montague grammar.
@inproceedings{Fer2016eswc,
author = {Sébastien Ferré},
title = {Bridging the Gap Between Formal Languages and Natural Languages with Zippers},
booktitle = {The Semantic Web ({ESWC}). Latest Advances and New Domains},
pages = {269--284},
year = {2016},
doi = {10.1007/978-3-319-34129-3_17},
editor = {Harald Sack and Eva Blomqvist and Mathieu d'Aquin and Chiara Ghidini and Simone Paolo Ponzetto and Christoph Lange},
series = {LNCS 9678},
publisher = {Springer},
url={https://hal.inria.fr/hal-01405488},
keywords = {semantic web, formal lamguage, natural language, zipper, Montague grammar},
abstract = {The Semantic Web is founded on a number of Formal Languages (FL) whose benefits are precision, lack of ambiguity, and ability to automate reasoning tasks such as inference or query answering. This however poses the challenge of mediation between machines and users because the latter generally prefer Natural Languages (NL) for accessing and authoring knowledge. In this paper, we introduce the NF design pattern based on Abstract Syntax Trees (AST), Huet's zippers and Montague grammars to zip together a natural language and a formal language. Unlike question answering, translation does not go from NL to FL, but as symbol NF suggests, from ASTs (A) of an intermediate language to both NL (NF). ASTs are built interactively and incrementally through a user-machine dialog where the user only sees NL, and the machine only sees FL.}
}
-
Sébastien Ferré.
SPARKLIS on QALD-6 Statistical Questions.
In Semantic Web Evaluation Challenge,
pages 178-187,
2016.
Springer.
Keyword(s): semantic web,
QALD,
statistical questions,
OLAP.
Abstract:
This work focuses on the statistical questions introduced by the QALD-6 challenge. With the growing amout of semantic data, including numerical data, the need for RDF analytics beyond semantic search becomes a key issue of the Semantic Web. We have extended SPARKLIS from semantic search to RDF analytics by covering the computation features of SPARQL (expressions, aggregations and groupings). We could therefore participate to the new task on statistical questions, and we report the achieved performance of SPARKLIS. Compared to other participants, SPARKLIS does not translate spontaneous questions by users, but instead guide users in the construction of a question. Guidance is based on the actual RDF data so as to ensure that built questions are well-formed, non-ambiguous, and inhabited with answers. We show that SPARKLIS enables superior results for both an expert user (94\% correct) and a beginner user (76\% correct). |
@inproceedings{Fer2016qald6,
title={{SPARKLIS} on {QALD-6} Statistical Questions},
author={Ferr{\'e}, S{\'e}bastien},
booktitle={Semantic Web Evaluation Challenge},
pages={178--187},
year={2016},
organization={Springer},
keywords={semantic web, QALD, statistical questions, OLAP},
abstract={This work focuses on the statistical questions introduced by the QALD-6 challenge. With the growing amout of semantic data, including numerical data, the need for RDF analytics beyond semantic search becomes a key issue of the Semantic Web. We have extended SPARKLIS from semantic search to RDF analytics by covering the computation features of SPARQL (expressions, aggregations and groupings). We could therefore participate to the new task on statistical questions, and we report the achieved performance of SPARKLIS. Compared to other participants, SPARKLIS does not translate spontaneous questions by users, but instead guide users in the construction of a question. Guidance is based on the actual RDF data so as to ensure that built questions are well-formed, non-ambiguous, and inhabited with answers. We show that SPARKLIS enables superior results for both an expert user (94\% correct) and a beginner user (76\% correct).},
}
-
Sébastien Ferré.
Semantic Authoring of Ontologies by Exploration and Elimination of Possible Worlds.
In Int. Conf. Knowledge Engineering and Knowledge Management,
LNAI 10024,
2016.
Springer.
[WWW]
Keyword(s): ontology authoring,
semantic web,
description logics,
OWL,
possible world explorer.
Abstract:
We propose a novel approach to ontology authoring that is centered on semantics rather than on syntax. Instead of writing axioms formalizing a domain, the expert is invited to explore the possible worlds of her ontology, and to eliminate those that do not conform to her knowledge. Each elimination generates an axiom that is automatically derived from the explored situation. We have implemented the approach in prototype PEW (Possible World Explorer), and conducted a user study comparing it to Protégé. The results show that more axioms are produced with PEW, without making more errors. More importantly, the produced ontologies are more complete, and hence more deductively powerful, because more negative constraints are expressed. |
@inproceedings{Fer2016ekaw:Pew,
author = {Sébastien Ferré},
title = {Semantic Authoring of Ontologies by Exploration and Elimination of Possible Worlds},
booktitle = {Int. Conf. Knowledge Engineering and Knowledge Management},
year = {2016},
series = {LNAI 10024},
publisher = {Springer},
url={https://hal.inria.fr/hal-01405502},
keywords = {ontology authoring, semantic web, description logics, OWL, possible world explorer},
abstract = {We propose a novel approach to ontology authoring that is centered on semantics rather than on syntax. Instead of writing axioms formalizing a domain, the expert is invited to explore the possible worlds of her ontology, and to eliminate those that do not conform to her knowledge. Each elimination generates an axiom that is automatically derived from the explored situation. We have implemented the approach in prototype PEW (Possible World Explorer), and conducted a user study comparing it to Protégé. The results show that more axioms are produced with PEW, without making more errors. More importantly, the produced ontologies are more complete, and hence more deductively powerful, because more negative constraints are expressed.},
}
-
Sébastien Ferré and Peggy Cellier.
Graph-FCA in Practice.
In O. Haemmerlé,
G. Stapleton,
and C. Faron-Zucker, editors,
Int. Conf. Conceptual Structures (ICCS) - Graph-Based Representation and Reasoning,
LNCS 9717,
pages 107-121,
2016.
Springer.
[WWW]
[doi:10.1007/978-3-319-40985-6_9]
Keyword(s): formal concept analysis,
knowledge graph,
graph pattern,
algorithm.
Abstract:
With the rise of the Semantic Web, more and more relational data are made available in the form of knowledge graphs (e.g., RDF, conceptual graphs). A challenge is to discover conceptual structures in those graphs, in the same way as Formal Concept Analysis (FCA) discovers conceptual structures in tables. Graph-FCA has been introduced in a previous work as an extension of FCA for such knowledge graphs. In this paper, algorithmic aspects and use cases are explored in order to study the feasability and usefulness of G-FCA. We consider two use cases. The first one extracts linguistic structures from parse trees, comparing two graph models. The second one extracts workflow patterns from cooking recipes, highlighting the benefits of n-ary relationships and concepts. |
@inproceedings{FerCel2016iccs,
author = {Sébastien Ferré and Peggy Cellier},
title = {Graph-{FCA} in Practice},
booktitle = {Int. Conf. Conceptual Structures ({ICCS}) - Graph-Based Representation and Reasoning},
pages = {107--121},
year = {2016},
doi = {10.1007/978-3-319-40985-6_9},
editor = {O. Haemmerlé and G. Stapleton and C. Faron{-}Zucker},
series = {LNCS 9717},
publisher = {Springer},
url = {https://hal.inria.fr/hal-01405491},
keywords = {formal concept analysis, knowledge graph, graph pattern, algorithm},
abstract = {With the rise of the Semantic Web, more and more relational data are made available in the form of knowledge graphs (e.g., RDF, conceptual graphs). A challenge is to discover conceptual structures in those graphs, in the same way as Formal Concept Analysis (FCA) discovers conceptual structures in tables. Graph-FCA has been introduced in a previous work as an extension of FCA for such knowledge graphs. In this paper, algorithmic aspects and use cases are explored in order to study the feasability and usefulness of G-FCA. We consider two use cases. The first one extracts linguistic structures from parse trees, comparing two graph models. The second one extracts workflow patterns from cooking recipes, highlighting the benefits of n-ary relationships and concepts.}
}
-
Clément Gautrais,
Peggy Cellier,
Thomas Guyet,
René Quiniou,
and Alexandre Termier.
Understanding Customer Attrition at an Individual Level: a New Model in Grocery Retail Context.
In Proceedings of the 19th International Conference on Extending Database Technology, EDBT 2016, Bordeaux, France, March 15-16, 2016, Bordeaux, France, March 15-16, 2016.,
pages 686-687,
2016.
[WWW]
@inproceedings{GautraisCGQT16,
author = {Clément Gautrais and Peggy Cellier and Thomas Guyet and René Quiniou and Alexandre Termier},
title = {Understanding Customer Attrition at an Individual Level: a New Model in Grocery Retail Context},
booktitle = {Proceedings of the 19th International Conference on Extending Database Technology, {EDBT} 2016, Bordeaux, France, March 15-16, 2016, Bordeaux, France, March 15-16, 2016.},
pages = {686--687},
year = {2016},
url = {https://hal.archives-ouvertes.fr/hal-01405172},
}
-
Pierre Maillot,
Sébastien Ferré,
Peggy Cellier,
Mireille Ducassé,
and Franck Partouche.
FORMULIS: Dynamic Form-Based Interface For Guided Knowledge Graph Authoring.
In 20th International Conference on Knowledge Engineering and Knowledge Management, Posters & Demonstrations,
2016.
[WWW]
Keyword(s): semantic web,
form,
knowledge authoring,
user interface.
Abstract:
Knowledge acquisition is a central issue of the Semantic Web. Knowledge cannot always be automatically extracted from existing data, thus contributors have to make efforts to create new data. In this paper, we propose FORMULIS, a dynamic form-based interface designed to make RDF data authoring easier. FORMULIS guides contributors through the creation of RDF data by suggesting fields and values according to the previously filled fields and the previously created resources. |
@inproceedings{formulis16ekaw,
title={{FORMULIS}: Dynamic Form-Based Interface For Guided Knowledge Graph Authoring},
author={Maillot, Pierre and Ferré, Sébastien and Cellier, Peggy and Ducassé, Mireille and Partouche, Franck},
booktitle={20th International Conference on Knowledge Engineering and Knowledge Management, Posters \& Demonstrations},
year={2016},
url={https://hal.archives-ouvertes.fr/hal-01443319},
keywords = {semantic web, form, knowledge authoring, user interface},
abstract = {Knowledge acquisition is a central issue of the Semantic Web. Knowledge cannot always be automatically extracted from existing data, thus contributors have to make efforts to create new data. In this paper, we propose FORMULIS, a dynamic form-based interface designed to make RDF data authoring easier. FORMULIS guides contributors through the creation of RDF data by suggesting fields and values according to the previously filled fields and the previously created resources.},
}