BACK TO INDEX

Publications of year 2021
Thesis
  1. Francesco Bariatti. Mining Tractable Sets of Graph Patterns with the Minimum Description Length Principle. PhD thesis, Université de Rennes 1, November 2021. [WWW]
    @phdthesis{Bar2021Thesis,
    author = {Bariatti, Francesco},
    title = {Mining {Tractable} {Sets} of {Graph} {Patterns} with the {Minimum} {Description} {Length} {Principle}},
    year = {2021},
    month = nov,
    school = {Université de Rennes 1},
    url = {https://hal.inria.fr/tel-03523742},
    
    }
    


  2. Aurélien Lamercerie. Principe de transduction sémantique pour l'application de théories d'interfaces sur des documents de spécification. PhD thesis, Université de Rennes 1, April 2021. Note: Thèse de doctorat dirigée par Caillaud, Benoit et Foret, Annie. [WWW]
    Abstract:
    The specification of technical systems is a complex and error-prone task. From a methodological point of view, the expected characteristics must be rigorously specified. In practice, specifications group together the desired properties in the form of a list of rules to be checked, called requirements. The challenge of this book is to build an analysis process for application on textual documents, written in a natural language such as English. The targeted implementation is an end-to-end automated processing chain, and integrating faculties of interpretation and reasoning on specification data. Precisely, we propose to study and experiment how to link natural language statements to formal models that can be exploited in a theoretical framework. First, the principle of semantic transduction is advanced to extract and formalize natural language statements. In a second step, the algebraic properties of specification models are studied to define a theory to verify the consistency of requirements.

    @PHDTHESIS{lamer2021,
    url = "http://www.theses.fr/2021REN1S029",
    title = "Principe de transduction sémantique pour l'application de théories d'interfaces sur des documents de spécification",
    author = "Lamercerie, Aurélien",
    year = "2021",
    month = apr,
    school = "Université de Rennes 1",
    note = "Thèse de doctorat dirigée par Caillaud, Benoit et Foret, Annie",
    Abstract={The specification of technical systems is a complex and error-prone task. From a methodological point of view, the expected characteristics must be rigorously specified. In practice, specifications group together the desired properties in the form of a list of rules to be checked, called requirements. The challenge of this book is to build an analysis process for application on textual documents, written in a natural language such as English. The targeted implementation is an end-to-end automated processing chain, and integrating faculties of interpretation and reasoning on specification data. Precisely, we propose to study and experiment how to link natural language statements to formal models that can be exploited in a theoretical framework. First, the principle of semantic transduction is advanced to extract and formalize natural language statements. In a second step, the algebraic properties of specification models are studied to define a theory to verify the consistency of requirements. } 
    }
    


Articles in journal or book chapters
  1. Sébastien Ferré. Conceptual Navigation in Large Knowledge Graphs. In Rokia Missaoui, Leonard Kwuida, and Talel Abdessalem, editors, Complex Data Analysis with Formal Concept Analysis. Springer, 2021. Note: To appear. Keyword(s): knowledge graph, formal concept analysis, Graph-FCA, conceptual navigation.
    Abstract:
    A growing part of Big Data is made of knowledge graphs. Major knowledge graphs such as Wikidata, DBpedia or the Google Knowledge Graph count millions of entities and billions of semantic links. A major challenge is to enable their exploration and querying by end-users. The SPARQL query language is powerful but provides no support for exploration by end-users. Question answering is user-friendly but is limited in expressivity and reliability. Navigation in concept lattices supports exploration but is limited in expressivity and scalability. % In this paper, we introduce a new exploration and querying paradigm, Abstract Conceptual Navigation (ACN), that merges querying and navigation in order to reconcile expressivity, usability, and scalability. ACN is founded on Formal Concept Analysis (FCA) by defining the navigation space as a concept lattice. We then instantiate the ACN paradigm to knowledge graphs (Graph-ACN) by relying on Graph-FCA, an extension of FCA to knowledge graphs. We continue by detailing how Graph-ACN can be efficiently implemented on top of SPARQL endpoints, and how its expressivity can be increased in a modular way. Finally, we present a concrete implementation available online, Sparklis, and a few application cases on large knowledge graphs.

    @InCollection{Fer2021cda_fca,
    author = {Sébastien Ferré},
    title = {Conceptual Navigation in Large Knowledge Graphs},
    booktitle = {Complex Data Analysis with Formal Concept Analysis},
    OPTcrossref = {},
    OPTkey = {},
    publisher = {Springer},
    year = {2021},
    editor = {Rokia Missaoui and Leonard Kwuida and Talel Abdessalem},
    OPTvolume = {},
    OPTnumber = {},
    OPTseries = {},
    OPTtype = {},
    OPTchapter = {},
    OPTpages = {17--44},
    OPTedition = {},
    OPTmonth = {},
    OPTaddress = {},
    note = {To appear},
    OPTannote = {},
    keywords = {knowledge graph, formal concept analysis, Graph-FCA, conceptual navigation},
    abstract = {A growing part of Big Data is made of knowledge graphs. Major knowledge graphs such as Wikidata, DBpedia or the Google Knowledge Graph count millions of entities and billions of semantic links. A major challenge is to enable their exploration and querying by end-users. The SPARQL query language is powerful but provides no support for exploration by end-users. Question answering is user-friendly but is limited in expressivity and reliability. Navigation in concept lattices supports exploration but is limited in expressivity and scalability. % In this paper, we introduce a new exploration and querying paradigm, Abstract Conceptual Navigation (ACN), that merges querying and navigation in order to reconcile expressivity, usability, and scalability. ACN is founded on Formal Concept Analysis (FCA) by defining the navigation space as a concept lattice. We then instantiate the ACN paradigm to knowledge graphs (Graph-ACN) by relying on Graph-FCA, an extension of FCA to knowledge graphs. We continue by detailing how Graph-ACN can be efficiently implemented on top of SPARQL endpoints, and how its expressivity can be increased in a modular way. Finally, we present a concrete implementation available online, Sparklis, and a few application cases on large knowledge graphs.},
    
    }
    


  2. Sébastien Ferré. Application of Concepts of Neighbours to Knowledge Graph Completion. Data Science: Methods, Infrastructure, and Applications, 4:1-28, 2021. [doi:10.3233/DS-200030] Keyword(s): knowledge graph, link prediction, concepts of enighbours.
    Abstract:
    The open nature of Knowledge Graphs (KG) often implies that they are incomplete. Knowledge graph completion (aka. link prediction) consists in inferring new relationships between the entities of a KG based on existing relationships. Most existing approaches rely on the learning of latent feature vectors for the encoding of entities and relations. In general however, latent features cannot be easily interpreted. Rule-based approaches offer interpretability but a distinct ruleset must be learned for each relation. In both latent- and rule-based approaches, the training phase has to be run again when the KG is updated. We propose a new approach that does not need a training phase, and that can provide interpretable explanations for each inference. It relies on the computation of Concepts of Nearest Neighbours (C-NN) to identify clusters of similar entities based on common graph patterns. Different rules are then derived from those graph patterns, and combined to predict new relationships. We evaluate our approach on standard benchmarks for link prediction, where it gets competitive performance compared to existing approaches.

    @Article{Fer2020ds,
    author = {Sébastien Ferré},
    title = {Application of Concepts of Neighbours to Knowledge Graph Completion},
    journal = {Data Science: Methods, Infrastructure, and Applications},
    year = {2021},
    OPTkey = {},
    volume = {4},
    OPTnumber = {},
    pages = {1--28},
    OPTmonth = {},
    OPTannote = {},
    doi = {10.3233/DS-200030},
    keyword = {knowledge graph, link prediction, concepts of enighbours},
    abstract = {The open nature of Knowledge Graphs (KG) often implies that they are incomplete. Knowledge graph completion (aka. link prediction) consists in inferring new relationships between the entities of a KG based on existing relationships. Most existing approaches rely on the learning of latent feature vectors for the encoding of entities and relations. In general however, latent features cannot be easily interpreted. Rule-based approaches offer interpretability but a distinct ruleset must be learned for each relation. In both latent- and rule-based approaches, the training phase has to be run again when the KG is updated. We propose a new approach that does not need a training phase, and that can provide interpretable explanations for each inference. It relies on the computation of Concepts of Nearest Neighbours (C-NN) to identify clusters of similar entities based on common graph patterns. Different rules are then derived from those graph patterns, and combined to predict new relationships. We evaluate our approach on standard benchmarks for link prediction, where it gets competitive performance compared to existing approaches.} 
    }
    


Conference articles
  1. Hugo Ayats, Peggy Cellier, and Sébastien Ferré. Extracting Relations in Texts with Concepts of Neighbours. In Agnès Braud, Aleksey Buzmakov, Tom Hanika, and Florence Le Ber, editors, Formal Concept Analysis, LNCS 12733, pages 155-171, 2021. Springer. [WWW] [doi:10.1007/978-3-030-77867-5_10]
    Abstract:
    During the last decade, the need for reliable and massive Knowledge Graphs (KG) increased. KGs can be created in several ways: manually with forms or automatically with Information Extraction (IE), a natural language processing task for extracting knowledge from text. Relation Extraction is the part of IE that focuses on identifying relations between named entities in texts, which amounts to find new edges in a KG. Most recent approaches rely on deep learning, achieving state-of-the-art performances. However, those performances are still too low to fully automatize the construction of reliable KGs, and human interaction remains necessary. This is made difficult by the statistical nature of deep learning methods that makes their predictions hardly interpretable. In this paper, we present a new symbolic and interpretable approach for Relation Extraction in texts. It is based on a modeling of the lexical and syntactic structure of text as a knowledge graph, and it exploits extit{Concepts of Neighbours}, a method based on Graph-FCA for computing similarities in knowledge graphs. An evaluation has been performed on a subset of TACRED (a relation extraction benchmark), showing promising results.

    @inproceedings{AyaCelFer2021icfca,
    author = {Hugo Ayats and Peggy Cellier and S{\'{e}}bastien Ferr{\'{e}}},
    editor = {Agn{\`{e}}s Braud and Aleksey Buzmakov and Tom Hanika and Florence Le Ber},
    title = {Extracting Relations in Texts with Concepts of Neighbours},
    booktitle = {Formal Concept Analysis},
    series = {LNCS 12733},
    pages = {155--171},
    publisher = {Springer},
    year = {2021},
    url = {https://doi.org/10.1007/978-3-030-77867-5\_10},
    doi = {10.1007/978-3-030-77867-5\_10},
    abstract = {During the last decade, the need for reliable and massive Knowledge Graphs (KG) increased. KGs can be created in several ways: manually with forms or automatically with Information Extraction (IE), a natural language processing task for extracting knowledge from text. Relation Extraction is the part of IE that focuses on identifying relations between named entities in texts, which amounts to find new edges in a KG. Most recent approaches rely on deep learning, achieving state-of-the-art performances. However, those performances are still too low to fully automatize the construction of reliable KGs, and human interaction remains necessary. This is made difficult by the statistical nature of deep learning methods that makes their predictions hardly interpretable. In this paper, we present a new symbolic and interpretable approach for Relation Extraction in texts. It is based on a modeling of the lexical and syntactic structure of text as a knowledge graph, and it exploits 	extit{Concepts of Neighbours}, a method based on Graph-FCA for computing similarities in knowledge graphs. An evaluation has been performed on a subset of TACRED (a relation extraction benchmark), showing promising results.},
    
    }
    


  2. Francesco Bariatti, Peggy Cellier, and Sébastien Ferré. GraphMDL+: interleaving the generation and MDL-based selection of graph patterns. In Chih-Cheng Hung, Jiman Hong, Alessio Bechini, and Eunjee Song, editors, ACM/SIGAPP Symp. Applied Computing (SAC), pages 355-363, 2021. ACM. [WWW] [doi:10.1145/3412841.3441917]
    Abstract:
    Graph pattern mining algorithms ease graph data analysis by extracting recurring structures. However, classic pattern mining approaches tend to extract too many patterns for human analysis. Recently, the GraphMDL algorithm has been proposed, which reduces the generated pattern set by using the \emph{Minimum Description Length} (MDL) principle to select a small descriptive subset of patterns. The main drawback of this approach is that it needs to first generate all possible patterns and then sieve through their complete set. In this paper we propose GraphMDL+, an approach based on the same description length definitions as GraphMDL but which tightly interleaves pattern generation and pattern selection (instead of generating all frequent patterns beforehand), and outputs a descriptive set of patterns at any time. Experiments show that our approach takes less time to attain equivalent results to GraphMDL and can attain results that GraphMDL could not attain in feasible time. Our approach also allows for more freedom in the pattern and data shapes, since it is not tied to an external approach.

    @inproceedings{BarCelFer2021sac,
    author = {Francesco Bariatti and Peggy Cellier and S{\'{e}}bastien Ferr{\'{e}}},
    editor = {Chih{-}Cheng Hung and Jiman Hong and Alessio Bechini and Eunjee Song},
    title = {{GraphMDL+}: interleaving the generation and MDL-based selection of graph patterns},
    booktitle = {{ACM/SIGAPP} Symp. Applied Computing ({SAC})},
    pages = {355--363},
    publisher = {{ACM}},
    year = {2021},
    url = {https://doi.org/10.1145/3412841.3441917},
    doi = {10.1145/3412841.3441917},
    abstract = {Graph pattern mining algorithms ease graph data analysis by extracting recurring structures. However, classic pattern mining approaches tend to extract too many patterns for human analysis. Recently, the GraphMDL algorithm has been proposed, which reduces the generated pattern set by using the \emph{Minimum Description Length} (MDL) principle to select a small descriptive subset of patterns. The main drawback of this approach is that it needs to first generate all possible patterns and then sieve through their complete set. In this paper we propose GraphMDL+, an approach based on the same description length definitions as GraphMDL but which tightly interleaves pattern generation and pattern selection (instead of generating all frequent patterns beforehand), and outputs a descriptive set of patterns at any time. Experiments show that our approach takes less time to attain equivalent results to GraphMDL and can attain results that GraphMDL could not attain in feasible time. Our approach also allows for more freedom in the pattern and data shapes, since it is not tied to an external approach.},
    
    }
    


  3. Shridhar B. Dandin and Mireille Ducassé. ComVisMD -- Compact 2D Visualization of Multidimensional Data: Experimenting with Two Different Datasets. In Harish et al. Sharma, editor, Intelligent Learning for Computer Vision, volume 61 of Lecture Notes on Data Engineering and Communications Technologies, pages 473-485, 2021. Springer Singapore. [WWW] Keyword(s): Tabular data, Visual representation design, Data analysis, Reasoning, Problem solving, Decision making, Data clustering, Aggregation.
    Abstract:
    Interpreting data with many attributes is a difficult issue. A simple 2D display, projecting two attributes onto two dimensions, is relatively easy to interpret but provides limited help to see multidimensional correlations. We propose a tool, ComVisMD, which displays, from a dataset, five dimensions in compact 2D maps. A map contains cells; each one represents an object from the dataset. In addition to the usual horizontal and vertical projections and the use of colors, we offer holes and shapes. In order to compact the display, we partition objects according to two dimensions, grouping values of each dimension into up to seven categories. In this paper, we present two case studies covering two different domains, a cricket player dataset and a heart disease dataset. The cricket dataset has 15 attributes and 2170 objects. We show how, using ComVisMD, correlations between variables can be found in an intuitive way. The heart disease dataset has 14 attributes and 297 objects. Blokh and Stambler, in the June 2015 issue of ``Aging and Disease,'' state that individual attributes show little correlation with heart disease. Yet in combination the correlation improves dramatically. We show how ComVisMD helps visualize those multidimensional correlations between four attributes and heart disease diagnosis.

    @InProceedings{dandin2021,
    author="Dandin, Shridhar B. and Ducass{\'e}, Mireille",
    editor={Sharma, Harish et al.},
    title="ComVisMD -- Compact 2D Visualization of Multidimensional Data: Experimenting with Two Different Datasets",
    booktitle="Intelligent Learning for Computer Vision",
    year="2021",
    series = {Lecture Notes on Data Engineering and Communications Technologies},
    volume = {61},
    publisher="Springer Singapore",
    pages="473--485",
    url = {https://hal.archives-ouvertes.fr/hal-03131685},
    KEYWORDS = {Tabular data ; Visual representation design ; Data analysis ; Reasoning ; Problem solving ; Decision making ; Data clustering ; Aggregation},
    HAL_ID = {hal-03131685},
    HAL_VERSION = {v1},
    abstract="Interpreting data with many attributes is a difficult issue. A simple 2D display, projecting two attributes onto two dimensions, is relatively easy to interpret but provides limited help to see multidimensional correlations. We propose a tool, ComVisMD, which displays, from a dataset, five dimensions in compact 2D maps. A map contains cells; each one represents an object from the dataset. In addition to the usual horizontal and vertical projections and the use of colors, we offer holes and shapes. In order to compact the display, we partition objects according to two dimensions, grouping values of each dimension into up to seven categories. In this paper, we present two case studies covering two different domains, a cricket player dataset and a heart disease dataset. The cricket dataset has 15 attributes and 2170 objects. We show how, using ComVisMD, correlations between variables can be found in an intuitive way. The heart disease dataset has 14 attributes and 297 objects. Blokh and Stambler, in the June 2015 issue of ``Aging and Disease,'' state that individual attributes show little correlation with heart disease. Yet in combination the correlation improves dramatically. We show how ComVisMD helps visualize those multidimensional correlations between four attributes and heart disease diagnosis.",
    
    }
    


  4. Sébastien Ferré. Adding Structure and Removing Duplicates in SPARQL Results with Nested Tables. In Further with Knowledge Graphs, pages 227-240, 2021. IOS Press. [doi:10.3233/SSW210047]
    Abstract:
    The results of a SPARQL query are generally presented as a table with one row per result, and one column per projected variable. This is an immediate consequence of the formal definition of SPARQL results as a sequence of mappings from variables to RDF terms. However, because of the flat structure of tables, some of the RDF graph structure is lost. This often leads to duplicates in the contents of the table, and difficulties to read and interpret results. % We propose to use nested tables to improve the presentation of SPARQL results. A nested table is a table where cells may contain embedded tables instead of RDF terms, and so recursively. We introduce an automated procedure that lifts flat tables into nested tables, based on an analysis of the query. We have implemented the procedure on top of Sparklis, a guided query builder in natural language, in order to further improve the readability of its UI. It can as well be implemented on any SPARQL querying interface as it only depends on the query and its flat results. We illustrate our proposal in the domain of pharmacovigilance, and evaluate it on complex queries over Wikidata.

    @InProceedings{Fer2021semantics,
    author = {Sébastien Ferré},
    title = {Adding Structure and Removing Duplicates in SPARQL Results with Nested Tables},
    booktitle = {Further with Knowledge Graphs},
    year = {2021},
    pages = {227-240},
    publisher = {IOS Press},
    doi = {10.3233/SSW210047},
    abstract = {The results of a SPARQL query are generally presented as a table with one row per result, and one column per projected variable. This is an immediate consequence of the formal definition of SPARQL results as a sequence of mappings from variables to RDF terms. However, because of the flat structure of tables, some of the RDF graph structure is lost. This often leads to duplicates in the contents of the table, and difficulties to read and interpret results. % We propose to use nested tables to improve the presentation of SPARQL results. A nested table is a table where cells may contain embedded tables instead of RDF terms, and so recursively. We introduce an automated procedure that lifts flat tables into nested tables, based on an analysis of the query. We have implemented the procedure on top of Sparklis, a guided query builder in natural language, in order to further improve the readability of its UI. It can as well be implemented on any SPARQL querying interface as it only depends on the query and its flat results. We illustrate our proposal in the domain of pharmacovigilance, and evaluate it on complex queries over Wikidata.},
    
    }
    


  5. Sébastien Ferré. Analytical Queries on Vanilla RDF Graphs with a Guided Query Builder Approach. In Troels Andreasen, Guy De Tré, Janusz Kacprzyk, Henrik Legind Larsen, Gloria Bordogna, and Slawomir Zadrozny, editors, Flexible Query Answering Systems, LNCS 12871, pages 41-53, 2021. Springer. [WWW] [doi:10.1007/978-3-030-86967-0_4]
    Abstract:
    As more and more data are available as RDF graphs, the availability of tools for data analytics beyond semantic search becomes a key issue of the Semantic Web. Previous work require the modelling of data cubes on top of RDF graphs. We propose an approach that directly answers analytical queries on unmodified (vanilla) RDF graphs by exploiting the computation features of SPARQL~1.1. We rely on the NAF design pattern to design a query builder that completely hides SPARQL behind a verbalization in natural language; and that gives intermediate results and suggestions at each step. Our evaluations show that our approach covers a large range of use cases, scales well on large datasets, and is easier to use than writing SPARQL queries.

    @inproceedings{Fer2021fqas,
    author = {Sébastien Ferré},
    editor = {Troels Andreasen and Guy De Tr{\'{e}} and Janusz Kacprzyk and Henrik Legind Larsen and Gloria Bordogna and Slawomir Zadrozny},
    title = {Analytical Queries on Vanilla {RDF} Graphs with a Guided Query Builder Approach},
    booktitle = {Flexible Query Answering Systems},
    series = {LNCS 12871},
    pages = {41--53},
    publisher = {Springer},
    year = {2021},
    url = {https://doi.org/10.1007/978-3-030-86967-0\_4},
    doi = {10.1007/978-3-030-86967-0\_4},
    abstract = {As more and more data are available as RDF graphs, the availability of tools for data analytics beyond semantic search becomes a key issue of the Semantic Web. Previous work require the modelling of data cubes on top of RDF graphs. We propose an approach that directly answers analytical queries on unmodified (vanilla) RDF graphs by exploiting the computation features of SPARQL~1.1. We rely on the NAF design pattern to design a query builder that completely hides SPARQL behind a verbalization in natural language; and that gives intermediate results and suggestions at each step. Our evaluations show that our approach covers a large range of use cases, scales well on large datasets, and is easier to use than writing SPARQL queries.},
    
    }
    



BACK TO INDEX




Disclaimer:

This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All person copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.

Les documents contenus dans ces répertoires sont rendus disponibles par les auteurs qui y ont contribué en vue d'assurer la diffusion à temps de travaux savants et techniques sur une base non-commerciale. Les droits de copie et autres droits sont gardés par les auteurs et par les détenteurs du copyright, en dépit du fait qu'ils présentent ici leurs travaux sous forme électronique. Les personnes copiant ces informations doivent adhérer aux termes et contraintes couverts par le copyright de chaque auteur. Ces travaux ne peuvent pas être rendus disponibles ailleurs sans la permission explicite du détenteur du copyright.




Last modified: Fri Jan 28 13:48:41 2022
Author: ferre.


This document was translated from BibTEX by bibtex2html