[PhD] Geopalis : Exploration of patterns extracted with data mining techniques from relational and geographical data.
Supervisors : Peggy Cellier, Olivier Ridoux, Sébastien Ferré.
funding: University Rennes 1, allocated October 1st, 2012, not available any more
Data mining techniques are used in order to discover emerging
knowledge (patterns) in databases [1].
The problem of such techniques is that there are, in general, too many
resulting patterns for a user to
explore them all by hand. Some methods try to reduce the number of
patterns without a priori pruning,
for example condensed representation [2,3] or constraints [4]. The
number of patterns remains,
nevertheless, high. Other approaches, based on a total ranking, propose
to show to a user the top-k
patterns with respect to a specific measure. However, those methods do
not take into account the
user's knowledge and the dependencies that exist between patterns. In
recent work [5], we have
proposed an application of Logical Concept Analysis (LCA) [6] to build
a generic framework to explore
patterns extracted by data mining techniques. The framework is based on
a data structure which
organizes the set of patterns, and provides operations on that
structure, namely navigation in the set of
patterns, selection of patterns of interest and pruning off patterns
without interest. The data structure
exploits the fact that patterns are naturally partially ordered. Users
can thus benefit from their
background knowledge to navigate through the patterns until their
goal(s) have been reached, without
a priori pruning.
The subject of the PhD is the exploration of patterns extracted
with data mining techniques from data
that are represented in a semantic web format. Close to problems
addressed by Inductive Databases
(IDB) [7], an expected goal is the definition of a method to help a
user to explore relevant patterns
directly from relational data by computing patterns on demand. The
approach will be adapted to treat
geographical data, for example GPS track logs in order to discover
behavior patterns. The work will be
led in collaboration with Erwan Quesseveur, a researcher in geography
from University of Rennes 2.
[1] U. M. Fayyad, G. Piatetsky-Shapiro, and P. Smyth. From data mining
to knowledge discovery: an
overview. In Advances in knowledge discovery and data mining. American
Association for Artificial
Intelligence, 1996.
[2] N. Pasquier, Y. Bastide, R. Taouil, and L. Lakhal. Discovering
frequent closed itemsets for
association rules. In Int. Conf. on Database Theory, pages 398–416.
Springer-Verlag, 1999.
[3] M. Plantevit and B. Crémilleux. Condensed representation of
sequential patterns according to
frequency-based measures. In Int. Symp. on Advances in Intelligent Data
Analysis, LNCS(5772).
Springer, 2009.
[4] J. Pei, J. Han, and L. V. S. Lakshmanan. Mining frequent itemsets
with convertible constraints. In
Int. Conf. on Data Engineering. IEE computer society, 2001.
[5] P. Cellier, S. Ferré, M. Ducassé, and T. Charnois. Partial orders
and logical concept analysis to
explore patterns extracted by data mining.In Int. Conf. on Conceptual
Structures, LNCS : Springer,
2011
[6] S. Ferré and O. Ridoux. An introduction to logical information
systems. Information Processing &
Management, 40(3):383–419, Elsevier, 2004.
[7] T. Imielinski et H. Mannila. A database perspective on knowledge
discovery. Communications of
The ACM, 39 :58–64, 1996.