Home
Scientific Axes
Members
Publications
Software
Collaborations
Activity Report
Seminars
Positions
Matthias Gallé

 

			
			

PhD Student

(with François Coste from Symbiose team (INRIA, France)

and Gabriel Infante Lopez from NLP team (FaMAF,Argentina)

 

Centre de Recherche INRIA Rennes - Bretagne Atlantique

Campus de Beaulieu
35042 RENNES Cedex - France

 

Tel: +33 2 99 84 75 23
Fax: +33 2 99 84 71 71
Mail: mgalle [at] gmail [dot] com

 

 

 

Goto: Research, Publications, Teaching, Software

 

I succesfully defended my thesis in Februrary 2011, and I am doing now a postdoc at XRCE with the MLDAT team. This page will probably become increasingly out-of-date.

 

 

Research Interests


 

 

Grammatical Inference & DNA sequences

 

I look what context-free grammars can tell us about DNA sequences. Specifically, I infer straight-line grammars to represent such sequences and see what they reveal on the underlying structure.

In our latest work we focus on the Smallest Grammar Problem, and we split the problem of searching for such grammars into two. This permits us to define a search space, and we define algorithms that traverse it. We improve state-of-the-art approximation algorithms for the Smallest Grammar Problem: our experiments reveal that not only the grammars we found are smaller, but that the structures they unveil are dramatically different.

 

Text Algorithms

 

We propose an algorithm to update a suffix array while replacing some of the occurrences of a word in the indexed text. This permits to accelerate SLG algorithms. More details and a some experiments can be found here.

More recently, I focused on a more specific class of repeats, called largest-maximal repeats and their relationship with a similar class for rigid patterns. Contact me if you want to know more about this...

 

Data Compression

 

Straight-line grammars have been used for compressing sequences in a framework called Grammar Based Codes. I extended this concept beyond context-freeness and have some promising results on compression of DNA sequences.

Publications

This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.

 

 

 


 

 

Reviewing Process

 

2011  

R. Carrascosa, F. Coste, M. Gallé, G. Infante-Lopez

The Smallest Grammar Problem as Constituents Choice and Minimal Grammar Parsing. [pdf]

 

 

Journals

2011  

R. Carrascosa, F. Coste, M. Gallé, G. Infante-Lopez

Searching for Smallest Grammars on Large Sequences and Application to DNA Accepted for publication at Journal of Discrete Algorithms [pdf]

2009  

M. Gallé, P. Peterlongo, F. Coste

In-Place Update of Suffix Array While Recoding Words. [pdf] [bib] [html]

International Journal of Foundations of Computer Science 20-6. pages 1025-1045.

Proceedings

2010

  M. Gallé

A New Tree Edit Distance for Structural Comparison of Sequences. [pdf] [bib]

Dagstuhl Seminar Proceedings: "Structure Discovery in Biology: Motifs, Networks & Phylogenies"

 

2010   R. Carrascosa, F. Coste, M. Gallé and G. Infante-Lopez

Choosing Word Occurrences for the Smallest Grammar Problem. [pdf] [slides] [bib]

In Proceedings of the 4th International Conference on Language and Automata Theory and Applications. pages 154-165.

2008  

M. Gallé, P. Peterlongo, F. Coste

In-Place Update of Suffix Array While Recoding Words preliminary version. [pdf] [slides] [bib]

Prague Stringology Club 2008

Theses

2011   M. Gallé

Searching for Compact Hierarchical Structures in DNA by means of the Smallest Grammar Problem [contact me for the pdf] [slides]

PhD thesis. U. Rennes 1

2007   M. Gallé

Efficient search of similar instances [pdf (spanish)] [slides (spanish)]

Masther thesis. Universidad Nacional de Córdoba

See also the final report of internship of:

2010

M Perrin

Compression de sèquences d'A.D.N. à base de grammaires minimales [pdf (fr] [slides (fr)] ENS, 3rd year

 

Teaching


 

Java (module B211 & B213) E-Miage (what is this?)     2009 - 2010
Complements of Java, ENSAI     2008
Java, INSA     2008

 

 

Software


If you want some really efficient software to calculate exact repetas, maximal, largest maximal or irreducible rigid patterns (besides any program used in one of my publications, contact me.

 

Symbiose Project Team - INRIA/Irisa © 2007 - 2008