Thesis of Pauline Wauquier

Task driven representation learning

Machine learning proposes numerous algorithms to solve the different tasks that can be extracted from real world prediction problems. To solve the different concerned tasks, most Machine learning algorithms somehow rely on relationships between instances. Pairwise instances relationships can be obtained by computing a distance between the vectorial representations of the instances. Considering the available vectorial representation of the data, none of the commonly used distances is ensured to be representative of the task that aims at being solved. In this work, we investigate the gain of tuning the vectorial representation of the data to the distance to more optimally solve the task. We more particularly focus on an existing graph-based algorithm for classification task. An algorithm to learn a mapping of the data in a representation space which allows an optimal graph-based classification is first introduced. By projecting the data in a representation space in which the predefined distance is representative of the task, we aim at outperforming the initial vectorial representation of the data when solving the task. A theoretical analysis of the introduced algorithm is performed to define the conditions ensuring an optimal classification. A set of empirical experiments allows us to evaluate the gain of the introduced approach and to temper the theoretical analysis.

Jury

Monsieur Marc TOMMASI, Professeure à l’Université de Lille 3. Directeur de thèse Monsieur Ludovic DENOYER, Professeur à l’Université Pierre et Marie Curie. Paris Madame Elisa FREMONT, Maître de conférences –HDR à l’Université de Saint-Etienne. Madame Laetitia JOURDAN, Professeure à l’Université de Lille 1 Monsieur Emmanuel VIENNET, Professeur à l’Université de Paris XIII Madame Mikaela KELLER, Maître de conférences à l’Université de Lille 3. Monsieur Frédérique GRIGOLATO, Dirigeante de Clic and Walk.

Thesis of the team MAGNET defended on 29/05/2017