Maude Pupin

Maître de conférences en Informatique
Univ Lille 1   labo cristal   cnrs inria lne
Bâtiment M3 extension, 59650 Villeneuve d'Ascq
+33 (0)3 28 77 85 55
Current research projets

I'm working on bioinformatics for nonribosomal peptides (NRPs). Those are chemical compounds synthesised by huge enzymes that select not only amino acids, but also lipids or carbohydrates and link them by peptide and non peptide bonds. Few NRPs already serve as drugs, and their capacity to provide new drugs or other industrial products is believed to be under-exploited. In 2006, I initiate a tight collaboration with ProBioGEM lab, a microbial laboratory specialized in that topic, to create bioinformatics database and tools that contribute to the process of discovering new NRPs and better understanding their activity. We are one of the very few teams in the world having an activity dedicated to NRP bioinformatics, for which we gain an international recognition.
We created NORINE, the unique and freely available resource dedicated to NRPs [1]. NORINE contains more than 1100 peptides, coming from the scientific literature or submitted directly by researchers. It receives 130 unique visitors from all continents each month, doing around 10 queries each. The recognition by our peers brought the worldwide Protein Data Bank, that maintains the unique archive of macromolecular structural data, to select our entry identifiers as external references along with the UniProt accession codes for gene product sequences.
Nonribosomal peptides harbor specific properties in comparison with peptides that are produced by the universal pathway (DNA transcribed in RNA then translated in protein/peptide). They can incorporate more than 500 different building blocks, which are amino acids, lipids or carbohydrates among others. We call those building blocks monomers. Their structures are not only linear but also contain cycles and/or branches. So, we choose to represent NRPs as monomeric structures, that are undirected graphs with nodes labeled by monomers names and edges corresponding to the bonds between monomers. We design efficient algorithms based on a variant of the compatibility graph to search for a structural pattern [2] or to compare a NRP to others.
Thanks to NORINE, we performed a large-scale statistical analysis on monomeric diversity that has revealed a correlation between monomeric composition and peptide activity [3]. This work has been highlighted by the American Society for Microbiology. Starting from this observation, we designed a machine learning classifier to predict the activity of a given NRP, represented by a monomeric composition fingerprint [4], that provides very good prediction rates.

Past research projets

I worked on DNA sequence analysis, more precisely on local repeats prediction.


Articles in international journals, with editorial board

Book chapters

Oral presentations in international conferences

Oral presentations in national conferences

