CRIStAL - Centre de Recherche en Informatique et Automatique de Lille

Thesis of Alexandre Berard

Neural Machine Translation Architectures and Applications

This thesis is centered on two main objectives: adaptation of Neural Machine Translation techniques to new tasks and research replication. Our efforts towards research replication have led to the production of two resources: MultiVec, a framework that facilitates the use of several techniques related to word embeddings (Word2vec, Bivec and Paragraph Vector); and a framework for Neural Machine Translation that implements several architectures and can be used for regular MT, Automatic Post-Editing, and Speech Recognition or Translation. These two resources are publicly available and now extensively used by the research community. We extend our NMT framework to work on three related tasks: Machine Translation (MT), Automatic Speech Translation (AST) and Automatic Post-Editing (APE). For the machine translation task, we replicate pioneer neural-based work, and do a case study on TED talks where we advance the state-of-the-art. Automatic speech translation consists in translating speech from one language to text in another language. In this thesis, we focus on the unexplored problem of end-to-end speech translation, which does not use an intermediate source-language text transcription. We propose the first model for end-to-end AST and apply it on two benchmarks: translation of audiobooks and of basic travel expressions. Our final task is automatic post-editing, which consists in automatically correcting the outputs of an MT system in a black-box scenario, by training on data that was produced by human post-editors. We replicate and extend published results on the WMT 2016 and 2017 tasks, and propose new neural architectures for low-resource automatic post-editing.

Jury

- Directeur de thèse : Olivier Pietquin, Laurent Besacier - Rapporteurs : Béatrice Daille, Philippe Langlais - Examinateurs : Pascale Sébillot, Marc Tommasi, François Yvon

Thesis of the team defended on 15/06/2018

AGENDA

Every dates to be informed about meetings not to miss

UTILITIES

Recruitment

Join our research teams

Thesis of Alexandre Berard

Neural Machine Translation Architectures and Applications

Jury

AGENDA

UTILITIES

Recruitment