Several large pre-trained models of source code were recently described in the literature. Can they, as in NLP, benefit from training on multiple different tasks? What about multiple programming languages?
As part of this project, you will learn how to apply Machine Learning in the Software Engineering domain (ML4SE) by evaluating and fine-tuning pre-trained modes of code on different downstream SE tasks.
Your goal will be to identify practical problems that could be solved by a large per-trained seq2seq model architecture, collect the data and fine-tune it on those problems, evaluating the resulting performance.
We expect you to demonstrate both:
Please, note: the candidate’s status and paperwork will have to be reviewed beforehand in order to assess the possibilities of hiring for this position.