Evaluation in Natural Language Generation: Lessons from Referring Expression Generation

Jette Viethen, Robert Dale
Centre for Language Technology
Division of Information and Communication Sciences
Macquarie University
Sydney NSW 2109
Australia
jviethen@ics.mq.edu.au
robert.dale@mq.edu.au
 
As one of the most well-defined subtasks in Natural Language Generation (NLG), the generation of referring expressions looks like a strong candidate for piloting shared evaluation tasks. Different to other areas of Natural Language Processing, it is still unclear what benefit the introduction of such tasks might have for the field of NLG. Based on an earlier evaluation of a number of well-established algorithms for the generation of referring expressions, this paper explores several problems that arise in designing evaluation for this task, and identifies general considerations that need to be met in evaluating Natural Language Generation subtasks.