Token-wise approach to span-based question answering

The Seventh Conference on Software Engineering and Information Management (SEIM-2022)
Authors:
Abstract:

Language model pre-training has led to significant success in a wide range of natural language processing problems. It was shown that modern deep contextual language models need only a small number of new parameters for fine-tuning due to the power of the base model. Nevertheless, the statement of the problem itself makes it possible to search the new approaches. Our experiments relate to the span-based question answering, one of machine reading comprehension (MRC) tasks. Recent works use loss functions that require the model to predict start and end positions of the answer in a contextual document. We propose a new loss that additionally requires the model to correctly predict whether each token is contained in the answer. Our hypothesis is that explicit using of this information can help the model to learn more dependencies from data. Our solution also includes a new span’s ranking and a no-answer examples selection scheme. We also propose approaches of accounting for information about relative positions of tokens in the dependency trees and the types of dependencies in relation to syntax-guided attention. The experiments showed that our approaches increase the quality of BERT-like models on SQuAD datasets.