OVERCOMING LANGUAGE PRIORS VIA SHUFFLING LANGUAGE BIAS FOR ROBUST VISUAL QUESTION ANSWERING

Overcoming Language Priors via Shuffling Language Bias for Robust Visual Question Answering

Overcoming Language Priors via Shuffling Language Bias for Robust Visual Question Answering

Blog Article

Recent research has revealed the notorious language prior problem in visual question answering (VQA) tasks based on visual-textual interaction, which indicates that well-developed VQA models rely essie cause and reflect on learning shortcuts from questions without fully considering visual evidence.To tackle this problem, most existing methods focus on decreasing the incentive to learn prior knowledge by adding a question-only branch and becoming complacent by mechanically improving accuracy.However, these methods over-correct positive biases useful for generalization, leading to the degradation of performance on the VQA v2 dataset when cumulating their methods into other VQA architecture.In this paper, we propose a robust bar vac 7 somnus shuffling language bias (SLB) approach to explicitly balance the prediction distribution, hopefully alleviating the language prior by increasing training opportunities for VQA models.

Experiment results demonstrate that our method is cumulative with data augmentation and large-scale pre-training VQA architectures and achieves competitive performance on both the in-domain benchmark VQA v2 and out-of-distribution benchmark VQA-CP v2.

Report this page