Generating Literal and Implied Subquestions to Fact-check Complex Claims


Verifying complex political claims, where politicians use various tactics for their agenda, is critical yet difficult. The performance of automatic fact-checking systems is still limited, and a prediction like “half-true” alone is not very useful, since we have no idea which parts of the claim are true and which are not. In this work, we focus on decomposing a complex claim into several yes-no subquestions whose answer influences the veracity of the claim. We collect CLAIMDECOMP, a dataset of decompositions for over 1000 claims, generated by trained annotators who are provided with the claim and a verification paragraph written by professional fact-checkers. Our comprehensive annotation addresses both explicit propositions of the original claim and its implicit facets, additional context that would soften the claim, or more generally other unstated factors. We study whether state-of-the-art question generation models can generate such subquestions, showing that these models generate reasonable questions to ask, but predicting the comprehensive set of subquestions from the original claim without evidence remains challenging. We further show that these subquestions can help with identifying relevant evidence to fact-check the full claim and composing the veracity through the answers to these questions, suggesting that they can be useful pieces of a fact-checking pipeline.

arxiv preprint