Free-text explanations are free-form textual justifications that are not constrained to the instance inputs.

English

Textual tasks

Dataset Task Collection Method # Instances # Explanations per Instance Total # Annotators
Jansen et al. (2016) science exam QA authors 363 1 4
Ling et al. (2017) solving algebraic word problems automatic + crowd ~101K 1 n/a
Srivastava et al. (2017) detecting phishing emails crowd + authors 7 30-35 146
BabbleLabble relation extraction students + authors 200 1 10
e-SNLI natural language inference crowd ~569K 1 or 3 6325
LIAR-PLUS verifying claims from text automatic 12, 836 1 n/a
CoS-E v1.0 commonsense QA crowd 8,560 1 n/a
CoS-E v1.1 commonsense QA crowd 10,962 1 n/a
Sen-Making commonsense validation students + authors 2,021 1 7
WinoWhy pronoun coreference resolution crowd 273 5 n/a
SBIC social bias inference crowd 48,923 1-3 n/a
PubHealth verifying claims from text automatic 11,832 1 n/a  
Wang et al. (2020) relation extraction crowd + authors 373 1 n/a
Wang et al. (2020) sentiment classification crowd + authors 85 1 n/a
e-delta-NLI defeasible natural language inference automatic 92,298 ~8 n/a

Multimodal tasks

Dataset Task Collection Method # Instances # Explanations per Instance Total # Annotators
BDD-X vehicle control for self-driving cars crowd ~26K 1 n/a
VQA-E visual QA automatic ~270K 1 n/a
VQA-X visual QA crowd 28,180 1 0r 3 n/a
ACT-X activity recognition crowd 18,030 3 n/a
Ehsan et al. (2019) playing arcade games crowd 2000 1 60
VCR visual commonsense reasning crowd ~290K 1 n/a
e-SNLI-VE visual-textual entailment crowd 11,335 3 n/a
ESPRIT reasoning about qualitative physics crowd 2441 2 n/a
VLEP future event prediction automatic + crowd 28,726 1 n/a
EMU reasoning about manipulated images crowd 48K n/a n/a