Free-Text Explanations

Free-text explanations are free-form textual justifications that are not constrained to the instance inputs.

English

Textual tasks

Dataset	Task	Collection Method	# Instances	# Explanations per Instance	Total # Annotators
Jansen et al. (2016)	science exam QA	authors	363	1	4
Ling et al. (2017)	solving algebraic word problems	automatic + crowd	~101K	1	n/a
Srivastava et al. (2017)	detecting phishing emails	crowd + authors	7	30-35	146
BabbleLabble	relation extraction	students + authors	200	1	10
e-SNLI	natural language inference	crowd	~569K	1 or 3	6325
LIAR-PLUS	verifying claims from text	automatic	12, 836	1	n/a
CoS-E v1.0	commonsense QA	crowd	8,560	1	n/a
CoS-E v1.1	commonsense QA	crowd	10,962	1	n/a
ECQA	commonsense QA	crowd	10,962	1	n/a
Sen-Making	commonsense validation	students + authors	2,021	1	7
ChangeMyView	argument persuasiveness	crowd	37,718	1	n/a
WinoWhy	pronoun coreference resolution	crowd	273	5	n/a
SBIC	social bias inference	crowd	48,923	1-3	n/a
PubHealth	verifying claims from text	automatic 11,832	1	n/a
Wang et al. (2020)	relation extraction	crowd + authors	373	1	n/a
Wang et al. (2020)	sentiment classification	crowd + authors	85	1	n/a
e-delta-NLI	defeasible natural language inference	automatic	92,298	~8	n/a
COPA-SSE (Semi-Structured Explanations for COPA)*	Balanced COPA (commonsense QA, causal reasoning)	crowd	1,500	4-9 (9747 total)	N/A

* ConceptNet-like triples with free-form head and tail concepts. The author classed this as structured but says it’s not very rigid and can also be used as free text.

Multimodal tasks

Dataset	Task	Collection Method	# Instances	# Explanations per Instance	Total # Annotators
BDD-X	vehicle control for self-driving cars	crowd	~26K	1	n/a
VQA-E	visual QA	automatic	~270K	1	n/a
VQA-X	visual QA	crowd	28,180	1 0r 3	n/a
ACT-X	activity recognition	crowd	18,030	3	n/a
Ehsan et al. (2019)	playing arcade games	crowd	2000	1	60
VCR	visual commonsense reasning	crowd	~290K	1	n/a
e-SNLI-VE	visual-textual entailment	crowd	11,335	3	n/a
ESPRIT	reasoning about qualitative physics	crowd	2441	2	n/a
VLEP	future event prediction	automatic + crowd	28,726	1	n/a
EMU	reasoning about manipulated images	crowd	48K	n/a	n/a

Multiple Languages

Dataset	Task	Collection Method	# Instances	# Explanations per Instance	Total # Annotators
E-KAR	analogical reasoning	crowd	1,655 (in Chinese); 1,251 (in English)	5	N/A