A Benchmark and Comprehensive Study of Fallacy Detection and Classification
Chadi Helwe¹
Tom Calamai²
Pierre-Henri Paris¹
Chloé Clavel²
Fabian Suchanek¹
1: 2:
Goals
Performance: SOTA LLMs and humans on fallacy detection and classification?
Benchmark: unification public existing fallacy collections
Taxonomy: covers existing dataset
Annotation Scheme: to tackle subjectivity
A fallacy is an erroneous or invalid way of reasoning.
“You must either support my presidential candidacy or be against America!”
This argument is a false dilemma fallacy: it wrongly assumes no other alternatives.
Why bother with fallacies?
First things first, some definitions
Argument
An argument consists of an assertion called the conclusion and one or more assertions called premises, where the premises are intended to establish the truth of the conclusion. Premises or conclusions can be implicit in an argument.
Fallacy
A fallacy is an argument where the premises do not entail the conclusion.
Taxonomies of Fallacies
What about taxonomies in current works on fallacy annotation, detection, and classification?
lack of consistency across the different dataset:
different granularity
different coverage
some are hierachical, others are not
Our taxonomy aims to systematize and classify those fallacies that are used in current works.
Level 0 is a binary classification
Level 1 groups fallacies into Aristotle’s categories:
Pathos (appeals to emotion),
Ethos (fallacies of credibility),
and Logos (fallacies of logic, relevance, or evidence).
Level 2 contains fine-grained fallacies within the broad categories of Level 1.
Accompanying the taxonomy, we provide both formal and informal definitions for each fallacies.
An example
“Why should I support the 2nd Amendment, do I look like toothless hick?”
Appeal to ridicule
This fallacy occurs when an opponent's argument is portrayed as absurd or ridiculous with the intention of discrediting it.
$E_1$ claims $P$. $E_2$ makes $P$ look ridiculous, by misrepresenting $P$ ($P$'). Therefore, $\neg P$
Disjunctive Annotation Scheme
In the last New Hampshire primary election, my favorite candidate won. Therefore, he will also win the next primary election.
Is it:
a false causality?
a causal oversimplification?
Legitimately differing opinions
One annotator may see implicit assertions that another annotator does not see.
“Are you for America? Vote for me!”
Annotators may also have different thresholds for fear or insults.
Different annotators have different background knowledge.
“Use disinfectants or you will get Covid-19!”
Let “$a\ b\ c\ d$” be a text where $a$, $b$, $c$, and $d$ are sentences.
$S= ${$a\ b, d$}, “$a\ b$” has labels {$l_1, l_2$}, and “$d$” has label {$l_3$}.
In that case, $G=${ $(a\ b,$ {$l_1, l_2$}$) , (d, ${$l_3$}$)$ }
A text is a sequence of sentences $st_1, \ldots, st_n$.
The span of a fallacy in a text is the smallest contiguous sequence of sentences that comprises the conclusion and the premises of the fallacy. If the span comprises a pronoun that refers to a premise or to the conclusion, that premise or conclusion is not included in the span.
Let $\mathcal{F}$ be the set of fallacy types and $\bot$ be a special label which means “no fallacy”.
Given a text and its set of spans $S$ (possibly overlapping), then the gold standard $G$ of a text is the set of pairs of all $s \in S$, along with their respective alternative labels, represented as follows:
$G \subseteq S \times \mathcal{P}(\mathcal{F}\cup${$\bot$}$)\setminus${$\emptyset, ${$\bot$}}
Given a text and its segmentation of (possibly overlapping) spans $S$, then a prediction $P$ of the text is a set of pairs of all $s \in S$ paired with exactly one predicted label, represented as follows: