Welcome to DialAM-2024

DialAM is the first shared task in dialogical argument mining where argumentation and dialogue information is modelled together in a domain-independent framework.

  View Final Results


With the DialAM-2024 task, we propose the first shared task in dialogue argument mining where argumentation and dialogue information is modelled together in a domain-independent framework. The Inference Anchoring Theory (IAT) framework, makes possible to obtain homogeneous annotations of dialogue argumentation including relevant information and structural data from speech and argumentation, regardless of the domain, and allowing a more complete analysis of argumentation in dialogues together with a consistent cross-domain evaluation of the resulting argument mining systems.

DialAM-2024 consists of two sub-tasks: the identification of propositional (argumentative) relations, and the identification of illocutionary (speech act) relations. For both tasks all the information belonging to argumentation and dialogue will be available for the development of the submitted systems. We invite the community to participate in the DialAM-2024 task and explore how the use of additional information from the dialogue can be integrated into the argument retrieval process, in an attempt to take a step forward from sequence modelling approaches, where much of the relevant information to argumentation remains implicit behind the natural language.


Task A: Identification of Propositional Relations

In the first task, the goal is to detect argumentative relations existing between the propositions identified and segmented in the argumentative dialogue. Such relations are: Inference (RA), Conflict (CA), and Rephrase (MA).

Task B: Identification of Illocutionary Relations

In the second task, the goal is to detect illocutonary relations existing between locutions uttered in the dialogue and the argumentative propositions associated with them such as: Asserting, Agreeing, Arguing, or Disagreeing among others.


We will use the QT30 corpus [1], the largest available corpus in dialogical argumentation in English. QT30 is a collection of 30 episodes of Question Time aired between June 2020 and November 2021, with a total of more than 29 hours of transcribed broadcast material and comprises of 19,842 locutions by more than 400 participants: one moderator, 125 panel members (7 of them appearing more than once), and 300+ audience members. The QT30 dataset contains 10,818 propositional relations divided into Default Inferences, Default Conflicts, and Default Rephrases, and 32,303 illocutionary relations divided into Asserting, Agreeing, Arguing, Disagreeing, Restating, Questioning, and Default Illocuting.

The provided IAT argument maps can be visualised using the OVA3 tool available at https://ova.arg.tech.


Evaluation will happen at two levels:

(i) We will evaluate all the submitted systems in our two sub-tasks independently (i.e., A and B) considering the Precision, Recall, and macro-F1 scores to determine the best system for each of the two specific sub-tasks.

(ii) We will also provide an overall evaluation of the submitted systems when identifying both argument (i.e., propositional relation identification) and dialogue (i.e., illocutionary relation identification) information together. For that purpose we will use the aggregated Precision, Recall, and macro-F1 metrics to compare the submitted argument maps to the original graphs.


Ramon Ruiz-Dolz

ARG-tech, University of Dundee

John Lawrence

ARG-tech, University of Dundee

Ella Schad

ARG-tech, University of Dundee

Chris Reed

ARG-tech, University of Dundee

[1] Hautli-Janisz, A., Kikteva, Z., Siskou, W., Gorska, K., Becker, R., & Reed, C. (2022, June). QT30: A corpus of argument and conflict in broadcast debate. In Proceedings of the 13th Language Resources and Evaluation Conference (pp. 3291-3300). European Language Resources Association (ELRA).
[2] Duthie, R., Lawrence, J., Budzynska, K., & Reed, C. (2016, August). The CASS technique for evaluating the performance of argument mining. In Proceedings of the ThirdWorkshop on Argument Mining (ArgMining2016) (pp. 40-49).