Final Results
The evaluation was conducted independently for the two subtasks included in DialAM. ARI evaluates the performance on Task A: Identification of Propositional Relations and ILO evaluates the performance on Task B: Identification of Illocutionary Relations. Finally GLOBAL represents the final results when looking at the complete argument maps.
The evaluation was performed at two levels, Focused and General. Focused evaluates the performance of the systems looking at the related propositions/locutions in the evaluation files only, excluding all the non-related cases. General looks at the whole map, including the non-related class. A high performance in General but low in Focused represents a pessimistic approach that over-relies on the non-related class. Differently, a high performance in Focused but low in General represents an optimistic approach, relating too many propositions/locutions.