Natural Language Processing

The Touché23-ValueEval Dataset for Identifying Human Values behind Arguments
The Touché23-ValueEval Dataset for Identifying Human Values behind Arguments

We've created the Touché23-ValueEval dataset, a large collection of over 9,300 arguments annotated with 54 human values, to help develop methods for analyzing the values that make arguments persuasive. Our dataset, which more than doubles the size of its predecessor, has already been used to achieve state-of-the-art results in identifying human values behind arguments, and has shown promising performance with large language models like Llama-2-7B.

May 1, 2024

Are Text Classifiers Xenophobic? A Country-Oriented Bias Detection Method with Least Confounding Variables
Are Text Classifiers Xenophobic? A Country-Oriented Bias Detection Method with Least Confounding Variables

Current bias detection methods in machine learning have their own biases and limitations, so we've developed a new approach that directly tests fine-tuned classifiers on real-world data to identify potential biases. Our method, which involves creating counterfactual examples by modifying named entities in target data, revealed significant biases in multilingual models, including sentiment analysis and stance recognition models, and shed light on the complex interactions between names, languages, and model predictions. Current models tend to prefer names from the countries speaking the language of the sentence, impulsing for the name IA Xenophobia.

May 1, 2024

Deep Natural Language Feature Learning for Interpretable Prediction
Deep Natural Language Feature Learning for Interpretable Prediction

A technique for explanability in LLM, allowing to break a complex task into subtasks formulated as binary questions in natural language, and represent any samples using the output of a binary classifier on these subtasks.

Dec 1, 2023

WASSA Workshop @ ACL 23
WASSA Workshop @ ACL 23

Workshop on Computational Approaches to Subjectivity, Sentiment & Social Media Analysis.

Jul 14, 2023

Findings of WASSA 2023 Shared Task on Empathy, Emotion and Personality Detection in Conversation and Reactions to News Articles
Findings of WASSA 2023 Shared Task on Empathy, Emotion and Personality Detection in Conversation and Reactions to News Articles

Findings of the shared task on Empathy, Personality, and Emotion Detection from the WASSA workshop @ ACL.

Jul 1, 2023

Multilingual Multi-Target Stance Recognition in Online Public Consultations
Multilingual Multi-Target Stance Recognition in Online Public Consultations

We've developed a machine learning approach to automatically recognize the opinions of citizens in online public consultations, using three datasets to train a model that can classify stances on various topics. Our work experiments with different methods, including self-supervised learning, and makes several annotated datasets available for others to use. This work was used in the Touché shared task of CLEF 2023.

Apr 1, 2023

CoFE: A New Dataset of Intra-Multilingual Multi-target Stance Classification from an Online European Participatory Democracy Platform
CoFE: A New Dataset of Intra-Multilingual Multi-target Stance Classification from an Online European Participatory Democracy Platform

A new dataset for Stance Recognition using data from the Participatory Democracy platform of the Conference for the Future of Europe. This dataset contains highly-multilingual interactions, as the platform used Machine Translation, in the sense that users interacts in using their (different) native languages in the same thread.

Nov 1, 2022

Debating Europe: A Multilingual Multi-Target Stance Classification Dataset of Online Debates
Debating Europe: A Multilingual Multi-Target Stance Classification Dataset of Online Debates

A new dataset of 2,600 online debate comments has been created to improve stance classification models. Fine-tuning and semi-supervised learning can boost accuracy by 3.4% over a baseline model.

Jun 1, 2022

WASSA 2022 Shared Task: Predicting Empathy, Emotion and Personality in Reaction to News Stories
WASSA 2022 Shared Task: Predicting Empathy, Emotion and Personality in Reaction to News Stories

Findings of the shared task on Empathy, Personality, and Emotion Detection from the WASSA workshop @ ACL.

May 1, 2022

How does a Pre-Trained Transformer Integrate Contextual Keywords? Application to Humanitarian Computing
How does a Pre-Trained Transformer Integrate Contextual Keywords? Application to Humanitarian Computing

It is possible to integrate textual metadata into transformers in order to help the model improve its performances. We show the model uses the semantics of the keyword metadata analyzing the attention interaction between the metadata and the text to classify. We applied this to a humanitarian classification task over tweets, using the disaster event type as context, and finally show this method is also useful to caracterize a new event like a hurricane in a data-driven way.

May 1, 2021