Article Details
- 48 Total Views:
- 14 No of Download
MULTILABEL BINARY CLASSIFICATION HATE SPEECH DETECTION IN HAUSA TWITTER DATA USING PRE-TRAINED TRANSFORMER MODELS
Authored By: Aliyu M., Garko A. B., Awwalu J., Rabiu A. M.
Article Number: 1771760531
Received Date: February 8th 2026 Published Date: February 22nd 2026Copyright © 2020 Author(s) retain the copyright of this article.
This study presents the development of a Hausa-specific hate speech detection model, HausaBERT, designed to address the limitations of existing multilingual transformers in understanding low-resource African languages. Data were collected from Twitter using language and keyword filters to extract Hausa tweets containing ethnoreligious, political, and personal insults. After expert-guided annotation and validation, the dataset underwent automated preprocessing, removing duplicates, cleaning text, and normalizing linguistic structures. Data augmentation was employed to balance classes at approximately 36,000 instances per category. Four baseline transformer models namely BERT, mBERT, RoBERTa, and DistilBERT were fine-tuned and compared with the proposed HausaBERT ensemble model using weighted precision, recall, and F1-score as evaluation metrics. The results revealed that HausaBERT outperformed some of the base models with a Weighted F1-score of 0.96 and an ensemble accuracy of 0.78, demonstrating superior adaptability to Hausa linguistic patterns. The ensemble approach also achieved a ROC-AUC of 0.9611, indicating robust classification capability across hate speech subcategories. These findings confirm the effectiveness of transformer-based transfer learning for hate speech detection in low-resource languages. The study contributes a reproducible Hausa hate-speech corpus and modeling framework, providing a foundation for future research in African language NLP, responsible AI, and culturally aware content moderation.
Aliyu M., Garko A. B., Awwalu J., & Rabiu A. M. (2026). Multilabel binary classification hate speech detection in Hausa twitter data using pre-trained transformer models. Journal of Science, Technology, and Education (JSTE); www.nsukjste.com/. 10(11), 139-153
- Aliyu M.
- Department of Computer Science, Faculty of Computing, Federal University Dutse, Jigawa State
- Garko A. B.
- Department of Computer Science, Faculty of Computing, Federal University Dutse, Jigawa State
- Awwalu J.
- Department of Computer Science, Faculty of Computing, Federal University Dutse, Jigawa State
- Rabiu A. M.
- Department of Computer Science, Faculty of Computing, Federal University Dutse, Jigawa State