Open Access Open Access  Restricted Access Subscription or Fee Access

Detection of Bodo Toxic Words using Recurrent Neural Network

Pranchis Narzaree, Amkar Brahma, Manoj Kr. Deka

Abstract



The detection of toxic words particularly in low resource support language like Bodo, characterize specific challenges in natural language processing (NLP). This study explores the application of Recurrent Neural Networks (RNN) which is a deep learning approach, for detecting toxic words in Bodo language. By the sequential nature of RNN, the model captures contextual dependencies that are crucial for understanding the nuances of a language. The application of deep learning algorithms in NLP has revolutionized the field, enabling significant advancements in understanding and generating human languages. In this paper, an imbalanced dataset is curated with 11465 instances collected from networking sites, dictionaries, comprising various forms of toxic expressions in Bodo language, annotated for training purposes. The model is trained to classify toxic or non-toxic words only. Moreover, performance of the model was evaluated using the metrics precision, recall, accuracy and F1-score. The results shows classifying toxicity with test loss of 0.0156 and test accuracy of 0.9947. The performance demonstrated precision of 0.8679, recall of 0.7731, accuracy of 0.5070 and F1-score of 0.8178. The findings highlight the model's effectiveness in identifying toxic words, contributing to the wide content of raising positive communication. The work emphasize the importance of NLP solutions for low resource language that illuminates the way for future studies in multi-lingual and multi-class toxicity identification.

Keywords


Toxic Words, Low Resource Language, Imbalanced Dataset, Recurrent Neural Network, Natural Language Processing.

Full Text:

PDF


Disclaimer/Regarding indexing issue:

We have provided the online access of all issues and papers to the indexing agencies (as given on journal web site). It’s depend on indexing agencies when, how and what manner they can index or not. Hence, we like to inform that on the basis of earlier indexing, we can’t predict the today or future indexing policy of third party (i.e. indexing agencies) as they have right to discontinue any journal at any time without prior information to the journal. So, please neither sends any question nor expects any answer from us on the behalf of third party i.e. indexing agencies.Hence, we will not issue any certificate or letter for indexing issue. Our role is just to provide the online access to them. So we do properly this and one can visit indexing agencies website to get the authentic information. Also: DOI is paid service which provided by a third party. Journal never mentioned that we have DOI number. However, to get free DOI, author can register your work which published with Zonodo (https://zenodo.org/signup/). We have no objection for this open access repository.