Evaluation of DistilBERT and BiLSTM Models for the Development of Islamic Chatbots Based on Tag Classification
Main Article Content
Abstract
This study evaluates the performance of DistilBERT and Bidirectional Long Short-Term Memory (BiLSTM) models for intent classification in Islamic chatbots, with the main challenge being a highly imbalanced dataset containing 2,031 unique intents. Following the CRISP-DM methodology, the DistilBERT model was fine-tuned using Focal Loss to address class imbalance, while the BiLSTM model was built from scratch with a standard loss function. The evaluation results demonstrated the absolute superiority of DistilBERT, achieving an accuracy of 65.15%, far surpassing BiLSTM, which achieved only 34.50% due to severe overfitting. Although the final model sizes of both were similar, DistilBERT training proved to be significantly more efficient. These findings demonstrate that a Transformer-based architecture combined with an appropriate strategy, such as Focal Loss, is a much more robust and effective solution for large-scale, imbalanced text classification in specific domains. The practical feasibility of this approach was validated through its successful implementation in a publicly accessible, functional chatbot prototype.
Downloads
Article Details
Section

This work is licensed under a Creative Commons Attribution 4.0 International License.
You are free to:
Share — copy and redistribute the material in any medium or format
Adapt — remix, transform, and build upon the material for any purpose, even commercially.
The licensor cannot revoke these freedoms as long as you follow the license terms.
Under the following terms:
Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
Notices:
You do not have to comply with the license for elements of the material in the public domain or where your use is permitted by an applicable exception or limitation.
No warranties are given. The license may not give you all of the permissions necessary for your intended use. For example, other rights such as publicity, privacy, or moral rights may limit how you use the material.
References
[1] V. Sanh, L. Debut, J. Chaumond, and T. Wolf, “DistillBERT, a distilled version of BERT: Smaller, faster, cheaper and lighter,” Oct. 2019.
[2] “Arabic natural language processing: An overview,” Journal of King Saud University - Computer and Information Sciences , vol. 33, no. 5, pp. 497–507.
[3] Z. Huang, W. Xu, and K. Yu, “Bidirectional LSTM-CRF Models for Sequence Tagging,” Aug. 2015.
[4] S. Shah, S. Manzoni, F. Zaman, F. Es-sabery, F. Epifania, and I. Zoppis, “Fine-Tuning of Distil-BERT for Continual Learning in Text Classification: An Experimental Analysis,” IEEE Access , vol. PP, p. 1, Jul. 2024, doi: 10.1109/ACCESS.2024.3435537.
[5] W. Antoun, F. Baly, and H. Hajj, “AraBERT: Transformer-based Model for Arabic Language Understanding,” Jul. 2020.
[6] M. Bahbib, M. Yakhlef, and L. Tamym, “CNN-BILSTM Based-Hybrid Automated Model for Arabic Medical Question Categorization,” Operations Research Forum , vol. 6, Jul. 2025, doi: 10.1007/s43069-025-00436-x.
[7] A. Farghaly and K. Shaalan, “Arabic Natural Language Processing,” ACM Transactions on Asian Language Information Processing , vol. 8, no. 4, pp. 1–22, Jul. 2009.
[8] A. Malik, AP Gefadri, E. Sidik, and AP Syadrina, "SoulScripture: Chatbot using Bidirectional Encoder Representations from Transformers as a Medium of Spiritual Guidance," Khazanah Journal of Religion and Technology , vol. 2, no. 1, pp. 23–27, Aug. 2024.
[9] RF Reza, Muhmmad Thoriq, and Rd. Imam Saepul Millah, "Sentiment Analysis of Marketplace Review with Islamic Perspective using Fine-Tuning DistilBERT," Khazanah Journal of Religion and Technology , vol. 2, no. 2, pp. 45–54, Jan. 2025.
[10] I. Hafidz et al. , “Chatbot Model Development Using BERT for West Sumatra Halal Tourism Information,” Halal Research Journal , vol. 4, no. 2, pp. 117–131, Jul. 2024.
[11] P. Anki, A. Bustamam, HS Al-Ash, and D. Sarwinda, “High Accuracy Conversational AI Chatbot Using Deep Recurrent Neural Networks Based on BiLSTM Model,” in 2020 3rd International Conference on Information and Communications Technology (ICOIACT) , Nov. 2020, pp. 382–387.
[12] P. Anki and A. Bustamam, "Measuring the accuracy of LSTM and BiLSTM models in the application of artificial intelligence by applying chatbot program," Indonesian Journal of Electrical Engineering and Computer Science , vol. 23, no. 1, p. 197, Jul. 2021.
[13] AM Mutawa and S. Sruthi, “A Comparative Evaluation of Transformers and Deep Learning Models for Arabic Meter Classification,” Mar. 2025.
[14] M. Abdul-Mageed, A. Elmadany, and EMB Nagoudi, “ARBERT & MARBERT: Deep Bidirectional Transformers for Arabic,” in Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) , Stroudsburg, PA, USA, 2021.
[15] N. Lhasiw, T. Tanantong, and N. Sanglerdsinlapachai, “Thai Conversational Chatbot Classification Using BiLSTM and Data Augmentation,” in Communications in Computer and Information Science , Singapore: Springer Nature Singapore, 2023, pp. 127–141.
[16] YD Kumar, MP Lahkar, AK Singh, B. Dey, and U. Sharma, “InfoGenie: A Chatbot that Enhances Information Extraction Using Modern Natural Language Processing Techniques,” in Proceedings of the 1st International Conference on Cognitive & Cloud Computing , 2024, pp. 239–247.
[17] Y. Sofyan and AFI Arroyan, "Implementation of Natural Language Processing (NLP) in Developing a Chatbot Application for Classical Islamic Text Learning at Pesantren El-Huda El-Islamy," Journal TIFDA (Technology Information and Data Analytics) , vol. 2, no. 1, pp. 34–41, June. 2025.
[18] N. Sandu and E. Gide, “Adoption of AI-Chatbots to Enhance Student Learning Experience in Higher Education in India,” in 2019 18th International Conference on Information Technology Based Higher Education and Training (ITHET) , Jul. 2019.
[19] D. Ruswanti, D. Susilo, and R. Riani, “Implementation of CRISP-DM in Data Mining to Predict Income with the C.45 Algorithm,” Go Infotech: STMIK AUB Scientific Journal , vol. 30, no. 1, pp. 111–121, Jun. 2024.
[20] C. Padurariu and M.E. Breaban, “Dealing with Data Imbalance in Text Classification,” Procedia Comput Sci , vol. 159, pp. 736–745, 2019.