Threat Detection in Cloud Computing Using Machine Learning
DOI:
https://doi.org/10.5281/ijurd.v1i3.7Keywords:
Anomaly Detection, BoT-IoT, CSE-CIC-IDS2018, Intrusion Detection System, Self-Supervised Learning, ToN_IoT, Transformer ModelAbstract
Cloud computing has become a critical foundation for hosting large-scale applications and storing sensitive information, but its open and multi-tenant architecture exposes cloud infrastructures to a wide range of cyber threats. Traditional signature-driven intrusion detection systems provide limited defense against new and evolving attack strategies, creating a need for intelligent and adaptive threat detection mechanisms. This study presents a Transformer-based machine learning model enhanced with contrastive self-supervised training for cloud intrusion detection. Network-flow features are transformed into token sequences using embedding and positional encoding, enabling the model to capture relationships across heterogeneous traffic attributes. The proposed system is evaluated on large benchmark datasets, including CSE-CIC-IDS2018, ToN_IoT, and BoT-IoT. Experimental results show that the model achieves 99.21% accuracy, 98.87% macro F1-score, and 99.46% AUROC on CSE-CIC-IDS2018, outperforming existing machine learning baselines such as XGBoost, LightGBM, TabTransformer, and Bi-LSTM. Cross-dataset testing confirms strong generalization, with accuracy improving from 94.15% to 96.91% on ToN_IoT after limited fine-tuning. Low false-positive rates and millisecond-level inference latency indicate suitability for real-time deployment in cloud environments. The findings demonstrate that self-supervised Transformer architectures offer a scalable and effective solution for modern cloud threat detection.
References
[1] M. Alshamrani, A. Chowdhary, and D. Huang, “A survey on cloud security: Issues, challenges, and solutions,” IEEE Communications Surveys & Tutorials, vol. 24, no. 3, pp. 1653–1684, 2022.
[2] K. Kour, M. Singh, and G. S. Aujla, “Multi-tenant cloud security: Attack taxonomy and future research directions,” IEEE Access, vol. 10, pp. 114251–114270, 2022.
[3] M. Mahbooba, M. A. Jan, and M. Khan, “Limitations of signature-based intrusion detection in modern networks: A review,” IEEE Access, vol. 9, pp. 159600–159613, 2021.
[4] Saini, S.S., Sharma, L.S. Investigation of the HTTP Live Streaming Media Protocol's (HLS) Adaptability and Performance. J. Inst. Eng. India Ser. B 106, 1081–1089 (2025). https://doi.org/10.1007/s40031-024-01132-w
[5] S. Injadat, A. Moubayed, and A. Shami, “Machine learning towards intelligent intrusion detection in cloud environments,” IEEE Internet of Things Journal, vol. 9, no. 5, pp. 3604–3616, 2022
[6] Z. Long and F. Liu, “A transformer-based network intrusion detection approach for cloud security,” Journal of Cloud Computing, vol. 13, no. 1, 2024.
[7] Saini, S.S., Sharma, L.S. Comparative Analysis of MPEG-DASH and HLS Protocols: Performance, Adaptation, and Future Directions in Adaptive Streaming. J. Inst. Eng. India Ser. B (2025). https://doi.org/10.1007/s40031-025-01244-x
[8] T. Geneiatakis, S. Gisdakis, and N. Alexiou, “Online self-supervised deep learning for intrusion detection,” arXiv preprint arXiv:2306.13030, 2024.
[9] S. Hossain, M. M. Hassan, and G. Fortino, “A privacy-preserving self-supervised learning-based intrusion detection system for intelligent networks,” Computer Networks, vol. 245, 2025.
[10] The Canadian Institute for Cybersecurity, “CSE-CIC-IDS2018 dataset,” University of New Brunswick, Fredericton, Canada, 2018–2024. [Online]. Available: https://www.unb.ca/cic/datasets/index.html
[11] Saini, S.S., Sharma, L.S. Comparative Analysis of MPEG-DASH and HLS Protocols: Performance, Adaptation, and Future Directions in Adaptive Streaming. J. Inst. Eng. India Ser. B (2025). https://doi.org/10.1007/s40031-025-01244-x
[12] N. Koroniotis, N. Moustafa, and H. Janicke, “BoT-IoT: Realistic botnet dataset to evaluate IoT cyber security solutions,” Future Generation Computer Systems, vol. 150, pp. 737–753, 2024.
[13] Z. Long and F. Liu, “A Transformer-based Network Intrusion Detection Approach for Cloud Security,” Journal of Cloud Computing, vol. 13, no. 1, 2024.
[14] X. Huang, A. Khetan, M. Cvitkovic, and Z. Karnin, “TabTransformer: Tabular Data Modeling Using Contextual Embeddings,” arXiv:2012.06678, 2020.
[15] Y. Gorishniy, I. Rubachev, V. Khrulkov, and A. Babenko, “Revisiting Deep Learning Models for Tabular Data,” NeurIPS,
[16] G. Somepalli, M. Goldblum, A. Schwarzschild, and T. Goldstein, “SAINT: Improved Neural Networks for Tabular Data via Row Attention and Contrastive Pre-Training,” arXiv:2106.01342, 2021.
[17] T. Geneiatakis et al., “Online Self-Supervised Deep Learning for Intrusion Detection,” arXiv:2306.13030, 2024.
[18] S. Hossain, M. M. Hassan, A. Gumaei, and G. Fortino, “A Privacy-Preserving Self-Supervised Learning-Based Intrusion Detection System for 5G-V2X Networks,” Computer Networks, vol. 245, 2025.
[19] University of New Brunswick, “CSE-CIC-IDS2018 Dataset,” 2018–2024.
[20] UNSW Canberra, “ToN_IoT Dataset,” 2020–2021.
[21] C. R. Aman, ‘Disease predictive models for healthcare by using data mining techniques: state of the art’, SSRG Int J Eng Trends Technol, vol. 68, no. 10, 2020, Accessed: Nov. 25, 2025. [Online]. Available: https://www.researchgate.net/profile/Aman-Darolia/publication/345397957_Disease_Predictive_Models_for_Healthcare_by_using_Data_Mining_Techniques_State_of_the_Art/links/63b599fa03aad5368e64aa42/Disease-Predictive-Models-for-Healthcare-by-using-Data-Mining-Techniques-State-of-the-Art.pdf
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Shyam Sunder Saini

This work is licensed under a Creative Commons Attribution 4.0 International License.