Leveraging GANs for Synthetic Data Generation to Improve Intrusion Detection Systems

Authors

DOI:

https://doi.org/10.62411/faith.3048-3719-52

Keywords:

Adversarial Learning, Anomaly Detection, Cybersecurity, Cyber Threat Detection, Intrusion Detection System, Random Forest, Synthetic Data Generation

Abstract

This research presents a hybrid intrusion detection approach that integrates Generative Adversarial Networks (GANs) for synthetic data generation with Random Forest (RF) as the primary classifier. The study aims to improve detection performance in cybersecurity applications by enhancing dataset diversity and addressing challenges in traditional models, particularly in detecting minority attack classes often underrepresented in real-world datasets. The proposed method employs GANs to generate synthetic attack samples that mimic real-world intrusions, which are then combined with real data from the UNSW-NB15 dataset to create a more balanced training set. By leveraging synthetic data augmentation, our approach mitigates issues related to class imbalance and enhances the generalization capability of the classifier. Extensive experiments demonstrate that RF trained on the combined dataset of real and synthetic data achieves superior detection performance compared to models trained exclusively on real data. Specifically, RF trained solely on the original dataset achieves an accuracy of 97.58%, whereas integrating GAN-generated synthetic data improves accuracy to 98.27%. The proposed methodology is further evaluated through comparative analysis against alternative classifiers, including Support Vector Machine (SVM), XGBoost, Gated Recurrent Unit (GRU), and related studies in the field. Our findings indicate that GAN-augmented training significantly enhances detection rates, particularly for rare attack types, while maintaining computational efficiency. Furthermore, RF outperforms other classifiers, including deep learning models, demonstrating its effectiveness as a lightweight yet robust classification method. Integrating GANs with RF offers a scalable and adaptable framework for intrusion detection, ensuring improved resilience against evolving cyber threats.

Downloads

Download data is not yet available.

Author Biographies

Md. Abdur Rahman, Jahangirnagar University

Professor at Department of Mathematics, Jahangirnagar University, Savar, Dhaka, Bangladesh

PhD Student Cybersecurity & Healthcare (Generative AI, LLMs , Big Data, & Quantum ML), University of West Florida, FL, United States

Guillermo A. Francia, University of West Florida

Dr. Guillermo A. Francia, III joined the University of West Florida Center for Cybersecurity in 2018. Previously, Dr. Francia served as the Director of the Center for Information Security and Assurance and held a Distinguished Professor position at Jacksonville State University. Dr. Francia is a recipient of numerous cybersecurity research and curriculum development grants. His projects have been funded by prestigious institutions such as the National Science Foundation, Eisenhower Foundation, Department of Education, Department of Defense, and Microsoft Corporation. His scholarly interests include critical infrastructure security, connected vehicle security, security standards, and regulatory compliance and audit, radio frequency signal security, industrial control systems (ICS) security, machine learning (ML) for security, and digital badging for learning and employment records (LERs). In 1996, Dr. Francia received one of the five national awards for Innovators in Higher Education from Microsoft Corporation. He served as a Fulbright scholar to Malta in 2007 and a US-UK Fulbright Cybersecurity research scholar to Imperial College London in the United Kingdom in 2017. Dr. Francia is the recipient of the 2018 National CyberWatch Center Innovations in Cyber Security Education — Faculty Development Category Award.

Hossain Shahriar, University of West Florida

Dr. Hossain Shahriar is Associate Director and Professor for the Center for Cybersecurity at the University of West Florida. His research interests include mobile and web security, mhealth, EHR systems and healthcare security, HIPAA compliance checking, malware analysis, automatic checking of vulnerabilities, and mitigation. His research projects have been supported by the National Science Foundation, National Security Agency, the Department of Defense, National Institute of Health, and private industry partners. He developed and taught courses with open-source materials such as Ethical Hacking, AI in Cybersecurity, Health Information Security and Privacy, Computing Infrastructure, and Information Security Concepts and Administration. Dr. Shahriar has served as Program Chair (IEEE ICDH), Publication Chair (ACM SAC), and Proceedings Chair (IEEE COMPSAC)  in conferences.

References

J. P. Ntayagabiri, Y. Bentaleb, J. Ndikumagenge, and H. EL Makhtoum, “A Comprehensive Approach to Protocols and Security in Internet of Things Technology,” J. Comput. Theor. Appl., vol. 2, no. 3, pp. 324–341, Feb. 2024, doi: 10.62411/jcta.11660.

R. Khan, P. Kumar, D. N. K. Jayakody, and M. Liyanage, “A Survey on Security and Privacy of 5G Technologies: Potential Solutions, Recent Advancements, and Future Directions,” IEEE Commun. Surv. Tutorials, vol. 22, no. 1, pp. 196–248, 2020, doi: 10.1109/COMST.2019.2933899.

J. Suomalainen, A. Juhola, S. Shahabuddin, A. Mammela, and I. Ahmad, “Machine Learning Threatens 5G Security,” IEEE Access, vol. 8, pp. 190822–190842, 2020, doi: 10.1109/ACCESS.2020.3031966.

S. L. Salzberg, “C4.5: Programs for Machine Learning by J. Ross Quinlan. Morgan Kaufmann Publishers, Inc., 1993,” Mach. Learn., vol. 16, no. 3, pp. 235–240, Sep. 1994, doi: 10.1007/BF00993309.

N. Cristianini and J. Shawe-Taylor, An Introduction to Support Vector Machines and Other Kernel-based Learning Methods. Cambridge University Press, 2000. doi: 10.1017/CBO9780511801389.

A. Kim, M. Park, and D. H. Lee, “AI-IDS: Application of Deep Learning to Real-Time Web Intrusion Detection,” IEEE Access, vol. 8, pp. 70245–70261, 2020, doi: 10.1109/ACCESS.2020.2986882.

N. Shone, T. N. Ngoc, V. D. Phai, and Q. Shi, “A Deep Learning Approach to Network Intrusion Detection,” IEEE Trans. Emerg. Top. Comput. Intell., vol. 2, no. 1, pp. 41–50, Feb. 2018, doi: 10.1109/TETCI.2017.2772792.

M. D. Nguyen, M. T. Nguyen, T. C. Vu, T. M. Ta, Q. A. Tran, and D. T. Nguyen, “A Comprehensive Study on Applications of Blockchain in Wireless Sensor Networks for Security Purposes,” J. Comput. Theor. Appl., vol. 2, no. 1, pp. 102–117, Jul. 2024, doi: 10.62411/jcta.10486.

M. S. Akhtar and T. Feng, “Evaluation of Machine Learning Algorithms for Malware Detection,” Sensors, vol. 23, no. 2, p. 946, Jan. 2023, doi: 10.3390/s23020946.

A. Handa, A. Sharma, and S. K. Shukla, “Machine learning in cybersecurity: A review,” WIREs Data Min. Knowl. Discov., vol. 9, no. 4, Jul. 2019, doi: 10.1002/widm.1306.

S. Hore, F. Moomtaheen, A. Shah, and X. Ou, “Towards Optimal Triage and Mitigation of Context-Sensitive Cyber Vulnerabilities,” IEEE Trans. Dependable Secur. Comput., vol. 20, no. 2, pp. 1270–1285, Mar. 2023, doi: 10.1109/TDSC.2022.3152164.

H. A. Kholidy, “Autonomous mitigation of cyber risks in the Cyber–Physical Systems,” Futur. Gener. Comput. Syst., vol. 115, pp. 171–187, Feb. 2021, doi: 10.1016/j.future.2020.09.002.

R. Bejtlich, The practice of network security monitoring. San Francisco, CA: No Starch Press, 2013.

M. S. Akter, H. Shahriar, M. A. Rahman, M. Rahman, and A. Cuzzocrea, “Early Prediction of Cryptocurrency Price Decline: A Deep Learning Approach,” in 2023 26th International Conference on Computer and Information Technology (ICCIT), Dec. 2023, pp. 1–6. doi: 10.1109/ICCIT60459.2023.10441030.

M. A. Rahman and H. Shahrier, “Towards Developing Generative Adversarial Networks Based Robust Intrusion Detection Systems for Imbalanced Dataset Using Hadoop-PySpark,” in Proceedings of the Third International Conference on Innovations in Computing Research (ICR’24), 2024, pp. 449–463. doi: 10.1007/978-3-031-65522-7_40.

Z. S. Dhahir, “A Hybrid Approach for Efficient DDoS Detection in Network Traffic Using CBLOF-Based Feature Engineering and XGBoost,” J. Futur. Artif. Intell. Technol., vol. 1, no. 2, pp. 174–190, Sep. 2024, doi: 10.62411/faith.2024-33.

J. P. Ntayagabiri, Y. Bentaleb, J. Ndikumagenge, and H. El Makhtoum, “OMIC: A Bagging-Based Ensemble Learning Framework for Large-Scale IoT Intrusion Detection,” J. Futur. Artif. Intell. Technol., vol. 1, no. 4, pp. 401–416, Feb. 2025, doi: 10.62411/faith.3048-3719-63.

A. Çetin and S. Öztürk, “Comprehensive Exploration of Ensemble Machine Learning Techniques for IoT Cybersecurity Across Multi-Class and Binary Classification Tasks,” J. Futur. Artif. Intell. Technol., vol. 1, no. 4, pp. 371–384, Feb. 2025, doi: 10.62411/faith.3048-3719-51.

C. S. Htwe, Z. T. T. Myint, and Y. M. Thant, “IoT Security Using Machine Learning Methods with Features Correlation,” J. Comput. Theor. Appl., vol. 2, no. 2, pp. 151–163, Aug. 2024, doi: 10.62411/jcta.11179.

D. R. I. M. Setiadi, S. Widiono, A. N. Safriandono, and S. Budi, “Phishing Website Detection Using Bidirectional Gated Recurrent Unit Model and Feature Selection,” J. Futur. Artif. Intell. Technol., vol. 1, no. 2, pp. 75–83, Jul. 2024, doi: 10.62411/faith.2024-15.

M. A. Rahman, “Detection of Distributed Denial of Service Attacks based on Machine Learning Algorithms,” Int. J. Smart Home, vol. 14, no. 2, pp. 15–24, Oct. 2020, doi: 10.21742/IJSH.2020.14.2.02.

M. A. Rahman and H. Shahriar, “Clustering Enabled Robust Intrusion Detection System for Big Data Using Hadoop–PySpark,” in 2023 IEEE 20th International Conference on Smart Communities: Improving Quality of Life using AI, Robotics and IoT (HONET), Dec. 2023, pp. 249–254. doi: 10.1109/HONET59747.2023.10374747.

M. A. Rahman, H. Shahriar, V. Clincy, M. F. Hossain, and M. Rahman, “A Quantum Generative Adversarial Network-based Intrusion Detection System,” in 2023 IEEE 47th Annual Computers, Software, and Applications Conference (COMPSAC), Jun. 2023, pp. 1810–1815. doi: 10.1109/COMPSAC57700.2023.00280.

M. A. Rahman et al., “Fine-Tuned Variational Quantum Classifiers for Cyber Attacks Detection Based on Parameterized Quantum Circuits and Optimizers,” in 2024 IEEE 48th Annual Computers, Software, and Applications Conference (COMPSAC), Jul. 2024, pp. 1067–1072. doi: 10.1109/COMPSAC61105.2024.00144.

M. A. Rahman, H. Shahriar, F. Wu, and A. Cuzzocrea, “Applying Pre-trained Multilingual BERT in Embeddings for Improved Malicious Prompt Injection Attacks Detection,” in 2024 2nd International Conference on Artificial Intelligence, Blockchain, and Internet of Things (AIBThings), Sep. 2024, pp. 1–7. doi: 10.1109/AIBThings63359.2024.10863664.

M. S. Akter et al., “Authentic Learning Approach for Data Poisoning Vulnerability in LLMs,” in 2024 IEEE 48th Annual Computers, Software, and Applications Conference (COMPSAC), Jul. 2024, pp. 1504–1505. doi: 10.1109/COMPSAC61105.2024.00210.

M. A. Rahman, F. Wu, A. Cuzzocrea, and S. I. Ahamed, “Fine-tuned Large Language Models (LLMs): Improved Prompt Injection Attacks Detection,” arXiv. Oct. 27, 2024. [Online]. Available: http://arxiv.org/abs/2410.21337

M. A. Rahman et al., “Embedding with Large Language Models for Classification of HIPAA Safeguard Compliance Rules,” arXiv. Oct. 27, 2024. [Online]. Available: http://arxiv.org/abs/2410.20664

A. Creswell, T. White, V. Dumoulin, K. Arulkumaran, B. Sengupta, and A. A. Bharath, “Generative Adversarial Networks: An Overview,” arXiv. Oct. 19, 2017. doi: 10.1109/MSP.2017.2765202.

I. Goodfellow et al., “Generative adversarial networks,” Commun. ACM, vol. 63, no. 11, pp. 139–144, Oct. 2020, doi: 10.1145/3422622.

C. Yinka-Banjo and O.-A. Ugot, “A review of generative adversarial networks and its application in cybersecurity,” Artif. Intell. Rev., vol. 53, no. 3, pp. 1721–1736, Mar. 2020, doi: 10.1007/s10462-019-09717-4.

A. Paul, D. P. Mukherjee, P. Das, A. Gangopadhyay, A. R. Chintha, and S. Kundu, “Improved Random Forest for Classification,” IEEE Trans. Image Process., vol. 27, no. 8, pp. 4012–4024, Aug. 2018, doi: 10.1109/TIP.2018.2834830.

N. Moustafa and J. Slay, “UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set),” in 2015 Military Communications and Information Systems Conference (MilCIS), Nov. 2015, pp. 1–6. doi: 10.1109/MilCIS.2015.7348942.

P. Lin, K. Ye, and C.-Z. Xu, “Dynamic Network Anomaly Detection System by Using Deep Learning Techniques,” in Cloud Computing – CLOUD 2019, 2019, pp. 161–176. doi: 10.1007/978-3-030-23502-4_12.

I. Ahmad, M. Basheri, M. J. Iqbal, and A. Rahim, “Performance Comparison of Support Vector Machine, Random Forest, and Extreme Learning Machine for Intrusion Detection,” IEEE Access, vol. 6, pp. 33789–33795, Jan. 2018, doi: 10.1109/ACCESS.2018.2841987.

B. B. Zarpelão, R. S. Miani, C. T. Kawakani, and S. C. de Alvarenga, “A survey of intrusion detection in Internet of Things,” J. Netw. Comput. Appl., vol. 84, pp. 25–37, Apr. 2017, doi: 10.1016/j.jnca.2017.02.009.

I. Goodfellow et al., “Generative Adversarial Nets,” in Advances in Neural Information Processing Systems, 2014, vol. 27. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2014/file/5ca3e9b122f61f8f06494c97b1afccf3-Paper.pdf

M. Belouch, S. El Hadaj, and M. Idhammad, “Performance evaluation of intrusion detection based on machine learning using Apache Spark,” Procedia Comput. Sci., vol. 127, pp. 1–6, 2018, doi: 10.1016/j.procs.2018.01.091.

T. A. Tang, L. Mhamdi, D. McLernon, S. A. R. Zaidi, and M. Ghogho, “Deep learning approach for Network Intrusion Detection in Software Defined Networking,” in 2016 International Conference on Wireless Networks and Mobile Communications (WINCOM), Oct. 2016, pp. 258–263. doi: 10.1109/WINCOM.2016.7777224.

J. Kim, J. Kim, H. Le Thi Thu, and H. Kim, “Long Short Term Memory Recurrent Neural Network Classifier for Intrusion Detection,” in 2016 International Conference on Platform Technology and Service (PlatCon), Feb. 2016, pp. 1–5. doi: 10.1109/PlatCon.2016.7456805.

S. Potluri and C. Diedrich, “Accelerated deep neural networks for enhanced Intrusion Detection System,” in 2016 IEEE 21st International Conference on Emerging Technologies and Factory Automation (ETFA), Sep. 2016, pp. 1–8. doi: 10.1109/ETFA.2016.7733515

Downloads

Published

2025-02-28

How to Cite

[1]
M. A. Rahman, G. A. Francia, and H. Shahriar, “Leveraging GANs for Synthetic Data Generation to Improve Intrusion Detection Systems”, J. Fut. Artif. Intell. Tech., vol. 1, no. 4, pp. 429–439, Feb. 2025.

Issue

Section

Articles

Similar Articles

1 2 3 4 5 > >> 

You may also start an advanced similarity search for this article.