PHISHING EMAIL DETECTION USING LARGE LANGUAGE MODELS (LLMS): A PERFORMANCE EVALUATION OF QWEN AND GEMINI

Main Article Content

Andyana Muhandhatul Nabila
Moh Sulthan Arief Rahmatullah

Abstract

The increasing complexity of network infrastructure and the increasing sophistication of phishing attacks require advanced cybersecurity solutions. Artificial Intelligence for IT Operations (AIOps) integrates big data analytics, machine learning and automation to improve real-time detection and response to security threats. This study evaluates the zero-shot performance of Large Language Models (LLMs) - Gemini 2.5 Pro, Gemini 2.5 Flash, and Qwen 3 - in detecting phishing emails in an AIOps environment at Institut Teknologi Sepuluh Nopember (ITS). The findings show different strengths: Gemini 2.5 Pro achieved 99.8% accuracy in identifying legitimate emails, minimizing false positives and workflow disruption, while Gemini 2.5 Flash excelled in detecting phishing attempts with 89.1% accuracy, prioritizing threat prevention. Qwen 3 performed poorly, most likely due to its lack of alignment with the nuances of English-language phishing. Achieved without refinement, these results highlight LLM's out-of-the-box efficacy for cybersecurity, offering an accessible and high-performance tool for organizations with limited AI resources. This study underscores the potential of LLM in AIOps to improve automated security monitoring and incident response, advocating for a layered approach that combines smart technology, user training, and organizational policies to effectively combat evolving phishing threats.

Article Details

Section

Articles

Author Biographies

Andyana Muhandhatul Nabila, Sepuluh Nopember Institute of Technology

Department of Information Technology, Faculty of Intelligent Electrical and Informatics Technology, Sepuluh Nopember Institute of Technology

Moh Sulthan Arief Rahmatullah, Sepuluh Nopember Institute of Technology

Department of Information Technology, Faculty of Intelligent Electrical and Informatics Technology, Sepuluh Nopember Institute of Technology

How to Cite

PHISHING EMAIL DETECTION USING LARGE LANGUAGE MODELS (LLMS): A PERFORMANCE EVALUATION OF QWEN AND GEMINI. (2025). Kohesi: Jurnal Sains Dan Teknologi, 8(5), 81-90. https://doi.org/10.2238/v2btct32

Similar Articles

You may also start an advanced similarity search for this article.