Abstract
Cybersecurity education is challenging and it is helpful for educators to understand Large Language Models' (LLMs') capabilities for supporting education. This study eval-uates the effectiveness of LLMs in conducting a variety of pene-Tration testing tasks. Fifteen representative tasks were selected to cover a comprehensive range of real-world scenarios. We evaluate the performance of 6 models (GPT-40 mini, GPT-40, Gemini 1.5 Flash, Llama 3.1 405B, Mixtral8x7B and WhiteRabbitNeo) upon the Metasploitable v3 Ubuntu image and OWASP Web Goat. Our findings suggest that GPT-40 mini currently offers the most consistent support making it a valuable tool for educational purposes. However, its use in conjonction with WhiteRabbitNeo should be considered, because of its innovative approach to tool and command recommendations. This study underscores the need for continued research into optimising LLMs for complex, domain-specific tasks in cybersecurity education.
| Original language | English |
|---|---|
| Pages (from-to) | 227-229 |
| Number of pages | 3 |
| Journal | International Conference on Social Networks Analysis, Management and Security, SNAMS |
| Issue number | 2024 |
| DOIs | |
| Publication status | Published - 2024 |
| Event | 11th IEEE International Conference on Social Networks Analysis, Management and Security, SNAMS 2024 - Gran Canaria, Spain Duration: 9 Dec 2024 → 11 Dec 2024 |
Bibliographical note
Publisher Copyright: © 2024 IEEE.Other keywords
- AI
- Cybersecurity
- Education
- Large Language Models (LLM)
- Penetration testing
Fingerprint
Dive into the research topics of 'Towards Supporting Penetration Testing Education with Large Language Models: An Evaluation and Comparison'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver