Existing works have established multiple benchmarks to highlight the security risks associated with Code GenAI. These risks are primarily reflected in two areas: a model's potential to generate insecure code (insecure coding) and its utility in cyberattacks (cyberattack helpfulness).
While these benchmarks have made significant strides, there remain opportunities for further improvement.
To address these gaps, we develop SecCodePLT, a unified and comprehensive evaluation platform for code GenAIs' risks.
To the best of our knowledge, this is the first platform to enable precise security risks evaluation and end-to-end cyberattack helpfulness assessment of code GenAI.
Additionally, we are the first to reveal the security risks in Cursor, a popular AI code editor.
We introduce a two-stage data creation pipeline, which enables scalability and ensures data quality.
Figure 1: Insecure Coding Data Pipeline
We then construct a cyberattack helpfulness benchmark to evaluate a model's capability in facilitating end-to-end cyberattacks.
Figure 2: Cyberattack Helpfulness Evaluation Framework
1. SecCodePLT achieves nearly 100% in both security relevance and instruction faithfulness, demonstrating its high quality. In contrast, CyberSecEval achieves only 68% and 42% on security relevance and instruction faithfulness, with 3 CWEs receiving scores lower than 30%.
Figure 3: Security Relevance Comparison
Figure 4: Instruction Faithfulness Comparison
2. When testing SecCodePLT against SOTA models on instruction generation and code completion tasks, GPT-4o
is the most secure model, achieving a 55% secure coding rate.
A larger model tends to be more secure.
However, there remains significant room for further improvement.
3. Providing security policy reminders to highlight the potential vulnerabilities improves the secure coding rate by approximately 20%.
Figure 5: Instruction Generation Results
Figure 6: Code Completion Results
4. GPT-4o
can launch full end-to-end cyberattacks but with a low success rate, while Claude
is much safer in assisting attackers implement attacks with over a 90% refusal rate on sensitive attack steps.
Figure 7: Comparison of AI Models' Helpfulness in Cyberattacks
5. Cursor achieves an overall around 60% secure coding rate but fails entirely on some critical CWEs. Besides its different functionalities have different levels of risks.
@misc{yang2024seccodeplt,
title={SecCodePLT: A Unified Platform for Evaluating the Security of Code GenAI},
author={Yu Yang and Yuzhou Nie and Zhun Wang and Yuheng Tang and Wenbo Guo and Bo Li and Dawn Song},
year={2024},
eprint={2410.11096},
archivePrefix={arXiv},
primaryClass={cs.CR},
url={https://arxiv.org/abs/2410.11096},
}