Modeling Attacker Behavior with RL Agents to Anticipate Critical Vulnerability Paths
Date:
In this talk, I presented the key contributions of my PhD research on a proactive approach to cyber-defense: simulating attacker behavior with reinforcement learning (RL) agents to anticipate and identify the most critical vulnerability paths in networks guided by a desired threat model. These agents navigate the network by selecting vulnerabilities, aiming to uncover the most critical attack paths. The talk focuses, in particular, on two main contributions: (1) Continuous CyberBattleSim—our extension of the Microsoft’s CyberBattleSim simulator—supporting more realistic scenarios of vulnerability allocations and cyber-terrain, with automated scenario transitions, enabling more effective training and evaluation of RL agents; (2) Strategies for improving the generalization and scalability of RL agents by reformulating their observation and action spaces. This includes leveraging embedding spaces with language models for vulnerability representations and graph neural networks for network topology representations. Through these contributions, we move toward more scalable and generalizable AI-driven defense solutions that anticipate attacker behavior and strengthen the resilience of dynamic network environments—for example, by using predicted attack paths to prioritize vulnerability patching.