GPUHammer attack exposes new risks for AI and cloud computing
A team of researchers from the University of Toronto has unveiled a novel hardware attack, dubbed "GPUHammer", that has exposed a significant vulnerability in NVIDIA graphics processing units (GPUs). This new attack variant, based on the infamous RowHammer technique, has demonstrated the ability to silently corrupt data and degrade the accuracy of artificial intelligence (AI) models, raising alarm across the cybersecurity and AI communities.
What is GPUHammer?
GPUHammer is a sophisticated evolution of the RowHammer attack, a hardware-level vulnerability first identified in dynamic random-access memory (DRAM) over a decade ago. Traditionally, RowHammer exploits the physical behaviour of DRAM by repeatedly accessing ("hammering") specific rows of memory cells, causing electrical interference that can flip bits in adjacent rows. This can result in data corruption, privilege escalation, and breaches of memory isolation.
While RowHammer attacks have previously focused on CPUs and system memory, GPUHammer marks the first successful demonstration of such an attack against discrete GPUs, specifically those using GDDR6 memory. The researchers targeted the NVIDIA A6000 GPU, a popular choice for AI workloads, and managed to induce multiple bit flips across several DRAM banks.
How does work?
The attack required overcoming significant technical challenges, including reverse-engineering proprietary GPU memory mappings and developing GPU-specific access patterns to maximise the effect. By carefully orchestrating memory accesses, the researchers were able to bypass existing mitigations such as target row refresh (TRR), which are designed to prevent such attacks in modern memory modules.
Once a bit flip is induced, the consequences can be severe. In proof-of-concept tests, a single bit flip was enough to degrade the accuracy of a deep neural network model trained on the ImageNet dataset from 80% to as low as 0.1%. Well-known neural network architectures such as AlexNet, VGG16, ResNet50, DenseNet161, and InceptionV3 were all found to be vulnerable. This means that attackers could potentially sabotage AI systems by corrupting their internal weights, rather than merely manipulating input data.
Why does this matter?
GPUs are the backbone of modern AI and machine learning, powering everything from autonomous vehicles to fraud detection systems and cloud computing platforms. The ability to silently corrupt AI models without direct access to their data or code represents a new class of threat. In shared GPU environments, such as cloud-based machine learning platforms or virtual desktop infrastructures, a malicious user could potentially launch GPUHammer against neighbouring workloads, affecting their reliability and integrity.
This attack vector is particularly concerning because it operates below the level of traditional security controls, evading detection by standard software-based defences. Silent corruption of AI models could lead to undetected errors, loss of trust in automated systems, and significant operational disruptions.
Industry response and mitigation
Following responsible disclosure in January 2025, NVIDIA acknowledged the vulnerability and issued a security advisory. The company is urging users to enable system-level error correction codes (ECC) as a primary defence. ECC works by adding redundant bits to memory, allowing single-bit errors to be detected and corrected before they cause harm.
Enabling ECC is particularly important for data centre and workstation GPUs handling sensitive AI workloads. However, it is not without drawbacks: enabling ECC can reduce memory capacity by around 6.25% and may introduce up to a 10% slowdown in machine learning inference tasks on affected GPUs. Newer NVIDIA GPUs, such as the H100 and RTX 5090, are not vulnerable to GPUHammer due to integrated on-die ECC, which provides robust protection against such attacks.
NVIDIA also recommends monitoring GPU error logs for ECC-related corrections, which can signal ongoing bit-flip attempts. Selective ECC activation for high-risk workloads and regular security reviews are advised to minimise performance impact while maintaining protection.
Broader implications for cybersecurity
The emergence of GPUHammer highlights the evolving landscape of hardware-based attacks and the need for holistic security approaches. As AI systems become more pervasive, attackers are increasingly targeting the underlying hardware to bypass traditional defences. The discovery of GPUHammer is expected to prompt a re-evaluation of security practices in both hardware design and AI deployment, with industry leaders and cloud providers working to patch susceptible architectures and update risk management strategies.
Experts warn that this is likely just the beginning. As attackers continue to innovate, the arms race between hardware security and exploitation will intensify, underscoring the importance of proactive research, responsible disclosure, and collaboration between academia, industry, and government.
GPUHammer represents a significant milestone in the evolution of hardware-level attacks, exposing new vulnerabilities in the infrastructure powering AI and cloud computing. While mitigations exist, the attack serves as a stark reminder that as technology advances, so too do the methods of those seeking to undermine it. Ongoing vigilance and investment in hardware security will be essential to safeguard the future of AI-driven innovation.