Lifestyle

Typographic Attacks on Vision-LLMs: Evaluating Adversarial Threats in Autonomous Driving Systems

AuthorSeptember 28, 2025

2 8 minutes read

Typographic Attacks on Vision-LLMs: Evaluating Adversarial Threats in Autonomous Driving Systems

Estimated reading time: 7 minutes

Vision-Large-Language Models (Vision-LLMs) are vital for autonomous driving but are highly susceptible to *typographic adversarial attacks*.
These attacks exploit the *auto-regressive capabilities* of Vision-LLMs, leading to dangerous misinterpretations of traffic signs and visual information.
New research introduces a *dataset-independent framework* and *linguistic augmentation schemes* to generate and understand realistic, transferable typographic threats.
Key mitigation strategies include implementing *adversarial training*, ensuring *multi-modal sensor redundancy*, and establishing *industry-wide security benchmarks*.
Existing Vision-LLMs, such as LLaVA, Qwen-VL, VILA, and Imp, have shown particular vulnerability to these specific types of attacks.

Understanding Typographic Attacks and Vision-LLMs
A Novel Framework for Real-World Threat Assessment
Real-World Example: The “Ghost” Speed Limit Sign
Mitigating the Threat: A Call to Action for Safer Autonomous Systems
Conclusion
FAQ

The rise of autonomous driving (AD) systems promises a future of safer, more efficient transportation. At the heart of this revolution are sophisticated artificial intelligence models, particularly Vision-Large-Language Models (Vision-LLMs). These advanced systems leverage their visual-language reasoning capabilities to perceive, predict, plan, and control vehicles, making real-time decisions in complex traffic scenarios. However, as with any cutting-edge technology, their integration introduces new vulnerabilities, specifically against an insidious threat known as typographic attacks.

Despite their impressive capabilities, Vision-LLMs are unfortunately not impervious against adversarial attacks that can misdirect their reasoning processes. This article delves into the critical issue of typographic attacks, exploring how they can compromise the reliability and safety of AD systems, and what steps can be taken to mitigate these risks. We’ll examine recent research that not only highlights these vulnerabilities but also proposes innovative frameworks for understanding and addressing them in realistic traffic environments.

Understanding Typographic Attacks and Vision-LLMs

Vision-LLMs represent a significant leap forward in AI, combining the power of large language models with advanced computer vision. They excel at tasks requiring both visual understanding and linguistic interpretation, such as answering questions about images, performing zero-shot optical character recognition, and providing textual justifications for complex scenarios. This ability to convey explicit reasoning steps on the fly makes them ideal candidates for integration into safety-critical AD systems, where transparency and reliability are paramount.

However, this very strength—their auto-regressive capabilities honed through large-scale pretraining with visual-language alignment—can be turned into a weakness. Typographic attacks exploit this by introducing subtle, visually manipulated text or symbols into an image that can trick the Vision-LLM into misinterpreting a scene. These attacks were first studied in the context of models like CLIP (Contrastive Language-Image Pre-training) and have evolved from developing general datasets to focusing on more targeted, real-world applications.

The potential implications for autonomous driving are severe. Imagine a scenario where a Vision-LLM misreads a stop sign as a speed limit sign, or misinterprets a warning label on an obstruction. Such errors, though stemming from seemingly minor visual perturbations, could lead to catastrophic outcomes, compromising not only the vehicle’s occupants but also other road users.

Verbatim Content from Research Paper

“Table of Links
Abstract and 1. Introduction
Related Work
2.1 Vision-LLMs
2.2 Transferable Adversarial Attacks

Preliminaries
3.1 Revisiting Auto-Regressive Vision-LLMs
3.2 Typographic Attacks in Vision-LLMs-based AD Systems

Methodology
4.1 Auto-Generation of Typographic Attack
4.2 Augmentations of Typographic Attack
4.3 Realizations of Typographic Attacks

Experiments

Conclusion and References

Abstract
Vision-Large-Language-Models (Vision-LLMs) are increasingly being integrated into autonomous driving (AD) systems due to their advanced visual-language reasoning capabilities, targeting the perception, prediction, planning, and control mechanisms. However, Vision-LLMs have demonstrated susceptibilities against various types of adversarial attacks, which would compromise their reliability and safety. To further explore the risk in AD systems and the transferability of practical threats, we propose to leverage typographic attacks against AD systems relying on the decision-making capabilities of Vision-LLMs. Different from the few existing works developing general datasets of typographic attacks, this paper focuses on realistic traffic scenarios where these attacks can be deployed, on their potential effects on the decision-making autonomy, and on the practical ways in which these attacks can be physically presented. To achieve the above goals, we first propose a dataset-agnostic framework for automatically generating false answers that can mislead Vision-LLMs’ reasoning. Then, we present a linguistic augmentation scheme that facilitates attacks at image-level and region-level reasoning, and we extend it with attack patterns against multiple reasoning tasks simultaneously. Based on these, we conduct a study on how these attacks can be realized in physical traffic scenarios. Through our empirical study, we evaluate the effectiveness, transferability, and realizability of typographic attacks in traffic scenes. Our findings demonstrate particular harmfulness of the typographic attacks against existing Vision-LLMs (e.g., LLaVA, Qwen-VL, VILA, and Imp), thereby raising community awareness of vulnerabilities when incorporating such models into AD systems. We will release our source code upon acceptance.
1 Introduction
Vision-Language Large Models (Vision-LLMs) have seen rapid development over the recent years [1, 2, 3], and their incorporation into autonomous driving (AD) systems have been seriously considered by both industry and academia [4, 5, 6, 7, 8, 9]. The integration of Vision-LLMs into AD systems showcases their ability to convey explicit reasoning steps to road users on the fly and satisfy the need for textual justifications of traffic scenarios regarding perception, prediction, planning, and control, particularly in safety-critical circumstances in the physical world. The core strength of VisionLLMs lies in their auto-regressive capabilities through large-scale pretraining with visual-language alignment [1], making them even able to perform zero-shot optical character recognition, grounded reasoning, visual-question answering, visual-language reasoning, etc. Nevertheless, despite their impressive capabilities, Vision-LLMs are unfortunately not impervious against adversarial attacks that can misdirect the reasoning processes [10]. Any successful attack strategies have the potential to pose critical problems when deploying Vision-LLMs in AD systems, especially those that may even bypass the models’ black-box characteristics. As a step towards their reliable adoption in AD, studying the transferability of adversarial attacks is crucial to raising awareness of practical threats against deployed Vision-LLMs, and to efforts in building appropriate defense strategies for them.

A Novel Framework for Real-World Threat Assessment

Recent groundbreaking research has aimed to fill the existing gaps in understanding typographic attacks within safety-critical systems. Unlike previous efforts that focused on general datasets, this new approach specifically investigates realistic traffic scenarios and the practical ways these attacks can be physically manifested. The methodology is threefold:

Dataset-Independent Framework: A novel framework has been introduced to automatically generate misleading answers. This framework is not tied to specific datasets, allowing for broader application in disrupting the reasoning processes of Vision-LLMs. By focusing on generating “false answers,” it aims to directly mislead the decision-making autonomy of the AI.
Linguistic Augmentation Schemes: To enhance the potency of these attacks, a sophisticated linguistic augmentation scheme has been developed. This scheme facilitates stronger typographic attacks at both image-level and region-level reasoning. Critically, it can be extended to target multiple reasoning tasks simultaneously, making the attacks more comprehensive and challenging to defend against.
Empirical Study in Semi-Realistic Scenarios: The research extends beyond theoretical attack generation by conducting a thorough study on how these attacks can be realized in physical traffic environments. This practical investigation provides invaluable insights into the effectiveness, transferability, and real-world feasibility of typographic attacks in traffic scenes.

Real-World Example: The “Ghost” Speed Limit Sign

Consider an autonomous vehicle approaching a construction zone with a temporary speed limit sign displaying “25 MPH.” A typographic attack could involve placing a subtly modified sticker or graffiti near the sign that, to the human eye, appears to be a mere imperfection or unrelated text. However, a Vision-LLM might interpret the visual noise in conjunction with the sign, processing “25 MPH” as “75 MPH” due to a strategically placed, almost imperceptible character alteration. This misinterpretation could lead the AD system to dangerously accelerate through the construction zone, demonstrating the immediate and severe safety risks posed by such vulnerabilities.

Mitigating the Threat: A Call to Action for Safer Autonomous Systems

The empirical findings from this research are stark: typographic attacks pose a particular harmfulness against existing Vision-LLMs like LLaVA, Qwen-VL, VILA, and Imp. These results serve as a crucial alert to the community regarding the vulnerabilities inherent in incorporating such models into AD systems. While the advanced capabilities of Vision-LLMs are undeniably transformative, their susceptibility to these subtle yet potent attacks demands immediate attention.

Building robust and resilient autonomous vehicles requires a proactive stance against adversarial threats. This involves not only understanding the nature of these attacks but also developing comprehensive defense strategies. Here are three actionable steps for researchers, developers, and policymakers:

Develop Robust Adversarial Training and Detection Mechanisms: Integrate advanced adversarial training techniques during Vision-LLM development to expose models to typographic perturbations. Simultaneously, research and implement real-time detection systems capable of identifying and flagging visually manipulated text or signs in traffic environments before they can influence decision-making.
Prioritize Multi-Modal Redundancy in Sensor Fusion: Do not rely solely on Vision-LLMs for critical decision-making based on visual text. Implement redundant systems that cross-reference visual information with other sensor data (e.g., LiDAR, radar, GPS map data, vehicle-to-infrastructure communication) to validate interpreted commands and warnings. If a Vision-LLM identifies a speed limit, a radar check for surrounding vehicle speeds or a map database check for typical speed limits in that area can act as a crucial failsafe.
Establish Industry-Wide Security Benchmarks and Standards: Collaborate across the autonomous driving industry to create standardized benchmarks for evaluating the resilience of Vision-LLMs against various adversarial attacks, including typographic ones. Regular, independent audits and penetration testing should become a mandatory part of the development and deployment lifecycle for AD systems.

Conclusion

The integration of Vision-LLMs into autonomous driving systems holds immense promise, but it also introduces complex security challenges. Typographic attacks, once a theoretical concern, are now demonstrably practical threats with the potential for severe real-world consequences. The research discussed here provides a vital roadmap for understanding these vulnerabilities and underscores the urgent need for a concerted effort to build more resilient AI systems for our roads.

By proactively addressing these adversarial threats through innovative research, robust defense mechanisms, and industry collaboration, we can ensure that the future of autonomous driving is not only intelligent and efficient but, most importantly, safe for everyone.

Learn More About AI Security in Autonomous Vehicles

This paper is available on arxiv under CC BY 4.0 DEED license.

Authors:
(1) Nhat Chung, CFAR and IHPC, A*STAR, Singapore and VNU-HCM, Vietnam;
(2) Sensen Gao, CFAR and IHPC, A*STAR, Singapore and Nankai University, China;
(3) Tuan-Anh Vu, CFAR and IHPC, A*STAR, Singapore and HKUST, HKSAR;
(4) Jie Zhang, Nanyang Technological University, Singapore;
(5) Aishan Liu, Beihang University, China;
(6) Yun Lin, Shanghai Jiao Tong University, China;
(7) Jin Song Dong, National University of Singapore, Singapore;
(8) Qing Guo, CFAR and IHPC, A*STAR, Singapore and National University of Singapore, Singapore.

Frequently Asked Questions (FAQ)

What are Vision-LLMs and why are they important for autonomous driving?

Vision-Large-Language Models (Vision-LLMs) are advanced AI systems that combine the capabilities of large language models with computer vision. They are crucial for autonomous driving because they can perceive visual information, interpret it linguistically, and provide explicit reasoning for decisions, aiding in perception, prediction, planning, and control in complex traffic scenarios.

How do typographic attacks work against Vision-LLMs?

Typographic attacks exploit the auto-regressive capabilities of Vision-LLMs. They involve introducing subtle, visually manipulated text, symbols, or perturbations into an image. These minor visual alterations can trick the Vision-LLM into misinterpreting a scene, such as misreading a stop sign or a speed limit, leading to potentially dangerous decisions.

What are the real-world implications of these attacks?

The real-world implications are severe. A Vision-LLM misinterpreting a traffic sign or warning due to a typographic attack could cause an autonomous vehicle to make critical errors, such as accelerating in a construction zone or failing to stop, risking the safety of occupants and other road users.

What strategies can mitigate typographic attack risks?

Mitigation strategies include developing robust adversarial training techniques for Vision-LLMs, implementing real-time detection systems for manipulated visuals, prioritizing multi-modal redundancy in sensor fusion (cross-referencing visual data with LiDAR, radar, GPS), and establishing industry-wide security benchmarks and standards for AD systems.

Which Vision-LLMs are most vulnerable to these attacks?

According to recent research, existing Vision-LLMs such as LLaVA, Qwen-VL, VILA, and Imp have demonstrated particular harmfulness and vulnerability to typographic attacks. This highlights the urgent need for addressing these threats in current and future AD system integrations.

AuthorSeptember 28, 2025

2 8 minutes read

Typographic Attacks on Vision-LLMs: Evaluating Adversarial Threats in Autonomous Driving Systems

Typographic Attacks on Vision-LLMs: Evaluating Adversarial Threats in Autonomous Driving Systems