A novel type of cyber-attack, delivered through generative artificial intelligence (AI) services, has been developed by an international team of cybersecurity researchers. The self-propagating worm has the ability to infiltrate AI services, steal data, and distribute spam through email correspondence.
Observations on generative AI systems
Generative AI systems like OpenAI ChatGPT and Google Gemini are widely used for tasks such as event creation in calendars and product orders. However, cybersecurity researchers warn against potential threats posed by these systems. They have developed a new kind of worm attack, named Morris II after the first computer worm, Morris, which infected ten percent of all computers connected to the internet at the time in 1988. Morris II launches attacks on AI-based virtual assistants via email and steals data from these emails, avoiding protection measures put in place by ChatGPT and Gemini.
The researchers tested the new attack model in isolated environments and found that large language models’ multimodal characters, or their ability to work with text, images, and video, made the attack possible. Although generative AI worms have not yet been detected in practice, the researchers urge independent developers, startups, and tech companies to consider this threat.
Understanding the mechanism of these AI worms
Generative AI systems generally work by receiving textual commands, such as requests to answer a question or create an image. These commands can be used against the system, compelling it to ignore safety measures and generate inappropriate content. The working principle of generative AI worms revolves around ‘adversarial self-replicating prompts,’ where the AI model generates a command in response to a command, much like traditional SQL injection and buffer overflow attack schemes.
Demonstration of the worm attack
The researchers created an email service capable of receiving and sending messages through a generative AI. The service connects to ChatGPT, Gemini, and an open model called LlaVA, and then uses self-replicating textual instructions, or similar instructions embedded within an image file, to exploit AI vulnerabilities. A trial attack induced the compromised email service to carry out an online search and generate a response, facilitating ‘AI heist’ and data theft.
Possibility of data theft and the way forward
The researchers warned that this technique could be exploited to extract personal information, such as phone numbers, credit card numbers, and social security numbers from emails. They attributed the method’s success to flaws in AI ecosystem design and shared their findings with Google and OpenAI. While OpenAI confirmed the threat and their ongoing measures to strengthen system robustness, Google refused to comment. Experts recommend that users not grant AI privileges such as issuing emails on their behalf. All actions should require human approval. Even so, researchers predict that such generative AI worms will be operational within the next two or three years.
This post was last modified on 03/03/2024