Lock Up Your LLMs: Pulling the Plug

cover
18 Jul 2024

It sounds like science fiction, but with all the hype around AI, the danger of ‘kidnapping’ is worth talking about.


It goes like this - companies who can afford to build their own internal LLMs throw all of their valuable IP (intellectual property, covering everything from trade secret designs to marketing campaigns and product strategies) into the model for it to generate relevant responses. Essentially, while it’s still a non-sentient AI model, it’s also a repository of all the most valuable information the company has.


This makes it a fantastic target for a criminal attacker, or even an unethical competitor. If it can be poisoned to generate bad responses, cloned to give the attacker more insider knowledge than they know what to do with, or even somehow locked down for ransom, then it can do a huge amount of damage to the company.


Companies struggle to secure their systems, with regular headlines showing large enterprises being breached. An LLM model, containing everything about the company, is going to quickly become a favoured target for virtual ‘kidnapping.’


Whether it’s a case of adapting ransomware to encrypt models, exfiltrating them and destroying the local copy, or subverting them into an insider threat (say, if the LLM has been integrated with the company’s invoicing and finance systems without effective human oversight), the potential for harm is dramatic.


Making it worse, the value of these models will come from their availability and accessibility within the company. If no one can use them, they’re nothing but a resource sink. They must be used effectively to have value, and that means they must be accessible. How do we match the need for availability with the need for security on these systems?

Is Software Security Enough?

Maybe software-defined networks could be a solution? The problem is that any software solution is inherently vulnerable to compromise. And if we’re talking about an LLM which has been integrated with administrative systems (this is already being done, the ethics and risks of this are a whole article in themselves - maybe a book), then it would be simplicity itself for a malicious party to prompt the AI to create a way in for further attacks.


Software solutions, and classic devices such as firewalls and data diodes, definitely have a place in a secure AI system design, but something with so much value to a company justifies a much more absolute approach. If we go old-school, what we’d talk about as the last resort for any attack is pulling the plug. Ultimately, unless an attacker can change the laws of physics (or is sitting inside the data centre, which is unlikely) unplugging the network is pretty unbeatable.


The problem is that unplugging and reconnecting the network takes time - making a phone call to your network engineer to plug it in or out every time you want to send an authorised prompt doesn’t seem a great option.

The Goldilock FireBreak

I recently came across a security device which impressed me, which doesn’t happen often these days. It wasn’t because of its brand new quantum dark web AI firewall technology, as so many vendors advertise these days. Instead it was because it’s a company which has taken a very old idea, thought about it, and brought it bang up to date in a way that opens up a lot of possibilities for secure network topologies.


The approach is simple, and pretty unbeatable (obviously nothing is 100% secure, but the attack vectors against a physical disconnect are vanishingly sparse). Essentially the FireBreak (as opposed to firewall, of course) unplugs the network cable on command - or plugs it back in. It’s not quite having someone on-call in the data centre to manipulate cables on demand, but that’s an easy way to think about it - except the person can plug or unplug the cables in a fraction of a second in response to prompts from any method you choose.


Often management networks are separated only by a firewall, whereas FireBreak itself has a control plane completely out-of-band from the network. I can easily imagine tying together FireBreak with your ticketing system to only physically connect the management planes during defined change windows, making them unavailable even to the most privileged attacker without going through a documented process.


“Operating in a so-called ‘post-breach era’, the security industry seems resigned to the idea that cyber-attacks are a matter of ‘when’ and not ‘if’. At Goldilock, we wanted to find a way to empower individuals and organisations to take more accountability and control over what happens to their data. Our flagship product, FireBreak, was born out of this idea. Its hardware-based approach enables users to physically segment all digital assets, from LLMs to entire networks, remotely, instantly and without using the internet.


This allows users to physically connect and disconnect when required, from wherever they are in the world, hiding any assets from view and enhancing their existing depth of defence, including against AI, which is essentially just a type of software.


Ultimately, we want people to rethink what needs to be kept online and see a move away from the "always-on" model that can leave sensitive data exposed. Because any data kept online is at risk of being breached. With more and more companies training their own LLM models in-house, a new realm of security considerations needs to be made. Network segmentation should be part of a holistic cybersecurity strategy that keeps bad actors out and ensures they can be stopped in their tracks if they do breach external parameters."


-- Tony Hasek, CEO and Co-Founder of Goldilock


So how does this help with the LLM kidnappings?


LLM models are big. No one is going to clone one in seconds. They can also be used asynchronously - they don’t depend on a constant connection to generate a response to a prompt and could easily reconnect and send it when finished instead of working in an interactive session.


There is no need for a user to have an active session to the LLM while they’re generating a prompt. There’s no need for them to have an active connection while they’re waiting for the response to be generated. Connecting only for the fraction of a second needed to send a prompt or receive the response is one, very effective, way to minimise the exposure time to threats.


This disconnect/reconnect happens too quickly for humans to be aware of any delay, and LLMs have very few use cases which are time critical on the level of microseconds.


It’s not the only layer of security to apply, and there are many other ways it can be applied, but it’s definitely one worth thinking about.