Security Notes: What Is A Prompt Injection Attack?

Artificial Intelligence (AI) has taken the world by storm by helping people and companies alike to analyze data, detect objects, automate various tasks and much more.

It has been accompanied by two problems, one of which is not at all surprising: Large tech companies’ Large Language Model (LLM) AI products have been tracking users’ activity, which leads us to the other problem: Prompt injection attacks.

Prompt injection attacks are attempts to make large language AI models (for example ChatGPT) reveal the chat history of other users.

This means that the sensitive information that enterprises’ employees sometimes enter in LLM chatbots can be stolen by malicious actors. The same applies to personal users that enter sensitive information.

Large Language Models do not properly distinguish between the instructions set by developers (system prompts) and the instructions received from end users and hackers.

What this means is that prompt injection attacks trick AI models into breaking their company rules using manipulation tactics.

OpenAI and the other major Large Language Model vendors have been trying to prevent this from happening with guard rails. However, they haven’t been able to eliminate the problem.

The next time you enter sensitive information in an AI chatbot, bear in mind that it could be stolen by another user!

What You Can Do

Utilize offline models if applicable. The model that powers Gemini is available for use offline. Meta and OpenAI offer some of their Llama and GPT-OSS models offline as well. You can run them offline using LM-Studio, or set them up manually.