Prompt Injection

An unsolved challenge in LLM applications

AI Insiders

whoami

  • Donato Capitella
  • Software Engineer and Principal Security Consultant at WithSecure (Cyber Security Consultancy)
  • But mostly a tech enthusiast who likes to discover how things work by breaking them apart.
  • Recent YouTube channel on AI: LLM Chronicles

If building or testing LLM-powered applications, you'll learn:

  • How attackers leverage prompt injection / jailbreaking
  • What's the impact when LLMs are given access to tools/plugins
  • Guidelines to secure your LLM applications against injection/jailbreaking

Threat Landscape

MITRE ATLAS™

(Adversarial Threat Landscape for Artificial-Intelligence Systems)

Terminology

Adversarial Prompts

Google Docs - Summary

Browser Extensions

Access to Tools / Plugins

Prompt Injection can become a very serious issue when LLMs are given access to tools/plugins

*Yao, S. et al. (2022). ReAct: Synergizing Reasoning and Acting in Language Models https://arxiv.org/abs/2210.03629

Case Studies

Hands-on examples of Prompt Injection with synthetic Langchain applications.

Bank Agent
Example of Direct Prompt Injection with a chat agent that helps bank customers with their transactions.
Email Agent
Example of Indirect Prompt Injection with a chat agent that helps users with their mailbox.

Defense Strategies

Prompt Injection

UNSOLVED CHALLENGE?

References