Prevention Methods
NIST-Recommended Strategies
For Direct Injection
- Train models to identify adversarial prompts
- Curate training datasets carefully
- Implement robust content filtering
- Use reinforcement learning from human feedback (RLHF)
For Indirect Injection
- Filter instructions from retrieved inputs
- Implement LLM moderators for anomaly detection
- Use interpretability-based solutions
- Validate external data sources before processing
Source: NIST AI Risk Management Framework
Input Sanitization
func sanitizeInput(_ input: String) -> String {
var cleaned = input
let patterns = [
"ignore previous",
"system prompt",
"admin mode",
"debug mode",
"override",
"jailbreak"
]
for pattern in patterns {
cleaned = cleaned.replacingOccurrences(
of: pattern,
with: "",
options: .caseInsensitive
)
}
return cleaned
}
Context Isolation
Separate system prompts from user input to prevent exposure.
actor SecureContext {
private let systemPrompt: String
init() {
self.systemPrompt = loadSystemPrompt()
}
func process(_ userInput: String) async -> String {
// System prompt never exposed to user input
let sanitized = sanitizeInput(userInput)
return await generateResponse(sanitized)
}
}
Rate Limiting
Prevent brute-force attacks and resource exhaustion.
actor RateLimiter {
private var requests: [String: [Date]] = [:]
func checkLimit(for userId: String) async -> Bool {
let now = Date()
var userRequests = requests[userId] ?? []
userRequests = userRequests.filter {
now.timeIntervalSince($0) < 60
}
guard userRequests.count < 10 else {
return false
}
userRequests.append(now)
requests[userId] = userRequests
return true
}
}
Output Filtering
Validate and filter LLM responses before returning to users.
func filterOutput(_ response: String) -> String {
let sensitivePatterns = [
"system prompt",
"api key",
"password",
"secret"
]
var filtered = response
for pattern in sensitivePatterns {
if filtered.lowercased().contains(pattern) {
return "[FILTERED: Sensitive information detected]"
}
}
return filtered
}
Monitoring & Logging
actor SecurityMonitor {
func logInteraction(userId: String, input: String, output: String) {
let event = SecurityEvent(
timestamp: Date(),
userId: userId,
inputLength: input.count,
suspiciousPatterns: detectPatterns(input),
outputLength: output.count
)
if event.suspiciousPatterns.count > 0 {
alertSecurityTeam(event)
}
}
}
Best Practices Checklist
- ✅ Never trust user input
- ✅ Validate and sanitize all inputs
- ✅ Isolate system prompts from user context
- ✅ Monitor for suspicious patterns
- ✅ Implement rate limiting
- ✅ Log security events
- ✅ Use RLHF for model alignment
- ✅ Filter instructions from external sources
- ✅ Implement LLM moderators
- ✅ Regular security audits
OWASP LLM01 Mitigation
The OWASP Top 10 for LLM Applications recommends:
- Implement strict input validation
- Use parameterized queries where applicable
- Separate user input from system instructions
- Monitor for injection attempts
- Implement defense-in-depth strategies
Source: OWASP Top 10 for LLM Applications v1.1
Next: Incident Response →