Skip to main content

Balancing Security and AI Accessibility

The challenge: Keep your site secure from threats while ensuring AI platforms can access and index your content. Here’s how to achieve both.
Key Principle: Security and AI visibility aren’t mutually exclusive. Proper configuration allows both.

Security Headers for AI-Friendly Sites

1. Content Security Policy (CSP)

Protect against XSS attacks while allowing AI crawlers:
<meta http-equiv="Content-Security-Policy" 
      content="default-src 'self'; 
               script-src 'self' 'unsafe-inline'; 
               style-src 'self' 'unsafe-inline'; 
               img-src 'self' data: https:;">
Or via HTTP header:
Content-Security-Policy: default-src 'self'; script-src 'self' 'unsafe-inline'
CSP headers don’t affect crawler access - they only control browser behavior.

2. CORS (Cross-Origin Resource Sharing)

Allow AI platforms to access your content:
// Node.js/Express example
app.use((req, res, next) => {
  res.header('Access-Control-Allow-Origin', '*');
  res.header('Access-Control-Allow-Methods', 'GET, POST');
  res.header('Access-Control-Allow-Headers', 'Content-Type');
  next();
});
For static content, configure your server:
# Nginx
location / {
    add_header Access-Control-Allow-Origin *;
}

3. X-Frame-Options

Prevent clickjacking while keeping content accessible:
X-Frame-Options: SAMEORIGIN
Or:
<meta http-equiv="X-Frame-Options" content="SAMEORIGIN">

4. X-Content-Type-Options

Prevent MIME type sniffing:
X-Content-Type-Options: nosniff

Rate Limiting for AI Crawlers

Balance protecting your server while allowing AI crawler access:
// Express rate limiting
const rateLimit = require('express-rate-limit');

// General rate limit
const generalLimiter = rateLimit({
  windowMs: 15 * 60 * 1000, // 15 minutes
  max: 100 // limit each IP to 100 requests per windowMs
});

// Generous limits for known AI crawlers
const botLimiter = rateLimit({
  windowMs: 15 * 60 * 1000,
  max: 1000, // More generous for bots
  skip: (req) => {
    const userAgent = req.get('User-Agent') || '';
    return /GPTBot|Claude-Web|PerplexityBot|Google-Extended/i.test(userAgent);
  }
});

app.use('/api/', generalLimiter);

Authentication Strategy

Protect sensitive areas while keeping public content accessible:
Keep these accessible to AI crawlers:
  • Homepage
  • Product pages
  • Blog posts
  • Documentation
  • Pricing pages
  • About/Contact pages
Require authentication for:
  • User dashboards
  • Admin panels
  • API endpoints (except public ones)
  • User data and profiles
  • Payment information
  • Internal tools
// Middleware to allow public paths
const publicPaths = [
  '/',
  '/blog',
  '/products',
  '/docs',
  '/pricing',
  '/about'
];

const requireAuth = (req, res, next) => {
  const isPublic = publicPaths.some(path => 
    req.path.startsWith(path)
  );
  
  if (isPublic) {
    return next();
  }
  
  // Check authentication for protected routes
  if (!req.user) {
    return res.status(401).json({ error: 'Unauthorized' });
  }
  
  next();
};

HTTPS and SSL/TLS

Always use HTTPS for security and SEO:
1

Get SSL certificate

Use Let’s Encrypt for free SSL certificates or purchase from a CA
2

Configure server

Enable HTTPS on your server (nginx, Apache, etc.)
3

Force HTTPS

Redirect all HTTP traffic to HTTPS
server {
    listen 80;
    server_name asklantern.com;
    return 301 https://$server_name$request_uri;
}
4

Update internal links

Ensure all internal links use HTTPS
AI platforms prefer HTTPS sites - it’s a trust signal for content quality.

robots.txt Security Considerations

Prevent access to sensitive paths:
# Block sensitive directories
User-agent: *
Disallow: /admin/
Disallow: /api/private/
Disallow: /.env
Disallow: /config/
Disallow: /backup/
Disallow: /database/

# Allow AI crawlers to public content
User-agent: GPTBot
User-agent: Claude-Web
User-agent: PerplexityBot
User-agent: Google-Extended
Allow: /
Disallow: /admin/
Disallow: /api/private/

# Reference security policy
# Security: https://asklantern.com/security.txt

security.txt File

Create a security.txt file at /.well-known/security.txt:
Contact: [email protected]
Expires: 2026-12-31T23:59:59.000Z
Encryption: https://asklantern.com/pgp-key.txt
Preferred-Languages: en
Canonical: https://asklantern.com/.well-known/security.txt
Policy: https://asklantern.com/security-policy
This helps security researchers report vulnerabilities responsibly.

API Security for AI Access

Secure your API while allowing AI platforms:
// API key validation
const validateApiKey = (req, res, next) => {
  const apiKey = req.headers['x-api-key'];
  
  // Check for valid API key
  if (!apiKey || !isValidApiKey(apiKey)) {
    return res.status(401).json({ 
      error: 'Invalid or missing API key' 
    });
  }
  
  next();
};

// Public endpoints (no auth required)
app.get('/api/public/stats', (req, res) => {
  res.json({ stats: getPublicStats() });
});

// Protected endpoints
app.get('/api/private/data', validateApiKey, (req, res) => {
  res.json({ data: getPrivateData() });
});

DDoS Protection

Protect against attacks while allowing legitimate AI crawlers:

Use CDN

Services like Cloudflare provide DDoS protection and bot management

Rate Limiting

Implement per-IP rate limits with exceptions for known good bots

Bot Detection

Use services to identify and block malicious bots while allowing AI crawlers

Monitor Traffic

Track unusual traffic patterns in your Lantern dashboard

Cloudflare Configuration

If using Cloudflare, configure bot protection:
  1. Security Level: Set to “Medium” (not “I’m Under Attack”)
  2. Bot Fight Mode: Enable with exceptions for known AI bots
  3. Rate Limiting: Configure rules that exclude verified bots
  4. Firewall Rules: Allow specific user agents:
(http.user_agent contains "GPTBot" or 
 http.user_agent contains "Claude-Web" or 
 http.user_agent contains "PerplexityBot" or 
 http.user_agent contains "Google-Extended") 
and not (cf.threat_score > 30)

Data Privacy Compliance

Ensure compliance while maintaining AI visibility:
  • Only expose public information to AI crawlers
  • Don’t include personal data in public pages
  • Implement cookie consent if tracking users
  • Provide privacy policy accessible to AI
  • Disclose data collection practices
  • Provide opt-out mechanisms
  • Don’t sell personal information
  • Make privacy policy accessible
  • Follow local data protection laws
  • Implement geo-blocking if needed
  • Provide localized privacy notices

Monitoring and Logging

Track security and bot activity:
// Log AI crawler access
const logBotAccess = (req, res, next) => {
  const userAgent = req.get('User-Agent') || '';
  
  if (/GPTBot|Claude-Web|PerplexityBot|Google-Extended/i.test(userAgent)) {
    console.log({
      timestamp: new Date(),
      bot: userAgent,
      path: req.path,
      ip: req.ip
    });
  }
  
  next();
};

app.use(logBotAccess);
Use Lantern to monitor which AI platforms are accessing your site and track any unusual patterns.

Security Checklist

1

Enable HTTPS

✅ SSL certificate installed and configured
2

Configure headers

✅ CSP, CORS, X-Frame-Options properly set
3

Update robots.txt

✅ Block sensitive paths, allow AI crawlers to public content
4

Implement rate limiting

✅ Protect against abuse, allow legitimate crawlers
5

Add security.txt

✅ Provide security contact information
6

Monitor access

✅ Track AI crawler activity in Lantern dashboard
7

Regular audits

✅ Review security configurations quarterly

Common Security Mistakes

Don’t do these:
  • ❌ Blocking all bots with aggressive security
  • ❌ Using “I’m Under Attack” mode permanently
  • ❌ Requiring login for all pages
  • ❌ Blocking AI crawlers by mistake
  • ❌ Exposing sensitive data in public pages
  • ❌ Ignoring HTTPS

Testing Your Security

SSL Test

Use SSL Labs to test SSL configuration

Security Headers

Check headers with Security Headers

Bot Access

Verify AI crawlers can access public content

Lantern Monitoring

Monitor crawler activity in your dashboard

Next Steps

View all technical guides

Return to the technical optimization overview