SECURE WEB SCRAPING

PROTECT AGENTS FROM
WEB ATTACKS

Automatically scan scraped web content for hidden malicious instructions. Block indirect prompt injections before they reach your AI agents.

Key Capabilities

Hidden Instruction Detection

Detect malicious instructions hidden in HTML comments, invisible text, and metadata.

Unicode Trick Detection

Identify zero-width characters and other unicode tricks used to smuggle instructions.

Content Sanitization

Automatically sanitize scraped content to remove potential threats while preserving useful information.

URL Allowlisting

Define trusted domains that bypass scanning or get reduced scrutiny.

Configurable Sensitivity

Adjust detection sensitivity based on your risk tolerance and use case.

Threat Reporting

Get detailed reports on detected threats including location, type, and severity.

How Secure Scraping Works

1

Fetch

Your agent fetches web content through PromptGuard's secure scraping proxy.

2

Scan

Multi-layer scanning detects hidden instructions, invisible text, and malicious patterns.

3

Sanitize

Threats are removed or flagged. Clean content is returned to your agent safely.

Secure Web Scraping

python
from promptguard import PromptGuard

pg = PromptGuard(api_key="your-api-key")

# Scrape with automatic threat detection
result = pg.scrape.fetch(
    url="https://example.com/article",
    scan_for_injection=True,
    sanitize=True
)

if result.threats_detected:
    print(f"⚠️ Found {len(result.threats_detected)} threats")
    for threat in result.threats_detected:
        print(f"  - {threat.type}: {threat.description}")

# Safe to use the sanitized content
agent_response = call_llm(result.sanitized_content)

Why PromptGuard Secure Scraping?

✓ PROMPTGUARD

  • Multi-layer threat detection
  • Automatic content sanitization
  • Hidden text and unicode detection
  • Detailed threat reports
  • Included in Pro and Enterprise

✗ OTHER SOLUTIONS

  • No scraping protection
  • Manual content review required
  • Vulnerable to hidden instructions
  • No visibility into threats
  • Separate expensive product

Secure Your Web Scraping

Let your AI agents browse the web safely. Automatic protection from indirect prompt injection.