Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Secure mode for Invisible prompt injection and emoji DoS #525

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

jthack
Copy link

@jthack jthack commented Feb 26, 2025

NOTE: I have NEVER used stagehand before. Also, AI wrote all of this. I am not a ts/js guy so please review it really well. THAT SAID, I'm super proud to add this because y'all are the GOAT of ai-based browsing AND I aspire to be the AI security GOAT so it's cool to add this.

Why

Stagehand processes website content that may contain invisible Unicode characters or emoji variation selectors. These characters can potentially be used in prompt injection attacks or other security exploits against AI systems. By filtering these characters, we can prevent potential security issues while still maintaining the functionality of the application.

What Changed

  • Added a configurable Unicode character filtering system that can be enabled/disabled
  • Implemented filtering for three specific Unicode ranges:
    • Language Tag characters (U+E0001, U+E0020–U+E007F)
    • Emoji Variation Selectors (U+FE00 - U+FE0F)
    • Supplementary Variation Selectors (U+E0100 - U+E01EF)
  • Added the CharacterFilterConfig interface to allow fine-grained control over which character ranges to filter
  • Integrated the filtering functionality into the core extraction and prompt building processes
  • Updated the Stagehand constructor to accept character filtering configuration
  • Added comprehensive tests to verify the filtering functionality

Test Plan

The implementation has been tested with several test cases:

  1. Run npx tsx examples/simple_unicode_test.ts to verify the basic filtering functionality
  2. Run npx tsx examples/stagehand_unicode_test.ts to test the integration with the Stagehand framework
  3. Run npx tsx examples/unicode_filter_test.ts to test different filtering configurations

The tests demonstrate that:

  • With filtering enabled (default), potentially unsafe Unicode characters are removed
  • With filtering disabled, all characters are preserved
  • Individual ranges can be selectively filtered based on configuration

All tests pass successfully, confirming that the character filtering system works as expected.

Copy link

changeset-bot bot commented Feb 26, 2025

⚠️ No Changeset found

Latest commit: 3e0b451

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant