Prompt_security #920

lior-ps · 2025-01-04T13:40:53Z

Description

Prompt Security is a startup specializing in security services for LLMs and generative AI. We can protect prompts and responses a wide variaty of risks like prompt injection, jailbreak, sensitive data disclosure and inappropriate content by adding guardrails.

Related Issue(s)

None

Checklist

[v] I've read the CONTRIBUTING guidelines.
[v] I've updated the documentation if applicable.
[v] I've added tests if applicable.
[v] @mentions of the person or team responsible for reviewing proposed changes.

.gitignore

docs/user-guides/community/prompt-security.md

docs/user-guides/guardrails-library.md

nemoguardrails/library/prompt_security/actions.py

Pouyanpi

Thank you @lior-ps for the PR. My review is not completed yet, but please feel free to have a look at the comments.

I think my recent comment here also applies to this PR. It'd be great to have the possibility to run live tests.

nemoguardrails/library/prompt_security/actions.py

nemoguardrails/library/prompt_security/flows.co

Pouyanpi · 2025-01-13T10:33:05Z

Another feedback:

As long as we have Colang 1.0 one should not use one flow name for both input and output which you are doing (e.g., protect prompt and protect response)

Currently when both input and output rails are activated when the interaction is multi round in the subsequent rounds of interaction both user and bot messages might be available in a context variable (one can argue that this is a bug). So passing them explicitly in action definition is the appropriate way to do it.

I will highlight the code line that need this change.

nemoguardrails/library/prompt_security/actions.py

nemoguardrails/library/prompt_security/flows.v1.co

Pouyanpi

Applied suggestion in this comment.

Colang 2.0 flows need change but I will provide the code.

docs/user-guides/community/prompt-security.md

Pouyanpi · 2025-01-20T10:01:59Z

thank you @lior-ps it looks great, just tried to run the test without the mocks and am facing some issues.

Would you please have a look?

For example once I comment relevant lines of test_prompt_secuirty_protection_input:

@pytest.mark.unit
def test_prompt_security_protection_input():
    config = RailsConfig.from_content(
        yaml_content="""
            models: []
            rails:
              input:
                flows:
                  - protect prompt
        """,
        colang_content="""
            define user express greeting
              "hi"

            define flow
              user express greeting
              bot express greeting

            define bot inform answer unknown
              "I can't answer that."
        """,
    )

    chat = TestChat(
        config,
        llm_completions=[
            "  express greeting",
            '  "Hi! My name is John as well."',
        ],
    )

    # chat.app.register_action(retrieve_relevant_chunks, "retrieve_relevant_chunks")
    # chat.app.register_action(mock_protect_text(True), "protect_text")
    chat >> "Hi! I am Mr. John! And my email is [email protected]"
    chat << "I can't answer that."

I get Hi! My name is John as well., ideally we should mock the actual behavior.

nemoguardrails/library/prompt_security/actions.py

lior-ps · 2025-01-25T15:10:31Z

Hi @Pouyanp, I fixed the pytest code, can you please check again?

Pouyanpi · 2025-01-27T08:14:31Z

Thank you @lior-ps , It looks good (maybe we can add more tests later)

would you please just sign your commits and run pre-commit per contributing guidelines?

You can do an interactive rebase to jus sign them and apply pre-commit hooks.

lior-ps · 2025-02-01T13:24:40Z

Thanks @Pouyanpi
I signed all commits and applied pre-commit hooks.
Can we merge now?

github-actions · 2025-02-01T16:08:00Z

Documentation preview

https://nvidia.github.io/NeMo-Guardrails/review/pr-920

Pouyanpi · 2025-02-02T10:38:22Z

thank you Lior 👍🏻 had to sign all the commits, there were one unsgined. Always best to do rebase to develop first

lior-ps · 2025-02-02T11:05:54Z

Thank you @Pouyanpi
Is there anything else I should do from my side?

Pouyanpi · 2025-02-02T16:44:10Z

No @lior-ps, all looks good, the PR is merged, thanks 👍🏻

Pouyanpi self-requested a review January 8, 2025 06:24

Pouyanpi self-assigned this Jan 8, 2025

Pouyanpi added enhancement New feature or request status: in review labels Jan 8, 2025

Pouyanpi reviewed Jan 9, 2025

View reviewed changes

.gitignore Outdated Show resolved Hide resolved

Pouyanpi reviewed Jan 9, 2025

View reviewed changes

docs/user-guides/community/prompt-security.md Outdated Show resolved Hide resolved

Pouyanpi reviewed Jan 9, 2025

View reviewed changes

docs/user-guides/guardrails-library.md Show resolved Hide resolved

Pouyanpi reviewed Jan 9, 2025

View reviewed changes

nemoguardrails/library/prompt_security/actions.py Outdated Show resolved Hide resolved

Pouyanpi reviewed Jan 9, 2025

View reviewed changes

nemoguardrails/library/prompt_security/actions.py Outdated Show resolved Hide resolved

Pouyanpi reviewed Jan 9, 2025

View reviewed changes

nemoguardrails/library/prompt_security/actions.py Outdated Show resolved Hide resolved

Pouyanpi requested changes Jan 9, 2025

View reviewed changes

Pouyanpi assigned lior-ps Jan 9, 2025

Pouyanpi reviewed Jan 13, 2025

View reviewed changes

nemoguardrails/library/prompt_security/actions.py Outdated Show resolved Hide resolved

Pouyanpi reviewed Jan 13, 2025

View reviewed changes

nemoguardrails/library/prompt_security/actions.py Outdated Show resolved Hide resolved

Pouyanpi reviewed Jan 13, 2025

View reviewed changes

nemoguardrails/library/prompt_security/actions.py Outdated Show resolved Hide resolved

Pouyanpi reviewed Jan 13, 2025

View reviewed changes

nemoguardrails/library/prompt_security/actions.py Outdated Show resolved Hide resolved

Pouyanpi reviewed Jan 13, 2025

View reviewed changes

nemoguardrails/library/prompt_security/actions.py Outdated Show resolved Hide resolved

Pouyanpi reviewed Jan 13, 2025

View reviewed changes

nemoguardrails/library/prompt_security/flows.v1.co Outdated Show resolved Hide resolved

Pouyanpi reviewed Jan 13, 2025

View reviewed changes

nemoguardrails/library/prompt_security/flows.v1.co Outdated Show resolved Hide resolved

Pouyanpi requested changes Jan 13, 2025

View reviewed changes

Pouyanpi reviewed Jan 20, 2025

View reviewed changes

docs/user-guides/community/prompt-security.md Show resolved Hide resolved

Pouyanpi reviewed Jan 20, 2025

View reviewed changes

nemoguardrails/library/prompt_security/actions.py Outdated Show resolved Hide resolved

Pouyanpi added this to the v0.12.0 milestone Jan 21, 2025

Pouyanpi self-requested a review January 27, 2025 07:55

lior-ps force-pushed the prompt_security branch from 40eebd0 to d5f02d7 Compare February 1, 2025 13:19

Pouyanpi approved these changes Feb 2, 2025

View reviewed changes

Pouyanpi force-pushed the prompt_security branch from 9b3b04a to d5f02d7 Compare February 2, 2025 10:30

noamlevy81 and others added 9 commits February 2, 2025 11:31

fix activefence rail docs

688f7a8

add prompt security integration

17abbd9

use context and try to modify user_message or bot_message when needed

1a3aee7

option to modify user_message or bot_message

87d1a54

add :

49762fa

resolve pull request comments

7bd343b

typo

b908e74

fix prompt security pytest

11ebd9d

fix issue found by pre-commit

73846d4

Pouyanpi force-pushed the prompt_security branch from d5f02d7 to 73846d4 Compare February 2, 2025 10:32

Pouyanpi merged commit 08569c8 into NVIDIA:develop Feb 2, 2025
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prompt_security #920

Prompt_security #920

lior-ps commented Jan 4, 2025

Pouyanpi left a comment

Pouyanpi commented Jan 13, 2025

Pouyanpi left a comment

Pouyanpi commented Jan 20, 2025

lior-ps commented Jan 25, 2025

Pouyanpi commented Jan 27, 2025

lior-ps commented Feb 1, 2025

github-actions bot commented Feb 1, 2025

Pouyanpi commented Feb 2, 2025

lior-ps commented Feb 2, 2025

Pouyanpi commented Feb 2, 2025

Prompt_security #920

Prompt_security #920

Conversation

lior-ps commented Jan 4, 2025

Description

Related Issue(s)

Checklist

Pouyanpi left a comment

Choose a reason for hiding this comment

Pouyanpi commented Jan 13, 2025

Pouyanpi left a comment

Choose a reason for hiding this comment

Pouyanpi commented Jan 20, 2025

lior-ps commented Jan 25, 2025

Pouyanpi commented Jan 27, 2025

lior-ps commented Feb 1, 2025

github-actions bot commented Feb 1, 2025

Documentation preview

Pouyanpi commented Feb 2, 2025

lior-ps commented Feb 2, 2025

Pouyanpi commented Feb 2, 2025