Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prompt_security #920

Merged
merged 9 commits into from
Feb 2, 2025
Merged

Prompt_security #920

merged 9 commits into from
Feb 2, 2025

Conversation

lior-ps
Copy link
Contributor

@lior-ps lior-ps commented Jan 4, 2025

Description

Prompt Security is a startup specializing in security services for LLMs and generative AI. We can protect prompts and responses a wide variaty of risks like prompt injection, jailbreak, sensitive data disclosure and inappropriate content by adding guardrails.

Related Issue(s)

None

Checklist

  • [v] I've read the CONTRIBUTING guidelines.
  • [v] I've updated the documentation if applicable.
  • [v] I've added tests if applicable.
  • [v] @mentions of the person or team responsible for reviewing proposed changes.

@Pouyanpi Pouyanpi self-requested a review January 8, 2025 06:24
@Pouyanpi Pouyanpi self-assigned this Jan 8, 2025
@Pouyanpi Pouyanpi added enhancement New feature or request status: in review labels Jan 8, 2025
.gitignore Outdated Show resolved Hide resolved
Copy link
Collaborator

@Pouyanpi Pouyanpi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @lior-ps for the PR. My review is not completed yet, but please feel free to have a look at the comments.

I think my recent comment here also applies to this PR. It'd be great to have the possibility to run live tests.

nemoguardrails/library/prompt_security/actions.py Outdated Show resolved Hide resolved
nemoguardrails/library/prompt_security/actions.py Outdated Show resolved Hide resolved
nemoguardrails/library/prompt_security/actions.py Outdated Show resolved Hide resolved
nemoguardrails/library/prompt_security/flows.co Outdated Show resolved Hide resolved
@Pouyanpi
Copy link
Collaborator

Another feedback:

As long as we have Colang 1.0 one should not use one flow name for both input and output which you are doing (e.g., protect prompt and protect response)

Currently when both input and output rails are activated when the interaction is multi round in the subsequent rounds of interaction both user and bot messages might be available in a context variable (one can argue that this is a bug). So passing them explicitly in action definition is the appropriate way to do it.

I will highlight the code line that need this change.

Copy link
Collaborator

@Pouyanpi Pouyanpi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applied suggestion in this comment.

Colang 2.0 flows need change but I will provide the code.

@Pouyanpi
Copy link
Collaborator

thank you @lior-ps it looks great, just tried to run the test without the mocks and am facing some issues.

Would you please have a look?

For example once I comment relevant lines of test_prompt_secuirty_protection_input:

@pytest.mark.unit
def test_prompt_security_protection_input():
    config = RailsConfig.from_content(
        yaml_content="""
            models: []
            rails:
              input:
                flows:
                  - protect prompt
        """,
        colang_content="""
            define user express greeting
              "hi"

            define flow
              user express greeting
              bot express greeting

            define bot inform answer unknown
              "I can't answer that."
        """,
    )

    chat = TestChat(
        config,
        llm_completions=[
            "  express greeting",
            '  "Hi! My name is John as well."',
        ],
    )

    # chat.app.register_action(retrieve_relevant_chunks, "retrieve_relevant_chunks")
    # chat.app.register_action(mock_protect_text(True), "protect_text")
    chat >> "Hi! I am Mr. John! And my email is [email protected]"
    chat << "I can't answer that."

I get Hi! My name is John as well., ideally we should mock the actual behavior.

@Pouyanpi Pouyanpi added this to the v0.12.0 milestone Jan 21, 2025
@lior-ps
Copy link
Contributor Author

lior-ps commented Jan 25, 2025

Hi @Pouyanp, I fixed the pytest code, can you please check again?

@Pouyanpi Pouyanpi self-requested a review January 27, 2025 07:55
@Pouyanpi
Copy link
Collaborator

Thank you @lior-ps , It looks good (maybe we can add more tests later)

would you please just sign your commits and run pre-commit per contributing guidelines?

You can do an interactive rebase to jus sign them and apply pre-commit hooks.

@lior-ps
Copy link
Contributor Author

lior-ps commented Feb 1, 2025

Thanks @Pouyanpi
I signed all commits and applied pre-commit hooks.
Can we merge now?

Copy link

github-actions bot commented Feb 1, 2025

Documentation preview

https://nvidia.github.io/NeMo-Guardrails/review/pr-920

@Pouyanpi Pouyanpi merged commit 08569c8 into NVIDIA:develop Feb 2, 2025
2 checks passed
@Pouyanpi
Copy link
Collaborator

Pouyanpi commented Feb 2, 2025

thank you Lior 👍🏻 had to sign all the commits, there were one unsgined. Always best to do rebase to develop first

@lior-ps
Copy link
Contributor Author

lior-ps commented Feb 2, 2025

Thank you @Pouyanpi
Is there anything else I should do from my side?

@Pouyanpi
Copy link
Collaborator

Pouyanpi commented Feb 2, 2025

No @lior-ps, all looks good, the PR is merged, thanks 👍🏻

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request status: in review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants