Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(secret): enhance secret scanning for python binary files #7223

Merged
merged 14 commits into from
Sep 30, 2024

Conversation

afdesk
Copy link
Contributor

@afdesk afdesk commented Jul 25, 2024

Description

Notes
this way doesn't detect secrets inside .odt (LibreOffice format) and .pdf.

Demo file:

secret1 = "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"
print(secret1)
$ python3 -m compileall .

Before:

$ trivy fs --scanners secret __pycache__/
2024-08-01T18:03:59+06:00	INFO	[secret] Secret scanning is enabled
2024-08-01T18:03:59+06:00	INFO	[secret] If your scanning is slow, please try '--scanners vuln' to disable secret scanning
2024-08-01T18:03:59+06:00	INFO	[secret] Please see also https://aquasecurity.github.io/trivy/v0.54/docs/scanner/secret#recommendation for faster secret detection

After:

$ trivy fs --scanners secret __pycache__/
secret.cpython-310.pyc (secrets)

Total: 1 (UNKNOWN: 0, LOW: 0, MEDIUM: 0, HIGH: 0, CRITICAL: 1)

CRITICAL: GitHub (github-fine-grained-pat)
═════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════
GitHub Fine-grained personal access tokens
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

Related issues

Checklist

  • I've read the guidelines for contributing to this repository.
  • I've followed the conventions in the PR title.
  • I've added tests that prove my fix is effective or that my feature works.
  • I've updated the documentation with the relevant information (if needed).
  • I've added usage information (if the PR introduces new options)
  • I've included a "before" and "after" example to the description (if the PR is a user interface change).

@afdesk afdesk changed the title feat: enhance secret scanning for specific binary files feat(secret): enhance secret scanning for specific binary files Aug 1, 2024
@afdesk afdesk changed the title feat(secret): enhance secret scanning for specific binary files feat(secret): enhance secret scanning for python binary files Aug 14, 2024
@afdesk afdesk marked this pull request as ready for review August 14, 2024 06:52
@afdesk afdesk requested a review from knqyf263 as a code owner August 14, 2024 06:52
Copy link
Collaborator

@knqyf263 knqyf263 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For binary files, we should not show the line as it is not printable. grep just shows "matches".

$ grep github_pat __pycache__/main.cpython-312.pyc
Binary file __pycache__/main.cpython-312.pyc matches

printalbe = append(printalbe, byte(' '))
wasReadable = true
}
printalbe = append(printalbe, current[0])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implementation concats all single printable characters, leading to the long string. Even in a binary file, there are many printable character strings when viewed on a per-byte basis. I think the strings way, setting the minimum length, is better like I shared.

@@ -63,6 +66,10 @@ func init() {
analyzer.RegisterAnalyzer(NewSecretAnalyzer(secret.Scanner{}, ""))
}

func isAllowedBinary(filename string) bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we omit the is prefix?

Suggested change
func isAllowedBinary(filename string) bool {
func allowedBinary(filename string) bool {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 439 to 454
if args.Binary {
for _, match := range matched {
findings = append(findings, types.SecretFinding{
RuleID: match.Rule.ID,
Category: match.Rule.Category,
Severity: lo.Ternary(match.Rule.Severity == "", "UNKNOWN", match.Rule.Severity),
Title: match.Rule.Title,
Match: fmt.Sprintf("Binary file %q matches a rule %q", args.FilePath, match.Rule.Title),
StartLine: 1,
EndLine: 1,
})
}
} else {
for _, match := range matched {
findings = append(findings, toFinding(match.Rule, match.Location, censored))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I sugest to use toFinding function.
something like this:

Suggested change
if args.Binary {
for _, match := range matched {
findings = append(findings, types.SecretFinding{
RuleID: match.Rule.ID,
Category: match.Rule.Category,
Severity: lo.Ternary(match.Rule.Severity == "", "UNKNOWN", match.Rule.Severity),
Title: match.Rule.Title,
Match: fmt.Sprintf("Binary file %q matches a rule %q", args.FilePath, match.Rule.Title),
StartLine: 1,
EndLine: 1,
})
}
} else {
for _, match := range matched {
findings = append(findings, toFinding(match.Rule, match.Location, censored))
}
for _, match := range matched {
finding := toFinding(match.Rule, match.Location, censored)
// These fields will be unreadable for binaries,
// Therefore overwrite them.
if args.Binary {
finding.Match = fmt.Sprintf("Binary file %q matches a rule %q", args.FilePath, match.Rule.Title)
finding.Code = types.Code{}
}
findings = append(findings, finding)
}

It might make sense to add flag to toFinding function to skip findLocation.

wdyt?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure! it's a better way!

Copy link
Contributor

@DmitriyLewen DmitriyLewen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@knqyf263 knqyf263 added this pull request to the merge queue Sep 30, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Sep 30, 2024
@knqyf263 knqyf263 added this pull request to the merge queue Sep 30, 2024
Merged via the queue into aquasecurity:main with commit 60725f8 Sep 30, 2024
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

feat: enhance secret scanning for specific binary files
3 participants