Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(fs): add --max-file-size flag to skip the files greater than a particular size #7304

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

zesiar0
Copy link

@zesiar0 zesiar0 commented Aug 4, 2024

@CLAassistant
Copy link

CLAassistant commented Aug 4, 2024

CLA assistant check
All committers have signed the CLA.

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@@ -74,13 +76,19 @@ type analyzer interface {
Version() int
Analyze(ctx context.Context, input AnalysisInput) (*AnalysisResult, error)
Required(filePath string, info os.FileInfo) bool
Description() string
Copy link
Contributor

@nikpivkin nikpivkin Aug 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is not used anywhere. I think the point is that the description should return a human-readable description of the analyser and be used in generating documentation.

/cc @knqyf263

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but we can do it in another PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @knqyf263
Let's move adding of this function to another PR.

}

type PostAnalyzer interface {
Type() Type
Version() int
PostAnalyze(ctx context.Context, input PostAnalysisInput) (*AnalysisResult, error)
Required(filePath string, info os.FileInfo) bool
Description() string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

Comment on lines 556 to 562
if _, ok := ag.fileSizes[analyzerType]; !ok {
return true
}
if fileInfo.Size() > ag.fileSizes[analyzerType] {
return false
}
return true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if _, ok := ag.fileSizes[analyzerType]; !ok {
return true
}
if fileInfo.Size() > ag.fileSizes[analyzerType] {
return false
}
return true
maxSize, ok := ag.fileSizes[analyzerType]
if !ok {
return true
}
return fileInfo.Size() <= maxSize

@nikpivkin
Copy link
Contributor

Tests should be added for the new flag, similar to the file-patterns flag.

@@ -411,6 +435,10 @@ func (ag AnalyzerGroup) AnalyzeFile(ctx context.Context, wg *sync.WaitGroup, lim
continue
}

if ag.fileSizeMatch(a.Type(), info) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the condition be inverted?

@nikpivkin
Copy link
Contributor

A flag needs to be added when the group is initialised. https://github.com/aquasecurity/trivy/blob/main/pkg/flag/scan_flags.go#L147

@@ -0,0 +1,68 @@

## Analyzers
Use the following types of analyzers for the [--max-file-size] flags
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Use the following types of analyzers for the [--max-file-size] flags
Use the following types of analyzers for the [--max-file-size] and [--file-pattern] flags

Comment on lines 338 to 351
for _, f := range opts.MaxFileSize {
// e.g. "jar:5mb secret:5kb"
s := strings.SplitN(f, separator, 2)
if len(s) != 2 {
return group, xerrors.Errorf("invalid max file size (%s) expected format: \"analyzerType:maxFileSize\" e.g. \"jar:5mb secret:5kb\"", f)
}

analyzerType := Type(s[0])
maxFileSize, err := units.FromHumanSize(s[1])
if err != nil {
return group, xerrors.Errorf("invalid file size (%s): %w", s[1], err)
}
group.fileSizes[analyzerType] = maxFileSize
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we move this logic into the flag package:

func (f *ScanFlagGroup) ToOptions(args []string) (ScanOptions, error) {
if err := parseFlags(f); err != nil {
return ScanOptions{}, err
}
var target string
if len(args) == 1 {
target = args[0]
}
parallel := f.Parallel.Value()
if f.Parallel != nil && parallel == 0 {
log.Info("Set '--parallel' to the number of CPUs", log.Int("parallel", runtime.NumCPU()))
parallel = runtime.NumCPU()
}
return ScanOptions{
Target: target,
SkipDirs: f.SkipDirs.Value(),
SkipFiles: f.SkipFiles.Value(),
OfflineScan: f.OfflineScan.Value(),
Scanners: xstrings.ToTSlice[types.Scanner](f.Scanners.Value()),
FilePatterns: f.FilePatterns.Value(),
MaxFileSize: f.MaxFileSize.Value(),
Parallel: parallel,
SBOMSources: f.SBOMSources.Value(),
RekorURL: f.RekorURL.Value(),
DetectionPriority: ftypes.DetectionPriority(f.DetectionPriority.Value()),
}, nil
}

PS. opts.MaxFileSize will be map[Type]int64.

Then users will get an error message if the wrong template is used immediately when launching Trivy.
@knqyf263 wdyt?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good

@@ -196,14 +203,30 @@ func (f *ScanFlagGroup) ToOptions(args []string) (ScanOptions, error) {
parallel = runtime.NumCPU()
}

maxFileSize := make(map[analyzer.Type]int64)
for _, f := range f.MaxFileSize.Value() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should not overwrite a variable that is in use:

Suggested change
for _, f := range f.MaxFileSize.Value() {
for _, fileSize := range f.MaxFileSize.Value() {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to add this page to docs.
Update mkdocs.yml for that.

But i am not sure about place for this page.
I think we can add this page to References.
@knqyf263 wdyt?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, sounds good

@@ -74,13 +76,19 @@ type analyzer interface {
Version() int
Analyze(ctx context.Context, input AnalysisInput) (*AnalysisResult, error)
Required(filePath string, info os.FileInfo) bool
Description() string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @knqyf263
Let's move adding of this function to another PR.

Comment on lines +533 to +537
if tt.args.maxFileSize != nil {
maxFileSize = tt.args.maxFileSize
} else {
maxFileSize = map[analyzer.Type]int64{}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if tt.args.maxFileSize != nil {
maxFileSize = tt.args.maxFileSize
} else {
maxFileSize = map[analyzer.Type]int64{}
}
maxFileSize := map[analyzer.Type]int64{}
if tt.args.maxFileSize != nil {
maxFileSize = tt.args.maxFileSize
}

return ScanOptions{}, xerrors.Errorf("invalid max file size (%s) expected format: \"analyzerType:maxFileSize\" e.g. \"jar:5mb secret:5kb\"", f)
}

analyzerType := analyzer.Type(s[0])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You added AllAnalyzerTypes.
We can currently check analyzerType

Suggested change
analyzerType := analyzer.Type(s[0])
analyzerType := analyzer.Type(s[0])
if !slices.Contains(analyzer.AllAnalyzerTypes(), analyzerType) {
return ScanOptions{}, xerrors.Errorf("invalid analyzer type %q. See analyzer list in %s", analyzerType, doc.URL("/docs/references/analyzers", ""))
}

Copy link

This PR is stale because it has been labeled with inactivity.

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and will be auto-closed. label Oct 29, 2024
@DmitriyLewen DmitriyLewen removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and will be auto-closed. label Oct 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

feat: add an option to skip the files greater than a particular size
5 participants