Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BCDA-7565: Support S3 file downloads for opt-out files #890

Merged
merged 20 commits into from
Dec 6, 2023
Merged

Conversation

kyeah
Copy link
Contributor

@kyeah kyeah commented Nov 30, 2023

🎫 Ticket

https://jira.cms.gov/browse/BCDA-7565

🛠 Changes

  • Fix intermittent test errors due to db.Close() statements (switch to sqlmock)
  • Add localstack (for spinning up local AWS S3 mock server)
  • Add new CLI options for import-suppression-directory
  • Add S3FileHandler to search and import files from S3

ℹ️ Context for reviewers

We run a daily Jenkins job that sshes onto an NFS server and reads files from local disk. In the future, the BFD team will be serving files through their S3 bucket; this is an interim solution that switches our file handling to read from their S3 bucket when given the --filesource s3 option.

✅ Acceptance Validation

I added files locally to dev S3 and then manually ran the import in dev.

This job imported files successfully.

This job correctly did nothing after seeing nothing in S3.

This job failed due to an invalid file header (which was expected.)

This job showed one skipped file due to an invalid filename (also expected.)

See the ops PR for more details.

🔒 Security Implications

  • This PR adds a new software dependency or dependencies.
  • This PR modifies or invalidates one or more of our security controls.
  • This PR stores or transmits data that was not stored or transmitted before.
  • This PR requires additional review of its security implications for other reasons.

This adds a new local-development-only dependency for Localstack for testing against a fake S3 server.

@@ -1,120 +1,122 @@
$ANSIBLE_VAULT;1.1;AES256
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated this to include BFD_S3_ENDPOINT=http://localhost:4566

@codecov-commenter
Copy link

Codecov Report

Merging #890 (61941eb) into master (8e1a0c5) will decrease coverage by 0.02%.
Report is 1 commits behind head on master.
The diff coverage is 84.87%.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #890      +/-   ##
==========================================
- Coverage   79.63%   79.62%   -0.02%     
==========================================
  Files          93       97       +4     
  Lines       10671    10852     +181     
==========================================
+ Hits         8498     8641     +143     
- Misses       1637     1663      +26     
- Partials      536      548      +12     
Files Coverage Δ
bcda/bcdacli/cli.go 75.69% <100.00%> (+1.27%) ⬆️
bcda/suppression/bcda_saver.go 100.00% <100.00%> (ø)
optout/fake_saver.go 100.00% <100.00%> (ø)
optout/models.go 100.00% <ø> (ø)
bcda/suppression/suppression.go 89.50% <87.23%> (-3.83%) ⬇️
optout/local_file_handler.go 93.54% <93.54%> (ø)
optout/s3_file_handler.go 73.57% <73.57%> (ø)

... and 5 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 400f1b3...61941eb. Read the comment docs.

@kyeah kyeah marked this pull request as ready for review December 5, 2023 00:18
@alex-dzeda
Copy link
Contributor

@kyeah It looks like this is the first use of localstack in the bcda-app project; would this be something that needs to trigger one of the "security implications" checkboxes?

@kyeah
Copy link
Contributor Author

kyeah commented Dec 5, 2023

@kyeah It looks like this is the first use of localstack in the bcda-app project; would this be something that needs to trigger one of the "security implications" checkboxes?

Thanks! I'll check the "new dependency" box but add a note that it's only installed for local development/testing. It shouldn't need to wait for approval (cc. @StewGoin @bhodges-navapbc)

Copy link
Contributor

@alex-dzeda alex-dzeda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, comments mainly concerned with the potential for region based issues as well as cross-talk between consumers, but may be false alarms!

return sc, func() {}, err
}

func (handler *S3FileHandler) CleanupOptOutFiles(suppresslist []*OptOutFilenameMetadata) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these opt-out files per-consumer, or shared between them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per consumer (each API team has their own S3 folders that only they have access to.)

sess := session.Must(session.NewSession())

config := aws.Config{
Region: aws.String("us-east-1"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we see a scenario where a file may not be in us-east-1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah thanks, I said on our call earlier that it would automatically use whatever is configured in the AWS profile, but I forgot that this is hardcoded.

The BFD S3 buckets will live in us-east-1 so I think we can leave them hardcoded for now to avoid the "too many options/parameters/environment variables" fatigue, but open to alternative thoughts!

I did note that this region is hardcoded in several other places in the codebase (e.g. the insights pipeline and a looot of Jenkins workflows)

Copy link
Contributor

@alex-dzeda alex-dzeda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks for addressing comments!

@kyeah kyeah merged commit a17ab7a into master Dec 6, 2023
@kyeah kyeah deleted the kev/cli-s3 branch December 6, 2023 15:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants