Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prefect-aws doesn't calculate paths correctly if bucket_folder is set #15267

Open
catermelon opened this issue Sep 6, 2024 · 2 comments · May be fixed by #15287
Open

prefect-aws doesn't calculate paths correctly if bucket_folder is set #15267

catermelon opened this issue Sep 6, 2024 · 2 comments · May be fixed by #15287
Labels
bug Something isn't working integrations Related to integrations with other services

Comments

@catermelon
Copy link

Bug summary

Hi Prefect friends, thanks so much for making this thing.

I found some minor unexpected behavior when using the prefect-aws library and writing files to S3.

The issue is specifically with this line in S3Bucket._resolve_path:

 path = (
     (Path(self.bucket_folder) / path).as_posix() if self.bucket_folder else path
)

I have a block where I set bucket_folder to be a prefix like /appname and I was attempting to write files with date partitioning like /2024/09/06/whatever.gz.

I expected this would write to /appname/2024/09/06/whatever.gz, but it doesn't. It'll instead spit out /2024/09/06/whatever.gz. Sending in a non-absolute path like 2024/09/06/whatever.gz does work.

So that's all correct behavior for Path, because resolving an absolute path against another directory should just give you the absolute path back. But it's surprising behavior for this code because I would assume the prefix would always apply.

Anyways, this is really minor, and I can work around it. I can send in a pull issue if you want, it should be pretty trivial to change.

Thanks so much.

Version info (prefect version output)

Version:             2.20.4
API version:         0.8.4
Python version:      3.10.10
Git commit:          3b951c35
Built:               Wed, Aug 28, 2024 9:09 PM
OS/Arch:             darwin/arm64
Profile:             prod
Server type:         server

Additional context

No response

@catermelon catermelon added the bug Something isn't working label Sep 6, 2024
@zzstoatzz
Copy link
Collaborator

zzstoatzz commented Sep 6, 2024

hi @catermelon - thank you for the issue!

hmm I'm looking at this now - assuming all these pass, is this the behavior you would expect?

@pytest.mark.parametrize(
    "bucket_folder, path, expected",
    [
        (None, "", ""),
        (None, "file.txt", "file.txt"),
        ("prefix", "file.txt", "prefix/file.txt"),
        ("prefix", "/file.txt", "prefix/file.txt"),
        ("prefix/", "file.txt", "prefix/file.txt"),
        ("prefix/", "/file.txt", "prefix/file.txt"),
        ("/prefix", "file.txt", "prefix/file.txt"),
        ("/prefix", "/file.txt", "prefix/file.txt"),
        ("/prefix/", "file.txt", "prefix/file.txt"),
        ("/prefix/", "/file.txt", "prefix/file.txt"),
    ],
)
def test_resolve_path(s3_bucket, bucket_folder, path, expected):
    s3_bucket.bucket_folder = bucket_folder
    assert s3_bucket._resolve_path(path) == expected

or would you expect the leading / to remain?

@zzstoatzz zzstoatzz linked a pull request Sep 9, 2024 that will close this issue
@zzstoatzz zzstoatzz added the integrations Related to integrations with other services label Sep 9, 2024
@catermelon
Copy link
Author

I think that all looks good! For my use case, the leading / is irrelevant, so I'm good with either. Thanks so much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working integrations Related to integrations with other services
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants