Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation Fault when multi-part upload through proxy #4300

Open
1 task
hjia97 opened this issue Oct 12, 2024 · 5 comments
Open
1 task

Segmentation Fault when multi-part upload through proxy #4300

hjia97 opened this issue Oct 12, 2024 · 5 comments
Assignees
Labels
bug This issue is a confirmed bug. closing-soon This issue will automatically close in 4 days unless further comments are made. p3 This is a minor priority issue response-requested Waiting on additional information or feedback. s3

Comments

@hjia97
Copy link

hjia97 commented Oct 12, 2024

Describe the bug

Hello! I want to ask and report an issue I encountered during multi-thread upload of Boto3.

I can reproduce the crash consistently on my Mac environment, and also an Alpine 3.20 Docker container.

I have an individual Python script that tries to multi-part upload a large file to S3 THROUGH a proxy. The thread count is set to be 40, or any large number.

Regression Issue

  • Select this option if this issue appears to be a regression.

Expected Behavior

My python program would crash instantly.

zsh: segmentation fault  python3 upload.py

Current Behavior

I used lldb to track down to the exact broken point. It looks like OPENSSL library that's used by Boto3 is freeing a memory location that has already be freed.
Crash Screenshot

Reproduction Steps

The code I used:


boto_session = boto3.Session(aws_session_token=aws_session_token, aws_access_key_id=aws_access_key_id,
                             aws_secret_access_key=aws_secret_access_key)

transfer_config = {
    'multipart_threshold': THRESHOLD,
    'multipart_chunksize': CHUNKSIZE,
    'max_concurrency': 40,
    'use_threads': True}

tc = TransferConfig(max_bandwidth=MAX_BANDWITDTH, **transfer_config)

client_config = Config(endpoint_discovery_enabled=True,
                       proxies={'https': 'https://' + proxy},
                       proxies_config={
                           'proxy_client_cert': 'cert.pem',
                           'proxy_use_forwarding_for_https': True,
                           'proxy_ca_bundle': 'bundle.crt'
                       },
                       client_cert='cert.pem')

client = boto_session.client('s3', use_ssl=True, verify='ion_dev_certificate.pem',
                             config=client_config,
                             region_name=region_name)

client.upload_file(file_path, bucket, dest_path, Config=tc)

Possible Solution

No response

Additional Information/Context

The crash could be recreated consistently at my Mac, and a clean Alpine 3.20 Docker container.

SDK version used

[email protected],

Environment details (OS name and version, etc.)

Alpine3.20, Mac

@hjia97 hjia97 added bug This issue is a confirmed bug. needs-triage This issue or PR still needs to be triaged. labels Oct 12, 2024
@adev-code adev-code self-assigned this Oct 18, 2024
@adev-code adev-code added s3 investigating This issue is being investigated and/or work is in progress to resolve the issue. p3 This is a minor priority issue and removed needs-triage This issue or PR still needs to be triaged. labels Oct 18, 2024
@adev-code
Copy link

Hi @hjia97 thanks for reaching out. For further look, please include the full debug response from Mac and Alpine redacting any sensitive information. Thank you.

@adev-code adev-code added response-requested Waiting on additional information or feedback. and removed investigating This issue is being investigated and/or work is in progress to resolve the issue. labels Oct 23, 2024
@hjia97
Copy link
Author

hjia97 commented Oct 29, 2024

Screenshot 2024-10-28 at 7 23 49 PM My program would hit a segmentation fault right after an upload is called. If I'm not using proxy, the error won't happen

@hjia97
Copy link
Author

hjia97 commented Oct 29, 2024

2024-10-29 00:01:32,688 urllib3.connectionpool [DEBUG] Starting new HTTPS connection (35): d3-ppd-us-systemlogs.s3.us-west-2.amazonaws.com:443
2024-10-29 00:01:32,688 botocore.endpoint [DEBUG] Sending http request: <AWSPreparedRequest stream_output=False, method=PUT, url=xxxxx, headers={'User-Agent': b'Boto3/1.35.50 md/Botocore#1.35.50 ua/2.0 os/macos#23.3.0 md/arch#arm64 lang/python#3.12.6 md/pyimpl#CPython cfg/retry-mode#legacy Botocore/1.35.50', 'Content-MD5': b'XzY+DlipXwbL6bvGYsXftg==', 'Expect': b'100-continue', 'X-Amz-Date': b'20241029T070132Z', 'X-Amz-Security-Token': xxxxxxx', 'X-Amz-Content-SHA256': b'UNSIGNED-PAYLOAD', 'Authorization': b'AWS4-HMAC-SHA256 Credential=ASIASSCAUKUUGYTYJSW3/20241029/us-west-2/s3/aws4_request, SignedHeaders=content-md5;host;x-amz-content-sha256;x-amz-date;x-amz-security-token, Signature=fecb0c0a6ac02703249839da472fa62835ac8352e39cbd77836c7b93ace035da', 'amz-sdk-invocation-id': b'a2edc30c-3494-4bbf-bb97-b7574f2c7768', 'amz-sdk-request': b'attempt=1', 'Content-Length': '5242880'}>

2024-10-29 00:01:32,688 urllib3.connectionpool [DEBUG] Starting new HTTPS connection (36): xxx.s3.us-west-2.amazonaws.com:443

2024-10-29 00:01:32,703 botocore.awsrequest [DEBUG] 100 Continue response seen, now sending request body.

Python(31188,0x28a073000) malloc: Double free of object 0x141e920f0

Python(31188,0x177e37000) malloc: Double free of object 0x141e920f0

Python(31188,0x28a073000) malloc: *** set a breakpoint in malloc_error_break to debug

Python(31188,0x177e37000) malloc: *** set a breakpoint in malloc_error_break to debug

Python(31188,0x28f0af000) malloc: Incorrect checksum for freed object 0x141e8f1d8: probably modified after being freed.

Corrupt value: 0x4254415446784d45

Python(31188,0x28f0af000) malloc: *** set a breakpoint in malloc_error_break to debug

@github-actions github-actions bot removed the response-requested Waiting on additional information or feedback. label Oct 30, 2024
@adev-code
Copy link

Hi @hjia97,

Thanks for reaching out. I'm a bit confused about the nature of the issue, this is not something we've seen before happen.
In my reproduction I have setup an EC2 server as a proxy. My local application is routing traffic to my proxy and my uploads are making it to S3 successfully.

I suggest that you enable the SDK logs on your origin code, and also capture the wire logs as they are being sent from your proxy. Please compare and share the two logs side by side for the failing request so we can examine to see if there are any discrepancies.

@adev-code adev-code added the response-requested Waiting on additional information or feedback. label Nov 1, 2024
Copy link

Greetings! It looks like this issue hasn’t been active in longer than five days. We encourage you to check if this is still an issue in the latest release. In the absence of more information, we will be closing this issue soon. If you find that this is still a problem, please feel free to provide a comment or upvote with a reaction on the initial post to prevent automatic closure. If the issue is already closed, please feel free to open a new one.

@github-actions github-actions bot added the closing-soon This issue will automatically close in 4 days unless further comments are made. label Nov 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a confirmed bug. closing-soon This issue will automatically close in 4 days unless further comments are made. p3 This is a minor priority issue response-requested Waiting on additional information or feedback. s3
Projects
None yet
Development

No branches or pull requests

2 participants