Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#4981] Improvement(test): Add integration tests for Hive catalog with S3 location. #4982

Merged
merged 6 commits into from
Sep 24, 2024

Conversation

yuqi1129
Copy link
Contributor

What changes were proposed in this pull request?

Add integration tests to test hive catalog with S3 location.

Why are the changes needed?

To make code more robust.

Fix: #4981

Does this PR introduce any user-facing change?

N/A.

How was this patch tested?

The changes itself is tested.

@yuqi1129 yuqi1129 marked this pull request as draft September 21, 2024 09:17
@yuqi1129
Copy link
Contributor Author

yuqi1129 commented Sep 21, 2024

This PR is not ready for review until #4980 is merged.

@yuqi1129 yuqi1129 changed the title [#4981] Improvement(test): Add S3 related configuration to support Hive S3 schema/table. [#4981] Improvement(test): Add Hive catalog with S3 location integration tests. Sep 21, 2024
@yuqi1129 yuqi1129 self-assigned this Sep 21, 2024
@yuqi1129 yuqi1129 changed the title [#4981] Improvement(test): Add Hive catalog with S3 location integration tests. [#4981] Improvement(test): Add integration tests for Hive catalog with S3 location. Sep 21, 2024
@yuqi1129 yuqi1129 marked this pull request as ready for review September 23, 2024 02:53
* container with S3 is Hive3 container and the Hive container is Hive2 container. There is
* something wrong with the hive2 container to access the S3.
*/
private static volatile HiveContainer hiveContainerWithS3;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot share the container even if we uniform the hive version to hive3 for the following reasons.

  1. We need to ensure the s3 environment is ready before starting the Hive container so we can share it.
  2. for the reason in #(1, we need to start localstack(S3 simulator) first, however we can't make sure CatalogHiveS3IT was the first one to execute.

@jerryshao
Copy link
Contributor

I'm OK with this PR itself. But I think compared to IT, it is more valuable to add some docs to claim that this can be run on S3 and walk users to some references about how to configure Hive on S3, that should be enough.

@yuqi1129
Copy link
Contributor Author

I'm OK with this PR itself. But I think compared to IT, it is more valuable to add some docs to claim that this can be run on S3 and walk users to some references about how to configure Hive on S3, that should be enough.

Got it, it's not a big deal to add a document reference about it and I'll do it today.

@jerryshao jerryshao merged commit 870c569 into apache:main Sep 24, 2024
26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Improvement] Add integration test for Hive with S3
2 participants