Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add data compaction policy validator #1238

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

flyrain
Copy link
Contributor

@flyrain flyrain commented Mar 21, 2025

This PR adds interface PolicyValidator and one implementation for data compaction policy. Policy validator is used at policy creation/update as well as attaching a policy to a Polaris entity.
For more context, check
the design doc: https://docs.google.com/document/d/1kIiVkFFg9tPa5SH70b9WwzbmclrzH3qWHKfCKXw5lbs/edit?tab=t.0#heading=h.nly223xz13km
cc @HonahX

@flyrain
Copy link
Contributor Author

flyrain commented Mar 21, 2025

The failure seems unrelated.

QuarkusApplicationIntegrationTest > initializationError FAILED
    java.lang.NullPointerException at PolarisApplicationIntegrationTest.java:137
Timed out trying to set fork join ClassLoader, this should never happen unless something has tied up a fork join thread before the app launched [Error Occurred After Shutdown]

OpenJDK 64-Bit Server VM warning: Sharing is only supported for boot loader classes because bootstrap classpath has been appended
Timed out trying to set fork join ClassLoader, this should never happen unless something has tied up a fork join thread before the app launched [Error Occurred After Shutdown]

> Task :polaris-quarkus-service:test

QuarkusRestCatalogIntegrationTest > initializationError FAILED
    java.lang.NullPointerException at PolarisRestCatalogIntegrationTest.java:173
Timed out trying to set fork join ClassLoader, this should never happen unless something has tied up a fork join thread before the app launched [Error Occurred After Shutdown]


> Task :polaris-quarkus-service:test

QuarkusRestCatalogViewAzureIntegrationTest > initializationError FAILED
    java.lang.NullPointerException at PolarisRestCatalogViewIntegrationBase.java:76
Timed out trying to set fork join ClassLoader, this should never happen unless something has tied up a fork join thread before the app launched [Error Occurred After Shutdown]

Copy link
Contributor

@HonahX HonahX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @flyrain ! This looks like a great start of Policy Validators. Overall LGTM!

case METADATA_COMPACTION:
case SNAPSHOT_RETENTION:
case ORPHAN_FILE_REMOVAL:
default:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just check my understanding: in the future when we add support for custom type, we will need to load the custom type's validator here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, something like this:

      ctor = DynConstructors.builder(PolicyValidator.class).impl(impl).buildChecked();
      policyValidator = ctor.newInstance();

* @param targetEntity the target Polaris entity to attach the policy to
* @return {@code true} if the policy is attachable to the target entity; {@code false} otherwise
*/
public static boolean canAttach(PolicyEntity policy, PolarisEntity targetEntity) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just brainstorming about the name, may be "isAttachable"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method’s primary role is to decide whether a policy is capable of being attached to a target, "canAttach" can be appropriate. I think isAttachable is more suitable if the method only takes a policy entity. WDYT?

import org.apache.polaris.core.exceptions.PolarisException;

/** Exception thrown when a policy is invalid or violates defined rules. */
public class InvalidPolicyException extends PolarisException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this map to status code 400, BadRequestError in rest?

}

try {
var policy = PolicyValidatorUtil.MAPPER.readValue(content, DataCompactionPolicy.class);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
var policy = PolicyValidatorUtil.MAPPER.readValue(content, DataCompactionPolicy.class);
DataCompactionPolicy policy = PolicyValidatorUtil.MAPPER.readValue(content, DataCompactionPolicy.class);

Shall we explicitly write out the type here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm OK with either way.

if (!POLICY_SCHEMA_VERSIONS.contains(policy.getVersion())) {
throw new InvalidPolicyException("Invalid policy version: " + policy.getVersion());
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be a follow up: Do we need to validate data compaction configs (if they present). For example, target_file_size_bytes needs to be a value larger than 0.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot know the semantic within the config map. User can give a string to target_file_size_bytes, it should still be acceptable, as target_file_size_bytes doesn't mean anything in the schema.

* specific language governing permissions and limitations
* under the License.
*/
package org.apache.polaris.core.policy.validator;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we move Data Compaction Policy Validator to a submodule data-compaction? I think a submodule for each policy type can offer a more organized layout.

Copy link
Contributor Author

@flyrain flyrain Mar 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mean a sub package? make sense, will move it.

import com.fasterxml.jackson.databind.annotation.JsonDeserialize;
import java.util.Map;

public class DataCompactionPolicy {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
public class DataCompactionPolicy {
public class DataCompactionPolicyContent {

Shall we append "Content" to the name to differentiate it with the Policy class generated by open api generator?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants