Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract compileRequest from BaseSingleStageRequestHandler.doHandleRequest #15073

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

vrajat
Copy link
Collaborator

@vrajat vrajat commented Feb 17, 2025

BaseSingleStageBrokerRequestHandler.doHandleRequest is a large function of ~600 LOC. This makes it hard to reason about the function as well as extend and reuse parts of the function. The function can be broken down into four parts:

  • Compile
  • Validate
  • Plan (generate server request)
  • Execute

This PR only extracts the code to compile the request. The o/p of compile is stored in a CompileResult object that the rest of the code can use. A couple of open points are:

  • This PR does not try to be exactly correct about the boundaries of these parts. Eventually some code has to be moved around.
  • SQL Parser & Validation errors should ideally be thrown as exceptions and then converted to BrokerResponse. This can be taken up in a subsequent PR.

Otherwise the code has not been changed.

@codecov-commenter
Copy link

codecov-commenter commented Feb 17, 2025

Codecov Report

Attention: Patch coverage is 53.91304% with 53 lines in your changes missing coverage. Please review.

Project coverage is 63.43%. Comparing base (59551e4) to head (79b0b4f).
Report is 1754 commits behind head on master.

Files with missing lines Patch % Lines
...sthandler/BaseSingleStageBrokerRequestHandler.java 53.91% 42 Missing and 11 partials ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #15073      +/-   ##
============================================
+ Coverage     61.75%   63.43%   +1.68%     
- Complexity      207     1483    +1276     
============================================
  Files          2436     2748     +312     
  Lines        133233   154471   +21238     
  Branches      20636    23817    +3181     
============================================
+ Hits          82274    97989   +15715     
- Misses        44911    49086    +4175     
- Partials       6048     7396    +1348     
Flag Coverage Δ
custom-integration1 100.00% <ø> (+99.99%) ⬆️
integration 100.00% <ø> (+99.99%) ⬆️
integration1 100.00% <ø> (+99.99%) ⬆️
integration2 0.00% <ø> (ø)
java-11 63.40% <53.91%> (+1.69%) ⬆️
java-21 63.32% <53.91%> (+1.70%) ⬆️
skip-bytebuffers-false 63.43% <53.91%> (+1.68%) ⬆️
skip-bytebuffers-true 63.30% <53.91%> (+35.57%) ⬆️
temurin 63.43% <53.91%> (+1.68%) ⬆️
unittests 63.43% <53.91%> (+1.68%) ⬆️
unittests1 56.04% <ø> (+9.15%) ⬆️
unittests2 34.00% <53.91%> (+6.27%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see this extraction very useful because it cannot really be reused. It will be very useful if we can abstract out the offline/realtime handling logic, so that the logic can be reused instead of repeated twice

@vrajat
Copy link
Collaborator Author

vrajat commented Feb 21, 2025

The ultimate goal is to abstract out the routing logic. Also the validation logic has to be extracted because logical table validation will be different.

This PR will make it easier for the subsequent abstractions because a few small number of objects can be passed around. Right now there is too much dependence on local variables which makes it harder to reason as well as look at a diff and be confident the changes are correct.

@vrajat
Copy link
Collaborator Author

vrajat commented Feb 21, 2025

Another reason to have these boundaries is that what should be in compile or validation phase is done much later. These boundaries will force some discipline. Couple of examples:

  • authorization is a few statements apart.
  • QPS validation is after getting table configs even though they depend on table names.

Maybe the sequence is the correct. However the reasoning is not clear to a newbie. If there are boundaries, then the reason for an op not in the right phase will (hopefully) be documented

Comment on lines 358 to 360
if (compileResult != null) {
rawTableName = compileResult._rawTableName;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

compileResult will always be null, and this will be a behavior change. We need to either handle the exception inside the extracted method, or find a way to return the raw table name.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed the refactor here. I've kept the same code structure in the compileRequest function.

static class CompileResult {
final PinotQuery _pinotQuery;
final PinotQuery _serverPinotQuery;
Schema _schema;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(minor) Should this be final?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -306,139 +306,71 @@ protected BrokerResponse handleRequest(long requestId, String query, SqlNodeAndO
}
}

static class CompileResult {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(minor) This can be private since it is just a helper wrapper class

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@vrajat vrajat force-pushed the rv-extract-compile-request branch from 1f1da71 to 79b0b4f Compare February 25, 2025 10:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants