-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support storing the code that builds the code model #305
base: code-reflection
Are you sure you want to change the base?
Conversation
👋 Welcome back mabbay! A progress list of the required criteria for merging this PR into |
@mabbay This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 17 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
Webrevs
|
src/jdk.incubator.code/share/classes/jdk/incubator/code/internal/ReflectMethods.java
Outdated
Show resolved
Hide resolved
src/jdk.incubator.code/share/classes/jdk/incubator/code/internal/ReflectMethods.java
Outdated
Show resolved
Hide resolved
To test this you can compile the program
The compilation will fail with the error: |
I made CODE_BUILDER the default storage mechanism for code models, so you can test the capability by running the compiler tests. |
Getting compiler errors due to lack of imports:
After fixing those locally i get many test failures. |
@mcimadamore Mourad implemented a transformation of the code model that builds a code model that adds local variables for values with more than one use, which makes it easier to generate the AST nodes. Would use of the internal |
I'm not sure I have all the context here. The problem here seems to be when you have a value that is resulting from some potentially side-effect operation. E.g. like a method call:
If Let expression nodes are useful when dealing with compact expressions. E.g.
E.g. javac typically uses a let expression when it has to translate a single expression into something more complex, but it wants to do so by keeping the result as an expression (rather than turning the expression into a statement, which is not possible in all cases, such as in the case of a variable initializer). It is true that what seems like a linear list of ops in a block can be modelled as something like more convoluted, like so:
This would mean to generate one let expression per op, where the "body" of the let expression is the remainder of the code model block. All this nesting is confusing, but is also avoidable -- a
Doing something like this would probably avoid the need of generating extra local variables -- you now have one var declaration per op in the "statements" part of a At the end of the day either adding extra variables (which can even be done as a pre-processing step, by javac), or using a more functional translation with P.S. |
|
||
private Map<JavaType, Type> mappingFromJavaTypeToType() { | ||
Map<JavaType, Type> m = new HashMap<>(); | ||
Symbol.ModuleSymbol jdk_incubator_code = syms.enterModule(names.jdk_incubator_code); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is effectively an extension to javac's symbol table. My preference would be to:
- add whatever symbol/type you need in
CodeReflectionSymbol
- then set up the translation map simply, like you do for primitive types -- e.g.
Map.entry(<type element>, <javac type>)
}; | ||
} | ||
|
||
private JCTree invokeOpToJCMethodInvocation(CoreOp.InvokeOp invokeOp) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method has some issues -- but perhaps we can keep it for now.
The main problem is that we seem to copy types from the invoked method declaration and put them "as is" in the generated AST. This works only as long as the invoked method doesn't have any generic type. For instance, consider a call to List::add
. The signature of this method accepts X
which is a type parameter of List<X>
. I believe the code you have copies the X
from the declared signature and sticks it into the javac's MethodType
instance attached to the method call. Since this method type is used to describe the use-site (the call), and not the declaration, the type seems incorrect. E.g. calling List<String>:add
should have a type of add(String)
not add(X)
. (btw, a similar problem is present for field access, if the type of the accessed field is a type-variable -- so it's not just an issue with method calls).
The type of the invoked method/accessed field should always be adjusted with Types.memberType
first. This method takes two parameters:
- the receiver type you are using to call the instance method/access the instance field
- the symbol you are trying to access
And returns the instantiated type of that symbol at that receiver type. Examples:
memberType(List<String>, add(X)->void) -> (String)->void
memberType(List<Integer>, add(X)->void) -> (Integer)->void
Once you address this, there is still another problem with generic methods -- as methods can have their own type-variables too. To figure out what is the type to be replaced for these type-variables you generally need to run inference -- which seems way too much for what we're trying to do here. The issue here is that the code model you are processing doesn't expose these details, and so generating the AST needs to "trace back" whatever steps where done when generating this model -- which is an hard problem.
I hope we can get away with just working with erased types, and maybe insert type-conversion to convert the type of the invoked method/accessed field so that it matches the expected op result type.
if (invokeOp.isVarArgs()) { | ||
var lastParam = invokeOp.invokeDescriptor().type().parameterTypes().getLast(); | ||
Assert.check(lastParam instanceof ArrayType); | ||
methodInvocation.varargsElement = typeElementToType(((ArrayType) lastParam).componentType()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same problem as described above -- here we're copying the vararg array at the declaration site into a type at the use-site -- for something like List::of
this won't give the result you expect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now -- while all the points I made above apply (e.g. if you were to try and type-check the generated AST using Attr
you will get several errors), the saving grace here is that you are sending this tree into javac's backend anyway. And the backend plays a lot looser with types, only inserting casts where absolutely needed. Since you are using the op result type on the MethodType
instance you are generating, I believe that should be enough for the backend to at least insert 90% of the type conversions that might be required because of erasure. So, in practice, even if incorrect, the code above might work fine.
…nly if the opMethod has there params.
#351 need to be merged before this PR |
I've verified that all tests pass, except the two test which fail with a compiler crash because of #351. |
.ldc(cp.constantDynamicEntry(cp.bsmEntry(bsmDataAt, List.of(cp.intEntry(1))), natMH)); | ||
MethodType mtype = quotableOpGetterInfo.getMethodType(); | ||
if (quotableOpGetterInfo.getReferenceKind() != MethodHandleInfo.REF_invokeStatic) { | ||
mtype = mtype.insertParameterTypes(0, implClass); | ||
} | ||
if (quotableOpGetterInfo.getMethodType().parameterList().equals(List.of(CodeReflectionSupport.OP_FACTORY_CLASS, | ||
CodeReflectionSupport.TYPE_ELEMENT_FACTORY_CLASS))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If these fields are public/static, why don't we load them from inside the compiler-generated method, instead of injecting them here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am assuming by the compiler-generated method you mean opMethod
. The opMethod
has two params (OpFactory, TypeElementFactory), so loading these fields is necessary before invoking it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can turn opMethod into one with no param and load these fields inside its body. But this removes the flexibility we have now (i.e. we can choose the OpFactory and TypeElementFactory we want to pass).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can turn opMethod into one with no param and load these fields inside its body.
That is what I was asking about -- while I agree what we have seems more flexible, we don't seem to use this flexibility anywhere -- as the method that turns a Quotable
into a Quoted
works with hardwired op and type factories. But maybe this has been already discussed -- in which case it's fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively use a consistent the method signature with the parameters even if they are not used. Uniformity in the method signature is i think a nice property we should aim for, but could be followed up later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep - I agree we can follow up later.
src/jdk.incubator.code/share/classes/jdk/incubator/code/Op.java
Outdated
Show resolved
Hide resolved
|
||
// TODO add VarOps in OpBuilder | ||
funcOp = addVarsWhenNecessary(funcOp); | ||
funcOp.writeTo(System.out); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addVarsWhenNecessary
is used to add necessary VarOp in the code model before transformation to AST. For example if an op result is used more than once, we introduce a VarOp to hold the value, because that's what the equivalent Java code will have. This makes the transformation from code model to AST a bit easier.
We also use it to in case we want an Op (e.g. InvokeOp) to be appended right away to the method we are building. As you now the block of a method is a list of statements, some operations maps to an expression, so they won't be appended until needed. So by putting the result of an op in a VarOp, that VarOp will be mapped to a statement which causes the tree of the op to be added immediately.
For example an InvokeOp
to Block.Builder::op
is something we want to be transformed to a JCTree
and added immediately to the method's block, so that we preserve the order of the operations as in the original code model.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Normally, the OpBuilder should produces a model that contains the VarOp when they should be. But I decided to have a seperate transformation as a quick fix. Later I will change the OpBuilder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies -- I meant whether the writeTo
should be removed :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I decided to have a seperate transformation as a quick fix. Later I will change the OpBuilder.
As stated in another comment, I'm not sure changing the op builder is needed -- as the javac code can easily workaround what the op builder does (e.g. what it does doesn't seem obviously wrong to me). But if the decision is to fix op builder that's fine too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to directly avoid changing OpBuilder (investing in a SSA to Var transformation, using liveness analysis, is probably the general right approach), but there might be a more focused approach to generating the AST so we can avoid this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for confirming @PaulSandoz -- we can defer this improvement to a follow up change -- and keep the var-adding transformation in this PR for now.
na.type = typeElementToType(at); | ||
yield na; | ||
} | ||
var ownerType = typeElementToType(newOp.constructorType().returnType()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a strange asymmetry in the model between constructors and methods. Methods have a descriptor, constructors do not. Also, for methods you know from the op if it is varargs, but for constructors this info is not available. My feeling is that (at least coming at this from javac), trying to lump array and class creation under one op is probably why these asymmetries crop up.
This has nothing to do with this PR of course -- but I suspect that if you had a varargs constructor call this code would not work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree on all points. We can have a new.array
operation and then model class creation as an invoke
op with method name being "init" ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree on all points. We can have a
new.array
operation and then model class creation as aninvoke
op with method name being "init" ?
not sure I follow -- I was talking about class construction -- arrays are a bit special (e.g. there's no array constructor descriptor, unlike for classes). But a constructor call does share a lot of similarities with a method call, and there should be reusable code between method and constructors -- as it happens inside the javac code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be deferred until we have a better op to model instance creation expression.
src/jdk.incubator.code/share/classes/jdk/incubator/code/internal/CodeModelToAST.java
Outdated
Show resolved
Hide resolved
var nc = treeMaker.NewClass(null, null, clazz, args.toList(), null); | ||
var argTypes = new ListBuffer<Type>(); | ||
for (Value operand : newOp.operands()) { | ||
argTypes.add(typeElementToType(operand.type())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems wrong -- and is related to the point I made above -- the type of the operands here are the types of the actual arguments of the constructor calls. So, if you have:
class Foo {
Foo(String string) { ... }
}
the method type of the constructor call has to say Foo(String)
. But if you "infer" the type of the constructor argument from the actual argument, if the call is like this:
new Foo(null)
We should probably use the op's constructorType
to derive the type of the invoked constructor? (You use that to determine the constructor's owner type)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be deferred until we have a better op to model instance creation expression.
return treeMaker.MethodDef(ms, mb); | ||
} | ||
|
||
public static CoreOp.FuncOp addVarsWhenNecessary(CoreOp.FuncOp funcOp) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I'm missing something obvious. But if you have something like:
func @"f" ()java.util.Map -> {
%0 : java.util.Map<java.lang.Integer, java.lang.Integer> = new @"func<java.util.HashMap>";
%1 : int = constant @"1";
%2 : int = constant @"2";
%3 : java.lang.Object = invoke %0 %1 %2 @"java.util.Map::put(java.lang.Object, java.lang.Object)java.lang.Object";
return %0;
}
Can't we translate as:
Map $0 = new HashMap();
int $1 = 1;
int $2 = 3;
Object $3 = $0.put($1, $2);
return $0;
Note how this translation is very 1-1 and direct, and doesn't require any adaptation step in the model. We just emit a new local variable for each op we see in the body -- and then refer to operands via their corresponding (locally declared) variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(of course we'd only do the above translation for ops that have a non-void result type)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can follow up on this improved translation at a later point.
var name = names.fromString(fieldLoadOp.fieldDescriptor().name()); | ||
var type = typeElementToType(fieldLoadOp.resultType()); | ||
var owner = typeElementToType(fieldLoadOp.fieldDescriptor().refType()); | ||
var sym = new Symbol.VarSymbol(flags, name, type, owner.tsym); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems wrong -- the accessed field symbol should be findable in the scope of the class in which the field is declared. It seems to me that we need some piece of code that turns a method/field descriptor into a java symbol. Look at Lower::lookupMethod
which uses Resolve::resolveInternalMethod
-- I think here we should do that too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is borderline, but again, we can leave it as is for now (the risk is that if we have a bad model javac will just emit a reference to some non-existent field, which will likely cause non-sensical bytecode to be emitted).
.collect(List.collector()); | ||
var restype = typeElementToType(invokeOp.resultType()); | ||
var type = new Type.MethodType(paramTypes, restype, List.nil(), syms.methodClass); | ||
var methodSym = new Symbol.MethodSymbol(flags, name, type, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as for field access (see below) -- we should not create a new method symbol from scratch but "resolve" a method in a given class symbol (using Resolve::findInternalMethod
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest to add two helper methods -- one takes a FieldDescriptor and returns a VarSymbol (using Resolve::findFieldInternal
-- the other takes a MethodDescriptor and returns a MethodSymbol (using Resolve::findMethodIntenal
. Then you rewrite the code here (and in field access) to use these methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is related to the one above for var symbol -- creating method/var symbols directly in the backend is not the way to do things -- we should lean on Resolve
instead. It would be nice to fix these -- but we can also take another pass at it later if required.
If we merge 351 into this, those two tests will pass. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cannot tell what may be left to address for Maurizio's comments and what we can defer. If Maurizio is ok to integrate then i am ok.
It seems we can follow up later on the following:
- avoiding the insertion of Vars
- uniformity of method signatures that produce code models
- adjust the new operation to support an invocation descriptor attribute, similar to the invoke operation.
.ldc(cp.constantDynamicEntry(cp.bsmEntry(bsmDataAt, List.of(cp.intEntry(1))), natMH)); | ||
MethodType mtype = quotableOpGetterInfo.getMethodType(); | ||
if (quotableOpGetterInfo.getReferenceKind() != MethodHandleInfo.REF_invokeStatic) { | ||
mtype = mtype.insertParameterTypes(0, implClass); | ||
} | ||
if (quotableOpGetterInfo.getMethodType().parameterList().equals(List.of(CodeReflectionSupport.OP_FACTORY_CLASS, | ||
CodeReflectionSupport.TYPE_ELEMENT_FACTORY_CLASS))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively use a consistent the method signature with the parameters even if they are not used. Uniformity in the method signature is i think a nice property we should aim for, but could be followed up later.
|
||
// TODO add VarOps in OpBuilder | ||
funcOp = addVarsWhenNecessary(funcOp); | ||
funcOp.writeTo(System.out); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to directly avoid changing OpBuilder (investing in a SSA to Var transformation, using liveness analysis, is probably the general right approach), but there might be a more focused approach to generating the AST so we can avoid this.
In this PR we allow the user to decide how to store the code model. The first option is
TEXT
, if this is selected, we store the textual representation of the code model. The second option isCODE_BUILDR
, if this is selected, we store the code that builds the code model. All work done here is around the second option, because the first option is already supported.Progress
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/babylon.git pull/305/head:pull/305
$ git checkout pull/305
Update a local copy of the PR:
$ git checkout pull/305
$ git pull https://git.openjdk.org/babylon.git pull/305/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 305
View PR using the GUI difftool:
$ git pr show -t 305
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/babylon/pull/305.diff
Using Webrev
Link to Webrev Comment