-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose raw descriptor string slice of ClassName
, FieldDescriptor
and MethodDescriptor
for bindgens
#52
Comments
The unqualified segments can be glued back together to make the string, right? I'd take a PR that exposes that as a function on ClassName.
Sorry, is there a bug here? If so please provide a test case to demonstrate it, thanks! |
I've already tried to do it, but it's not an ideal solution: it constructs a new
Here it is (based on your #[test]
fn test_owned_cow() {
let chars = Cow::from("(Ljava/lang/Object$Obj;)V".to_string());
let descriptor = parse_method_descriptor(&chars, 0).unwrap();
let mut parameters = descriptor.parameters.into_iter();
let return_type = descriptor.return_type;
assert_eq!(
parameters.next().unwrap(),
FieldDescriptor {
dimensions: 0,
field_type: FieldType::Object(ClassName {
segments: vec![
UnqualifiedSegment {
name: Cow::Borrowed("java")
},
UnqualifiedSegment {
name: Cow::Borrowed("lang")
},
UnqualifiedSegment {
name: Cow::Borrowed("Object$Obj")
},
],
}),
},
);
assert!(parameters.next().is_none());
assert_eq!(return_type, ReturnDescriptor::Void);
} |
Sorry, I think it's not easy to make the change: parsed fields available in I've built a self-referencing wrapper for the parsed pub use unsafe_class::Class;
mod unsafe_class {
use std::pin::Pin;
use std::marker::PhantomPinned;
pub struct Class {
raw_bytes: Pin<Box<(Vec<u8>, PhantomPinned)>>,
inner: cafebabe::ClassFile<'static>,
}
impl Class {
pub fn read(raw_bytes: Vec<u8>) -> Result<Self, cafebabe::ParseError> {
let pinned = Box::pin((raw_bytes, PhantomPinned));
// SAFETY: `get<'a>(&'a self)` restricts the lifetime
let fake_static = unsafe {
std::slice::from_raw_parts(pinned.0.as_ptr(), pinned.0.len())
};
let inner = cafebabe::parse_class(fake_static)?;
Ok(Self { raw_bytes: pinned, inner })
}
pub fn get<'a>(&'a self) -> &'a cafebabe::ClassFile<'a> {
// SAFETY: casts `self.inner` into `cafebabe::ClassFile<'a>` forcefully.
// Why is `ClassFile` invariant over `'a`?
unsafe { &*(&raw const (self.inner)).cast() }
}
}
} Do you think it's actually safe? |
Yes, unfortunately there's the pesky case of Java UTF-8 being different from regular UTF-8, so we can't unconditionally use the original class data when representing things as rust strings.
neat!
I don't know, your grasp of advanced rust is better than mine, I haven't kept up with the language much recently.
I looked again the JVM spec and per https://docs.oracle.com/javase/specs/jvms/se21/html/jvms-4.html#jvms-4.2.2 unqualified names are allowed to contain the $ character so unless there's an actual bug here I don't think there's anything to change. The test case you provided passes as-written, but even if you change the expectation there it's not clear to me why you would expect it to behave differently. |
I'm about to finish porting
How about holding a Note about nested classes: "The binary name of a member class or interface consists of the binary name of its immediately enclosing class or interface, followed by $, followed by the simple name of the member." https://docs.oracle.com/javase/specs/jls/se17/html/jls-13.html#jls-13.1. Have a
|
Well, a workaround in |
While technically this is possible, I think it would be a pretty invasive change and require a massive overhaul of how data is plumbed through the parser. If I'm wrong and there's an easy way to do this I'll consider it.
Not sure what you mean by this, can you elaborate?
This is in the JLS which is specific to Java. cafebabe is a class file parser which means it has to work with all JVM languages not just Java. I'd like to avoid adding Java-specific things into cafebabe directly. |
It is an issue about API designing. Why should the field type (or method argument/return type) class binary name available in |
Ah I see what you mean. That's a fair point. When I wrote it I was mostly following the spec and in the one case it's a descriptor string and in the other it's not, so I used different types. But I can see your point that from the consumer's point of view it's inconsistent. |
Current workarounds in Dirbaio/java-spaghetti#5:
Description of https://docs.oracle.com/javase/specs/jvms/se21/html/jvms-4.html#jvms-4.4.1: The value of the Then I took a glance at https://docs.oracle.com/javase/specs/jvms/se21/html/jvms-4.html#jvms-4.5, https://docs.oracle.com/javase/specs/jvms/se21/html/jvms-4.html#jvms-4.6, still I can't understand your words "in the one case it's a descriptor string and in the other it's not". |
Ok after reading over the spec again I agree with you, the way I have it is inconsistent and can be improved. What I'm thinking is that
Does that seem reasonable? |
Maybe. However, I had worried about the lifetime of |
Well, I'm confused by my previous thoughts about the workaround for the current version of this library. If changes were to be made inside this library, the lifetime of `ClassName` and other descriptors should be still consistent with the `ClassFile` which contains them.
…________________________________
发件人: Kartikaya Gupta (kats) ***@***.***>
发送时间: 2025年2月16日 4:50
收件人: staktrace/cafebabe ***@***.***>
抄送: wu bobo ***@***.***>; Author ***@***.***>
主题: Re: [staktrace/cafebabe] Expose raw descriptor string slice of `ClassName`, `FieldDescriptor` and `MethodDescriptor` for bindgens (Issue #52)
Ok after reading over the spec again I agree with you, the way I have it is inconsistent and can be improved. What I'm thinking is that
1. The ClassName structure should have a Cow for the whole string, and instead of existing the Vec of unqualified names, could have an impl method that computes and returns the Vec
2. The this_class (and super_class etc.) members of the ClassFile structure should also be ClassName structures that wrap the Cow values currently present.
Does that seem reasonable?
―
Reply to this email directly, view it on GitHub<#52 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AVFLUNU3A7VWYLOL3L3NKD32P6SAHAVCNFSM6AAAAABWSEOY4WVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMNRRGA4TMMRQHA>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
[staktrace]staktrace left a comment (staktrace/cafebabe#52)<#52 (comment)>
Ok after reading over the spec again I agree with you, the way I have it is inconsistent and can be improved. What I'm thinking is that
1. The ClassName structure should have a Cow for the whole string, and instead of existing the Vec of unqualified names, could have an impl method that computes and returns the Vec
2. The this_class (and super_class etc.) members of the ClassFile structure should also be ClassName structures that wrap the Cow values currently present.
Does that seem reasonable?
―
Reply to this email directly, view it on GitHub<#52 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AVFLUNU3A7VWYLOL3L3NKD32P6SAHAVCNFSM6AAAAABWSEOY4WVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMNRRGA4TMMRQHA>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
I think in this case the unqualified names would be returned as Strings with copying. I don't mind doing that since it would be an explicit function call from the user instead of happening as part of parsing. |
I'm porting the
java-spaghetti-gen
to use this library, because the previously usedjreflection
crate is currently unmaintained.It looks strange to me that
ClassFile<'_>::this_class
isCow<'a, str>
, but the object type class name contained in theFieldDescriptor
isClassName
in which the original string slice cannot be read.These desired functions are available in
jreflection
crate:https://docs.rs/jreflection/latest/jreflection/field/struct.Field.html#method.descriptor_str
https://docs.rs/jreflection/latest/jreflection/method/struct.Method.html#method.descriptor_str
https://docs.rs/jreflection/latest/jreflection/field/enum.BasicType.html (Note:
Id<'a>
is a wrapper of&'a str
)Crate
noak
exposes these original string slices too, but it doesn't parse these field/method descriptors.PS: It seems like
ClassName
's parse function splits the class binary name by/
, but it doesn't care about$
.The text was updated successfully, but these errors were encountered: