Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: 64-bit Integer Type #227

Open
wants to merge 10 commits into
base: master
Choose a base branch
from
Open

RFC: 64-bit Integer Type #227

wants to merge 10 commits into from

Conversation

carterkozak
Copy link
Contributor

@carterkozak carterkozak commented Feb 14, 2019


## Proposal

New conjure `integer64` type semantics matching `integer` except for the allowed values, and JSON serialized form.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bikeshed: should we use long as the name? We match Java's naming for integer, boolean, double currently.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went back and forth between the two, I don't have a strong preference. I think integer<bits> is a bit cleaner if we ever need to add larger integer types, and allows us to alias integer to integer32 and safelong to integer54.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be on board with moving to integer32, float64, etc.

@sfackler
Copy link
Member

How are we going to handle the serialization/deserialization in Jackson, since we need SafeLong to continue serializing as a number and normal longs to serialize as strings?

@carterkozak
Copy link
Contributor Author

We can implement simple serializers in the conjure-java-runtime ObjectMappers factory, similar to the way we serialize java long values safely in our logging library.

@sfackler
Copy link
Member

Handling the serialization in conjure-rust requires a bit of hackery, but it's dooable: palantir/conjure-rust#20


New conjure `integer64` type semantics matching `integer` except for the allowed values, and JSON serialized form.

For the immediate future, typescript clients may represent these values using branded strings.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems to me like this change should block on the typescript serialization layer, or we end up further down the path of this not really being a language-independent RPC layer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementing the typescript generator feature will make it easier to support this proposal with rich numerical types, though the serde layer implementation detail should block improving the Conjure specification unless a feature is provably unsupportable in a target language. By adding this type, we provide an incentive to improve the TS generator. Inverting that dependency would push us to optimize around the wrong details.

@markelliot
Copy link
Contributor

I'd be curious if the common-case for 64 bit integers is that they're used as identifiers or actually used for math. If for math, it seems important we resolve the JavaScript/TypeScript question, because AFAICT there's no common way to do long-math in JS/we may not want to force that into JS.

If used as identifiers (and thus important to represent minimally, where possible to avoid memory bloat), we might want to consider a specialized ID-type like uuid, where, abstractly, the wire-value accurately represents the uniqueness/can be used directly for equality, and where some in-memory representations can be more efficient (by using a semantically equivalent value in fewer bytes).

At any rate, if it's intended to represent identifiers, I think this proposal is perhaps too general.

@carterkozak
Copy link
Contributor Author

I don't think we should add another identifier type beyond rid and uuid, which are both strong identifier types, I would not want to recommend or endorse using 64-bit integers as identifiers.

Furthermore using an identifier wrapper around a 64-bit integer would significantly increase memory overhead over a language-specific primitive 64-bit integer.

It's relatively uncommon to do math on large numbers in the frontend, making these values effectively IDs already. The most common operations are equality comparison and sorting. Math operations can be used on these types using the linked javascript bigint proposal which is already implemented in chrome has has polyfills for other/older browsers. There are other libraries for javascript which provide this functionality, however we should focus on the one which we and google anticipate being adopted into ecmascript.

The PLAIN format of an `integer64` value is the base-10 numerical value.

## Alternatives Considered

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be good to elaborate on the suitability of a JSON number representation, beyond what's written above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, I will update this when I’m at a computer (responses will be a bit delayed for the next couple weeks as I’m ooto)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, let me know if there are points you'd like me to expand upon.

@markelliot
Copy link
Contributor

It's relatively uncommon to do math on large numbers in the frontend, making these values effectively IDs already.

That's kind of my point -- why are we adding this type?

@sfackler
Copy link
Member

The internal service that I work on uses 64 bit integers as timestamps in its API. Safelongs aren't sufficient since 2^53 nanoseconds is only ~104 days. We don't expect any frontends to ever be talking directly to this service, so the level of support in Typescript isn't really all that important to my use case in particular.

We're moving this service's APIs slowly over to Conjure, but lack of 64 bit integer support is going to be a pretty hard blocker on being able to handle even the majority of the API surface.

@carterkozak
Copy link
Contributor Author

why are we adding this type?

Most of our code is neither typescript nor front end, between backend services large number values are more common and helpful, and the usage patterns mentioned above do not apply.

@uschi2000
Copy link
Contributor

For that use-case, @sfackler, it seems sufficient to define int64 == alias of string, internal to your service, right? I'm a little skeptical of adding a half-baked implementation of a thing that would be used by one consumer (in one language) and for which a viable and wire-format-forwards-compatible alternative exists.

@sfackler
Copy link
Member

My particular thing would be dealing with two languages, not one, but in general 64 bit integers should work just fine in all the languages we care about except one.

@uschi2000
Copy link
Contributor

Yep, and I believe that that ("except one") is a deal-breaker.

@sfackler
Copy link
Member

But what level of "fine-ness" are we okay with? Would a polyfilled BigInt be sufficient in Typescript?

Any API that wants to be consistent across a large variety of languages is going to have some amount of impedance mismatch. As an extreme example, set<double> isn't even representable in conjure-rust! It seems like we have to decide what level of support is minimally required across all the languages we care about.

```yml
types:
imports:
Long:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this type definition will only work if deserializers coerce strings to longs, but I'm pretty sure that feature was disabled recently in Java, for instance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is correct, the reference java implementation recently broke this, however other internal java conjure implementations still support it. My goal for rfc is to standardize behavior, and avoid maintaining multiple implementations.

My preference is to provide a supported mechanism for this feature rather than revert the coercion change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as Rob and I noted, it then likely needs to be handled in all languages.

I think if you weren't dealing with math, that an additional identifier type would give equal treatment in all languages and simply make more efficient use of memory in some languages than others.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are in agreement that the ecmascript bigint type is the correct binding for TS (as noted in the Proposal section). The full implementation requires a bit of work on our typescript generator, but I cannot invest my team in doing so until the specification change has been ratified.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately I don't think we could agree to this specification without the pre-work from happening in TypeScript to support the then small-ish change this RFC would make.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear: pending the typescript client serialization layer, you support this feature. Is that correct?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if and only there is a serialization layer.

Copy link
Contributor

@iamdanfox iamdanfox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding integer64 seems pretty well justified and the upsides (for both Gotham ids, and also enabling regular usage of 64-bit numbers) outweigh the downsides (added language complexity, potential IR revs).

Given that conjure-typescript already turns some richer types like uuid and rid into plain JavaScript strings, I think it's reasonable for conjure-typescript to initially just turn the new integer64 type into strings too, so that users can pick whichever polyfill they prefer.

Before merging this though, I'd like to get some consensus about how changes like this can actually be rolled out, i.e. is it a new IR format number, or do we just add it to the existing definition.

@markelliot
Copy link
Contributor

@iamdanfox I think there's a pretty big difference between rid, uuid and integer64. There's never going to be an expectation that rid and uuid have more than equality/ordering as types, and so it's natural to render as strings in typescript, whereas there should be an expectation for our numeric types that one can do math. If we're intent on adding a similar type, I'd again propose we target something more narrow, such as i64id, a 64-bit integer id with a natural representation in languages that support 64 bit numbers, and string otherwise.

Alternatively, we could wait for a common typescript polyfill for math, as noted in the other comments.

@iamdanfox
Copy link
Contributor

iamdanfox commented Mar 6, 2019

@markelliot I think the intention of this integer64 proposal is that it is sent over the wire as a quoted string but in all practicable languages (Java, Golang, Python, Rust), you'd interact with a conjure 'integer64' type as a number (allowing sensible +, -, < etc).

In typescript, we could choose to bundle any one of the existing polyfills, BigInt, bn.js, bigi, yaffle or silentmatt-biginteger. However, I think it's actually preferable to represent these as javascript strings initially so that consumers are free to experiment with these polyfills... I'd only want to impose a specific one when we better understand performance & compatibility characteristics. Until then, I think it's appropriate to let consumers choose (they can even just use BigInt("1234") in Chrome today, with no polyfill at all!).

@stale
Copy link

stale bot commented Oct 15, 2019

This PR has been automatically marked as stale because it has not been touched in the last 14 days. If you'd like to keep it open, please leave a comment or add the 'long-lived' label, otherwise it'll be closed in 7 days.

@stale stale bot added the stale label Oct 15, 2019
@jhowarth-pal
Copy link

wanted to show interest from golang rpc side of things

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants