Types and Programming Languages Chapter 15 Subtyping – Part 1

In this meeting we discussed the first half of Types and Programming Languages Chapter 15: Subtyping, up to the end of section 15.4 (i.e. the top of page 193).

Tom's update

Before the meeting, @tomstuart sent the following update to the club mailing list:

Hi Computation Club,

These emails are meant to arrive on a Monday, aren’t they? Sorry.

As Paul’s message said last week, we’re currently trying to read the first half of Types and Programming Languages chapter 15, up to the end of section 15.4 (i.e. the top of page 193). That will give us loads of subtyping-related stuff to discuss, and hopefully leave us ready to tackle the rest of the chapter, which is all about how to incorporate subtyping into some of the extensions we’ve built in recent chapters.

I wanted to incept a few small ideas between now and the next meeting (22nd August) so that they could ferment over the weekend and turn your brain into a glorious risen loaf of understanding:

--

Intuitively, subtyping is all about substitutability. Why is “records with the fields ‘name’ and ‘age’” a subtype of “records with the field ‘name’”? Because, if something in your program (e.g. a function) expects a record with a field called ‘name’ in a particular place, it’s always safe to give it a record with fields called ‘name’ and ‘age’ instead — that’s an acceptable substitute. This is a little bit counterintuitive because “the subtype is bigger”, but it makes sense when you think about it, and is likely to ultimately agree with other intuitions you have about subtypes in your favourite languages.

--

You’re probably familiar with subclassing from object-oriented programming languages, but subclassing and subtyping are different ideas. In a Java program, for example, you might have a class called “Widget”; therefore, because this class exists and can be instantiated, we can also talk about the type “instances of the Widget class”, which is a description of a particular kind of value that might appear when the program runs. Likewise if “Widget” is a subclass of the class “Product”, that usually means that “instances of the Widget class” is a subtype of “instances of the Product class” as a consequence of the way instances of subclasses behave. But just because these ideas often appear together doesn’t mean they’re not separate.

(The confusion is compounded by the fact that we usually just write “Widget” instead of the longwinded “instances of the Widget class” when it’s clear from context that we’re talking about the type rather than the class itself. I’m about to do that right now.)

--

A good candidate for the most interesting and/or confusing bit of what we’re reading is subtyping for function types (the S-Arrow rule). The underlying issue here is: what does substitutability look like for functions? If something in your program expects a function of type Integer → Integer in a particular place, what types of functions can you safely provide instead?

Let’s think about the function’s argument type first. Since the program is expecting a function of type Integer → Integer, it’s liable to call that function with any argument that can be treated as an Integer, which includes values of subtypes of Integer (e.g. Fixnum or Bignum). So an appropriate substitute function must be able to cope with any Integer, but it’s fine if it’s also able to accept values of a supertype (e.g. Numeric). For example, a function of type Numeric → Integer would be a safe substitute, because you can still call it with any Integer.

What about the result type? Since the program is expecting a function of type Integer → Integer, it’s liable to use its result in ways that assume it’s an Integer. So an appropriate substitute function must return a result that can be treated as an Integer, which means it’s fine if it returns values of a subtype (e.g. Fixnum or Bignum). For example, a function of type Integer → Fixnum would be a safe substitute, because you can still use its return values in any place where an Integer is expected.

By putting these together, we can see that e.g. a function of type Numeric → Fixnum would be a suitable substitute wherever a function of type Integer → Integer is expected, because it can cope with all possible inputs and produces only acceptable outputs (hints of Postel’s law here). As the chapter explains, this is called being contravariant in the argument type (“contra” = “against”) and covariant in the result type (“co” = “with”) because of the way the argument and result types of the subtype can safely be supertypes and subtypes of the original function’s argument/result respectively.

Confusing but interesting! Something to discuss more clearly in the meeting?

--

That’s it. Happy reading. See you next week.

Cheers,
-Tom

The chapter

The big picture

We began by reviewing our existing typing rule for application: T-App:

Γ ⊢ t1 : T11 → T12  Γ ⊢ t2 : T11
---------------------------------
        Γ ⊢ t1 t2 : T12

And seeing how some well-behaved programs would be rejected by our existing type system such as:

(λr:{x:Nat}. r.x) {x=0,y=1}

This program would run fine as the abstraction on the left only uses the x field of the record r passed into it and ignores the extra y field but our type system would reject it as it only permits records with exactly an x field.

In order to support more correct programs with our type checker, we can introduce the concept of subtyping. Informally, this allows us to substitute a type T with a subtype S where it is safe to do so.

Subtyping rules

We grappled with what it means to be a subtype in terms of sets with the help of a ~~Venn~~ Euler diagram. If S is a subtype of T, is T a superset of S?

After a baffling exchange about the relative sizes of infinities, we met our first new typing rule of "subsumption", T-Sub which introduces the new notation S <: T meaning "S is a subtype of T":

Γ ⊢ t : S  S <: T
------------------
    Γ ⊢ t : T

Note that we haven't yet defined which types are subtypes of others, only that we can substitute subtypes into terms.

We then went through an exciting new set of subtyping rules (tantalisingly prefixed with S-) starting with S-Refl defining the reflexive nature of subtypes:

S <: S

Then S-Trans which defines the transitive nature of subtypes:

S <: U  U <: T
--------------
    S <: T

While this at first seemed straightforward, we discussed how this was the first rule that requires us to take an intuitive step to invent a type U when deriving. In particular, this makes mechanical derivation difficult compared to the rules we've seen so far.

Subtyping records

With those fundamentals in place, we turned to defining our first subtypes for our record type: first defining that two records, one with "extra" fields at the end relative to the other (e.g. {x:Nat} and {x:Nat, y:Nat}) are subtypes through S-RcdWidth:

{li:Ti^i∈1..n+k} <: {li:Ti^i∈1..n}

We discussed examples such as:

{x:Nat,y:Nat,z:Bool} <: {x:Nat,y:Nat}

With width covered, we then discussed the "depth" of records through S-RcdDepth:

      for each i  Si <: Ti
--------------------------------
{li:Si^i∈1..n} <: {li:Ti^i∈1..n}

And drew up some examples on the board:

{x:{y:Nat, z:Nat}} <: {x:{y:Nat}}

The final record subtyping rule we discussed was the ~~Patuzzo Principle~~, S-RcdPerm:

{kj:Sj^j∈1..n} is a permutation of {li:Ti^i∈1..n}
-------------------------------------------------
      {kj:Sj^j∈1..n} <: {li:Ti^i∈1..n}

This allowed us to answer a question from earlier in the book: can you treat records with identical fields in a different order equivalently? With subtyping, yes!

{x:Nat, y:Nat} <: {y:Nat, x:Nat}

With all of these rules in place, we talked about how the book says that the subtype relationship is not anti-symmetric (after S-RcdPerm is introduced on p184). In particular, the combination of S-Refl and S-Trans means that the subtype relation is a preorder, not a partial order so you cannot construct a tree of subtypes.

To test our understanding, we went through exercise 15.2.1 together:

Draw a derivation showing that {x:Nat,y:Nat,z:Nat} is a subtype of {y:Nat}.

We discussed how we would have to combine rules to show this as no single rule applies: e.g. we can't use S-RcdWidth because {y:Nat} is not a prefix of {x:Nat,y:Nat,z:Nat} (there's a pesky x:Nat in the way) and we can't use S-RcdPerm because the number of fields differ.

Our solution was to use S-Trans to introduce an interstitial type that we could then apply rules to:

{x:Nat,y:Nat,z:Nat} is a permutation of {y:Nat,x:Nat,z:Nat}
----------------------------------------------------------- S-RcdPerm  ------------------------------ S-RcdWidth
{x:Nat,y:Nat,z:Nat} <: {y:Nat,x:Nat,z:Nat}                             {y:Nat,x:Nat,z:Nat} <: {y:Nat}
----------------------------------------------------------------------------------------------------- S-Trans
                            {x:Nat,y:Nat,z:Nat} <: {y:Nat}

Subtyping functions

Having steeled ourselves, we then looked at the subtyping rule for function types, S-Arrow:

T1 <: S1  S2 <: T2
------------------
S1 → S2 <: T1 → T2

The tricky bit here is that the argument type (T1 above) can be replaced with a supertype ("contravariant") and the return type (T2) can be replaced with a subtype ("covariant").

We tried to work through examples to convince us of the safety of this using an example higher-order function that takes a function f:

(λf:{x:Nat,y:Nat} → Bool . f {x=0,y=0})

With the above, we can pass an abstraction that exactly matches the type like so (inventing an addition operator + to provide a better example):

(λf:{x:Nat,y:Nat} → Bool . f {x=0,y=0}) (λr:{x:Nat,y:Nat} . iszero (r.x + r.y))

But we could also pass a function that only takes an x and ignores the y altogether:

(λf:{x:Nat,y:Nat} → Bool . f {x=0,y=0}) (λr:{x:Nat} . iszero r.x)

However, it would not be safe for us to pass the following function:

(λf:{x:Nat,y:Nat} → Bool . f {x=0,y=0}) (λr:{x:Nat,y:Nat,z:Nat} . iszero (r.x + r.y + r.z))

Therefore a function type can only be replaced by another function type where the argument type is a supertype (that is, {x:Nat,y:Nat} → Bool can be replaced with {x:Nat} → Bool).

Looking at return types, we used the following example:

(λf:Bool → {x:Nat,y:Nat} . (f true).x + (f false).y)

Again, we could pass a function with the exact type:

(λf:Bool → {x:Nat,y:Nat} . (f true).x + (f false).y) (λb:Bool . {x=0,y=0})

Or we could pass a function which returns more fields:

(λf:Bool → {x:Nat,y:Nat} . (f true).x + (f false).y) (λb:Bool . {x=0,y=0,z=0})

However, we're not able to pass a function which returns fewer fields:

(λf:Bool → {x:Nat,y:Nat} . (f true).x + (f false).y) (λb:Bool . {x=0})

This means that it is safe to substitute a function type where the return type is a subtype (that is, Bool → {x:Nat,y:Nat} can be replaced with Bool → {x:Nat,y:Nat,z:Nat}).

As this was all rather counter-intuitive, @jcoglan suggested an analogy: funnels.

If we imagine a function type as a funnel, with the top end being the argument type and the bottom end being the return type (perhaps connected to some sort of hopper), we might picture something like so:

    ooo
 \  ooo  /
  \ ooo /
   |ooo| 
  \ ooo /
   \ooo/

It seem intuitive that we could widen the top end and still accept the same input:

    ooo
\   ooo   /
 \  ooo  /
  \ ooo /
   |ooo| 
  \ ooo /
   \ooo/

And we could make the bottom end narrower and still fit into the hopper:

    ooo
 \  ooo  /
  \ ooo /
   \ooo/
    |o| 
  \  o  /
   \ o /

However, we can't say the reverse: if we make the top end narrower then we can't accept the same inputs:

    ooo
    \o/
     | 
  \     /
   \   /

And if we make the bottom end wider then we can't fit into the hopper:

\    ooo    /
 |   ooo   |
   \     /
    \   /

Top and bottom types

Speaking of supertypes, we had just enough time to discuss a type that is the supertype of all types called Top which is defined by the typing rule S-Top:

S <: Top

We discussed how this was a little like Object in languages like Ruby and Java where every type is a subtype of Object.

On the opposite end of the scale, we were introduced to a type that is a subtype of all types called Bot and its corresponding typing rule S-Bot:

Bot <: T

At first, we discussed whether Bot was like nil or null but this would means we could pass values that would pass type checking but fail evaluation, e.g. if null is of type Bot

(λr:{x:Nat} . r.x) null

This would type check (as S-Bot tells us Bot <: {x:Nat}) but would explode when we try to access field x of nothing.

We then went back to the chapter on the topic:

The first thing to notice is that Bot is empty — there are no closed values of type Bot. [...] The emptiness of Bot does not make it useless. On the contrary: Bot provides a very convenient way of expressing the fact that some operations (in particular, throwing an exception or invoking a continuation) are not intended to return.

In this way, Bot seemed closer to the notion of void in languages like C than our previous example of null (though we did briefly discuss how this differed from our existing notion of unit).

Finally, @tomstuart closed the meeting by summarising an important point about the tricksy S-Arrow:

When I have taught this before, people often make the mistake of thinking that functions are now contravariant in their argument types and covariant in their return types: this is not what the chapter is saying.

(As S-Arrow describes the substitutability of entire function types, i.e. when can you replace some type T → T with another, not just the substitutability of argument types and return types in general. To use our funnel analogy, we're not talking about substituting the input and output of the funnel, we're talking about substituting the funnel itself.)

With a mixture of baffled looks, we were out of time!

Retrospective

There were mixed feelings about the success of the meeting: some people saying that it had cleared up their understanding but others saying it had made matters worse
There was agreement that this felt like a huge increase in power of our type system so far, allowing us to type check many more programs than was previously possible without subtyping
@zetter kindly volunteered to organise the next meeting
We discussed when would be a good time for new people to join the club and agreed that we should hold an interstitial soon but that we should at least finish Chapter 15 and do one implementation meeting before then

Thanks

Thanks to @leocassarani and Geckoboard for hosting and providing beverages, to @lsrus, @jcoglan, @tomstuart, @dkandalov and all those who contributed bread and dips, to @tomstuart for leading the meeting and @jcoglan for his excellent funnel analogy.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly