# Subtyping

We use \tau <: \sigma to denote that \tau is a subtype of \sigma, where there is required for a \sigma, an element of \tau can be supplied. Generally speaking, a \tau can be viewed as a \sigma, any term of \tau can be used safely in a context that requires \sigma, i.e., \lbrace s|s:\tau \rbrace \subseteq \lbrace t|t:\sigma\rbrace

## Typing Relations

### New Typing Relation

1. Subsumption: \cfrac{\Gamma\vdash t:\tau\ \ \ \ \ \tau<:\sigma}{\Gamma\vdash t:\sigma}

### Subtyping Relations

1. Reflecxivity: \tau<:\tau
2. Transitivity: \cfrac{\tau<:\sigma\ \ \ \ \ \sigma<:\rho}{\tau<:\rho}

#### Covariance and Contravariance

A special case of subtyping is type constructor, a type \tau\rightarrow \sigma is a subtype of \tau'\rightarrow \sigma' iff \tau'<:\tau and \sigma<:\sigma', the intuition is that a function f: \tau \rightarrow \sigma can be replaced by a function that accepts any type that is a subtype of \tau, and results in any type that is a super type of \sigma. The \tau here is said to be at contravariant position, and \sigma is said to be at covariant position. combined with these facts we have:

1. Type Constructor: \cfrac{\tau'<:\tau\ \ \ \ \ \sigma'<:\sigma}{\tau\rightarrow\sigma <: \tau'\rightarrow\sigma'}

#### Extensions for STLC

A record type r can be considered as a subtype of record type r' iff for all label k:\tau\in r, we have k:\tau\in r' or k:\tau'\in r' such that \tau<:\tau'.
Since order of the labels in a record doesn't matter, we have three supplementary rules for the records in \lambda_{\rightarrow}

1. Narrowing: \lbrace l_i:\tau_i \stackrel{1\leqslant i\leqslant n+k}{} \rbrace<:\lbrace l_i:\tau_i \stackrel{1\leqslant i\leqslant n}{} \rbrace
2. Subtyping on Depth: \cfrac{\forall i.\tau_i<:\sigma_i}{\lbrace l_i:\tau_i\stackrel{1\leqslant i\leqslant n}{}\rbrace<:\lbrace l_i:\sigma_i \stackrel{1\leqslant i\leqslant n}{} \rbrace}
3. Subtyping on Permutation: \cfrac{\lbrace l_i:\tau_i \stackrel{1\leqslant i\leqslant n}{}\rbrace\text{ is a permutation of } \lbrace k_i:\sigma_i \stackrel{1\leqslant i\leqslant n}{}\rbrace }{\lbrace l_i:\tau_i \stackrel{1\leqslant i\leqslant n}{}\rbrace<:\lbrace k_i:\sigma_i \stackrel{1\leqslant i\leqslant n}{}\rbrace}

### Inversion Lemma of the Subtyping Relation (Extended with Record)

1. If \tau<:\sigma_1\rightarrow\sigma_2, then \tau:\tau_1\rightarrow\tau_2 for some \sigma_1<:\tau_1 and \tau_2 <: \sigma_2
2. If \tau<:\lbrace l_i:\tau_i \stackrel{1\leqslant i\leqslant n}{} \rbrace, then \tau:\lbrace k_j:\sigma_j \stackrel{1\leqslant j\leqslant m}{} \rbrace such that \lbrace l_i \stackrel{1\leqslant i\leqslant n}{} \rbrace \subseteq \lbrace k_j \stackrel{1\leqslant j\leqslant m}{} \rbrace, and \sigma_j<:\tau_i for each l_i=k_j

## Top and Bottom Types

Top type is the super type of all type:

1. Top Type: \tau<:\top

Bottom type is the subtype of all type:

1. Bottom Type \bot<:\tau

Bottom type has no canonical form, there is no value with type \bot, since if it is, then we can derive terms like v:\top\rightarrow\top from v:\bot and the subsumption rule, but canonical lemma tells us if a value is of type \tau_1\rightarrow\tau_1, then it has to be the abstraction expression, which leads to the contradition.

This special characteristic of bottom type makes it suitable for expressing some particular operations that are not supposed to return, such as throwing an exception or invoking a continuation (by the way, invoking a continuation will never return because it will suspend itself and resume the captured control flow, remind that the call/cc itself is a control flow operator and continuation is the first class control-flow, so to speak.), we can put normal logic in the success branch of an if expression and the code for raising error on the other one while allow this term to be well-typed, since the type of raising an exception is \bot and thus can be promoted to any desired type by subsumption rule.

Another case that requires attention is reference cell, generally speaking, in a lot of cases, the type constructors are either covariant or contravariant, e.g., it is covariant in List constructor:

\cfrac{\tau<:\sigma}{\mathtt{List}\ \tau<:\mathtt{List}\ \sigma}

This is because that List is considered immutable, which makes the type only appears at covariant position. The reference cell, in constract, is mutable so that the type appears at both covariant and contravariant position, in such cases it must be invariant, i.e., neither covariant nor contravariant, which makes the derivation rule looks like:

\cfrac{\tau<:\sigma\ \ \ \ \ \sigma<:\tau}{\tt Ref\ \tau<:Ref\ \sigma}

## Intersection and Union types

Type \tau_1\cap\tau_2 is an intersection type of \tau_1 and \tau_2, it is the meet of \tau_1 and \tau_2, which means if x:\tau_1\cap\tau_2, then x can be viewed as both \tau_1 and \tau_2

1. Intersection Type I: \tau_1\cap\tau_2<:\tau_1
2. Intersection Type II: \tau_1\cap\tau_2<:\tau_2
3. Intersection Type III: \cfrac{\sigma<:\tau_1\ \ \ \ \ \sigma<:\tau_2}{\sigma<:\tau_1\cap\tau_2}
4. Intersection Type IV: \sigma\rightarrow\tau_1\land\sigma\rightarrow\tau_2<:\sigma\rightarrow(\tau_1\cap\tau_2)

Type \tau_1\cup\tau_2 is an union type of \tau_1 and \tau_2, which means if x:\tau_1\cup\tau_2, then x can be viewed as either \tau_1 or \tau_2

1. Union Type I: \cfrac{\tau_1<:\sigma\ \ \ \ \ \tau_2<:\sigma}{\tau_1\cup\tau_2<:\sigma}
2. Union Type II: \cfrac{\sigma<:\tau_1}{\sigma<:\tau_1\cup\tau_2}
3. Union Type III: \cfrac{\sigma<:\tau_2}{\sigma<:\tau_1\cup\tau_2}
4. Union Type IV: \sigma\rightarrow\tau_1\lor\sigma\rightarrow\tau_2<:\sigma\rightarrow(\tau_1\cup\tau_2)

## Coercion Semantics

The Coercion Semantics for a language with subtyping is a set of functions that transforms the language with subtyping to a language without subtyping, while preserve its semantic(sorta like desugaring), the coercion function is denoted by a double bracket \llbracket-\rrbracket. consider the coercion semantics of \lambda_{\rightarrow}^{<:} with the unique type \tt unit and record types, we have the following basic transformations:

\begin{aligned}
&\llbracket\top\rrbracket&&=&&\tt unit\\
&\llbracket\tau_1\rightarrow\tau_2\rrbracket&&=&&\llbracket\tau_1\rrbracket\rightarrow\llbracket\tau_2\rrbracket\\
&\llbracket\lbrace l_i:\tau_i \stackrel{i\in 1..n}{} \rbrace\rrbracket&&=&&\lbrace l_i:\llbracket\tau_i\rrbracket \stackrel{i\in 1..n}{} \rbrace
\end{aligned}

Since the subtyping involves subsumption rules, if we want to design a coercion function that translate a value of \tau to \sigma, we need to know the exact derivation that promotes \tau to \sigma, i.e., we use calligraphic letter to denote a subderivation tree, where \mathcal{C}::\tau<:\sigma means "A subderivation \mathcal{C} whose conclusion is \tau<:\sigma", thus we have following inductive definition on the subtyping derivation:

\begin{aligned}
&\llbracket \cfrac{}{\tau<:\tau}\rrbracket &&=&& \lambda x:\llbracket \tau\rrbracket.\ \mathbb{x}\\
&\llbracket \cfrac{}{\tau<:\top}\rrbracket &&=&& \lambda x:\llbracket \tau\rrbracket.\ \tt unit\\
&\llbracket \cfrac{\mathcal{C_1}::\tau<:\sigma\ \ \ \ \ \mathcal{C_2}::\sigma<:\rho}{\tau<:\rho}\rrbracket &&=&& \lambda x:\llbracket \tau\rrbracket.\ \llbracket\mathcal{C_2}\rrbracket(\llbracket\mathcal{C_1}\rrbracket\ x)\\
&\llbracket \cfrac{\mathcal{C_1}::\sigma_1<:\tau_1\ \ \ \ \ \mathcal{C_2}::\tau_2<:\sigma_2}{\tau_1\rightarrow\tau_2<:\sigma_1\rightarrow\sigma_2} \rrbracket &&=&& \lambda f:\llbracket\tau_1\rightarrow\tau_2\rrbracket.\ \lambda x:\llbracket \sigma_1\rrbracket.\ \llbracket\mathcal{C_2}\rrbracket(f(\llbracket\mathcal{C_1}\rrbracket\ x))\\
&\llbracket\cfrac{n< m}{\lbrace l_i<:\tau_i\stackrel{i\in1..m}{} \rbrace <:\lbrace l_i:\tau_i\stackrel{1\in1..n}{} \rbrace}\rrbracket&&=&& \lambda r:\lbrace l_i:\llbracket\tau_i\rrbracket\stackrel{i\in1..m}{}\rbrace.\ \lbrace l_i=r.l_i\stackrel{i\in1..n}{}\rbrace\\
&\llbracket \cfrac{\forall i.\mathcal{C_i}::\tau_i<:\sigma_i}{\lbrace l_i:\tau_i\stackrel{i\in 1..n}{}\rbrace<:\lbrace l_i<:\sigma_i\stackrel{i\in1..n}{}\rbrace}\rrbracket &&=&& \lambda r:\lbrace l_i:\llbracket\tau_i\rrbracket\stackrel{i\in1..n}{}\rbrace.\ \lbrace l_i=\llbracket\mathcal{C_i}\rrbracket(r.l_i)\stackrel{i\in1..n}{}\rbrace\\
&\llbracket \cfrac{\lbrace k_j:\tau_j\stackrel{j\in1..n}{}\rbrace\text{ perm. of }\lbrace l_i:\sigma_i\stackrel{i\in1..n}{} \rbrace}{\lbrace k_j:\tau_j \stackrel{j\in1..n}{}\rbrace<:\lbrace l_i:\sigma_i\stackrel{i\in1..n}{}\rbrace}\rrbracket &&=&& \lambda r:\lbrace k_j:\llbracket\tau_i\rrbracket\stackrel{j\in1..n}{}\rbrace.\ \lbrace l_i=r.l_i\stackrel{i\in1..n}{}\rbrace\\
\end{aligned}

We use \llbracket \tau_i\rrbracket in the last rule because it implicitly implies "find the label k_j that has the same type as l_i".
The rule 4 may be a bit confusing, if we want to write a coercion function \mathcal{\llbracket C\rrbracket} on the subtyping rule for arrow type, then we know that \mathcal{\llbracket C\rrbracket} must both accepts and returns an arrow type. We first use \mathcal{\llbracket C_1\rrbracket} to promote x to \tau_1(contravariant), then use \mathcal{\llbracket C_2\rrbracket} to promote the return type to \sigma_2(covariant) after preserving the operational semantic(applying the function to a parameter) by applying f.
We also have these for typing derivations, the only difference is that the \llbracket-\rrbracket on typing derivation is nolonger viewed as a function, but as a term of desugared langauge, if \mathcal{D} derives \Gamma\vdash t:\tau, then \mathcal{\llbracket D\rrbracket}:\llbracket\tau\rrbracket:

\begin{aligned}
&\llbracket\cfrac{x:\tau\in\Gamma}{\Gamma\vdash x:\tau}\rrbracket &&=&& x\\
&\llbracket\cfrac{\mathcal{D}::\Gamma,t_1:\tau\vdash t_2:\sigma}{\Gamma\vdash \lambda t_1:\tau.\ t_2:\tau\rightarrow\sigma}\rrbracket &&=&& \lambda t_1:\llbracket\tau\rrbracket.\ \llbracket\mathcal{D}\rrbracket\\
&\llbracket\cfrac{\mathcal{D_1}::\Gamma\vdash t_1:\tau\rightarrow\sigma\ \ \ \ \ \mathcal{D_2}::\Gamma\vdash t_2:\tau}{\Gamma\vdash t_1\ t_2:\sigma}\rrbracket &&=&& \llbracket\mathcal{D_1}\rrbracket\ \llbracket\mathcal{D_2}\rrbracket\\
&\llbracket \cfrac{\forall i.\mathcal{D_i}::\Gamma\vdash t_i:\tau_i}{\Gamma\vdash\lbrace l_i=t_i\stackrel{i\in1..n}{} \rbrace:\lbrace l_i:\tau_i\stackrel{i\in1..n}{}\rbrace} \rrbracket &&=&&\lbrace l_i=\llbracket\mathcal{D_i}\rrbracket\stackrel{i\in1..n}{}\rbrace\\
&\llbracket\cfrac{\mathcal{D}::\Gamma\vdash t:\lbrace l_i:\tau_i\stackrel{i\in1..n}{}\rbrace}{\Gamma\vdash t.l_j:\tau_j}\rrbracket &&=&& \llbracket\mathcal{D}\rrbracket.l_j\\
&\llbracket\cfrac{\mathcal{D}::\Gamma\vdash t:\tau\ \ \ \ \ \ \mathcal{C}::\tau<:\sigma}{\Gamma\vdash t:\sigma}\rrbracket &&=&& \llbracket\mathcal{C}\rrbracket\ \llbracket\mathcal{D}\rrbracket
\end{aligned}

Note that although a coherent symbol \llbracket-\rrbracket is used, it has difference behavior on typing derivations and subtyping derivations, respectively. Generally speaking, if \llbracket\mathcal{C}\rrbracket::\tau<:\sigma for some \tau and \sigma, then it follows the subtyping derivation semantic, if \llbracket\mathcal{C}\rrbracket::\Gamma\vdash t:\tau for some t and \tau, then it follows the typing derivation semantic.

### Coherence of coercion semantics

A translation function \llbracket-\rrbracket is said to be coherent, if for every derivation \mathcal{C} and \mathcal{D} with the same conclusion, \llbracket\mathcal{C}\rrbracket and \llbracket\mathcal{D}\rrbracket are behaviorally equivalent, i.e., they produce the same result in evaluation.

## Algorithmic Subtyping

The subtyping rules proposed above are not suitable for actual implementation in programming languages, the reason is that the rule for subsumption and transtivity involves an arbitrary type, for example, we can't decide whether to promote a type \tau to \top (which is obviously available at any time by definition of \top and subsumption rule) or just leave it as is.
To fix this, we introduce a trimmed subtyping relation, called algorithmic subtyping, denoted by \Vdash1, where \Vdash\tau<:\sigma means "\tau is algorithmically a subtype of \sigma", the subsumption rule, reflexivity rule, and transitivity rule are removed, and the three rules for records are combined into a single rule:

\begin{gather}
\Vdash \tau<:\top\tag{Top}\\
\cfrac{\Vdash \sigma_1<:\tau_1\ \ \ \ \ \Vdash\tau_2<:\sigma_2}{\Vdash \tau_1\rightarrow\tau_2<:\sigma_1\rightarrow\sigma_2}\tag{Arrow}\\
\cfrac{\lbrace l_i\stackrel{i\in1..n}{} \rbrace\subseteq\lbrace k_j\stackrel{j\in1..m}{} \rbrace\ \ \ \ \ \forall k_j=l_j.\sigma_j<:\tau_i}{\Vdash\lbrace k_j:\sigma_j\stackrel{j\in1..m}{}\rbrace<:\lbrace l_i:\tau_i\stackrel{i\in1..n}{}\rbrace}\tag{Record}
\end{gather}

We need to prove that this modified subtyping relation is equivalent to the original relation.

### Lemma [The nonessential of Reflexivity]

The \tau<:\tau can be derived for every type \tau without using Reflexivity rule.

#### Proof

Straightforward induction on the subtyping derivation of \tau. ∎

### Lemma [The nonessential of Transitivity]

If \tau<:\sigma is derivable, it can be derived without using Transitivity rule.

#### Proof

By induction on the size of the derivation \tau<:\sigma, first noticed that if the last rule used is any rule other than transitivity rule, then the result is immediate, since the size of the subderivation is strictly smaller than the original derivation, apply induction hypothesis we can obtain the fact that the subderivation is transitivity-free, and since the last rule is not transitivity, too, the whole derivation tree is thus transitivity-free. If the last rule used is the transitivity rule, we proceed by the combination of both subderivation of the transtivity rule:

1. Case \cfrac{\tau<:\sigma\ \ \ \ \ \sigma<:\top}{\tau<:\top} for some \tau and \sigma:
The result is immediate, \tau<:\top is always true.
2. Case \cfrac{\tau<:\top\ \ \ \ \ \top<:\sigma}{\tau<:\sigma} for some \tau and \sigma:
The result is immediate since the only rule can be used to derive \tau<:\sigma is \text{Top}(we have eliminated reflexivity)
3. Case \cfrac{\tau_1\rightarrow\tau_2<:\sigma_1\rightarrow\sigma_2\ \ \ \ \ \sigma_1\rightarrow\sigma_2<:\rho_1\rightarrow\rho_2}{\tau_1\rightarrow\tau_2<:\rho_1\rightarrow\rho_2}:
Since we know that both \tau_1\rightarrow\tau_2<:\sigma_1\rightarrow\sigma_2 and \sigma_1\rightarrow\sigma_2<:\rho_1\rightarrow\rho_2, from inversion lemma we can conclude that the subderivations for \sigma_1<:\tau_1, \tau_2<:\sigma_2, \rho_1<:\sigma_1, and \sigma_2<:\rho_2 must exist somewhere in the derivation tree, then, by appending transitivity rule right after the subderivation that causes these four conclusions, we can obtain that \tau_2<:\rho_2 and \rho_1<:\tau_1, and since the newly constructed2 derivation is strictly smaller than the original derivation, we can apply the induction hypothesis to obtain a transitivity-free derivation on them, combined with subtying rules for arrow, we can derive \tau_1\rightarrow\tau_2<:\rho_1\rightarrow\rho_2 without using transitivity rule.
4. The record case is similar with case 3
5. The mix of record and arrow is impossible. ∎

### Proposition [Soundness and Completeness]

\tau<:\sigma iff \Vdash \tau<:\sigma

#### Proof

Straightforward induction on the derivation of \tau<:\sigma, use one of the two lemmas above for reflexivity and transitivity rule. ∎

## Algorithmic Typing

Similar to the subtyping relation, there is one non-syntax-directed rule in typing relation: the subsumption rule \cfrac{\Gamma\vdash t:\tau\ \ \ \ \ \tau<:\sigma}{\Gamma\vdash t:\sigma}.
This typing relation allows a term to be promoted to \top at any time, however, the subsumption rule cannot be deleted, it plays an important role in the subtyping. If a subsumption rule is used in the immediate subderivation, then it can be moved down to the root to become the last rule ever used in the whole derivation tree——except for the application case(这一段的example实在是太他妈的长了，我懒得打出来了), in application case, if the subsumption rule appears at either left or right hand of the premises, it can only be moved to the other side rather than down to the root. This is because it is necessary to bridging the difference between the function's argument type and actual parameter's type——either by promoting the parameter's type or by demoting the argument's type, there is no way to do this without the sumsumption rule.
In those cases to which the subsumption rule can be postponed, it is safe to delete it, results in a smaller (or more refined) type, e.g., in the following derivation:

It is perfectly safe to delete the last subsumption rule, leave the s_1\ s_2 of type S_{12}, the type is smaller but it's more refined. It is sufficient to conclude that all subsumption rules can be eliminated except for those who appear at the premises of an application rule. To deal with the latter case, we involve a more complex application rule:

\cfrac{\Gamma\vdash f:\tau\rightarrow\sigma\ \ \ \ \ \Gamma\vdash t:\rho\ \ \ \ \ \rho<:\tau}{\Gamma\vdash f\ t:\sigma}

which completes the algorithmic typing relation(the \Vdash will also be used to denote the algorithmic typing relation):

\begin{gather}
\cfrac{x:\tau\in\Gamma}{\Gamma\Vdash x:\tau}\\
\cfrac{\Gamma,x:\tau\Vdash t:\sigma}{\Gamma\Vdash \lambda x:\tau.t:\tau\rightarrow\sigma}\\
\cfrac{\Gamma\Vdash f:\tau\rightarrow\sigma\ \ \ \ \ \Gamma\Vdash t:\rho\ \ \ \ \ \Vdash \rho<:\tau}{\Gamma\Vdash f\ t:\sigma}\\
\cfrac{\forall i.\Gamma\Vdash t_i:\tau_i}{\Gamma\Vdash \lbrace l_i=t_i\stackrel{i\in1..n}{}\rbrace:\lbrace l_i:\tau_i\stackrel{i\in1..n}{}\rbrace}\\
\cfrac{\Gamma\Vdash t:\lbrace l_i:\tau_i\stackrel{i\in1..n}{}\rbrace}{\Gamma\Vdash t.l_i:\tau_i}
\end{gather}

## Joins and Meets

A type \tau is called a join of type \sigma and \rho, denoted by \sigma\lor\rho=\tau, if \sigma<:\tau and \rho<:\tau, and for all \upsilon, \tau<:\upsilon only if \sigma<:\upsilon and \rho<:\upsilon, i.e., join is the least upper bound of \sigma and \rho.
A type \tau is called a meet of type \sigma and \rho, denoted by \sigma\land\rho=\tau, if \tau<:\sigma and \tau<:\rho, and for all \upsilon, \upsilon<:\tau only if \upsilon<:\sigma and \upsilon<:\rho, i.e., meet is the greatest lower bound of \sigma and \rho.

1. 原书中用的是一个Katex里压根没有的符号，这里只能做下替换 ↩︎
2. This is exactly the reason why we cannot use induction on derivations. ↩︎

We Choose to Go to the Moon