Executive summary: type
= unsized
⊎ sized
, so we should use
type
as our generalization marker, not unsized
.
- Background: Dynamically Sized Types (DST)
- The Insight:
type
is a better generalization marker - Examples ported from DST, Take 5
Background: Dynamically Sized Types (DST)
The Rust team has been discussing incorporating “dynamically-sized
types” into the static semantics for Rust. Essentially the idea is to
allow code to describe and name static types whose size is only known
at Runtime. E.g. the integer vector [int, ..5]
is known at compile
time to have five elements, and is considered (statically) sized,
while the vector [int]
has unknown size at compile time, and so that
type is called unsized.
There is a series of blog posts about dynamically sized types on niko’s blog. So I will not dive into the details too much here
The main points are that the compiler wants to know whether a type is meant to always have a static size, or if it can potentially be unsized. In a language without type polymorphism, this might be easy to determine directly from the parsed type expression (such as in the vector examples I gave at the outset). But once you add polymorphism, things get a litle harder for the compiler.
Anyway, the plan drafted in Niko’s
“DST, Take 5”
is to add an unsized
keyword, and then use it as a marker to make
certain spots more general than they are by default. The reasoning
here is that in the common case, you want a type parameter to
represent a sized type. (Since there are certain operations you
cannot do with a value of an unsized type, such copying the value into
some other location, the compiler needs to know its size statically so
that it can allocate an appopriate amount of space for it.)
So under that scheme, to write type parameter of most general type,
e.g. for a struct
definition that ends with an unsized field,
you need to write:
1 2 3 4 5 6 7 8 9 10 |
|
That is, you need to use what I will call a “generalization” marker at the spot where you bind a type variable, to indicate that the domain of that type variable is more general than the common-case default of a sized type.
For defining a trait that can be implemented on any possible type,
including unsized ones, you would need to use the unsized
keyword
somewhere there as well. “DST, Take 5” proposed
trait Foo<unsized Self> : NormalBounds { ... }
(or trait Foo : unsized + NormalBounds { ... }
, but this is broken for
various reasons).
I had been suggesting unsized trait Foo : NormalBounds { ... }
,
which Niko rightly objected to (since it is not the trait that is
unsized, but rather potentially its Self type).
Over the Rust work week last week I suggested
trait Foo for unsized : NormalBounds
{ … }, which I think is the first
suggestion that Niko and myself could both stomach. (The reasoning
behind the latter suggestion is that we write impl Trait for
SelfType
, so it makes sense to put the generalization marker into the
same position, i.e. filling the placeholder in: Trait for _
.)
The Insight: type
is a better generalization marker
One of the concerns that Niko has pointed out to me is that it is easy
to (mis)read unsized T
as saying “T
must be unsized”. But that is not
what it is saying; it is saying “T
can be unsized”; you can still pass in
a sized type for T
.
I was reflecting on that this morning, and I realized something:
The whole point of DST is to partition the type universe into (Sized ⊎ Unsized).
So if you want this construct to be more self-documenting, the
generalization marker should be using some name to describe that union
(Sized ⊎ Unsized), rather than the name unsized
.
But we already have a very appropriate name for that union: type
!
So that started me thinking: Why don’t we use type
as our generalization marker?
So the definition of bar
in the example above would be written
1
|
|
In fact, this can have a very simple explanation: If we keep the Sized
trait bound,
then you can just say that
1
|
|
desugars to
1
|
|
and in general, any type variable formal binding <T:Bounds>
desugars
to <type T:Sized+Bounds>
I admit, when I first wrote this, I said “hmm, this looks a bit like
C++, is that a problem?” But I’m coming to like it. The biggest
problem I can foresee is that a developer might be confused about when
they are suppposed to write foo<type T>
versus foo<T>
. But chances
are that someone who does not understand the distinction will not
suffer if they just guess the answer; if they over-generalize, either:
the code will compile successfully anyway, in which case there is no harm, except perhaps w.r.t. forward-compatibility of their library when they may have wished they had imposed the
Sized
bound, orthe compiler will flag a problem in their code, in which case hopefully our error messages will suggest to add a
:Sized
bound or to just not usetype
in the binding forT
.
If they under-generalize, then they (or their library’s clients) will
discover the problem when they apply foo
.
For the trait case, it is a little less obvious what to do.
I think we could likewise write:
trait Foo for type : NormalBounds
for the maximally general case.
trait Foo : NormalBounds
would then desugar to
trait Foo for type : Sized + NormalBounds
So the point is that you would only use the type
keyword when you
wanted to explicitly say “I am generalizing over all types, not just
sized ones”, and thus are opting into the additional constraints that
that scenario presents.
This approach wouldn’t be so palatable under earlier envisioned
designs for DST where e.g. you were restricted to write explicitly
unsized struct S { ... }
for structs that could end up being
unsized. But at this point I think we have collectively decided that
such a restriction is unnecessary and undesired, so there is no worry
that someone might end up having to write type struct S { ... }
,
which definitely looks nonsensical.
There is another potential advantage to this approach that I have not
explored much yet: we could also add an Unsized
trait bound, and
allow people to write <type X:Unsized>
for when they want to
restrict X
to unsized types alone. I am not sure whether this is
actual value in this, but it does not seem absurd to put in a special
case in the coherence checker to allow one to write
impl<X:Sized> SomeTrait for X { ... }
and
impl<X:Unsized> SomeTrait for X { ... }
in order to get full coverage of SomeTrait
for all types.
Finally, another obvious (though obviously post Rust 1.0) direction
that this approach suggests is that if we decide to add
parameterization over constants, we can likewise use the const
keyword in the spot where I have written the generalization marker
type
, e.g.
1
|
|
(In this case const
would not be a generalization marker but instead
a kind marker, since it is changing the domain of the parameter from
being that of a type to being some value within a type.)
Examples ported from DST, Take 5
Here are the ported definitions of Rc
and RcData
.
(Update: had to turn off syntax highlighting to work-around a rendering bug on *
.)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
|
Here is the ImmDeref
example:
1 2 3 4 5 6 7 8 9 10 11 |
|
(I think I need a wider variety of examples, but this is good enough for now.)