A proposal: Sumtypes
Feb 08 Richard (Rikki) Andrew Cattermole
Feb 08 Paul Backus
Feb 08 ryuukk_
Feb 08 ryuukk_
Feb 17 Richard (Rikki) Andrew Cattermole
Feb 08 Timon Gehr
Feb 09 ryuukk_
Feb 17 Richard (Rikki) Andrew Cattermole
Feb 16 Kagamin
Feb 16 Richard (Rikki) Andrew Cattermole
Feb 16 Kagamin
Feb 16 Richard (Rikki) Andrew Cattermole
Feb 16 Dukc
Feb 17 Richard (Rikki) Andrew Cattermole
Feb 19 Dukc
Mar 03 IchorDev
Mar 07 cc
Mar 07 Richard (Rikki) Andrew Cattermole

February 08

A proposal: Sumtypes

Posted by Richard (Rikki) Andrew Cattermole

Permalink

Richard (Rikki) Andrew Cattermole

Permalink

Yesterday I mentioned that I wasn't very happy with Walter's design of sum types, at least as per his write-up in his DIP repository.
I have finally after two years written up an alternative to it, that should cover everything you would expect from such a language feature.
There are also a couple of key differences with regards to the tag and ABI that will make value type exceptions aka zero cost exceptions work fairly fast.

A summary of features:

Support both a short-hand declaration syntax similar to the ML family as well as the one proposed by Walter's enum-like syntax. With UDA's.
The member of operator refers to the tag name
Proposed match parameters for both name and type (although matching itself is not proposed)
Copy constructors and destructor support
Flexible ABI, if you don't use it, you won't pay for it (i.e. no storage for a value or function pointers for copy constructor/destructor)
Default initialization using first entry or preferred :none
Implicit construction based upon value and using assignment expression to prefer existing tag
Does not have the null type state
Comparison based upon tag, and only then value
Introspection (traits and properties)
Set operations (merging, checking if type/name is in the set)
No non-introspection method to access a sum type value is specified currently, a follow-up matching proposal would offer it instead.
It can be done using the trait getMember, although it will be up to you to validate if that is the correct entry given the tag for a value.

Latest version: https://gist.github.com/rikkimax/d25c6b2bed8caba008a6967e9e0a7e7c

Walter's DIP: https://github.com/WalterBright/DIPs/blob/sumtypes/DIPs/1NNN-(wgb).md

Example nullable:

sumtype Nullable(T) {
    :none,
    T value
}

sumtype Nullable(T) = :none | T value;

void accept(Nullable!Duration timeout) {}

accept(1.minute);
accept(:value = 1.minute);
accept(:none);

The following is a copy of the proposed member of operator and then the sumtype for posterity's sake.

PR: https://github.com/dlang/dmd/pull/16161

Member Of Operator

The member of operator, is an operator that operates on a contextual type with respect to a given statement or declaration.

It may appear as the first term in an expression, then it may be followed with binary and dot expressions.

The syntax of the operator is ':' Identifier.

Context

The context is a type that is provided by the statement or relevant declaration.

Validation

The type that the member of operator results in is the same as the one it is in context of.

If it does not match, it will error.

Valid Statements and Declarations

Return expressions
The compiler rewrites return :Identifier; as return typeof(return).Identifier;.
Variable declarations
Type qualifiers may not appear as the variable type, there must be a concrete type.
It can be thought of as the type on the variable as having been aliased with the alias applying to the variable type and as the context.
Type var = :Identifier; would internally be rewritten as __Alias var = __Alias.Identifier;.

Switch statements
The expression used by the switch statement, will need to be aliased as per variable declarations.
So

switch(expr) {
    case :Identifier:
        break;
}

would be rewritten as

alias __Alias = typeof(expr);
switch(expr) {
    case __Alias.Identifier:
        break;
}

Function calls
During parameter to argument matching, a check to see if the typeof(param).Identifier is possible for func(:Identifier).
Function parameter default initialization
It must support the default initialization of a parameter. void func(Enum e = :Start).
Comparison
The left hand side of a comparison is used as the context for the right hand side e == :Start.
This may require an intermediary variable to get the type of, prior to the comparison.

Depends upon: member of operator

SumTypes

Sum types are a union of types, as well as a union of names.
Some names will be applied to a type, others may not be.

It acts as a tagged union, using a tag to determine which type or name is currently active.

The matching capabilities are not specified here.

It is influenced from Walter Bright's DIP, although it is not a continuation of.

Syntax

Two new declaration syntaxes are proposed.

The first comes from Walter Bright's proposal:

sumtype Identifier (TemplateParameters) {
    @UDAs|opt Type Identifier = Expression,
    @UDAs|opt Type Identifier,
    @UDAs|opt MemberOfOperator,
}

TODO: swap for spec grammar version

The second is short hand which comes from the ML family:

sumtype Identifier (TemplateParameters) = @UDAs|opt Type Identifier|opt | @UDAs|opt MemberOfOperator;

TODO: swap for spec grammar version

For a nullable type this would look like in both syntaxes:

sumtype Nullable(T) {
    :none,
    T value
}

sumtype Nullable(T) = :none | T value;

Member Of

A sumtype is a kind of tag union.
This uses a tag to differentiate between each member.
The tag is a hash of both the fully qualified name of the type and the name.

The tag should be stored in a CPU word size register, so that if only names and no types are provided, there will be no storage.

When the member of operator applies to a sumtype it will locate given the member of identifier from the list of names the entry.

Proposed Match Parameters

There are two forms that need to be supported.

Both of which support a following name identifier that will be used for the variable declaration in the given scope.

The first is a the type
Second is the member of operator to match the name

It is recommended that if you can have conflicts to always declare entries with names and to always use the names in the matching.

obj.match {
    (:entry varName) => writeln(varName);
}

If you did not specify a type, you may not use the renamed variable declaration for a given entry nor specify the entry by the type.

It will of course be possible to specify an entry based upon the member of operator.

sumtype S = :none;

identity(:none);

S identity(S s) => return s;

As a feature this is overwise known as implicit construction and applies to types in general in any location including function arguments.

Storage

A sumtype at runtime is represented by a flexible ABI.

The tag [size_t]
Copy constructor [function]
Destructor [function]
Storage [void[X]]

The tag always exists.

If none of the entries has a copy constructor (including generated), this field does not exist.

If none of the entires has a destructor (including generated), this field does not exist.

If none of the entries takes any storage (so all entries do not have a type), this field does not exist.

Copy constructors and destructors for the entries that do not provide one, but are needed will have a generated internal to object file function generated that will perform the appropriete action (and should we get reference counting also perform that).

For all intents and purposes a sum type is similar to a struct as far as when to call the copy constructors and destructors.

Initialization

The default initialization of a sumtype will always prefer :none if present, otherwise it is the first entry.
For the first entry on the short hand syntax it does not support expressions for the default initialization, therefore it will be the default initialized value of that type.

Assigning a value to a sum type, will always prefer the currently selected tag.
If however the value cannot be coerced into the tag's type, it will then do a match to determine the best candidate based upon the type of the expression.

An example of prefering the currently selected tag:

sumtype S = int i | long l;

S s = :i = 2;

But if we switch to a larger value s = long.max;, this will assign the long instead.

Nullability

A sum type cannot have the type state of null.

Set Operations

A sumtype which is a subset of another, will be assignable.

sumtype S1 = :none | int;
sumtype S2 = :none | int | float;

S1 s1;
S2 s2 = s1;

This covers other scenarios like returning from a function or an argument to a function.

To remove a possible entry from a sumtype you must peform a match (which is not being proposed here):

sumtype S1 = :none | int;
sumtype S2 = :none | int | float;

S1 s1;
S2 s2 = s1;

s2.match {
    (float) => assert(0);
    (default val) s1 = val;
}

To determine if a type is in the set:

sumtype S1 = :none | int;

pragma(msg, int in S1); // true
pragma(msg, :none in S1); // true
pragma(msg, "none" in S1); // true

To merge two sumtypes together use the pipe operator on the type.

sumtype S1 = :none | int i;
sumtype S2 = :none | long l;
alias S3 = S1 | S2; // :none | int i | long l

Or you can expand a sumtype directly into another:

sumtype S1 = :none | int i;
sumtype S2 = :none | S1.expand | long l; // :none | int i | long l

When merging, duplicate types and names are not an error, they will be combined.
Although if two names have different types this will error.

Introspection

A sumtype includes all primary properties of types including sizeof.

It has one new property, expand. Which is used to expand a sumtype into the currently declaring one.

The trait allMembers will return a set of strings that donate the names of each entry. If an entry has not been given a name by the user, a generated name will provided that will access it instead.

Using the trait getMember or using SumpType.Member will return an alias to that entry so that you may acquire the type of it, or to assign to it.

For the trait identifier on an alias of the a given entry, it will return the name for that entry.

An is expression may be used to determine if a given type is a sumtype: is(T == sumtype).

Comparison

The comparison of two sum types is first done based upon tag, if they are not equal that will give the less than and more than values.

Should they align, then a match will occur with the behavior for the given entry type resulting in the final comparison value.
If a given entry does not have a type, then it will return as equal.

February 08

Re: A proposal: Sumtypes

Posted by Paul Backus
in reply to Richard (Rikki) Andrew Cattermole

Permalink

Paul Backus

Posted in reply to Richard (Rikki) Andrew Cattermole

Permalink

On Thursday, 8 February 2024 at 15:42:25 UTC, Richard (Rikki) Andrew Cattermole wrote:

[...]

Latest version: https://gist.github.com/rikkimax/d25c6b2bed8caba008a6967e9e0a7e7c

There are two big-picture issues with this proposal:

In addition to sum types, it includes proposals for several unrelated new language features, like a "member of operator" and implicit construction of function arguments and return values.
There are several unusual design decisions that are presented without any motivation or rationale. For example:
- "The tag is a hash of both the fully qualified name of the type and the name."
- Copy constructors and destructors are stored as fields
- "The default initialization of a sumtype will always prefer :none if present"
- "Assigning a value to a sum type, will always prefer the currently selected tag"

There's a lot more I could say, but I don't think there's much value in giving more detailed feedback until these big structural issues are addressed.

February 08

Re: A proposal: Sumtypes

Posted by ryuukk_
in reply to Richard (Rikki) Andrew Cattermole

Permalink

ryuukk_

Posted in reply to Richard (Rikki) Andrew Cattermole

Permalink

I personally not a fan of having a new keyword, it's words i can no longer use in my code, we have union and enum a sumtype is the combination of both, so why not:

union MyTaggedUnion: enum {
    A a,
    B b,
    C c,
}

It'd loose your one liner idea however

I am not a fan of using .match and not a fun of having match wich is yet another new keyword, why not reuse switch?

It's easy to distinguish with C's switch, just check presence of case

switch (value) {
    :A => writeln("This is A: ", value);
    :B => writeln("This is B: ", value);
    else => writeln("something else");
}

I would also make a proposal for switch as expression, wich i guess already is possible with your match idea?

I like it so far, hopefully things moves fast from now on

February 08

Re: A proposal: Sumtypes

Posted by ryuukk_
in reply to Paul Backus

Permalink

ryuukk_

Posted in reply to Paul Backus

Permalink

On Thursday, 8 February 2024 at 17:28:32 UTC, Paul Backus wrote:

There are two big-picture issues with this proposal:

In addition to sum types, it includes proposals for several unrelated new language features, like a "member of operator" and implicit construction of function arguments and return values.

Welcome features imo, needed to make the UX nice and pleasant and, repetition is useless and is visual noise, name your types / variable properly instead

February 08

Re: A proposal: Sumtypes

Posted by Timon Gehr
in reply to Richard (Rikki) Andrew Cattermole

Permalink

Timon Gehr

Posted in reply to Richard (Rikki) Andrew Cattermole

Permalink

On 2/8/24 16:42, Richard (Rikki) Andrew Cattermole wrote:
> Yesterday I mentioned that I wasn't very happy with Walter's design of sum types, at least as per his write-up in his DIP repository.
> I have finally after two years written up an alternative to it, that should cover everything you would expect from such a language feature.
> There are also a couple of key differences with regards to the tag and ABI that will make value type exceptions aka zero cost exceptions work fairly fast.
> 
> ...
> 
> ```d
> sumtype Nullable(T) {
>      :none,
>      T value
> }
> 
> sumtype Nullable(T) = :none | T value;
> 
> void accept(Nullable!Duration timeout) {}
> 
> accept(1.minute);
> accept(:value = 1.minute);
> accept(:none);
> ```
> ...
> 
> The first comes from Walter Bright's proposal:
> 
> ```d
> sumtype Identifier (TemplateParameters) {
>      @UDAs|opt Type Identifier = Expression,
>      @UDAs|opt Type Identifier,
>      @UDAs|opt MemberOfOperator,
> }
> ```
> ...

> TODO: swap for spec grammar version
> 
> The second is short hand which comes from the ML family:
> 
> ```d
> sumtype Identifier (TemplateParameters) = @UDAs|opt Type Identifier|opt | @UDAs|opt MemberOfOperator;
> ```
> 
> TODO: swap for spec grammar version
> 
> For a nullable type this would look like in both syntaxes:
> 
> ```d
> sumtype Nullable(T) {
>      :none,
>      T value
> }
> 
> sumtype Nullable(T) = :none | T value;
> ```
> ...

I have to say, I am not a big fan of having only parameterless named constructors as a special case.


> Implicit construction and applies to types in general in any location including function arguments.
> 
> ## Storage
> 
> A sumtype at runtime is represented by a flexible ABI.
> 
> 1. The tag [``size_t``]
> 2. Copy constructor [``function``]
> 3. Destructor [``function``]
> 4. Storage [``void[X]``]
> ...

It is not so clear why copy constructor and destructor need to be virtual functions.

> 
> The default initialization of a sumtype will always prefer ``:none`` if present, otherwise it is the first entry.

Just do first entry always.

> For the first entry on the short hand syntax it does not support expressions for the default initialization, therefore it will be the default initialized value of that type.
> ...

In this case, I am not sure what initializers inside a sumtype will do.

> Assigning a value to a sum type, will always prefer the currently selected tag.

Alarm bells go off in my head.

> If however the value cannot be coerced into the tag's type, it will then do a match to determine the best candidate based upon the type of the expression.
> 
> An example of prefering the currently selected tag:
> 
> ```d
> sumtype S = int i | long l;
> 
> S s = :i = 2;
> ```
> ...

I would strongly advise to drop this.

> But if we switch to a larger value ``s = long.max;``, this will assign the long instead.
> 
> ## Nullability
> 
> A sum type cannot have the type state of null.
> ...

I am not sure what that means.

> ## Set Operations
> 
> A sumtype which is a subset of another, will be assignable.
> 
> ```d
> sumtype S1 = :none | int;
> sumtype S2 = :none | int | float;
> 
> S1 s1;
> S2 s2 = s1;
> ```
> ...

This seems like a strange mix of nominal and structural typing.

> This covers other scenarios like returning from a function or an argument to a function.
> 
> To remove a possible entry from a sumtype you must peform a match (which is not being proposed here):
> 
> ```d
> sumtype S1 = :none | int;
> sumtype S2 = :none | int | float;
> 
> S1 s1;
> S2 s2 = s1;
> 
> s2.match {
>      (float) => assert(0);
>      (default val) s1 = val;
> }
> ```
> 
> To determine if a type is in the set:
> 
> ```d
> sumtype S1 = :none | int;
> 
> pragma(msg, int in S1); // true
> pragma(msg, :none in S1); // true
> pragma(msg, "none" in S1); // true
> ```
> ...

I think a priori here you will have an issue with parsing.

> To merge two sumtypes together use the pipe operator on the type.
> 
> ```d
> sumtype S1 = :none | int i;
> sumtype S2 = :none | long l;
> alias S3 = S1 | S2; // :none | int i | long l
> ```
> ...

The flattening behavior is unintuitive.


> Or you can expand a sumtype directly into another:
> 
> ```d
> sumtype S1 = :none | int i;
> sumtype S2 = :none | S1.expand | long l; // :none | int i | long l
> ```
> 
> When merging, duplicate types and names are not an error, they will be combined.
> Although if two names have different types this will error.
> ...

Again this mixes nominal and structural typing in a way that seems unsatisfying. Note that different struct types do not become assignable just because they share member types and names.

February 09

Re: A proposal: Sumtypes

Posted by ryuukk_
in reply to Timon Gehr

Permalink

ryuukk_

Posted in reply to Timon Gehr

Permalink

On Thursday, 8 February 2024 at 19:26:37 UTC, Timon Gehr wrote:

> >

sumtype S = int i | long l;

S s = :i = 2;

...

I would strongly advise to drop this.

I agree, i'd make it an error and ask user to be explicit

> >

A sumtype which is a subset of another, will be assignable.

sumtype S1 = :none | int;
sumtype S2 = :none | int | float;

S1 s1;
S2 s2 = s1;

...

This seems like a strange mix of nominal and structural typing.

This covers other scenarios like returning from a function or an argument to a function.

To remove a possible entry from a sumtype you must peform a match (which is not being proposed here):

sumtype S1 = :none | int;
sumtype S2 = :none | int | float;

S1 s1;
S2 s2 = s1;

s2.match {
     (float) => assert(0);
     (default val) s1 = val;
}

To determine if a type is in the set:

sumtype S1 = :none | int;

pragma(msg, int in S1); // true
pragma(msg, :none in S1); // true
pragma(msg, "none" in S1); // true

...

I think a priori here you will have an issue with parsing.

To merge two sumtypes together use the pipe operator on the type.

sumtype S1 = :none | int i;
sumtype S2 = :none | long l;
alias S3 = S1 | S2; // :none | int i | long l

...

The flattening behavior is unintuitive.

Or you can expand a sumtype directly into another:

sumtype S1 = :none | int i;
sumtype S2 = :none | S1.expand | long l; // :none | int i | long l

When merging, duplicate types and names are not an error, they will be combined.
Although if two names have different types this will error.
...

Again this mixes nominal and structural typing in a way that seems unsatisfying. Note that different struct types do not become assignable just because they share member types and names.

I agree, multiple ways to do the same thing will lead to confusion

February 16

Re: A proposal: Sumtypes

Posted by Kagamin
in reply to Richard (Rikki) Andrew Cattermole

Permalink

Kagamin

Posted in reply to Richard (Rikki) Andrew Cattermole

Permalink

February 16

Re: A proposal: Sumtypes

Posted by Richard (Rikki) Andrew Cattermole
in reply to Kagamin

Permalink

Richard (Rikki) Andrew Cattermole

Posted in reply to Kagamin

Permalink

On 16/02/2024 9:19 PM, Kagamin wrote:
> Is the `Result` type supposed to be sumtype? Is `Result` on top level supposed to be something like `string | UnicodeException | ErrnoException | IOException | SocketException | PostgresqlException | SqliteException`? That would be beyond attribute soup.

Where did you get ``Result`` from?

None of my examples use that identifier, although Walter's does.

February 16

Re: A proposal: Sumtypes

Posted by Kagamin
in reply to Richard (Rikki) Andrew Cattermole

Permalink

Kagamin

Posted in reply to Richard (Rikki) Andrew Cattermole

Permalink

I refer to the idea of implementation of error handling with return types, possibly nogc. Is it not supposed to involve sumtypes somehow?

February 16

Re: A proposal: Sumtypes

Posted by Richard (Rikki) Andrew Cattermole
in reply to Kagamin

Permalink

Richard (Rikki) Andrew Cattermole

Posted in reply to Kagamin

Permalink

On 16/02/2024 10:19 PM, Kagamin wrote:
> I refer to the idea of implementation of error handling with return types, possibly nogc. Is it not supposed to involve sumtypes somehow?

If there is no language level support for it, then yes you will need to declare the sumtype explicitly.

If there is, which my proposal for value type exceptions provide, inference or ``@throws(Exception, MyException)`` will do it, without the need of a sumtype declaration.

Note for language level support, throw and catch would add and remove automatically from the set, so sumtypes are only of note if you catch all.

Top | Forum index | About this forum

Forums

Member Of Operator

Context

Validation

Valid Statements and Declarations

SumTypes

Syntax

Member Of

Proposed Match Parameters

Storage

Initialization

Nullability

Set Operations

Introspection

Comparison