www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - A proposal: Sumtypes

reply Richard (Rikki) Andrew Cattermole <richard cattermole.co.nz> writes:
Yesterday I mentioned that I wasn't very happy with Walter's 
design of sum types, at least as per his write-up in his DIP 
repository.
I have finally after two years written up an alternative to it, 
that should cover everything you would expect from such a 
language feature.
There are also a couple of key differences with regards to the 
tag and ABI that will make value type exceptions aka zero cost 
exceptions work fairly fast.

A summary of features:

- Support both a short-hand declaration syntax similar to the ML 
family as well as the one proposed by Walter's enum-like syntax. 
With UDA's.
- The member of operator refers to the tag name
- Proposed match parameters for both name and type (although 
matching itself is not proposed)
- Copy constructors and destructor support
- Flexible ABI, if you don't use it, you won't pay for it (i.e. 
no storage for a value or function pointers for copy 
constructor/destructor)
- Default initialization using first entry or preferred ``:none``
- Implicit construction based upon value and using assignment 
expression to prefer existing tag
- Does not have the null type state
- Comparison based upon tag, and only then value
- Introspection (traits and properties)
- Set operations (merging, checking if type/name is in the set)
- No non-introspection method to access a sum type value is 
specified currently, a follow-up matching proposal would offer it 
instead.
   It can be done using the trait ``getMember``, although it will 
be up to you to validate if that is the correct entry given the 
tag for a value.

Latest version: 
https://gist.github.com/rikkimax/d25c6b2bed8caba008a6967e9e0a7e7c

Walter's DIP: 
https://github.com/WalterBright/DIPs/blob/sumtypes/DIPs/1NNN-(wgb).md

Example nullable:

```d
sumtype Nullable(T) {
     :none,
     T value
}

sumtype Nullable(T) = :none | T value;

void accept(Nullable!Duration timeout) {}

accept(1.minute);
accept(:value = 1.minute);
accept(:none);
```

The following is a copy of the proposed member of operator and 
then the sumtype for posterity's sake.

------------------------

PR: https://github.com/dlang/dmd/pull/16161



The member of operator, is an operator that operates on a 
contextual type with respect to a given statement or declaration.

It may appear as the first term in an expression, then it may be 
followed with binary and dot expressions.

The syntax of the operator is ``':' Identifier``.



The context is a type that is provided by the statement or 
relevant declaration.



The type that the member of operator results in is the same as 
the one it is in context of.

If it does not match, it will error.



- Return expressions
     The compiler rewrites ``return :Identifier;`` as ``return 
typeof(return).Identifier;``.
- Variable declarations
     Type qualifiers may not appear as the variable type, there 
must be a concrete type.
     It can be thought of as the type on the variable as having 
been aliased with the alias applying to the variable type and as 
the context.
     ``Type var = :Identifier;`` would internally be rewritten as 
``__Alias var =  __Alias.Identifier;``.
- Switch statements
     The expression used by the switch statement, will need to be 
aliased as per variable declarations.
     So
     ```d
     switch(expr) {
         case :Identifier:
             break;
     }
     ```
     would be rewritten as
     ```d
     alias __Alias = typeof(expr);
     switch(expr) {
         case __Alias.Identifier:
             break;
     }
     ```
- Function calls
     During parameter to argument matching, a check to see if the 
``typeof(param).Identifier`` is possible for 
``func(:Identifier)``.
- Function parameter default initialization
     It must support the default initialization of a parameter. 
``void func(Enum e = :Start)``.
- Comparison
     The left hand side of a comparison is used as the context for 
the right hand side ``e == :Start``.
     This may require an intermediary variable to get the type of, 
prior to the comparison.

------------------------

Depends upon: [member of 
operator](https://gist.github.com/rikkimax/9e02ad538d94615d76d869070f7fd65f)



Sum types are a union of types, as well as a union of names.
Some names will be applied to a type, others may not be.

It acts as a tagged union, using a tag to determine which type or 
name is currently active.

The matching capabilities are not specified here.

It is influenced from Walter Bright's DIP, although it is not a 
continuation of.



Two new declaration syntaxes are proposed.

The first comes from Walter Bright's proposal:

```d
sumtype Identifier (TemplateParameters) {
      UDAs|opt Type Identifier = Expression,
      UDAs|opt Type Identifier,
      UDAs|opt MemberOfOperator,
}
```

TODO: swap for spec grammar version

The second is short hand which comes from the ML family:

```d
sumtype Identifier (TemplateParameters) =  UDAs|opt Type 
Identifier|opt |  UDAs|opt MemberOfOperator;
```

TODO: swap for spec grammar version

For a nullable type this would look like in both syntaxes:

```d
sumtype Nullable(T) {
     :none,
     T value
}

sumtype Nullable(T) = :none | T value;
```



A sumtype is a kind of tag union.
This uses a tag to differentiate between each member.
The tag is a hash of both the fully qualified name of the type 
and the name.

The tag should be stored in a CPU word size register, so that if 
only names and no types are provided, there will be no storage.

When the member of operator applies to a sumtype it will locate 
given the member of identifier from the list of names the entry.



There are two forms that need to be supported.

Both of which support a following name identifier that will be 
used for the variable declaration in the given scope.

1. The first is a the type
2. Second is the member of operator to match the name

It is recommended that if you can have conflicts to always 
declare entries with names and to always use the names in the 
matching.

```d
obj.match {
     (:entry varName) => writeln(varName);
}
```

If you did not specify a type, you may not use the renamed 
variable declaration for a given entry nor specify the entry by 
the type.

It will of course be possible to specify an entry based upon the 
member of operator.

```d
sumtype S = :none;

identity(:none);

S identity(S s) => return s;
```

As a feature this is overwise known as implicit construction and 
applies to types in general in any location including function 
arguments.



A sumtype at runtime is represented by a flexible ABI.

1. The tag [``size_t``]
2. Copy constructor [``function``]
3. Destructor [``function``]
4. Storage [``void[X]``]

The tag always exists.

If none of the entries has a copy constructor (including 
generated), this field does not exist.

If none of the entires has a destructor (including generated), 
this field does not exist.

If none of the entries takes any storage (so all entries do not 
have a type), this field does not exist.

Copy constructors and destructors for the entries that do not 
provide one, but are needed will have a generated internal to 
object file function generated that will perform the appropriete 
action (and should we get reference counting also perform that).

For all intents and purposes a sum type is similar to a struct as 
far as when to call the copy constructors and destructors.



The default initialization of a sumtype will always prefer 
``:none`` if present, otherwise it is the first entry.
For the first entry on the short hand syntax it does not support 
expressions for the default initialization, therefore it will be 
the default initialized value of that type.

Assigning a value to a sum type, will always prefer the currently 
selected tag.
If however the value cannot be coerced into the tag's type, it 
will then do a match to determine the best candidate based upon 
the type of the expression.

An example of prefering the currently selected tag:

```d
sumtype S = int i | long l;

S s = :i = 2;
```

But if we switch to a larger value ``s = long.max;``, this will 
assign the long instead.



A sum type cannot have the type state of null.



A sumtype which is a subset of another, will be assignable.

```d
sumtype S1 = :none | int;
sumtype S2 = :none | int | float;

S1 s1;
S2 s2 = s1;
```

This covers other scenarios like returning from a function or an 
argument to a function.

To remove a possible entry from a sumtype you must peform a match 
(which is not being proposed here):

```d
sumtype S1 = :none | int;
sumtype S2 = :none | int | float;

S1 s1;
S2 s2 = s1;

s2.match {
     (float) => assert(0);
     (default val) s1 = val;
}
```

To determine if a type is in the set:

```d
sumtype S1 = :none | int;

pragma(msg, int in S1); // true
pragma(msg, :none in S1); // true
pragma(msg, "none" in S1); // true
```

To merge two sumtypes together use the pipe operator on the type.

```d
sumtype S1 = :none | int i;
sumtype S2 = :none | long l;
alias S3 = S1 | S2; // :none | int i | long l
```

Or you can expand a sumtype directly into another:

```d
sumtype S1 = :none | int i;
sumtype S2 = :none | S1.expand | long l; // :none | int i | long l
```

When merging, duplicate types and names are not an error, they 
will be combined.
Although if two names have different types this will error.



A sumtype includes all primary properties of types including 
``sizeof``.

It has one new property, ``expand``. Which is used to expand a 
sumtype into the currently declaring one.

The trait ``allMembers`` will return a set of strings that donate 
the names of each entry. If an entry has not been given a name by 
the user, a generated name will provided that will access it 
instead.

Using the trait ``getMember`` or using ``SumpType.Member`` will 
return an alias to that entry so that you may acquire the type of 
it, or to assign to it.

For the trait ``identifier`` on an alias of the a given entry, it 
will return the name for that entry.

An is expression may be used to determine if a given type is a 
sumtype: ``is(T == sumtype)``.



The comparison of two sum types is first done based upon tag, if 
they are not equal that will give the less than and more than 
values.

Should they align, then a match will occur with the behavior for 
the given entry type resulting in the final comparison value.
If a given entry does not have a type, then it will return as 
equal.
Feb 08 2024
next sibling parent reply Paul Backus <snarwin gmail.com> writes:
On Thursday, 8 February 2024 at 15:42:25 UTC, Richard (Rikki) 
Andrew Cattermole wrote:
 Yesterday I mentioned that I wasn't very happy with Walter's 
 design of sum types, at least as per his write-up in his DIP 
 repository.
 I have finally after two years written up an alternative to it, 
 that should cover everything you would expect from such a 
 language feature.
 There are also a couple of key differences with regards to the 
 tag and ABI that will make value type exceptions aka zero cost 
 exceptions work fairly fast.

 [...]

 Latest version: 
 https://gist.github.com/rikkimax/d25c6b2bed8caba008a6967e9e0a7e7c
There are two big-picture issues with this proposal: 1. In addition to sum types, it includes proposals for several unrelated new language features, like a "member of operator" and implicit construction of function arguments and return values. 2. There are several unusual design decisions that are presented without any motivation or rationale. For example: - "The tag is a hash of both the fully qualified name of the type and the name." - Copy constructors and destructors are stored as fields - "The default initialization of a sumtype will always prefer `:none` if present" - "Assigning a value to a sum type, will always prefer the currently selected tag" There's a lot more I could say, but I don't think there's much value in giving more detailed feedback until these big structural issues are addressed.
Feb 08 2024
parent ryuukk_ <ryuukk.dev gmail.com> writes:
On Thursday, 8 February 2024 at 17:28:32 UTC, Paul Backus wrote:
 There are two big-picture issues with this proposal:

 1. In addition to sum types, it includes proposals for several 
 unrelated new language features, like a "member of operator" 
 and implicit construction of function arguments and return 
 values.
Welcome features imo, needed to make the UX nice and pleasant and, repetition is useless and is visual noise, name your types / variable properly instead
Feb 08 2024
prev sibling next sibling parent reply ryuukk_ <ryuukk.dev gmail.com> writes:
I personally not a fan of having a new keyword, it's words i can 
no longer use in my code, we have `union` and `enum` a sumtype is 
the combination of both, so why not:


```D
union MyTaggedUnion: enum {
     A a,
     B b,
     C c,
}
```

It'd loose your one liner idea however


I am not a fan of using `.match` and not a fun of having `match` 
wich is yet another new keyword, why not reuse `switch`?

It's easy to distinguish with C's switch, just check presence of 
`case`


```D
switch (value) {
     :A => writeln("This is A: ", value);
     :B => writeln("This is B: ", value);
     else => writeln("something else");
}
```

I would also make a proposal for `switch` as expression, wich i 
guess already is possible with your `match` idea?


I like it so far, hopefully things moves fast from now on
Feb 08 2024
parent "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 09/02/2024 7:10 AM, ryuukk_ wrote:
 I personally not a fan of having a new keyword, it's words i can no 
 longer use in my code, we have `union` and `enum` a sumtype is the 
 combination of both, so why not:
I suspect we'll be using ``sumtype`` for this regardless. But at least it isn't as bad as Haskell with the ``data`` keyword to designate more or less a sumtype.
 I am not a fan of using `.match` and not a fun of having `match` wich is 
 yet another new keyword, why not reuse `switch`?
I am not proposing the match capability, first let's wait and see what Walter proposes in his DConf Online 2024 presentation. If it is a good design that hits the marks then a competing design isn't required to be attempted ;)
Feb 17 2024
prev sibling next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 2/8/24 16:42, Richard (Rikki) Andrew Cattermole wrote:
 Yesterday I mentioned that I wasn't very happy with Walter's design of 
 sum types, at least as per his write-up in his DIP repository.
 I have finally after two years written up an alternative to it, that 
 should cover everything you would expect from such a language feature.
 There are also a couple of key differences with regards to the tag and 
 ABI that will make value type exceptions aka zero cost exceptions work 
 fairly fast.
 
 ...
 
 ```d
 sumtype Nullable(T) {
      :none,
      T value
 }
 
 sumtype Nullable(T) = :none | T value;
 
 void accept(Nullable!Duration timeout) {}
 
 accept(1.minute);
 accept(:value = 1.minute);
 accept(:none);
 ```
 ...
 
 The first comes from Walter Bright's proposal:
 
 ```d
 sumtype Identifier (TemplateParameters) {
       UDAs|opt Type Identifier = Expression,
       UDAs|opt Type Identifier,
       UDAs|opt MemberOfOperator,
 }
 ```
 ...
 TODO: swap for spec grammar version
 
 The second is short hand which comes from the ML family:
 
 ```d
 sumtype Identifier (TemplateParameters) =  UDAs|opt Type Identifier|opt 
 |  UDAs|opt MemberOfOperator;
 ```
 
 TODO: swap for spec grammar version
 
 For a nullable type this would look like in both syntaxes:
 
 ```d
 sumtype Nullable(T) {
      :none,
      T value
 }
 
 sumtype Nullable(T) = :none | T value;
 ```
 ...
I have to say, I am not a big fan of having only parameterless named constructors as a special case.
 Implicit construction and applies to types in general in any location
including function arguments.
 

 
 A sumtype at runtime is represented by a flexible ABI.
 
 1. The tag [``size_t``]
 2. Copy constructor [``function``]
 3. Destructor [``function``]
 4. Storage [``void[X]``]
 ...
It is not so clear why copy constructor and destructor need to be virtual functions.
 
 The default initialization of a sumtype will always prefer ``:none`` if 
 present, otherwise it is the first entry.
Just do first entry always.
 For the first entry on the short hand syntax it does not support 
 expressions for the default initialization, therefore it will be the 
 default initialized value of that type.
 ...
In this case, I am not sure what initializers inside a sumtype will do.
 Assigning a value to a sum type, will always prefer the currently 
 selected tag.
Alarm bells go off in my head.
 If however the value cannot be coerced into the tag's type, it will then 
 do a match to determine the best candidate based upon the type of the 
 expression.
 
 An example of prefering the currently selected tag:
 
 ```d
 sumtype S = int i | long l;
 
 S s = :i = 2;
 ```
 ...
I would strongly advise to drop this.
 But if we switch to a larger value ``s = long.max;``, this will assign 
 the long instead.
 

 
 A sum type cannot have the type state of null.
 ...
I am not sure what that means.

 
 A sumtype which is a subset of another, will be assignable.
 
 ```d
 sumtype S1 = :none | int;
 sumtype S2 = :none | int | float;
 
 S1 s1;
 S2 s2 = s1;
 ```
 ...
This seems like a strange mix of nominal and structural typing.
 This covers other scenarios like returning from a function or an 
 argument to a function.
 
 To remove a possible entry from a sumtype you must peform a match (which 
 is not being proposed here):
 
 ```d
 sumtype S1 = :none | int;
 sumtype S2 = :none | int | float;
 
 S1 s1;
 S2 s2 = s1;
 
 s2.match {
      (float) => assert(0);
      (default val) s1 = val;
 }
 ```
 
 To determine if a type is in the set:
 
 ```d
 sumtype S1 = :none | int;
 
 pragma(msg, int in S1); // true
 pragma(msg, :none in S1); // true
 pragma(msg, "none" in S1); // true
 ```
 ...
I think a priori here you will have an issue with parsing.
 To merge two sumtypes together use the pipe operator on the type.
 
 ```d
 sumtype S1 = :none | int i;
 sumtype S2 = :none | long l;
 alias S3 = S1 | S2; // :none | int i | long l
 ```
 ...
The flattening behavior is unintuitive.
 Or you can expand a sumtype directly into another:
 
 ```d
 sumtype S1 = :none | int i;
 sumtype S2 = :none | S1.expand | long l; // :none | int i | long l
 ```
 
 When merging, duplicate types and names are not an error, they will be 
 combined.
 Although if two names have different types this will error.
 ...
Again this mixes nominal and structural typing in a way that seems unsatisfying. Note that different struct types do not become assignable just because they share member types and names.
Feb 08 2024
next sibling parent ryuukk_ <ryuukk.dev gmail.com> writes:
On Thursday, 8 February 2024 at 19:26:37 UTC, Timon Gehr wrote:
 ```d
 sumtype S = int i | long l;
 
 S s = :i = 2;
 ```
 ...
I would strongly advise to drop this.
I agree, i'd make it an error and ask user to be explicit
 A sumtype which is a subset of another, will be assignable.
 
 ```d
 sumtype S1 = :none | int;
 sumtype S2 = :none | int | float;
 
 S1 s1;
 S2 s2 = s1;
 ```
 ...
This seems like a strange mix of nominal and structural typing.
 This covers other scenarios like returning from a function or 
 an argument to a function.
 
 To remove a possible entry from a sumtype you must peform a 
 match (which is not being proposed here):
 
 ```d
 sumtype S1 = :none | int;
 sumtype S2 = :none | int | float;
 
 S1 s1;
 S2 s2 = s1;
 
 s2.match {
      (float) => assert(0);
      (default val) s1 = val;
 }
 ```
 
 To determine if a type is in the set:
 
 ```d
 sumtype S1 = :none | int;
 
 pragma(msg, int in S1); // true
 pragma(msg, :none in S1); // true
 pragma(msg, "none" in S1); // true
 ```
 ...
I think a priori here you will have an issue with parsing.
 To merge two sumtypes together use the pipe operator on the 
 type.
 
 ```d
 sumtype S1 = :none | int i;
 sumtype S2 = :none | long l;
 alias S3 = S1 | S2; // :none | int i | long l
 ```
 ...
The flattening behavior is unintuitive.
 Or you can expand a sumtype directly into another:
 
 ```d
 sumtype S1 = :none | int i;
 sumtype S2 = :none | S1.expand | long l; // :none | int i | 
 long l
 ```
 
 When merging, duplicate types and names are not an error, they 
 will be combined.
 Although if two names have different types this will error.
 ...
Again this mixes nominal and structural typing in a way that seems unsatisfying. Note that different struct types do not become assignable just because they share member types and names.
I agree, multiple ways to do the same thing will lead to confusion
Feb 09 2024
prev sibling parent "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 09/02/2024 8:26 AM, Timon Gehr wrote:
 On 2/8/24 16:42, Richard (Rikki) Andrew Cattermole wrote:


 A sum type cannot have the type state of null.
 ...
I am not sure what that means.
It comes from type state theory. What it means in the context of D is: ``s is null`` doesn't exist. It also removes the temptation that the entry is considered for the null check. https://en.wikipedia.org/wiki/Typestate_analysis


 A sumtype which is a subset of another, will be assignable.

 ```d
 sumtype S1 = :none | int;
 sumtype S2 = :none | int | float;

 S1 s1;
 S2 s2 = s1;
 ```
 ...
This seems like a strange mix of nominal and structural typing.
It is a mix yes. It is based around the entries being a set. Something almost all D programmers will be familiar with and should be easily explainable.
 This covers other scenarios like returning from a function or an 
 argument to a function.

 To remove a possible entry from a sumtype you must peform a match 
 (which is not being proposed here):

 ```d
 sumtype S1 = :none | int;
 sumtype S2 = :none | int | float;

 S1 s1;
 S2 s2 = s1;

 s2.match {
      (float) => assert(0);
      (default val) s1 = val;
 }
 ```

 To determine if a type is in the set:

 ```d
 sumtype S1 = :none | int;

 pragma(msg, int in S1); // true
 pragma(msg, :none in S1); // true
 pragma(msg, "none" in S1); // true
 ```
 ...
I think a priori here you will have an issue with parsing.
I don't know, in expression already exist, this would be a different version that is differentiated at semantic analysis. I wouldn't be surprised if it already parses with my member of operator PR.
 Or you can expand a sumtype directly into another:

 ```d
 sumtype S1 = :none | int i;
 sumtype S2 = :none | S1.expand | long l; // :none | int i | long l
 ```

 When merging, duplicate types and names are not an error, they will be 
 combined.
 Although if two names have different types this will error.
 ...
Again this mixes nominal and structural typing in a way that seems unsatisfying. Note that different struct types do not become assignable just because they share member types and names.
A struct is not a set, a sumtype and with that this design is based upon being a set in terms of options.
Feb 17 2024
prev sibling next sibling parent reply Kagamin <spam here.lot> writes:
Is the `Result` type supposed to be sumtype? Is `Result` on top 
level supposed to be something like `string | UnicodeException | 
ErrnoException | IOException | SocketException | 
PostgresqlException | SqliteException`? That would be beyond 
attribute soup.
Feb 16 2024
parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 16/02/2024 9:19 PM, Kagamin wrote:
 Is the `Result` type supposed to be sumtype? Is `Result` on top level 
 supposed to be something like `string | UnicodeException | 
 ErrnoException | IOException | SocketException | PostgresqlException | 
 SqliteException`? That would be beyond attribute soup.
Where did you get ``Result`` from? None of my examples use that identifier, although Walter's does.
Feb 16 2024
parent reply Kagamin <spam here.lot> writes:
I refer to the idea of implementation of error handling with 
return types, possibly nogc. Is it not supposed to involve 
sumtypes somehow?
Feb 16 2024
parent "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 16/02/2024 10:19 PM, Kagamin wrote:
 I refer to the idea of implementation of error handling with return 
 types, possibly nogc. Is it not supposed to involve sumtypes somehow?
If there is no language level support for it, then yes you will need to declare the sumtype explicitly. If there is, which my proposal for value type exceptions provide, inference or `` throws(Exception, MyException)`` will do it, without the need of a sumtype declaration. Note for language level support, throw and catch would add and remove automatically from the set, so sumtypes are only of note if you catch all.
Feb 16 2024
prev sibling next sibling parent reply Dukc <ajieskola gmail.com> writes:
On Thursday, 8 February 2024 at 15:42:25 UTC, Richard (Rikki) 
Andrew Cattermole wrote:
 Yesterday I mentioned that I wasn't very happy with Walter's 
 design of sum types, at least as per his write-up in his DIP 
 repository.
 I have finally after two years written up an alternative to it, 
 that should cover everything you would expect from such a 
 language feature.
 There are also a couple of key differences with regards to the 
 tag and ABI that will make value type exceptions aka zero cost 
 exceptions work fairly fast.
Thanks for the writeup. I read both DIPs. Honestly, both of them need improvement IMO. At present state, I prefer Walter's DIP, mainly because the details there are better nailed down. We don't want this special case for pointers - or at least it needs to be much, much more refined before it carries it's weight. If I have `sumtype S { a, int* b }`, `S.a == S.b(null);`, right? Well, why doesn't the DIP say the same should happen with `sumtype S { a, Object b } `? Even more interesting case, `sumtype S { a, b, c, d, bool e}`. A boolean has 254 illegal bit patterns - shouldn't they be used for the tag in this case? And what happens with `sumtype S {a, int* b, int* c}`? Since we need space for a separate tag anyway, does it make sense for null `b` to be equal to `a`? The proposed special case doesn't help much. If one wants a pointer and a special null value, one can simply use a pointer. On the other hand, one might want a pointer AND a separate tag value. To accomplish that, the user will have to either put the 0 value to the end or do something like `sumtype S {int[0] a, int* b}`. Certainly doable, but it's a special case with no good reason. The query expression is not a good idea. This introduces new syntax that isn't consistent with rest of the langauge. Instead, I propose that each sumtype has a member function `has`, that returns a DRuntime-defined nested struct with an opDispatch defined for quessing the tag: ```D sumtype Sum {int a, float b, dchar c} auto sum = Sum.b(2.5); assert(!sum.has.a); assert(sum.has.b); assert(!sum.has.c); ``` Alternatively, we can settle for simply providing a way for the user to get the tag of the sumtype. Then he can use that tag as he'd use it in case of a regular enum. In fact we will want to provide tag access in any case, because the sum type is otherwise too hard to use in `switch` statements. Like Timon said, the types proposed don't seem to know whether they are supposed to be an unique type. Consider that any tuple can be used to initialise part of another tuple: `Tuple(int, int, char, char)` can be initialised with `tuple(5, tuple(10, 'x').expand, `\n`)`. It makes sense - tuples are defined by their contents and beyond that have no identity of their own. However, there are excellent reaons why you can't do "std.datetime.StopWatch(999l.nullable.expand, 9082l)`. You aren't supposed to just declare any random bool and two longs as stopwatches just because their internal representation happens to be that. Structs are not just names for tuples, they're independent types that shouldn't be implicitly mixable unless the struct author explicitly declares so. By saying that a sumtype is always implicitly convertible to another sumtype that can structurally hold the same values, you're making it the tuple of sumtypes. If the user wants to protect the details, he must put it inside a struct or an union. But this feels wrong: ```D struct MySumType { sumtype Impl = int a | float b | dchar c; Impl impl; } ``` Why do I need to invent three names for this? If I want to define a tuple type that doesn't mix/match freely, I need just one name for the struct I use for that. If you insist on this implicit conversion thing, I propose that sum types don't have names by default. Instead, they would become part of type declaration syntax. `void` would be the type for members with no values beside the tag, and array indexes would be used for getting the members: ```D double | float sumTypeInstance = 3.4; alias SumTypeMixable = int | float | dchar; struct SumTypeUnmixable { short | wchar | ubyte[2] members; alias asShort = members[0]; alias asWchar = members[1]; alias asBytePair = members[2]; } ``` Then again, the problem would be that how do you name the members this way? Maybe it can work with udas. `double a | float b | :c sumTypeInstance` could be rewritten to ` memberName(0, "a") memberName(1, "b") memberName(2, "c") double | float | void sumTypeInstance`. The compiler would check for those udas of the symbol when accessing members via name, and also propagate udas of an alias to any declaration done using it. I suspect this rabbit hole goes a bit too deep though: ```D alias Type1 = int a | float b; alias Type2 = int b | float a; // What would be the member names of this? Sigh. auto sumtype = [Type1.init, Type2.init]; ``` So okay, I don't have very good ideas. Maybe we should just require putting the sum type inside another type if naming is desired. There is more I could say, on both of these DIPs but I've used a good deal of time on this post already. Maybe I'll do some more another time.
Feb 16 2024
parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 17/02/2024 3:34 AM, Dukc wrote:
 Thanks for the writeup. I read both DIPs. Honestly, both of them need 
 improvement IMO. At present state, I prefer Walter's DIP, mainly because 
 the details there are better nailed down.
Mine isn't nailed down, because it isn't worth the effort if Walter won't even reply to this thread ;)
 By saying that a sumtype is always implicitly convertible to another 
sumtype that can structurally hold the same values, you're making it the tuple of sumtypes. If the user wants to protect the details, he must put it inside a struct or an union. What I tried to do was to make it a set. Some places you want a set to merge, some you don't. ```d sumtype S1 = int; sumtype S2 = int | float; S1 s1 = 1; S2 s2 = s1; ``` That to me makes sense, both sets contain int, it'll be a common operation that people will want to do. So the question I have is why would this not be appropriate behavior, and instead should error? More importantly, if you did want to do it, how do you do it without doing a match, then assign the value? Especially when all the compiler would need to do is codegen a blit (or with copy constructor call if provided).
Feb 16 2024
parent Dukc <ajieskola gmail.com> writes:
On Saturday, 17 February 2024 at 02:46:26 UTC, Richard (Rikki) 
Andrew Cattermole wrote:
 What I tried to do was to make it a set. Some places you want a 
 set to merge, some you don't.

 ```d
 sumtype S1 = int;
 sumtype S2 = int | float;

 S1 s1 = 1;
 S2 s2 = s1;
 ```

 That to me makes sense, both sets contain int, it'll be a 
 common operation that people will want to do.

 So the question I have is why would this not be appropriate 
 behavior, and instead should error?
It does not have to error, it makes sense per se. It just mixes poorly with having to name the sum type. Consider enums: ```D enum E1 { a, e = 4 } enum E2 { a, d = 3, e, f } ``` You can't impliclitly assign an instance of `E1` to `E2` either even though all legal members of E1 are representable in E2. On the other hand, if you just declared two bunches of `enum int`s, you could exchange them in the same variable. In the same way, if I have to give a type a name, I'd expect it won't implicitly convert to anything. If the sum types were, by default, declared without a type name then I wouldn't be bothered.
 More importantly, if you did want to do it, how do you do it 
 without doing a match, then assign the value? Especially when 
 all the compiler would need to do is codegen a blit (or with 
 copy constructor call if provided).
S2 can well accept an assignment from an `int` (or `float`). It's assigning `S1` directly, without explicitly fetching the `int` from inside that I'm criticising. In the source code fetching that `int` from `S1` is going to require a match (like any use of a sum type outside unsafe pointer casts), but since there is only one member the compiler can optimise that out.
Feb 19 2024
prev sibling next sibling parent IchorDev <zxinsworld gmail.com> writes:
On Thursday, 8 February 2024 at 15:42:25 UTC, Richard (Rikki) 
Andrew Cattermole wrote:
 Yesterday I mentioned that I wasn't very happy with Walter's 
 design of sum types, at least as per his write-up in his DIP 
 repository.
 I have finally after two years written up an alternative to it, 
 that should cover everything you would expect from such a 
 language feature.
 There are also a couple of key differences with regards to the 
 tag and ABI that will make value type exceptions aka zero cost 
 exceptions work fairly fast.
I am pretty pleased with both of these DIPs. The syntax of the sum types could be tweaked a little: 1. I think that the new keyword should be avoided: `sumtype` => `enum union`/`case union` (or similar) 2. The `:none` in the sum type's declaration is really odd... it seems to reference itself from within its own declaration? Why not just use `void`? 3. Why comma-separation between members? These are a union-like type, use semicolons. So, all in all my suggestions would look like this: ```d case union Nullable(T){ void none; T value; }
Mar 03 2024
prev sibling parent reply cc <cc nevernet.com> writes:
On Thursday, 8 February 2024 at 15:42:25 UTC, Richard (Rikki) 
Andrew Cattermole wrote:
 sum types
In D there is a case where a variable can be declared in a condition, e.g. `if (auto a = someFunc) a.doWhatever;`. Could similar logic like this work? ```d sumtype Foo = int | float | string; Foo foo; foo = 2; assert(foo is int); assert(foo !is float); void barFloat(float) {} Foo getPi() => { if (userIsBaker) return "apple"; return 3.14f; } foo = getPi(); barFloat(foo); // Illegal, use: if (foo is float) { // inside this scope, accesses to foo treat it as float barFloat(foo); // ok now! } else if (foo is string) { writefln("Sure could go for some %s PIE...", foo.toUpper); } writefln("handy foo debugger: %s", foo); // template/trait magic would make this ok though and format to library standards. static if (is(typeof(arg) == sumtype)) ... etc ```
Mar 07 2024
parent "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
On 08/03/2024 12:59 AM, cc wrote:
 On Thursday, 8 February 2024 at 15:42:25 UTC, Richard (Rikki) Andrew 
 Cattermole wrote:
 sum types
In D there is a case where a variable can be declared in a condition, e.g. `if (auto a = someFunc) a.doWhatever;`.  Could similar logic like this work? ```d sumtype Foo = int | float | string; Foo foo; foo = 2; assert(foo is int); assert(foo !is float); void barFloat(float) {} Foo getPi() => {     if (userIsBaker)         return "apple";     return 3.14f; } foo = getPi(); barFloat(foo); // Illegal, use: if (foo is float) { // inside this scope, accesses to foo treat it as float     barFloat(foo); // ok now! } else if (foo is string) {     writefln("Sure could go for some %s PIE...", foo.toUpper); } writefln("handy foo debugger: %s", foo); // template/trait magic would make this ok though and format to library standards. static if (is(typeof(arg) == sumtype)) ... etc ```
It could be made to work yes. However for matching, I'm waiting on Walter before having opinions about it.
Mar 07 2024