www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Prototype of Ownership/Borrowing System for D

reply Walter Bright <newshound2 digitalmars.com> writes:
https://github.com/dlang/dmd/pull/10586

It's entirely opt-in by adding the ` live` attribute on a function, and the 
implementation is pretty much contained in one module, so it doesn't disrupt
the 
rest of the compiler.
Nov 19 2019
next sibling parent reply mipri <mipri minimaltype.com> writes:
On Wednesday, 20 November 2019 at 04:59:37 UTC, Walter Bright 
wrote:
 https://github.com/dlang/dmd/pull/10586

 It's entirely opt-in by adding the ` live` attribute on a 
 function, and the implementation is pretty much contained in 
 one module, so it doesn't disrupt the rest of the compiler.
dmd and phobos sure do compile fast! For anyone else: https://wiki.dlang.org/Starting_as_a_Contributor#Building_from_source as a bash script: mkdir ~/dtest cd ~/dtest git clone https://github.com/WalterBright/dmd.git cd dmd git checkout ob make -f posix.mak -j8 AUTO_BOOTSTRAP=1 cd .. git clone https://github.com/dlang/phobos cd phobos make -f posix.mak -j8 and now you can compile with ~/dtest/dmd/generated/linux/release/64/dmd The longest part of this process is the 'git clone' so why not? No need to -preview= anything. live: // doesn't add live to following functions. I tried writing some bad code and got some nice errors that made sense. This is the first error that surprises me (someone who's never learned Rust or dealt with a borrow checker): import std; import core.stdc.stdlib: malloc; struct Person { uint age; uint karma; char[0] name; } live void birthday(Person* p) { p.age++; } // Error: variable x4.omg.p is left dangling at return void main() { auto p = cast(Person*)malloc(Person.sizeof + 10); birthday(p); p.name.ptr[0] = 'B'; writeln(p); } With linear types in ATS there's a special syntax to suggest that birthday is giving ownership back to caller on return. But without that... this of course works: live Person* birthday(Person* p) { p.age++; return p; } ... p = birthday(p);
Nov 19 2019
next sibling parent mipri <mipri minimaltype.com> writes:
On Wednesday, 20 November 2019 at 06:57:55 UTC, mipri wrote:
 With linear types in ATS there's a special syntax to
 suggest that birthday is giving ownership back to caller
 on return. But without that... this of course works:

    live Person* birthday(Person* p) {
       p.age++;
       return p;
   }
   ...
       p = birthday(p);
As does this: live void birthday(scope Person* p) { p.age++; } No new syntax needed because it was already present :)
Nov 19 2019
prev sibling next sibling parent reply IGotD- <nise nise.com> writes:
On Wednesday, 20 November 2019 at 06:57:55 UTC, mipri wrote:
   void main() {
       auto p = cast(Person*)malloc(Person.sizeof + 10);
       birthday(p);
       p.name.ptr[0] = 'B';
       writeln(p);
   }
If you would make main live, would the compiler automatically put in a call to free for variable 'p' at the end of main?
Nov 20 2019
parent reply mipri <mipri minimaltype.com> writes:
On Wednesday, 20 November 2019 at 18:32:42 UTC, IGotD- wrote:
 On Wednesday, 20 November 2019 at 06:57:55 UTC, mipri wrote:
   void main() {
       auto p = cast(Person*)malloc(Person.sizeof + 10);
       birthday(p);
       p.name.ptr[0] = 'B';
       writeln(p);
   }
If you would make main live, would the compiler automatically put in a call to free for variable 'p' at the end of main?
How I read it is that live doesn't know anything about malloc or free (except what it knows by their signatures: that malloc returns a pointer and that free takes ownership of a pointer). A glance through the commits seems to support this; I don't see references to either function outside of unit tests, and the docs mention that there's no protection against using the wrong allocator for a pointer, like free_allocator2(malloc_allocator1()); This compiles without error, for example: import std; import core.stdc.stdlib: malloc; struct Person { uint age; uint karma; char[0] name; } live void birthday(scope Person* p) { p.age++; } void consume(Person* p) { } live void main() { auto p = cast(Person*)malloc(Person.sizeof + 10); birthday(p); writeln(p.age); consume(p); } Where valgrind points out the use of undefined memory and the leaked memory. live just cares that consume() took ownership of the pointer, even if consume() is not a responsible owner. Of course you get an error if you try to add live to consume(). live isn't a superset of safe either, since it permits the cast.
Nov 20 2019
parent Walter Bright <newshound2 digitalmars.com> writes:
On 11/20/2019 10:52 AM, mipri wrote:
 How I read it is that  live doesn't know anything about malloc or
 free (except what it knows by their signatures: that malloc returns
 a pointer and that free takes ownership of a pointer).
That is correct. It only checks that the pointer value gets "consumed" exactly once. Passing it to another function that takes ownership of the pointer value counts as "consuming" it. What that function does with the pointer value is not the concern of the live function.
 Where valgrind points out the use of undefined memory and the
 leaked memory.  live just cares that consume() took ownership of
 the pointer, even if consume() is not a responsible owner.
 
 Of course you get an error if you try to add  live to consume().
 
  live isn't a superset of  safe either, since it permits the cast.
Exactly right.
Nov 20 2019
prev sibling parent TheGag96 <thegag96 gmail.com> writes:
On Wednesday, 20 November 2019 at 06:57:55 UTC, mipri wrote:
snip
Wow, I never even attempted to compile dmd before despite how long I've hung around here... Never really appreciated just how fast compilation speed is. The entire compiler finishes in 9.4 seconds. Puts C/++-based projects to shame...
Nov 20 2019
prev sibling next sibling parent reply Manu <turkeyman gmail.com> writes:
On Wed, Nov 20, 2019 at 3:00 PM Walter Bright via Digitalmars-d
<digitalmars-d puremagic.com> wrote:
 https://github.com/dlang/dmd/pull/10586

 It's entirely opt-in by adding the ` live` attribute on a function, and the
 implementation is pretty much contained in one module, so it doesn't disrupt
the
 rest of the compiler.
I haven't read thoroughly yet, although I have been following along the way and understand the goals... but I really can't not say straight up that I think ` live` is very upsetting to me. I hate to bike-shed, but at face value ` live` is a very unintuitive name. It offers me no intuition what to expect, and I have at no point along this process has any idea what it means or why you chose that word, and I think that's an immediate ref flag. Are you really certain there's no way to do this without adding yet more attributes? It would be better if an attribute was not required for this... we're already way overloaded in that department. Timon appeared to have a competing proposal which didn't add an attribute. I never saw your critique of his work, how do your relative approaches compare?
Nov 19 2019
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 11/19/2019 11:49 PM, Manu wrote:
 I haven't read thoroughly yet, although I have been following along
 the way and understand the goals... but I really can't not say
 straight up that I think ` live` is very upsetting to me.
 I hate to bike-shed, but at face value ` live` is a very unintuitive
 name. It offers me no intuition what to expect, and I have at no point
 along this process has any idea what it means or why you chose that
 word, and I think that's an immediate ref flag.
The "live" refers to the data flow analysis which discovers which pointers are "live" or "dead" at any point in the flow graph. This is critical for what O/B is trying to do. ` ownerBorrow` just seems a little awkward :-) Andrei proposed ` live`, and I like it. It's short, sweet, and sounds good.
 Are you really certain there's no way to do this without adding yet
 more attributes?
We'll never be able to compile C-like code if we force an O/B system on all the code. There has to be a way to distinguish, like what `pure` does. D would be unusable if everything had to be `pure`. My understanding of O/B is you're going to have to redesign code and data structures to use it effectively. I.e. it'll break everything. Rust has a powerful enough marketing machine to convince people that redesigning your programs is a Good Thing (tm) and perhaps it is, but we don't have the muscle to do that.
 It would be better if an attribute was not required
 for this... we're already way overloaded in that department.
 Timon appeared to have a competing proposal which didn't add an
 attribute. I never saw your critique of his work, how do your relative
 approaches compare?
I don't have a good understanding of Timon's work yet.
Nov 20 2019
next sibling parent reply Jab <jab_293 gmall.com> writes:
On Wednesday, 20 November 2019 at 08:19:46 UTC, Walter Bright 
wrote:
 Are you really certain there's no way to do this without 
 adding yet
 more attributes?
We'll never be able to compile C-like code if we force an O/B system on all the code. There has to be a way to distinguish, like what `pure` does. D would be unusable if everything had to be `pure`. My understanding of O/B is you're going to have to redesign code and data structures to use it effectively. I.e. it'll break everything. Rust has a powerful enough marketing machine to convince people that redesigning your programs is a Good Thing (tm) and perhaps it is, but we don't have the muscle to do that.
An attribute you put on every function just isn't the greatest. If you do fix " live:" and you can put it at the start of the file, then how do you disable it for a section of code you don't want it on? This is the problem with nogc as well and possibly some of the other attributes. This leads to every function being individually marked as such as needed. This doesn't really add on to the problem, but it just introduces a new feature with the same problem as those before it. I don't see why this can't just be attached to safe, if this is a feature to prevent memory corruption. That's exactly what safe is for. If your porting C code you aren't going to be using safe anyways. Just as you wouldn't be using live either as you'd effectively have to completely rewrite your code.
Nov 20 2019
next sibling parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Wednesday, 20 November 2019 at 19:14:52 UTC, Jab wrote:
 new feature with the same problem as those before it. I don't 
 see why this can't just be attached to  safe, if this is a 
 feature to prevent memory corruption. That's exactly what  safe 
 is for. If your porting C code you aren't going to be using 
  safe anyways. Just as you wouldn't be using  live either as 
 you'd effectively have to completely rewrite your code.
Yes, something like " live" does not really work unless it is a default you occasionally try to escape from. So it seems like it would be better to have " nolive" instead of " live", but maybe I don't understand what it is meant to enable yet. I think nogc is different, though, you should never be able to escape from that. nogc is there so that you can ship executables with a bare bones runtime. It is basically available so that both GC and non-GC programs can share libraries, IMO.
Nov 20 2019
prev sibling parent reply mipri <mipri minimaltype.com> writes:
On Wednesday, 20 November 2019 at 19:14:52 UTC, Jab wrote:
 I don't see why this can't just be attached to  safe, if this 
 is a feature to prevent memory corruption. That's exactly what 
  safe is for.
Well, live enables a static check, with no runtime cost, but adding safe to code also makes it retain bounds checks by default. live you might want to just place everywhere you can, at a cost only to programmer time, but if you add safe everywhere you might then want to remove bounds checks also from safe code, which would removes the potentially useful distinction. Anyway, the comment from the pull request is "This is far enough along that it can be experimented with." For example, if you think live should be folded into safe, you can write some code while always trying to put safe alongside live, and maybe you'll end up with some results like: a. I was always able to easily do this, but now I have all these " safe live" in my code and frankly any function with only one of these attributes is only going to lack the other attribute due to an oversight, so why not merge these attributes? b. I found I couldn't add live to lots of safe functions for whatever reason. If these attributes were merged I would have to downgrade these functions to trusted, and in practice I'd lose a ton of safety provisions, just because my code is incompatible with borrow checking. c. I found that trying to add safe to live functions resulted in my breaking portions of the live functions code out into new trusted live functions. If live was merged with safe I'd just lose borrow checking of these functions. An example of the third case follows, where factory() and then newPerson() are results of wanting to add safe. import std; import core.stdc.stdlib: malloc; struct Person { uint age; uint karma; char[0] _name; // a C-style "flexible array member" static Person* factory(size_t size) live safe in { assert(size >= 4); } body { Person* newPerson() live trusted { auto partial = cast(Person*)malloc(Person.sizeof + size); partial._name.ptr[0] = '\0'; return partial; } auto p = newPerson(); p.age = uint.init; p.karma = uint.init; return p; } // Conditional jump or move depends on uninitialised value(s) void identify() live trusted { _name.ptr[0..4] = "Bob\0"; } string name() live trusted { return _name.ptr.fromStringz.idup; } } void birthday(scope Person* p) live safe { p.age++; } void consume(Person* p) safe { } void main() safe live { auto p = Person.factory(4); birthday(p); p.identify(); writeln(p.name, " ", p.age); consume(p); }
Nov 20 2019
parent Walter Bright <newshound2 digitalmars.com> writes:
Your experiments with  live shows that we need more experience with  live
before 
we can make a proper determination as to how it fits in with other attributes. 
In the meantime, I like it being an independent attribute which makes it easy
to 
experiment with in various combinations with other attributes.
Nov 20 2019
prev sibling next sibling parent reply Manu <turkeyman gmail.com> writes:
On Wed, Nov 20, 2019 at 6:20 PM Walter Bright via Digitalmars-d
<digitalmars-d puremagic.com> wrote:
 On 11/19/2019 11:49 PM, Manu wrote:
 I haven't read thoroughly yet, although I have been following along
 the way and understand the goals... but I really can't not say
 straight up that I think ` live` is very upsetting to me.
 I hate to bike-shed, but at face value ` live` is a very unintuitive
 name. It offers me no intuition what to expect, and I have at no point
 along this process has any idea what it means or why you chose that
 word, and I think that's an immediate ref flag.
The "live" refers to the data flow analysis which discovers which pointers are "live" or "dead" at any point in the flow graph. This is critical for what O/B is trying to do. ` ownerBorrow` just seems a little awkward :-) Andrei proposed ` live`, and I like it. It's short, sweet, and sounds good.
 Are you really certain there's no way to do this without adding yet
 more attributes?
We'll never be able to compile C-like code if we force an O/B system on all the code. There has to be a way to distinguish, like what `pure` does. D would be unusable if everything had to be `pure`. My understanding of O/B is you're going to have to redesign code and data structures to use it effectively. I.e. it'll break everything. Rust has a powerful enough marketing machine to convince people that redesigning your programs is a Good Thing (tm) and perhaps it is, but we don't have the muscle to do that.
Is there a path perhaps where you only attribute parameters instead of the whole function? We already have `scope`, `return`, etc on parameters... ?
 It would be better if an attribute was not required
 for this... we're already way overloaded in that department.
 Timon appeared to have a competing proposal which didn't add an
 attribute. I never saw your critique of his work, how do your relative
 approaches compare?
I don't have a good understanding of Timon's work yet.
Okay. I'd like to see a rigorous comparison somewhere with the key differences called out.
Nov 20 2019
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 11/20/2019 2:57 PM, Manu wrote:
 Is there a path perhaps where you only attribute parameters instead of
 the whole function? We already have `scope`, `return`, etc on
 parameters... ?
Perhaps. Seems rather tedious, though.
Nov 20 2019
parent Walter Bright <newshound2 digitalmars.com> writes:
On 11/20/2019 5:13 PM, Walter Bright wrote:
 On 11/20/2019 2:57 PM, Manu wrote:
 Is there a path perhaps where you only attribute parameters instead of
 the whole function? We already have `scope`, `return`, etc on
 parameters... ?
Perhaps. Seems rather tedious, though.
And less auditable. `scope` and `return` are backed up with compiler checks when they aren't used. ` live` is conceptually different - it adds checks, it does not compensate for added checks.
Nov 20 2019
prev sibling parent reply =?iso-8859-1?Q?Robert_M._M=FCnch?= <robert.muench saphirion.com> writes:
On 2019-11-20 08:19:46 +0000, Walter Bright said:

 On 11/19/2019 11:49 PM, Manu wrote:
 I haven't read thoroughly yet, although I have been following along
 the way and understand the goals... but I really can't not say
 straight up that I think ` live` is very upsetting to me.
 I hate to bike-shed, but at face value ` live` is a very unintuitive
 name. It offers me no intuition what to expect, and I have at no point
 along this process has any idea what it means or why you chose that
 word, and I think that's an immediate ref flag.
+1
 The "live" refers to the data flow analysis which discovers which 
 pointers are "live" or "dead" at any point in the flow graph. This is 
 critical for what O/B is trying to do.
So, the name is related to how it's done, not what it gives to the user. That's like selling a thing and stating: "Look this was milled, turned etc." but not telling the customer what it does.
 ` ownerBorrow` just seems a little awkward :-)
Well, why not make it catchy: borrow I'm sure 99.99% of all people would get a hint what this is about. -- Robert M. Münch http://www.saphirion.com smarter | better | faster
Nov 22 2019
parent reply Doc Andrew <x x.com> writes:
On Friday, 22 November 2019 at 09:10:43 UTC, Robert M. Münch 
wrote:
 So, the name is related to how it's done, not what it gives to 
 the user. That's like selling a thing and stating: "Look this 
 was milled, turned etc." but not telling the customer what it 
 does.

 ` ownerBorrow` just seems a little awkward :-)
Well, why not make it catchy: borrow I'm sure 99.99% of all people would get a hint what this is about.
I always kind of thought live would be a transitional thing until we figured out safe-by-default, and ownership/borrowed pointers would just be "the way."
Nov 22 2019
parent jmh530 <john.michael.hall gmail.com> writes:
On Friday, 22 November 2019 at 14:48:11 UTC, Doc Andrew wrote:
 [snip]

 I always kind of thought  live would be a transitional thing 
 until we figured out safe-by-default, and ownership/borrowed 
 pointers would just be "the way."
While I like the idea of a safe-by-default compiler flag and I'm warming to live as described by this thread, I'm not sure I'm convinced of the idea of safe or live being the default. The nice thing about system being the default is that anyone from another language can immediately start doing stuff.
Nov 22 2019
prev sibling next sibling parent reply mipri <mipri minimaltype.com> writes:
On Wednesday, 20 November 2019 at 04:59:37 UTC, Walter Bright 
wrote:
 https://github.com/dlang/dmd/pull/10586

 It's entirely opt-in by adding the ` live` attribute on a 
 function, and the implementation is pretty much contained in 
 one module, so it doesn't disrupt the rest of the compiler.
It's doubly-linked lists that are supposed to be impossible without evading the borrow checker, but a linked list seems to be useful if you just want to provoke it: --- import std.stdio: write, writeln; import core.stdc.stdlib: malloc, free; struct List { bool empty; List* next; int value; live void popFront() { empty = next.empty; value = next.value; next = next.next; } live List* front() { return &this; } } live void dump(scope List* p) { if (!p.empty) { write(p.value, ' '); return dump(p.next); } else { writeln; } } live void free_list(List* p) { if (!p.empty) { free_list(p.next); free(p); } else { free(p); } } // three undefined states void incr(scope List* p) { while (!p.empty) { p.value++; p = p.next; } } // both Undefined and Owner void dump_negated(scope List* p) { foreach (node; *p) { write(-node.value, ' '); } writeln; } // all kinds of errors List* listOf(int[] args...) { List* result = cast(List*)malloc(List.sizeof); List* node = result; foreach (arg; args) { node.empty = false; node.value = arg; node.next = cast(List*)malloc(List.sizeof); node = node.next; } node.empty = true; return result; } live void main() { List* a = listOf(1, 2, 3); dump(a); incr(a); dump(a); dump_negated(a); free_list(a); // free_list(a); // undefined state and can't be read }
Nov 20 2019
parent Walter Bright <newshound2 digitalmars.com> writes:
On 11/20/2019 12:37 AM, mipri wrote:
 It's doubly-linked lists that are supposed to be impossible
 without evading the borrow checker, but a linked list seems
 to be useful if you just want to provoke it:
I see you're having fun with it :-)
Nov 20 2019
prev sibling next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 20.11.19 05:59, Walter Bright wrote:
 https://github.com/dlang/dmd/pull/10586
 
 It's entirely opt-in by adding the ` live` attribute on a function, and 
 the implementation is pretty much contained in one module, so it doesn't 
 disrupt the rest of the compiler.
I will look into this in on the weekend, but I think it would help if you could answer the following questions: - What do you want to achieve with borrowing/ownership in D? - What can already be done with live? (Ideally with runnable code examples.) - How will I write a compiler-checked memory safe program that uses varied allocation strategies, including plain malloc, tracing GC and reference counting? Right now, the only use I can see for live is as an incomplete and unsound linting tool in system code. It doesn't make safe code any more expressive. To me, added expressiveness in safe code is the whole point of a borrowing scheme.
Nov 20 2019
next sibling parent reply jmh530 <john.michael.hall gmail.com> writes:
On Wednesday, 20 November 2019 at 12:16:29 UTC, Timon Gehr wrote:
 [snip]

 Right now, the only use I can see for  live is as an incomplete 
 and unsound linting tool in  system code. It doesn't make  safe 
 code any more expressive. To me, added expressiveness in  safe 
 code is the whole point of a borrowing scheme.
From here [1], Walter says that "OB will also be turned off in system code." But I don't see anything about safe or sysstem in the changelog/ob.md file. [1] https://dlang.org/blog/2019/07/15/ownership-and-borrowing-in-d/
Nov 20 2019
parent Walter Bright <newshound2 digitalmars.com> writes:
On 11/20/2019 5:46 AM, jmh530 wrote:
  From here [1], Walter says that "OB will also be turned off in  system code." 
 But I don't see anything about  safe or  sysstem in the changelog/ob.md file.
That was the original plan, making live an extension of safe. But in implementing it, I realized it could be used as an independent attribute, and implemented it that way. So yes, you can use it in system code, and I think it can add value to it, especially code that is a snarl of pointer allocations and frees.
Nov 20 2019
prev sibling next sibling parent reply rikki cattermole <rikki cattermole.co.nz> writes:
On 21/11/2019 1:16 AM, Timon Gehr wrote:
 Right now, the only use I can see for  live is as an incomplete and 
 unsound linting tool in  system code. It doesn't make  safe code any 
 more expressive. To me, added expressiveness in  safe code is the whole 
 point of a borrowing scheme.
You touch upon a very good point. I'm starting to think of live as a superset of safe. With trusted being an escape route. If this is the case then perhaps making all pointers non-null (with asserts) would make sense.
Nov 20 2019
next sibling parent Radu <void null.pt> writes:
On Wednesday, 20 November 2019 at 13:51:34 UTC, rikki cattermole 
wrote:
 On 21/11/2019 1:16 AM, Timon Gehr wrote:
 Right now, the only use I can see for  live is as an 
 incomplete and unsound linting tool in  system code. It 
 doesn't make  safe code any more expressive. To me, added 
 expressiveness in  safe code is the whole point of a borrowing 
 scheme.
You touch upon a very good point. I'm starting to think of live as a superset of safe. With trusted being an escape route. If this is the case then perhaps making all pointers non-null (with asserts) would make sense.
Indeed it looks that live sits on top of safe. I would love to see this default to safe when -preview=borrow (don't think it is implemented ATM) is enabled. I guess the reason for live is backwards compatibility, but even with this I think using safe(own), name up to discussion - meaning having safe accept a parameter, would serve better to limit the attribute soup.
Nov 20 2019
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 11/20/2019 5:51 AM, rikki cattermole wrote:
 If this is the case then perhaps making all pointers non-null 
 (with asserts) would make sense.
I originally was going to add null to the data flow analysis. But I realized it would be rather useless: T* foo(); T* p = foo(); // is p null or not? Very quickly, the flow analysis would drop into "dunno if it is null or not" so it just won't be worth much. In order to make non-null checking actually work, the language semantics would likely need to change to make: T* a non-null pointer T?* an optionally null pointer or something like that. Two different pointer types would need to exist. Something like this is orthogonal to what live is trying to do, so I put it on the shelf for the time being.
Nov 20 2019
next sibling parent reply Doc Andrew <x x.com> writes:
On Wednesday, 20 November 2019 at 22:15:18 UTC, Walter Bright 
wrote:
 I originally was going to add null to the data flow analysis. 
 But I realized it would be rather useless:

   T* foo();

   T* p = foo(); // is p null or not?

 Very quickly, the flow analysis would drop into "dunno if it is 
 null or not" so it just won't be worth much.
You might only have to maintain this "undefined state" for a short period of time: T* foo(); T* p = foo(); //Undefined state if(p) { //We opened Schrodinger's pointer and know it's not null anymore. } else { //We know it's null here. } //Back to undefined state - we shouldn't deref p out here. And honestly, knowing whether a pointer is in an undefined state is still a very useful piece of information that the flow analysis would provide. Dereferencing a pointer of unknown provenance should be an error in live code, no different than null.
 In order to make non-null checking actually work, the language 
 semantics would likely need to change to make:

    T*    a non-null pointer
    T?*   an optionally null pointer

 or something like that. Two different pointer types would need 
 to exist.
Maybe. As an alternative to that syntax, Consider a pointer with a "notnull" attribute (storage class?) and a corresponding "notnull" function attribute for the return type. I'd expect runtime checks to be inserted in debug mode for the function return and removed in release mode.
 Something like this is orthogonal to what  live is trying to 
 do, so I put it on the shelf for the time being.
It's orthogonal, but very useful for correctness. I'm anxious to see what comes of it... ...but hopefully not with ?*s scattered everywhere :)
Nov 20 2019
next sibling parent SimonN <eiderdaus gmail.com> writes:
On Thursday, 21 November 2019 at 03:47:21 UTC, Doc Andrew wrote:
    T*    a non-null pointer
    T?*   an optionally null pointer
Yes, optional pointers would be amazing, even though it breaks the syntax in most existing modules that uses pointers.
 alternative to that syntax, Consider a pointer with a "notnull" 
 attribute (storage class?) and a corresponding "notnull" 
 function attribute for the return type.
Ideally, the non-nullable pointer is the default, and the nullable pointer should be the oddball that requires explicit annotation. Thus the main benefit of the "notnull" attribute would be preserving existing D, certainly a huge benefit. Reason: I'd estimate that 80% to 95% of all pointer code can be written with non-nullable pointers. No proof, only a hunch. Many functions already return non-null than maybe-null, the type system merely doesn't know it yet. Even if we return a null pointer, then our caller, on receiving the maybe-null pointer from us, tends to immediately check, thus immediately convert to non-nullable, or else return early. Whenever the common case requires annotation, we foster annotation salad: int notnull* foo() const pure safe nothrow nogc With class references, non-nullability is even more common than with pointers. Typically, null class references appear only as member variables during construction of another class: class A { /* ... */ } class B { A a; this() { /* here is the only time that a is ever null */ a = new A(); } }
 It's orthogonal, but very useful for correctness.
Yes, it's entirely orthogonal. :-) live is conceptually different enough from non-nullability, it feels best to solve them separately. live feels more like safe instead, and even those turned out independent. Exciting topic, hard to fit onto existing D codebases. -- Simon
Nov 20 2019
prev sibling next sibling parent reply Johannes Pfau <nospam example.com> writes:
Am Thu, 21 Nov 2019 03:47:21 +0000 schrieb Doc Andrew:

 On Wednesday, 20 November 2019 at 22:15:18 UTC, Walter Bright wrote:
 I originally was going to add null to the data flow analysis. But I
 realized it would be rather useless:

   T* foo();

   T* p = foo(); // is p null or not?

 Very quickly, the flow analysis would drop into "dunno if it is null or
 not" so it just won't be worth much.
You might only have to maintain this "undefined state" for a short period of time: T* foo(); T* p = foo(); //Undefined state if(p) { //We opened Schrodinger's pointer and know it's not null anymore. } else { //We know it's null here. }
TypeScript does that. In addition, it disallows dereferencing if the pointer is in the undefined state. So it forces you to check a pointer before you dereference it. I always found that to be quite useful. -- Johannes
Nov 21 2019
next sibling parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Thursday, 21 November 2019 at 08:18:56 UTC, Johannes Pfau 
wrote:
 TypeScript does that. In addition, it disallows dereferencing 
 if the pointer is in the undefined state. So it forces you to 
 check a pointer before you dereference it. I always found that 
 to be quite useful.
Yes, but TypeScript is using flow-typing. I guess D could switch to flow-typing somehow, but I am not sure if it is a good idea to do so without doing it in a more wholesome manner.
Nov 21 2019
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 11/21/2019 12:18 AM, Johannes Pfau wrote:
 TypeScript does that. In addition, it disallows dereferencing if the
 pointer is in the undefined state. So it forces you to check a pointer
 before you dereference it. I always found that to be quite useful.
I would find that to be annoying, as the CPU hardware checks it before dereferencing it, too, for free.
Nov 21 2019
next sibling parent reply Jacob Carlborg <doob me.com> writes:
On Thursday, 21 November 2019 at 11:10:28 UTC, Walter Bright 
wrote:

 I would find that to be annoying, as the CPU hardware checks it 
 before dereferencing it, too, for free.
Why would you wait until runtime when the compiler can do it at compile time? -- /Jacob Carlborg
Nov 21 2019
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 11/21/2019 4:28 AM, Jacob Carlborg wrote:
 On Thursday, 21 November 2019 at 11:10:28 UTC, Walter Bright wrote:
 
 I would find that to be annoying, as the CPU hardware checks it before 
 dereferencing it, too, for free.
Why would you wait until runtime when the compiler can do it at compile time?
The antecedent said "it forces you to check a pointer before you dereference it"
Nov 21 2019
next sibling parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Friday, 22 November 2019 at 04:55:57 UTC, Walter Bright wrote:
 On 11/21/2019 4:28 AM, Jacob Carlborg wrote:
 Why would you wait until runtime when the compiler can do it 
 at compile time?
The antecedent said "it forces you to check a pointer before you dereference it"
You usually have an escape like "object!.memberfunction()" that tells the compiler that the programmer is certain that he does the right thing.
Nov 21 2019
prev sibling next sibling parent reply mipri <mipri minimaltype.com> writes:
On Friday, 22 November 2019 at 04:55:57 UTC, Walter Bright wrote:
 On 11/21/2019 4:28 AM, Jacob Carlborg wrote:
 On Thursday, 21 November 2019 at 11:10:28 UTC, Walter Bright 
 wrote:
 
 I would find that to be annoying, as the CPU hardware checks 
 it before dereferencing it, too, for free.
Why would you wait until runtime when the compiler can do it at compile time?
The antecedent said "it forces you to check a pointer before you dereference it"
I'd sell it not as "it forces you to test" but "it reliably promotes nullable types to non-nullable types when this is safe" Kotlin example: fun Int.test() = println("definitely not null") fun Int?.test() = println("could be null") fun main() { val n: Int? = 10 n.test() if (n != null) n.test() } Output: could be null definitely not null So if you test a nullable type for null, everything else in the control path works with a non-nullable type. The effect on the program is that instead of having defensive tests throughout your program, you can have islands of code that definitely don't have to test, and only have the tests exactly where you need them. If you go too far in not having tests, the compiler will catch this as an error. I've never used TypeScript, but Kotlin's handling of nullable types is nice to the point I'd rather use the billion dollar mistake than an optional type. IMO what's really exciting about this though is that the compiler smarts that you need for it are *just short* of what you'd need for refinement types, so instead of just T? <-- it could be T or null you can have T(5) <-- this type is statically known to be associated with the number 5. And functions can be defined so that it's a compile-time error to pass a value to them without, somewhere in the statically known control path, doing something that associates them with 5. Similarly to how you can't just hand a char[4] to a D function with a signature that's asking for a char[5] - but for any type, for any purpose, and with automatic promotion of a char[???] to char[5] in the same manner that Kotlin promotes Int? to Int Add a SAT solver, and "this array is associated with a number that is definitely less than whatever this size_t is associated with", and you get compile-time guarantees that you don't need bounds checking for definite control paths in your program. That's cool. ATS does that: it's a compile-time error in ATS to look at argv[2] without checking argc.
Nov 21 2019
next sibling parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Friday, 22 November 2019 at 06:43:01 UTC, mipri wrote:
 I've never used TypeScript, but Kotlin's handling of nullable 
 types
 is nice to the point I'd rather use the billion dollar mistake 
 than
 an optional type.
Kotlin has type inference, but TypeScripts has to be able to describe existing JavaScript data-models, so that it can provide stricter typing from existing JavaScript libraries. As a consequence TypeScript has to allow "crazy" mixing of types. Thus a function can have parameters and a return type that is a union of "incompatible" types. To make this work it fully embraces flow-typing, that is: the type is determined by all possible execution paths within a function. Thus TypeScript has many features to deal with "undefined" (unitialized or non-existing variable) and "null" (missing object) as you have to deal with those from JavaScript code. And of course, this becomes a very valuable tool when transitioning code to TypeScript from JavaScript, or from a prototype to production code. The verification language Whiley is following the same core flow-typing principle, but since the type system of Whiley employs a SMT engine it has more potential than TypeScript IMO. See page 8 in http://whiley.org/download/GettingStartedWithWhiley.pdf . (I don't think it is production ready, but interesting nevertheless).
Nov 21 2019
parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Friday, 22 November 2019 at 07:40:30 UTC, Ola Fosheim Grøstad 
wrote:
 to deal with those from JavaScript code. And of course, this 
 becomes a very valuable tool when transitioning code to 
 TypeScript from JavaScript, or from a prototype to production 
 code.
Nnngh, I clearly meant transitioning code from JavaScript to TypeScript. Kinda like how D can be used for transitioning less safe C code to more safe D code.
Nov 21 2019
prev sibling next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
I'm going to suggest that further discussion of null be in a new thread, this 
thread is about the prototype O/B system.
Nov 22 2019
prev sibling parent Doc Andrew <x x.com> writes:
On Friday, 22 November 2019 at 06:43:01 UTC, mipri wrote:
 I'd sell it not as "it forces you to test" but "it reliably
 promotes nullable types to non-nullable types when this is safe"

 Kotlin example:

   fun Int.test() = println("definitely not null")
   fun Int?.test() = println("could be null")

   fun main() {
       val n: Int? = 10
       n.test()
       if (n != null)
           n.test()
   }

 Output:

   could be null
   definitely not null
That's pretty cool.
 So if you test a nullable type for null, everything else in the
 control path works with a non-nullable type. The effect on the
 program is that instead of having defensive tests throughout 
 your
 program, you can have islands of code that definitely don't 
 have to
 test, and only have the tests exactly where you need them. If 
 you
 go too far in not having tests, the compiler will catch this as 
 an
 error.

 I've never used TypeScript, but Kotlin's handling of nullable 
 types
 is nice to the point I'd rather use the billion dollar mistake 
 than
 an optional type.
Yes, after all, pointers are already an "optional" type, where null stands in for None (or whatever).
 IMO what's really exciting about this though is that the 
 compiler
 smarts that you need for it are *just short* of what you'd need 
 for
 refinement types, so instead of just

   T?  <-- it could be T or null

 you can have

   T(5) <-- this type is statically known to be associated with 
 the
            number 5.

 And functions can be defined so that it's a compile-time error 
 to
 pass a value to them without, somewhere in the statically known
 control path, doing something that associates them with 5.

 Similarly to how you can't just hand a char[4] to a D function 
 with
 a signature that's asking for a char[5] - but for any type, for 
 any
 purpose, and with automatic promotion of a char[???] to char[5] 
 in
 the same manner that Kotlin promotes Int? to Int

 Add a SAT solver, and "this array is associated with a number 
 that
 is definitely less than whatever this size_t is associated 
 with",
 and you get compile-time guarantees that you don't need bounds
 checking for definite control paths in your program. That's 
 cool.
 ATS does that: it's a compile-time error in ATS to look at 
 argv[2]
 without checking argc.
Standby for some more discussion on formal verification... :) I'm working on a program I'm calling ProveD (for now, at least) to do source-to-source translation using libdparse to generate Why3ML code and verification conditions that can be proven with Why3 and whatever SMT solvers you have installed. I don't have much to show yet, but I should have a PoC that others can hack on in a few months. A lot of the effort I've spent on it so far is fighting with Ubuntu's broken Why3/Z3/Alt-Ergo/CVC4 packages (pro tip: don't use them) and goofing around with the OCaml package manager OPAM to get everything set up to where I can start running proofs. I was going to try and get a little further before starting a thread for discussion, but if it comes up as a result of "The null Thread", that might be a good place to start. -Doc
Nov 22 2019
prev sibling next sibling parent Sebastiaan Koppe <mail skoppe.eu> writes:
On Friday, 22 November 2019 at 04:55:57 UTC, Walter Bright wrote:
 On 11/21/2019 4:28 AM, Jacob Carlborg wrote:
 On Thursday, 21 November 2019 at 11:10:28 UTC, Walter Bright 
 wrote:
 
 I would find that to be annoying, as the CPU hardware checks 
 it before dereferencing it, too, for free.
Why would you wait until runtime when the compiler can do it at compile time?
The antecedent said "it forces you to check a pointer before you dereference it"
I think that is looking at it from the wrong angle. The power of optional types is being able to declare a non-optional type. I have done a lot of Kotlin programming the last 2 years. The beauty is that when I see a T, I know it cannot be null. I know the compilers knows that, and I know my colleagues know that as well. If you don't want the check, just use T. If you cannot use T (because you might need it to be null), use T?, and then you need the check. The point is, in the case you end up using T?, you needed the check regardless. As an example, just like to need to check whether a range isn't empty before you front it, the first() function in Kotlin (defined on collections) returns a T?, which needs to be checked before usage. But again, you needed that anyways.
Nov 21 2019
prev sibling parent reply Johannes Pfau <nospam example.com> writes:
Am Thu, 21 Nov 2019 20:55:57 -0800 schrieb Walter Bright:

 On 11/21/2019 4:28 AM, Jacob Carlborg wrote:
 On Thursday, 21 November 2019 at 11:10:28 UTC, Walter Bright wrote:
 
 I would find that to be annoying, as the CPU hardware checks it before
 dereferencing it, too, for free.
Why would you wait until runtime when the compiler can do it at compile time?
The antecedent said "it forces you to check a pointer before you dereference it"
But only if you don't know for sure already that the pointer can't possibly be null. With non-null types (by default) such situations are rare. You should really have a look at TypeScript / Kotlin some time. In Javascript, aborting scripts due to null-pointers in untested codepaths has be a main reason of bugs and TypeScript really nicely solved this. I think non-null types + these automated conversion from nullable to non- null are the most important innovations for OOP code in recent years and the feature i miss most when doing OOP in D. -- Johannes
Nov 22 2019
parent Walter Bright <newshound2 digitalmars.com> writes:
On 11/22/2019 12:08 AM, Johannes Pfau wrote:
 But only if you don't know for sure already that the pointer can't
 possibly be null. With non-null types (by default) such situations are
 rare.
 
 You should really have a look at TypeScript / Kotlin some time. In
 Javascript, aborting scripts due to null-pointers in untested codepaths
 has be a main reason of bugs and TypeScript really nicely solved this. I
 think non-null types + these automated conversion from nullable to non-
 null are the most important innovations for OOP code in recent years and
 the feature i miss most when doing OOP in D.
 
I won't say no, but it is off topic for this thread and there's too much on my plate in the near future to think about that (O/B, and move semantics).
Nov 26 2019
prev sibling parent Johannes Pfau <nospam example.com> writes:
Am Thu, 21 Nov 2019 03:10:28 -0800 schrieb Walter Bright:

 On 11/21/2019 12:18 AM, Johannes Pfau wrote:
 TypeScript does that. In addition, it disallows dereferencing if the
 pointer is in the undefined state. So it forces you to check a pointer
 before you dereference it. I always found that to be quite useful.
I would find that to be annoying, as the CPU hardware checks it before dereferencing it, too, for free.
As Ola mentioned it's most useful as part of a complete flow-typing system and much of the benefit in TypeScript is also caused by having nullable and non-nullable types: void foo (Nullable!T value) { if (value) bar(value); else bar2(); } In the call to bar, value is known to be not null, so the type is no longer Nullable!T but T instead. So essentially, this provides a natural way to safely go from nullable to non-null types. Heavy use of non-null types makes sure you don't have to insert if () checks just to satisfy the compiler. In TypeScript the amount of false positives for this error message actually seemed to be quite low. -- Johannes
Nov 21 2019
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 11/20/2019 7:47 PM, Doc Andrew wrote:
 And honestly, knowing whether a pointer is in an undefined state is still a
very
 useful piece of information that the flow analysis would provide.
 Dereferencing a pointer of unknown provenance should be an error in  live code,
 no different than null.
That's what I considered; there would be so many "cry wolf" false positives it would be useless. Besides, the only way a null pointer dereference could result in memory corruption is if it comes with an offset that is larger than the protected memory zone at address 0. I.e. it's very unlikely. It's so unlikely I've literally never had one (but I've had lots and lots of null pointer seg faults).
Nov 21 2019
prev sibling next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On Wednesday, 20 November 2019 at 22:15:18 UTC, Walter Bright 
wrote:
 In order to make non-null checking actually work, the language 
 semantics would likely need to change to make:

    T*    a non-null pointer
    T?*   an optionally null pointer

 or something like that. Two different pointer types would need 
 to exist.
If you're going to go with introducing new meanings like that, why not `&T` for the non-null pointer (to match Rust usage and the implicit comparison with taking the address of a stack variable)? That would make it easier to introduce non-null pointers as opt-in without breaking existing code, and avoid any unintuitive differences with C APIs. [By the way, isn't my bikeshed a lovely colour this season? :-)]
 Something like this is orthogonal to what  live is trying to 
 do, so I put it on the shelf for the time being.
Sure, it makes sense to split the data flow analysis from the question of how any given pointer has its value set. That said, it intuitively feels like a non-null pointer type might want to be implicitly live. One could avoid imposing that, but it's hard to think of a situation where on would want the one, without the other. The one thing that Rust does with its references, which I find a little bit _too_ limiting, is the "single mutable reference" constraint. They do this by default in order to avoid various multithreading failure cases, but it blocks some of the easy design options one could use in single-threaded code. _As a default_ that probably makes sense, but it would be good to have an opt-out for code designs that are not going to have those failure cases.
Nov 21 2019
prev sibling parent reply Jacob Carlborg <doob me.com> writes:
On Wednesday, 20 November 2019 at 22:15:18 UTC, Walter Bright 
wrote:

 In order to make non-null checking actually work, the language 
 semantics would likely need to change to make:

    T*    a non-null pointer
    T?*   an optionally null pointer

 or something like that. Two different pointer types would need 
 to exist.
That would be really nice. But instead it would be just one pointer type and arbitrary optional types: T* a regular pointer (can't be null) T*? an optional pointer (can point to something or can be empty) int a regular int int? an optional int (can contain an int or can be empty) Object a regular reference type (can't be null) Object? an optional reference (can point to something or can be empty) Basically there wouldn't be "null" anymore. -- /Jacob Carlborg
Nov 21 2019
next sibling parent rikki cattermole <rikki cattermole.co.nz> writes:
On 22/11/2019 1:35 AM, Jacob Carlborg wrote:
 On Wednesday, 20 November 2019 at 22:15:18 UTC, Walter Bright wrote:
 
 In order to make non-null checking actually work, the language 
 semantics would likely need to change to make:

    T*    a non-null pointer
    T?*   an optionally null pointer

 or something like that. Two different pointer types would need to exist.
That would be really nice. But instead it would be just one pointer type and arbitrary  optional types: T*           a regular pointer (can't be null) T*?         an optional pointer (can point to something or can be empty) int           a regular int int?         an optional int (can contain an int or can be empty) Object    a regular reference type (can't be null) Object?  an optional reference (can point to something or can be empty) Basically there wouldn't be "null" anymore. -- /Jacob Carlborg
If we look at live as a superset of safe (basically safe on steroids) which is the way I'm looking at it, all pointers in live code would have to be non-null if they are alive (so viewable / owned). It simplifies the language some what, and allows us to use something like Nullable for optional values. While it is good that we are thinking about it, it can be put off while Walter gets the rest working (as he has been saying in essence).
Nov 21 2019
prev sibling parent reply jmh530 <john.michael.hall gmail.com> writes:
On Thursday, 21 November 2019 at 12:35:03 UTC, Jacob Carlborg 
wrote:
 [snip]

 That would be really nice. But instead it would be just one 
 pointer type and arbitrary  optional types:
 [snip]
What you describe is like Kotlin's approach. The part where you say there would be no null, just empty, is kind of like just renaming null as empty, correct? Kotlin still for null in T? types.
Nov 21 2019
parent Jacob Carlborg <doob me.com> writes:
On 2019-11-21 14:35, jmh530 wrote:

 What you describe is like Kotlin's approach. The part where you say 
 there would be no null, just empty, is kind of like just renaming null 
 as empty, correct?
Yes, basically.
 Kotlin still for null in T? types.
Sure, you could have "null" as syntax sugar for representing and empty type. -- /Jacob Carlborg
Nov 21 2019
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 11/20/2019 4:16 AM, Timon Gehr wrote:
 - What do you want to achieve with borrowing/ownership in D?
I want to prevent the following common issues with pointer code: 1. use after free 2. neglecting to free 3. double free 4. safe casting to immutable 5. safe conversion to/from a shared pointer
 - What can already be done with  live? (Ideally with runnable code examples.)
The test cases included with the PR should give an idea.
 - How will I write a compiler-checked memory safe program that uses varied 
 allocation strategies, including plain malloc,
I'm not sure what clarification you want about plain malloc/free, although there are limitations outlined in ob.md.
 tracing GC
I don't see any value OB adds to tracing GC. A GC is an entirely different solution.
 and reference counting?
The main difficulty (as you pointed out) with RC is holding on to an interior pointer via one reference while another reference free's what it's pointing to. There's been a lot of progress with this with the addition of DIP25, DIP1000, and DIP1012. This further improves it by making the protections transitive.
 Right now, the only use I can see for  live is as an incomplete and unsound 
 linting tool in  system code.
The unsoundness is in dealing with thrown exceptions, which I have some ideas on how to deal with, and conflating different allocators, which I don't have a good idea on.
 It doesn't make  safe code any more expressive. To 
 me, added expressiveness in  safe code is the whole point of a borrowing
scheme.
Technically, live works by adding restrictions, not expressiveness.
Nov 20 2019
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 20.11.19 23:45, Walter Bright wrote:
 On 11/20/2019 4:16 AM, Timon Gehr wrote:
 - What do you want to achieve with borrowing/ownership in D?
I want to prevent the following common issues with pointer code: 1. use after free 2. neglecting to free 3. double free ...
GC prevents those, and those problems cannot appear in safe code. live doesn't prevent them at the interface between live and non- live code.
 4. safe casting to immutable
 5. safe conversion to/from a shared pointer
 ...
What about user-defined types? What about allowing internal pointers into manually-managed memory to be exposed in safe code?
 
 - What can already be done with  live? (Ideally with runnable code 
 examples.)
The test cases included with the PR should give an idea. ...
Well, they are compiler tests, not use cases.
 
 - How will I write a compiler-checked memory safe program that uses 
 varied allocation strategies, including plain malloc,
I'm not sure what clarification you want about plain malloc/free, although there are limitations outlined in ob.md. ...
I.e., it is not planned that we will be able to write such programs?
 tracing GC
I don't see any value OB adds to tracing GC. A GC is an entirely different solution. ...
The worry is that live _removes_ value from tracing GC. If every pointer is owns its data, how do I express a pointer to GC-owned memory? Do I need to write a "smart" pointer data type that's just a shallow wrapper for a GC pointer? Also, if I do that, how do I make sure different GC-backed pointers don't lend out the same owning pointer at the same time?
 and reference counting?
The main difficulty (as you pointed out) with RC is holding on to an interior pointer via one reference while another reference free's what it's pointing to.
Yup.
 There's been a lot of progress with this with the addition of DIP25, 
 DIP1000, and DIP1012. This further improves it by making the protections 
 transitive.
 ...
As far as I can tell, live doesn't bring us closer to safe RC, because it applies to built-in pointers instead of library-defined smart pointers. I think this is completely backwards. Every owning pointer also needs to know the allocation strategy. Therefore, allowing built-in pointers to own their memory is vastly less useful than allowing library-defined smart pointers to do so.
 
 Right now, the only use I can see for  live is as an incomplete and 
 unsound linting tool in  system code.
The unsoundness is in dealing with thrown exceptions, which I have some ideas on how to deal with,
That's not the only problem. live code can call non- live code and obtain pointers from such code. Therefore, safe live code is useless, as all accessible pointers mustn't be owning anyway.
 and conflating different allocators, which I 
 don't have a good idea on.
 ...
Do the checks for library-defined smart pointers instead of built-in pointers. Built-in pointers shouldn't care about lifetime nor allocator.
 It doesn't make  safe code any more expressive. To me, added 
 expressiveness in  safe code is the whole point of a borrowing scheme.
Technically, live works by adding restrictions, not expressiveness.
The point of adding restrictions is to gain expressiveness. It's why type systems are a good idea. In this case, the point of borrowing restrictions should be to enable safe code to manipulate interior pointers into manually-managed data structures.
Nov 20 2019
next sibling parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Thursday, 21 November 2019 at 00:59:01 UTC, Timon Gehr wrote:
 allocation strategy. Therefore, allowing built-in pointers to 
 own their memory is vastly less useful than allowing 
 library-defined smart pointers to do so.
Yes, but if you could define all builtin pointers as defaulting to borrowing and introduce a syntax for marking a builtin pointer as owning. Then owning library structs could use that syntax to tell the compiler that that a pointer field is owning the resource. If you also introduce a syntax for marking other types as borrowing/owning then you could use the same analysis for file handles, texture handles etc.
Nov 21 2019
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 11/20/2019 4:59 PM, Timon Gehr wrote:
 On 20.11.19 23:45, Walter Bright wrote:
 On 11/20/2019 4:16 AM, Timon Gehr wrote:
 - What do you want to achieve with borrowing/ownership in D?
I want to prevent the following common issues with pointer code: 1. use after free 2. neglecting to free 3. double free ...
GC prevents those,
That's right. The GC is memory safe.
 and those problems cannot appear in  safe code.
safe code has to call free() some time when manually managing memory.
  live doesn't prevent them at the interface between  live and non- live code.
live relies on any function it calls obeying live conventions for its interface. This allows incremental adoption of live code.
 What about user-defined types? What about allowing internal pointers into 
 manually-managed memory to be exposed in  safe code?
Exposing an internal pointer in live code is considered "borrowing" from the root of its container.
 - How will I write a compiler-checked memory safe program that uses varied 
 allocation strategies, including plain malloc,
I'm not sure what clarification you want about plain malloc/free, although there are limitations outlined in ob.md. ...
I.e., it is not planned that we will be able to write such programs?
I believe I covered that in ob.md. What am I missing?
 The worry is that  live _removes_ value from tracing GC. If every pointer is 
 owns its data, how do I express a pointer to GC-owned memory? Do I need to
write 
 a "smart" pointer data type that's just a shallow wrapper for a GC pointer? 
 Also, if I do that, how do I make sure different GC-backed pointers don't lend 
 out the same owning pointer at the same time?
live does not distinguish a GC-allocated raw pointer from a malloc-allocated raw pointer. This means you'll be able to write generic live code that can handle both equally. Of course, if all you're using is the GC, you won't need to bother with live at all.
 There's been a lot of progress with this with the addition of DIP25, DIP1000, 
 and DIP1012. This further improves it by making the protections transitive.
As far as I can tell, live doesn't bring us closer to safe RC, because it applies to built-in pointers instead of library-defined smart pointers. I think this is completely backwards. Every owning pointer also needs to know the allocation strategy. Therefore, allowing built-in pointers to own their memory is vastly less useful than allowing library-defined smart pointers to do so.
Nothing about live stops programmers from using library-defined smart pointers. The smart pointer would be the owning pointer, and if it exposed an internal pointer that internal pointer would be treated as "borrowing" from the owner and further access to the smart pointer would be denied until the borrower's last use.
 and conflating different allocators, which I don't have a good idea on.
Do the checks for library-defined smart pointers instead of built-in pointers. Built-in pointers shouldn't care about lifetime nor allocator.
People use raw pointers, and that isn't going away (because performance). Telling people "just use smart pointers" is like telling C++ people to do that. It doesn't work reliably. The checks on smart pointers can be done with RAII and reference counting, and the dips already implemented.
 The point of adding restrictions is to gain expressiveness. It's why type 
 systems are a good idea. In this case, the point of borrowing restrictions 
 should be to enable  safe code to manipulate interior pointers into 
 manually-managed data structures.
They can do that now as long as the container only exposes interior pointers as 'ref'.
Nov 21 2019
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 21.11.19 12:29, Walter Bright wrote:
 On 11/20/2019 4:59 PM, Timon Gehr wrote:
 On 20.11.19 23:45, Walter Bright wrote:
 On 11/20/2019 4:16 AM, Timon Gehr wrote:
 - What do you want to achieve with borrowing/ownership in D?
I want to prevent the following common issues with pointer code: 1. use after free 2. neglecting to free 3. double free ...
GC prevents those,
That's right. The GC is memory safe.
 and those problems cannot appear in  safe code.
safe code has to call free() some time when manually managing memory. ...
safe code cannot call free, because free is not safe. In particular, I'm not supposed to free a GC pointer or a pointer into the static data segment. live is useless in safe code. If you want live to mean: "do these additional checks", that is fine, if people indeed want to write system code with those checks without a guarantee that their code is safe if the checks pass.
  live doesn't prevent them at the interface between  live and 
 non- live code.
live relies on any function it calls obeying live conventions for its interface. This allows incremental adoption of live code. ...
What it allows is one of the following: 1. a split of the language in two parts that cannot interoperate safely, in a language that claims to support memory safety. 2. live checks provide no guarantees because they are optional and you can't rely on your callers to obey your desired borrowing/ownership interface. If you want incremental adoption, just add the missing language features, and let them interoperate with existing code. New code will be written to take advantage of the new features. Don't change the meaning of existing language features based on a function attribute.
 
 What about user-defined types? What about allowing internal pointers 
 into manually-managed memory to be exposed in  safe code?
Exposing an internal pointer in live code is considered "borrowing" from the root of its container. ...
It makes no sense to let the caller decide. The entity exposing the internal pointer should say whether it is borrowed out or not. The data structure manages its invariants, not the caller.
 
 - How will I write a compiler-checked memory safe program that uses 
 varied allocation strategies, including plain malloc,
I'm not sure what clarification you want about plain malloc/free, although there are limitations outlined in ob.md. ...
I.e., it is not planned that we will be able to write such programs?
I believe I covered that in ob.md. What am I missing? ...
You have previously attacked and dismissed my _sound_ designs for ostensibly not being checkable by the compiler (even though they are). I don't understand why this is not a concern for _your_ designs, which freely admit to being uncheckable and unsound. You can't say " safe means memory safe and this is checked by the compiler" and at the same time " live safe relies on unchecked conventions to ensure memory safety across the live boundary".
 
 The worry is that  live _removes_ value from tracing GC. If every 
 pointer is owns its data, how do I express a pointer to GC-owned 
 memory? Do I need to write a "smart" pointer data type that's just a 
 shallow wrapper for a GC pointer? Also, if I do that, how do I make 
 sure different GC-backed pointers don't lend out the same owning 
 pointer at the same time?
live does not distinguish a GC-allocated raw pointer from a malloc-allocated raw pointer. This means you'll be able to write generic live code that can handle both equally.
I don't think this is the case. The GC-allocated raw pointer allows aliasing while live does not allow aliasing. They have incompatible aliasing restrictions. It's like having a mutable and an immutable reference to the same memory location.
 Of course, if all you're using is the GC, you won't need to bother with  live
at all.
 ...
I hope _nobody_ will have to bother with live, but if they will, it will inevitably infect libraries and suddenly, yes, I will have to deal with it. Also, the GC is not _all_ I am using. I am using the GC when that makes sense and I am not using the GC when using the GC does not make sense. For example, I have code that is unsafe because it uses compile-time reflection to gain access to the internal array backing std.container.Array, because std.container.Array has no safe way to lend out that array to a caller, so it does not do it at all. Note that for this use case, it would be enough if there was some way for std.container.Array to state to the compiler that no invalidating operation may be called while the reference to the internal array is borrowed out. If std.container.Array can rely on all safe code being checked that way, the function that borrows out the internal reference can be safe. If it can't rely on that, no such safe function can be written.
 
 There's been a lot of progress with this with the addition of DIP25, 
 DIP1000, and DIP1012. This further improves it by making the 
 protections transitive.
As far as I can tell, live doesn't bring us closer to safe RC, because it applies to built-in pointers instead of library-defined smart pointers. I think this is completely backwards. Every owning pointer also needs to know the allocation strategy. Therefore, allowing built-in pointers to own their memory is vastly less useful than allowing library-defined smart pointers to do so.
Nothing about live stops programmers from using library-defined smart pointers.
What about the fact that it is _optional_ for a /caller/ to respect the smart pointer's desired ownership restrictions? That's very restrictive for the smart pointer! It won't be able to provide safe borrowing functionality.
 The smart pointer would be the owning pointer,
Why do you _need_ an unsound live construct to let a smart pointer be an owning pointer?
 and if it 
 exposed an internal pointer that internal pointer would be treated as 
 "borrowing" from the owner and further access to the smart pointer would 
 be denied until the borrower's last use.
 ...
Great. I want that in safe code _if the smart pointer requests it_. No live needed.
 
 and conflating different allocators, which I don't have a good idea on.
Do the checks for library-defined smart pointers instead of built-in pointers. Built-in pointers shouldn't care about lifetime nor allocator.
People use raw pointers, and that isn't going away (because performance).
How about because the `new` operator returns pointers? If there is a difference in performance between a T* and a `struct { T* payload; }` that's an issue with the backend and/or the ABI.
 Telling people "just use smart pointers" is like telling 
 C++ people to do that.
C++ does not have safe!
 It doesn't work reliably.
 ...
It works in safe code because the smart pointer will be the only way for the safe code to manually manage memory. E.g.: struct MP(T){ // owning, malloc-backed pointer private T* payload; disable this(); disable this(T*); // can't construct disable this(this); // can't copy (move-only type; therefore track this type like you track pointers in live now) pragma(inline,true){ private system ~this(){} // only current module can drop // values, in system or trusted code ref T borrow()return{ return *payload; } } // can borrow out internal pointer alias borrow this; } safe MP!T malloc(); // type tracks allocator trusted void free(MP!T); // safe because pointer is known to be unique and malloc'd In order to call the safe free function you have to pass a pointer that was allocated with the matching smart pointer type. Note that by "smart" in this case I just mean it knows about the underlying allocator and it prevents the pointer from being leaked. There is no runtime behavior, it's all in the types. I.e., this pointer type would use language features to precisely tell the compiler what restrictions its users have to obey. In this case, they may not invent new MP's, they may not copy MP's and they have to explicitly dispose of the MP. This is essentially what live now does for all raw pointers, but maybe some data types only need a subset of the restrictions. In particular, raw pointers in safe code need none of those restrictions. There are potential issues if you try to borrow from some entity that is potentially referenced from somewhere else, so that should be disallowed. To bridge the gap, you can implement runtime checks, like Rust's cell does: https://doc.rust-lang.org/std/cell/ This is discussed in my ownership/borrowing post. (Note that my ownership/borrowing post assumes that `scope` pointers and `ref` pointers cannot alias. Aliasing restrictions could also be moved into a separate attribute for backwards-compatibility and better expressiveness.)
 The checks on smart pointers can be done with RAII and reference 
 counting, and the dips already implemented.
 ...
Not sure what this is referring to.
 
 The point of adding restrictions is to gain expressiveness. It's why 
 type systems are a good idea. In this case, the point of borrowing 
 restrictions should be to enable  safe code to manipulate interior 
 pointers into manually-managed data structures.
They can do that now as long as the container only exposes interior pointers as 'ref'.
Not really, because aliasing is not considered. live just assumes it does not exist and non- live can introduce it arbitrarily. live is all checks, no derived guarantees.
Nov 23 2019
next sibling parent reply mipri <mipri minimaltype.com> writes:
On Saturday, 23 November 2019 at 23:40:05 UTC, Timon Gehr wrote:
 If you want  live to mean: "do these additional checks", that
 is fine, if people indeed want to write  system code with those
 checks without a guarantee that their code is safe if the
 checks pass.
It's a compile-time guarantee that a class of error can't occur within the code so marked. If a non- live caller makes use of some live code, and introduces his own errors, they're his own error. live remains in the language as a tool that the calling code might use to gain the same protection. As it expands in a code base, so shrink the places where these errors may still be found. There's not a choice between absolute perfect guarantees (with some other design) vs. a complete absence of guarantees (with this one). The choice is between language defaults, in how easily the guarantees can be defeated, in what you can expect of other people's code. This is why everyone doesn't say that Rust has no real safety because you can always drop down an unsafe {} block and do whatever you want, and why Rust hasn't gotten a reputation as an unsafe language even as bugs are occasionally found in the unsafe blocks of its standard library. unsafe {} isn't the default; since you have to opt into errors it's very easy to avoid opting in; and you can expect that other people will make sparing use of unsafe {}. In D, system is the default and you have to opt in to various protections, so it's very different language, but live is in line with that language. I think it's also in line with any language that still wants to be able to make reasonable use of foreign code written in unsafe languages--a famous source of annoyance for Rust.
 I hope _nobody_ will have to bother with  live,
It's really hard to see you as only having sincere technical objections to live after reading this. Either live does no good as you say you think, and enthusiasm for alternatives to it will persist, or live will have an effect and this enthusiasm will wane. I don't think there's a future of " live is enthusiastically embraced even though it doesn't help at all."
 but if they
 will, it will inevitably infect libraries and suddenly, yes, I
 will have to deal with it.
How will you have to deal with it? Code can't require that their callers have live. It's your preferred alternative that would necessarily entangle users of libraries.
Nov 23 2019
next sibling parent reply mipri <mipri minimaltype.com> writes:
On Sunday, 24 November 2019 at 02:10:41 UTC, mipri wrote:
 It's really hard to see you as only having sincere technical
 objections to  live after reading this.
Even after rewriting this so many times, I reckon it still won't be received well. So: I'm actually very interested in criticisms of live (I hope more people are testing it than is apparent from the posts here), and even of alternatives that won't happen. But I don't have a four-year degree with a major of "the last 300 years of your bitter disputes about language design", and every single post of yours has required that. (I still have no idea what you could possibly mean with a remark like "It doesn't make safe code any more expressive.") I realize it's tiresome to repeat things that you think are already established, though. Please feel free to disregard my input.
Nov 23 2019
next sibling parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Sunday, 24 November 2019 at 02:33:50 UTC, mipri wrote:
 I'm actually very interested in criticisms of  live (I hope
 more people are testing it than is apparent from the posts
 here), and even of alternatives that won't happen.
I enjoyed reading your testing of this experimental feature. :-)
 But I don't
 have a four-year degree with a major of "the last 300 years of
 your bitter disputes about language design", and every single
 post of yours has required that.
There are many angles to verification and type systems, I don't think you will find anyone that has a complete understanding. Even professors have a rather narrow subfield of verifiable programming where they have deep understanding (and then some overview over the rest of the field). What is certain though, is that there are no easy paths to a workable solution. So being sceptical is healthy... live should be watched as babysteps not as a solution. But you need to take many babysteps to learn how to walk. So in that regard this is an interesting move. I personally think that it would be better to split the language into two, one library-language and one application language. The application language should be almost as easy to deal with as Python, and then move all the complications and attributes and what not down into the library-language. I don't think application programmers want to deal with pure, live etc etc. You have to keep the semantics simple on the higher levels.
Nov 24 2019
prev sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 24.11.19 03:33, mipri wrote:
 On Sunday, 24 November 2019 at 02:10:41 UTC, mipri wrote:
 It's really hard to see you as only having sincere technical
 objections to  live after reading this.
Even after rewriting this so many times,
Rewriting is a waste of time, because I respond to what you say, not how you say it. If you find yourself in such a situation, it can help to think about whether you actually want to say it. My drafts folder is overflowing with posts I never submitted.
 I reckon it still won't be received well.
 ...
I think a simple "I'm sorry, it wasn't my intention to question your motives" would have been more appropriate than predicting unreasonable animosity and implying it would be my fault. If I wanted to, I could choose to get offended by that just the same. :)
 So:
 
 I'm actually very interested in criticisms of  live (I hope
 more people are testing it than is apparent from the posts
 here), and even of alternatives that won't happen. But I don't
 have a four-year degree with a major of "the last 300 years of
 your bitter disputes about language design", and every single
 post of yours has required that.
I don't think this is the case. In particular, the borrowing/ownership discussion is not very old.
 (I still have no idea what you
 could possibly mean with a remark like "It doesn't make
  safe code any more expressive.")
 ...
It's a summary of some of the other points in the post. safe restricts code to be memory safe in a way that is checked by the compiler, such that only trusted functions can be a potential source of memory unsafety in a safe program. (Basically, trusted functions are at the same level of trust as the compiler implementation of safe, so that not everything at this level of trust has to be implemented in the compiler, which makes sense.) If live safe code can interact arbitrarily with safe code, live safe cannot establish that the invariants that live attempts to preserve (every memory location has a unique mutable reference to it or many non-mutable references, pointers are not leaked, etc) actually hold. Therefore, we cannot use live invariants to write safe code whose safety depends on those invariants. This means live does not improve the expressiveness of safe code: it does not allow us to write new and interesting safe code that we could not write before. Walter however claimed that live enables safe manual memory management in safe code What I am complaining about is a discrepancy between the stated goals of live and what it actually accomplishes. The goal is to close the gap, to keep the quality of D high. I believe this is Walter's sincere goal too, this is why he is asking for feedback in the first place. My arguments are not very complicated, but necessarily a bit abstract, because Walter is not providing any concrete examples of safe code that are helped by live that I could then break immediately by applying that abstract reasoning. The burden of proof shouldn't even be on me, because if live safe indeed enables safe manual memory management, he can demonstrate it by providing a code example that I can't break.
 I realize it's tiresome to repeat things that you think are
 already established, though.
What's tiresome is when people keep responding with nonsense or personal attacks. I have no problem at all with people asking for additional details that I didn't think to provide, or let alone people responding with good points!
 Please feel free to disregard my input.
 
I will disregard your personal attack, but I don't see any reason to disregard your input.
Nov 24 2019
parent mipri <mipri minimaltype.com> writes:
On Sunday, 24 November 2019 at 14:07:31 UTC, Timon Gehr wrote:
 This means  live does not improve
 the expressiveness of  safe code: it does not allow us to write
 new and interesting  safe code that we could not write before.
 Walter however claimed that  live enables safe manual memory
 management in  safe code
OK, I get it. The problem was that 'expressiveness' is so strongly associated with code size or neatness. What you're wanting is not something like "D code can become less verbose or look more modern", but that more of the D code that's currently possible, can be possibly marked safe. That's not a weird thing to want at all. Especially not if there's a long term goal of making safe the default. It can't be the default if too much valid code isn't valid safe.
Nov 24 2019
prev sibling next sibling parent reply Jab <jab_293 gmall.com> writes:
On Sunday, 24 November 2019 at 02:10:41 UTC, mipri wrote:
 On Saturday, 23 November 2019 at 23:40:05 UTC, Timon Gehr wrote:
 If you want  live to mean: "do these additional checks", that
 is fine, if people indeed want to write  system code with those
 checks without a guarantee that their code is safe if the
 checks pass.
It's a compile-time guarantee that a class of error can't occur within the code so marked. If a non- live caller makes use of some live code, and introduces his own errors, they're his own error. live remains in the language as a tool that the calling code might use to gain the same protection. As it expands in a code base, so shrink the places where these errors may still be found.
You can have use-after-free bugs happen in live code, as a result of what is passed into live functions. void zoo() { int* p = cast(int*)malloc( int.sizeof * 2 ); foo(p, p + 1); } live void foo( int* p, int* q ) { free(p); *q = 10; // use after free. } The very error that live is supposed to stop happens inside of the function on it's watch. Sure you can pass a garbage pointer to safe, but the garbage pointer is created in code that isn't marked safe. Here it happens inside of live itself. I think the more dangerous thing as well is that it appears that it would complain that "q" is a dangling pointer. Which I assume it would want you to free(). But that may not be a pointer you want to free. So the compiler will be effectively be forcing you to create an error, or your code won't compile.
 I'm actually very interested in criticisms of  live (I hope 
 more people are testing it than is apparent from the posts 
 here),
I'm not too keen on testing it now. The only reason I'd see to test it is to find any flaws in the system, the doc provided illustrates that there is still a lot of known issues that need to be resolved first. Who knows what kind of changes will be made in that time.
Nov 23 2019
next sibling parent reply mipri <mipri minimaltype.com> writes:
On Sunday, 24 November 2019 at 03:42:00 UTC, Jab wrote:
 You can have use-after-free bugs happen in  live code, as a 
 result of what is passed into  live functions.


 void zoo() {
     int* p = cast(int*)malloc( int.sizeof * 2 );
     foo(p, p + 1);
 }

  live void foo( int* p, int* q ) {
     free(p);

     *q = 10; // use after free.
 }
And this crashes dmd: --- import core.stdc.stdlib: malloc, free; live void zoo() { int* p = cast(int*)malloc( int.sizeof * 2 ); foo(p, p + 1); } live void foo( scope int* p, scope int* q ) { *q = 10; // use after free. } --- core.exception.RangeError dmd/ob.d(1526): Range violation It's zoo's live that does it.
 The very error that  live is supposed to stop happens inside of 
 the function on it's watch. Sure you can pass a garbage pointer 
 to  safe, but the garbage pointer is created in code that isn't 
 marked safe. Here it happens inside of  live itself.
The user-after-free occurs there but the caller is still violating foo()'s signature by saying "here are two pointers that you are now the owner of" when it impossible for foo() to dispose of them safely. OTOH, I guess it's always the case that live code can't safely take ownership of an unknown pointer, because it can't distinguish between a pointer the GC will clean up vs. a pointer that a third party allocator must clean up vs. a pointer to statically allocated memory that doesn't bear cleaning up.
Nov 23 2019
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 11/23/2019 8:05 PM, mipri wrote:
 And this crashes dmd:
 ---
 import core.stdc.stdlib: malloc, free;
 
  live void zoo() {
      int* p = cast(int*)malloc( int.sizeof * 2 );
      foo(p, p + 1);
 }
 
  live void foo( scope int* p, scope int* q ) {
      *q = 10; // use after free.
 }
 ---
 core.exception.RangeError dmd/ob.d(1526): Range violation
Good. I'll take care of that.
Dec 03 2019
parent Walter Bright <newshound2 digitalmars.com> writes:
On 12/3/2019 12:20 AM, Walter Bright wrote:
 Good. I'll take care of that.
Pushed a fix.
Dec 03 2019
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 11/23/2019 7:42 PM, Jab wrote:
 I'm not too keen on testing it now. The only reason I'd see to test it is to 
 find any flaws in the system, the doc provided illustrates that there is still
a 
 lot of known issues that need to be resolved first. Who knows what kind of 
 changes will be made in that time.
The point of testing it within what is implemented is to test to see if it works as a concept. If it doesn't work as a concept, there is no point in expending the effort to fill in the corners.
Dec 03 2019
prev sibling next sibling parent reply Dennis <dkorpel gmail.com> writes:
On Sunday, 24 November 2019 at 02:10:41 UTC, mipri wrote:
 There's not a choice between absolute perfect guarantees (with
 some other design) vs. a complete absence of guarantees (with
 this one). The choice is between language defaults, in how
 easily the guarantees can be defeated, in what you can expect
 of other people's code.
Walter's design has the explicit goal of giving 100% mechanically checkable memory safety for (limited) manual memory management. Take for example this reply regarding a different proposal: [1]
 I haven't studied that. Is it a 100% [mechanically checkable 
 memory safety] solution? I know there's one being worked on for 
 C++, but it's an 80-90% solution.

 D needs to have a 100%, mechanically checkable, solution.

 We have 100% for transitive const and function purity. It's 
 hard to get code to pass, but worth it when one succeeds. The 
 Ownership/Borrowing solution is 100% which is what is appealing 
 about it.
But now it appears to be more and more a linting tool for system code that catches some errors but doesn't give any guarantees at all. Or maybe it will, once your entire code base has transitioned to live, and you somehow fixed the issue of mixing memory pools [2]? But then code that is made to work with the garbage collector will be crippled with overly strict aliasing rules from live. The current path seems to go towards a three-way split of D projects between memory management styles: - system for code like C/C++ - live for code like Rust It would be really cool if ownership and borrowing could work alongside the garbage collector instead of introducing this hard split called ` live` for every function. And if this was all just an experiment to test the waters before creating a final, complete design, I would not be concerned. But the way it is described in the blog post [3] and given the fact that DIP 1021 has been accepted [4], it seems like Walter is already committed to this specific design which has been pointed out to be seriously flawed by Timon Gehr multiple times now. Walter's attitude seems to be "We plug the holes one by one" [5], but I don't see any holes being plugged recently. The test cases in the pull request do not display any memory corruption. Examples on the news group often include a wrongly trusted free function (e.g. [6]), which do not show anything. While I am excited that safe is being worked on, it is completely unclear to me how any of the recent developments contribute to safe as a 100% mechanically checked memory safety solution. If we're going to downgrade safe to a 80% memory safety solution relying on strong defaults and well-informed programmers, then the specification needs to be updated to reflect that. [1] https://forum.dlang.org/post/qlvihc$24nk$1 digitalmars.com [2] https://github.com/dlang/dmd/pull/10586/files#diff-e96b0b4865baa6204208527156832d3fR252 [3] https://dlang.org/blog/2019/07/15/ownership-and-borrowing-in-d/ [4] https://forum.dlang.org/post/beselqdzploaeqfunzos forum.dlang.org [5] https://forum.dlang.org/post/qluqlv$c5m$1 digitalmars.com [6] https://forum.dlang.org/post/qqckhn$1hss$1 digitalmars.com
Nov 24 2019
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 24.11.19 13:30, Dennis wrote:
 ...
Nice summary and collection of references. Thanks!
 
 The current path seems to go towards a three-way split of D projects 
 between memory management styles:
 - ...
 - ...
 -  live for code like Rust
I agree with everything but this. For instance, Rust does not do any lifetime checking for raw pointers. In Rust, you can't access raw pointers in safe code. This makes sense for Rust, because there is no built-in GC or safe mutable global variables acting as the default owners of built-in pointers. In D, safe code can manipulate raw pointers just fine, because of the GC. Lifetime checking for raw pointers still does not make sense in safe code though.
Nov 24 2019
prev sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 24.11.19 03:10, mipri wrote:
 On Saturday, 23 November 2019 at 23:40:05 UTC, Timon Gehr wrote:
 ...
 but if they
 will, it will inevitably infect libraries and suddenly, yes, I
 will have to deal with it.
How will you have to deal with it? Code can't require that their callers have live.
It will inevitably _assume_ that its safe callers have live (otherwise it wouldn't have chosen to bother with live safe). So sure, you can violate that assumption, but then you risk memory corruption in so-called safe functions. It's memory safety by convention.
 It's your preferred alternative that
 would necessarily entangle users of libraries.
 
No, it's a take it or leave it situation. It would enable safe implementations of functions that can't have safe implementation at all right now. I'm not taking away anyone's option to write system code. As I said, if the goal of live is just to provide some linting in system code, that's fine, but the stated goals are actually more ambitious.
Nov 24 2019
prev sibling next sibling parent reply Sebastiaan Koppe <mail skoppe.eu> writes:
On Saturday, 23 November 2019 at 23:40:05 UTC, Timon Gehr wrote:
 struct MP(T){ // owning, malloc-backed pointer
     private T* payload;
      disable this();
      disable this(T*); // can't construct
      disable this(this); // can't copy (move-only type; 
 therefore track
                                         this type like you track
                                         pointers in  live now)
     pragma(inline,true){
         private  system ~this(){} // only current module can 
 drop
                                   // values, in  system or 
  trusted code
         ref T borrow()return{ return *payload; }
     }
     // can borrow out internal pointer
     alias borrow this;
 }

  safe MP!T malloc(); // type tracks allocator
  trusted void free(MP!T); //  safe because pointer is known to 
 be unique and malloc'd
To be honest I don't fully understand all the points you are making. But that, that is a thing of beauty, exactly what I want. The insight I gained from it is that you should not annotate functions, but rather, express the semantics you need by annotating a struct. Taken to its natural conclusion, that would make raw pointers system. To use them in safe you would need a wrapper struct with the semantics you need. This also scales really well, instead of adding yet another annotation to every function every year, you just update your structs with the latest semantics you need. If I have some time I am going to reread your latest posts, as I want to have a better understanding of what you are saying. Thank you for fighting this fight.
Nov 24 2019
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 24.11.19 14:42, Sebastiaan Koppe wrote:
 On Saturday, 23 November 2019 at 23:40:05 UTC, Timon Gehr wrote:
 struct MP(T){ // owning, malloc-backed pointer
     private T* payload;
      disable this();
      disable this(T*); // can't construct
      disable this(this); // can't copy (move-only type; therefore track
                                        
this type like you track
                                        
pointers in  live now)
     pragma(inline,true){
         private  system ~this(){} // only current module can drop
                                   // values,
in  system or  trusted code
         ref T borrow()return{ return *payload; }
     }
     // can borrow out internal pointer
     alias borrow this;
 }

  safe MP!T malloc(); // type tracks allocator
  trusted void free(MP!T); //  safe because pointer is known to be 
 unique and malloc'd
To be honest I don't fully understand all the points you are making. ...
Most points are in support of this kind of strategy.
 But that, that is a thing of beauty, exactly what I want.
 
 The insight I gained from it is that you should not annotate functions, 
 but rather, express the semantics you need by annotating a struct. Taken 
 to its natural conclusion, that would make raw pointers  system.
This is indeed what Rust does. In D, raw pointers without additional semantics can actually be fine in safe code (for example, `new int` returns an `int*`, and so does `&x` for a module-level (thread-local) `int x;`), but trusted and system code has to be careful what raw pointers it passes to safe functions.
 To use 
 them in  safe you would need a wrapper struct with the semantics you 
 need. This also scales really well, instead of adding yet another 
  annotation to every function every year, you just update your structs 
 with the latest semantics you need.
 
 If I have some time I am going to reread your latest posts, as I want to 
 have a better understanding of what you are saying.
 ...
Feel free to ask if specific things are unclear. I had a major deadline yesterday and couldn't spend a lot of my time to make my posts more detailed. (And often, it is not so clear a priori which points deserve elaboration.)
 Thank you for fighting this fight.
Sure! I can't help it. :)
Nov 24 2019
parent reply Doc Andrew <x x.com> writes:
On Sunday, 24 November 2019 at 14:51:31 UTC, Timon Gehr wrote:
 This is indeed what Rust does. In D, raw pointers without 
 additional semantics can actually be fine in  safe code (for 
 example, `new int` returns an `int*`, and so does `&x` for a 
 module-level (thread-local) `int x;`), but  trusted and  system 
 code has to be careful what raw pointers it passes to  safe 
 functions.
I just saw this article that might be helpful: https://plv.mpi-sws.org/rustbelt/stacked-borrows/ I'm dealing with sick kiddos today so haven't had time to go over it in much (any) detail, but it looks like an attempt to solve the aliasing problem caused by raw pointers when passed to otherwise safe code.
Nov 24 2019
parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Sunday, 24 November 2019 at 21:51:08 UTC, Doc Andrew wrote:
 I just saw this article that might be helpful:
 https://plv.mpi-sws.org/rustbelt/stacked-borrows/

 I'm dealing with sick kiddos today so haven't had time to go 
 over it in much (any) detail, but it looks like an attempt to 
 solve the aliasing problem caused by raw pointers when passed 
 to otherwise safe code.
Looks more like it is defining what authors should not do in unsafe code for Rust to make optimizations under the assumption that unsafe code is wellbehaved. Then they provide an interpreter that dynamically borrow checks tests by running the code... In order to capture situations that rust cannot deal with statically. Or something to that effect. TLDR; they specify undefined behaviour in order to enable optimization. Probably the opposite of what you are looking for?
Nov 24 2019
parent reply Doc Andrew <x x.com> writes:
On Monday, 25 November 2019 at 00:10:41 UTC, Ola Fosheim Grøstad 
wrote:
 Looks more like it is defining what authors should not do in 
 unsafe code for Rust to make optimizations under the assumption 
 that unsafe code is wellbehaved. Then they provide an 
 interpreter that dynamically borrow checks tests by running the 
 code... In order to capture situations that rust cannot deal 
 with statically. Or something to that effect.

 TLDR; they specify undefined behaviour in order to enable 
 optimization.
Yeah, that's about as much as I was able to take from it yesterday. I wasn't sure if the stacked borrow technique might be useful for catching some of the errors that Timon pointed out with the interface between live and system code. I think most of us would consider the UB an error, rather than an issue with optimization?
Nov 25 2019
parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Monday, 25 November 2019 at 20:32:03 UTC, Doc Andrew wrote:
 On Monday, 25 November 2019 at 00:10:41 UTC, Ola Fosheim 
 Grøstad wrote:
 Looks more like it is defining what authors should not do in 
 unsafe code for Rust to make optimizations under the 
 assumption that unsafe code is wellbehaved. Then they provide 
 an interpreter that dynamically borrow checks tests by running 
 the code... In order to capture situations that rust cannot 
 deal with statically. Or something to that effect.

 TLDR; they specify undefined behaviour in order to enable 
 optimization.
Yeah, that's about as much as I was able to take from it yesterday. I wasn't sure if the stacked borrow technique might be useful for catching some of the errors that Timon pointed out with the interface between live and system code. I think most of us would consider the UB an error, rather than an issue with optimization?
Well, my take speedreading the paper was that they try to arrive at a "memory model" for Rust that enables aliasing optimizations while not limiting unsafe code too much. So they try out different models by building the model into the interpreter and run it on many programs to see how limiting their new rules are. Anyway, one of the authors has blogged about what he was interested in exploring here: https://www.ralfj.de/blog/2019/05/21/stacked-borrows-2.1.html
Nov 25 2019
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 11/23/2019 3:40 PM, Timon Gehr wrote:
 I don't understand why 
 this is not a concern for _your_ designs, which freely admit to being 
 uncheckable and unsound. You can't say " safe means memory safe and this is 
 checked by the compiler" and at the same time " live  safe relies on unchecked 
 conventions to ensure memory safety across the  live boundary".
All languages that have an "unsafe" annotation rely on unchecked conventions when the boundary is crossed.
 I don't think this is the case. The GC-allocated raw pointer allows aliasing 
 while  live does not allow aliasing. They have incompatible aliasing 
 restrictions. It's like having a mutable and an immutable reference to the
same 
 memory location.
If I may sum this up, it is you wish to not follow O/B rules in live functions when you've got a GC pointer. I suspect it is possible to segregate such operations into separate, non- live functions, and I concede that you'll find this inconvenient. In your scheme, this issue is resolved by distinguishing the two by annotating the non-GC pointers with `scope`.
 What about the fact that it is _optional_ for a /caller/ to respect the smart 
 pointer's desired ownership restrictions? That's very restrictive for the
smart 
 pointer! It won't be able to provide  safe borrowing functionality.
I understand that the salient difference between your scheme and mine is you attach it to the pointer type while mine attaches it to the function. Pretty much all of this discussion is about consequences of this difference, so I don't think it is necessary to go through it point-by-point agreeing with you on those consequences. But yours has problems, too, so the question is which method is better? Some problems: 1. Doesn't seem to be a way to prevent un-scope pointers from existing and being unsafe. 2. The splitting pointers down the middle into two categories: scope and un-scope. This is a very intrusive and drastic change. The only saving grace of the existing `scope` and `return` annotations is the compiler can infer the vast bulk of them. Will people accept manually adding `scope` to a big chunk of their types? I don't know.
Nov 26 2019
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 26.11.19 10:34, Walter Bright wrote:
 On 11/23/2019 3:40 PM, Timon Gehr wrote:
 I don't understand why this is not a concern for _your_ designs, which 
 freely admit to being uncheckable and unsound. You can't say " safe 
 means memory safe and this is checked by the compiler" and at the same 
 time " live  safe relies on unchecked conventions to ensure memory 
 safety across the  live boundary".
All languages that have an "unsafe" annotation rely on unchecked conventions when the boundary is crossed. ...
I was very careful _not_ to cross that boundary. For your argument to make sense, crossing the boundary between live and non- live needs to be system. You can't say that just because trusted exists, safe doesn't need to do any checking either. Calling live from non- live or the other way around amounts to an unsafe type cast.
 
 I don't think this is the case. The GC-allocated raw pointer allows 
 aliasing while  live does not allow aliasing. They have incompatible 
 aliasing restrictions. It's like having a mutable and an immutable 
 reference to the same memory location.
If I may sum this up, it is you wish to not follow O/B rules in live functions when you've got a GC pointer.
I don't really want live functions. I want O/B, but only where it makes sense.
 I suspect it is possible to 
 segregate such operations into separate, non- live functions,
It is not possible. What if you store a std.container.Array as a field of a class object and want to borrow the internal data? If you null out the class pointer after borrowing the internal array data, the GC may deallocate the internal array data. live can't protect against such interactions.
 and I concede that you'll find this inconvenient.
 ...
Inconvenient and unsound. I would have to write system code, and the compiler wouldn't even alert me that there is a problem if I annotate it safe.
 In your scheme, this issue is resolved by distinguishing the two by 
 annotating the non-GC pointers with `scope`.
 ...
Yes, but `scope` does not track the allocator. `scope` restricts lifetime, and possibly aliasing. As I also said, alternatively, and perhaps preferably, we could annotate aliasing restrictions separately, but the accepted DIP1021 already introduces some checking against aliasing outside live code.
 
 What about the fact that it is _optional_ for a /caller/ to respect 
 the smart pointer's desired ownership restrictions? That's very 
 restrictive for the smart pointer! It won't be able to provide  safe 
 borrowing functionality.
I understand that the salient difference between your scheme and mine is you attach it to the pointer type while mine attaches it to the function.
You attach it to the function, but the meaning pertains to pointers. This should be a red flag, but apparently you don't think so. Then, you allow live and non- live code to interact in a safe context. This is highly unsound because those calls change the interpretation of pointer types. Those are unsafe pointer casts.
 Pretty much all of this discussion is about consequences of 
 this difference,
Not true. Even if I accept that a function attribute is a reasonable way to go, live comes short.
 so I don't think it is necessary to go through it 
 point-by-point agreeing with you on those consequences.
 ...
This makes no sense to me. It seems rather weird to be debating the merits of proposals without actually taking into account the consequences of each proposal.
 But yours has problems, too, so the question is which method is better?
 ...
I don't care about "my" method vs "your" method. I want a method that makes sense, does not break safe or GC and improves safe. This can just as well be some new combination. The issue was that you weren't responding to any of my points until I made some concrete proposal. I am not stuck to that. We can improve it.
 Some problems:
 
 1. Doesn't seem to be a way to prevent un-scope pointers from existing 
 and being unsafe.
 ...
It doesn't statically prevent memory corruption in system code. I don't think this is a "problem".
 2. The splitting pointers down the middle into two categories: scope and 
 un-scope.
It appears this split exists after https://github.com/dlang/DIPs/blob/master/DIPs/accepted/DIP1021.md
 This is a very intrusive and drastic change.
I think it's less so than splitting functions into two mutually incompatible categories.
 The only saving 
 grace of the existing `scope` and `return` annotations is the compiler 
 can infer the vast bulk of them.
But now there's DIP1021, which checks (unsoundly) that `scope` pointers don't alias. This is not my invention and I would be on board with reverting the decision on DIP1021.
 Will people accept manually adding 
 `scope` to a big chunk of their types? I don't know.
If those types want manage memory manually and expose the internal memory directly to a safe caller, yes. If the safe caller wants to pass around manually-managed memory it will have to annotate stuff as `scope`. This is also true with live. If you want to avoid intrusive and drastic language changes, what about reverting DIP1021, moving aliasing checks to run time? We could add opBorrow and opUnborrow primitives. opBorrow would return a scoped value borrowed from the receiver and opUnborrow would be called once the last borrow ends. This would even be quite a bit more precise than doing everything in the type system, because you would only prevent invalidation, not all mutation.
Nov 26 2019
next sibling parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Tuesday, 26 November 2019 at 13:07:07 UTC, Timon Gehr wrote:
 Yes, but `scope` does not track the allocator. `scope` 
 restricts lifetime, and possibly aliasing. As I also said, 
 alternatively, and perhaps preferably, we could annotate 
 aliasing restrictions separately, but the accepted DIP1021 
 already introduces some checking against aliasing outside  live 
 code.
One possibility for thread local objects: Add "lifetime-names" and let scope be one, and deduce as many as possible. Then have a stack of thread local allocators (i.e. arena). Then let lifetime-name track the allocator-position on that stack. Then only allow the popping of an allocator when there are no pointers carrying the life-time name with allocator-stack-depth 0 (top). The problem is shared objects...
Nov 26 2019
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 11/26/2019 5:07 AM, Timon Gehr wrote:
 You can't say that just because  trusted exists,  safe doesn't need to do any 
 checking either. Calling  live from non- live or the other way around amounts
to 
 an unsafe type cast.
Of course when live functions are called, they must supply arguments that fit live rules. When live functions call other functions, the interface to those functions must agree with live rules. This is not in dispute. You don't have to keep hitting me over the head with it :-)
 I want O/B, but only where it makes sense.
I don't disagree with that, either.
 I suspect it is possible to segregate such operations into separate, non- live 
 functions,
It is not possible. What if you store a std.container.Array as a field of a class object and want to borrow the internal data? If you null out the class pointer after borrowing the internal array data, the GC may deallocate the internal array data. live can't protect against such interactions.
live never inserts null. It does not change the code at all. It only adds compile time checks. Marking pointers as "Invalid" after they are borrowed is not a code generation or runtime thing, it is purely inside the compiler. BTW, under O/B rules, if you borrow a pointer to the internal data, the owner of the data structure (i.e. the cless reference) gets marked "Invalid" (not the field in the class) until the pointer is no longer live.
 Inconvenient and unsound. I would have to write  system code, and the compiler 
 wouldn't even alert me that there is a problem if I annotate it  safe.
This is true. The prototype has set up live to be an attribute independent of safe/ trusted/ system annotations. The idea is to try it out without disruption of the logic of the latter 3 attributes. If it works well, I anticipate working it in.
 In your scheme, this issue is resolved by distinguishing the two by annotating 
 the non-GC pointers with `scope`.
 ...
Yes, but `scope` does not track the allocator. `scope` restricts lifetime, and possibly aliasing. As I also said, alternatively, and perhaps preferably, we could annotate aliasing restrictions separately, but the accepted DIP1021 already introduces some checking against aliasing outside live code.
Ok.
 I understand that the salient difference between your scheme and mine is you 
 attach it to the pointer type while mine attaches it to the function.
You attach it to the function, but the meaning pertains to pointers. This should be a red flag, but apparently you don't think so.
That's right.
 This makes no sense to me.  It seems rather weird to be debating the merits
of 
 proposals without actually taking into account the consequences of each
proposal.
The problem is I agree with your technical criticisms of my proposal, you do not have to keep reiterating them.
 I want a method that makes 
 sense, does not break  safe or GC and improves  safe. This can just as well be 
 some new combination.
We're both on the same team here.
 1. Doesn't seem to be a way to prevent un-scope pointers from existing and 
 being unsafe.
 ...
It doesn't statically prevent memory corruption in system code. I don't think this is a "problem".
What happens with raw pointers? Would use of them produce a compile time error?
 2. The splitting pointers down the middle into two categories: scope and 
 un-scope.
It appears this split exists after https://github.com/dlang/DIPs/blob/master/DIPs/accepted/DIP1021.md
 This is a very intrusive and drastic change.
I think it's less so than splitting functions into two mutually incompatible categories.
 The only saving grace of the existing `scope` and `return` annotations is the 
 compiler can infer the vast bulk of them.
But now there's DIP1021, which checks (unsoundly) that `scope` pointers don't alias. This is not my invention and I would be on board with reverting the decision on DIP1021.
 Will people accept manually adding `scope` to a big chunk of their types? I 
 don't know.
If those types want manage memory manually and expose the internal memory directly to a safe caller, yes. If the safe caller wants to pass around manually-managed memory it will have to annotate stuff as `scope`. This is also true with live.
The difference is that `scope` is inferred for parameters for templates, auto return functions, and lambdas. `scope` for local pointers is inferred for all functions. This works out rather nicely in practice such that most of the time, the user doesn't need to add any annotations. I am not seeing this happening with your proposal. I also find concerning the idea that one can have a pointer to a scope pointer - how can that work with O/B?
 If you want to avoid intrusive and drastic language changes, what about 
 reverting DIP1021,
It's currently enabled only with a switch. It's well worth seeing how it works in practice before deciding.
 moving aliasing checks to run time?
Selling runtime checks is a tough battle.
 We could add opBorrow and 
 opUnborrow primitives. opBorrow would return a scoped value borrowed from the 
 receiver and opUnborrow would be called once the last borrow ends. This would 
 even be quite a bit more precise than doing everything in the type system, 
 because you would only prevent invalidation, not all mutation.
I've resisted similar things before on performance grounds. Rust has done a pretty good job selling safety as a compile time phenomenon.
Dec 01 2019
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 01.12.19 10:16, Walter Bright wrote:
 On 11/26/2019 5:07 AM, Timon Gehr wrote:
 You can't say that just because  trusted exists,  safe doesn't need to 
 do any checking either. Calling  live from non- live or the other way 
 around amounts to an unsafe type cast.
Of course when live functions are called, they must supply arguments that fit live rules. When live functions call other functions, the interface to those functions must agree with live rules. This is not in dispute. You don't have to keep hitting me over the head with it :-) ...
Then this is a bug: --- safe: int* foo(int* p,const int *q) live; void main(){ auto p=new int; foo(p,p); // ok } ---
 
 I want O/B, but only where it makes sense.
I don't disagree with that, either. ...
Ok. (I also want it everywhere where it makes sense.)
 
 I suspect it is possible to segregate such operations into separate, 
 non- live functions,
It is not possible. What if you store a std.container.Array as a field of a class object and want to borrow the internal data? If you null out the class pointer after borrowing the internal array data, the GC may deallocate the internal array data. live can't protect against such interactions.
live never inserts null. It does not change the code at all. It only adds compile time checks. Marking pointers as "Invalid" after they are borrowed is not a code generation or runtime thing, it is purely inside the compiler. BTW, under O/B rules, if you borrow a pointer to the internal data, the owner of the data structure (i.e. the cless reference) gets marked "Invalid" (not the field in the class) until the pointer is no longer live. ...
None of this is pertinent. You suggested it may be possible to segregate operations including GC pointers into non- live functions, and I explained why it is not possible.
 
 Inconvenient and unsound. I would have to write  system code, and the 
 compiler wouldn't even alert me that there is a problem if I annotate 
 it  safe.
This is true. The prototype has set up live to be an attribute independent of safe/ trusted/ system annotations. The idea is to try it out without disruption of the logic of the latter 3 attributes. If it works well, I anticipate working it in. ...
As the goal of live involves memory safety, I think it is important to specify all interactions of live with safe. Otherwise you don't make it clear enough that live splits the language into two incompatible parts.
 
 In your scheme, this issue is resolved by distinguishing the two by 
 annotating the non-GC pointers with `scope`.
 ...
Yes, but `scope` does not track the allocator. `scope` restricts lifetime, and possibly aliasing. As I also said, alternatively, and perhaps preferably, we could annotate aliasing restrictions separately, but the accepted DIP1021 already introduces some checking against aliasing outside live code.
Ok.
 I understand that the salient difference between your scheme and mine 
 is you attach it to the pointer type while mine attaches it to the 
 function.
You attach it to the function, but the meaning pertains to pointers. This should be a red flag, but apparently you don't think so.
That's right.
 This makes no sense to me.  It seems rather weird to be debating the 
 merits of proposals without actually taking into account the 
 consequences of each proposal.
The problem is I agree with your technical criticisms of my proposal, you do not have to keep reiterating them. ...
Ok. (You are not making that clear when you respond to a technical criticism by reiterating some aspect of live's design that is superficially related to that point or by claiming that the criticism isn't technical in nature.)
 I want a method that makes sense, does not break  safe or GC and 
 improves  safe. This can just as well be some new combination.
We're both on the same team here.
 1. Doesn't seem to be a way to prevent un-scope pointers from 
 existing and being unsafe.
 ...
It doesn't statically prevent memory corruption in system code. I don't think this is a "problem".
What happens with raw pointers?
Nothing happens with raw pointers. Raw pointers are allowed to exist in safe functions already, and they do not cause memory corruption.
 Would use of them produce a compile time error?
 ...
No.
 
 2. The splitting pointers down the middle into two categories: scope 
 and un-scope.
It appears this split exists after https://github.com/dlang/DIPs/blob/master/DIPs/accepted/DIP1021.md
 This is a very intrusive and drastic change.
I think it's less so than splitting functions into two mutually incompatible categories.
 The only saving grace of the existing `scope` and `return` 
 annotations is the compiler can infer the vast bulk of them.
But now there's DIP1021, which checks (unsoundly) that `scope` pointers don't alias. This is not my invention and I would be on board with reverting the decision on DIP1021.
 Will people accept manually adding `scope` to a big chunk of their 
 types? I don't know.
If those types want manage memory manually and expose the internal memory directly to a safe caller, yes. If the safe caller wants to pass around manually-managed memory it will have to annotate stuff as `scope`. This is also true with live.
The difference is that `scope` is inferred for parameters for templates, auto return functions, and lambdas. `scope` for local pointers is inferred for all functions. This works out rather nicely in practice such that most of the time, the user doesn't need to add any annotations. I am not seeing this happening with your proposal. ...
I don't see why not, but I think this is a criticism of DIP1021, which tries to prevent `scope` pointers from aliasing. As I have mentioned before, personally, I would prefer to instead have two attributes, `scope` and `owned`. `scope` retains its pre-DIP1021 semantics. `owned` means that there is at most one reference to the annotated data.
 I also find concerning the idea that one can have a pointer to a scope 
 pointer - how can that work with O/B?
 ...
With the attribute split, this would be a pointer to an owning pointer. I.e., we have an `owned(T)** p`. While there can be many references to the `owned(T)*`, at most one of them can use it to manipulate the `owned(T)` at any given time. In particular, dereferencing an `owned(T)**` is not allowed, but we can move the pointed-to owning pointer somewhere else. Let's say we have a function foo(owned(T)*). Now foo(*p) cannot work, because this makes a copy of the owning pointer. However, foo(move(*p)) can work. (`move` would null out *p.)
 
 If you want to avoid intrusive and drastic language changes, what 
 about reverting DIP1021,
It's currently enabled only with a switch. It's well worth seeing how it works in practice before deciding. ...
Is it? If you keep inferring `scope` while `scope` disallows aliasing, and that works well enough in practice, why would you not "see that happening with my proposal"?
 moving aliasing checks to run time?
Selling runtime checks is a tough battle. ...
No matter what you do, the user will need to annotate, have trusted code and/or do some runtime checking. It's just a matter of balancing them. The `owned` attribute would allow runtime checks to be elided in the common case. If people don't want to annotate with `owned`, libraries would still be able to fall back to runtime checks.
 We could add opBorrow and opUnborrow primitives. opBorrow would return 
 a scoped value borrowed from the receiver and opUnborrow would be 
 called once the last borrow ends. This would even be quite a bit more 
 precise than doing everything in the type system, because you would 
 only prevent invalidation, not all mutation.
I've resisted similar things before on performance grounds.
Fair enough. It's not my preferred solution either. (And my proposals generalize it, because you can borrow out a struct that performs the opUnborrow action in its destructor.)
 Rust has done a pretty good job selling safety as a compile time phenomenon.
I have no interest at all in how Rust is marketed. Rust has bounds checks and runtime checks against unsafe borrowing, because its type system is not expressive enough to formalize functional correctness properties.
Dec 01 2019
parent reply aliak <something something.com> writes:
On Sunday, 1 December 2019 at 14:44:25 UTC, Timon Gehr wrote:
 Rust has done a pretty good job selling safety as a compile 
 time phenomenon.
I have no interest at all in how Rust is marketed. Rust has bounds checks and runtime checks against unsafe borrowing, because its type system is not expressive enough to formalize functional correctness properties.
Can you elaborate on this a bit more? Which correctness properties of functions cannot be formalized by rust's type system and what particularly is lacking in it's type system to be able to do that? If you have any links handy that can explain these concepts also that'd be super appreciated. Also, can D's type system become expressive enough without being too crazy to solve the same?
Dec 01 2019
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 01.12.19 16:35, aliak wrote:
 On Sunday, 1 December 2019 at 14:44:25 UTC, Timon Gehr wrote:
 Rust has done a pretty good job selling safety as a compile time 
 phenomenon.
I have no interest at all in how Rust is marketed. Rust has bounds checks and runtime checks against unsafe borrowing, because its type system is not expressive enough to formalize functional correctness properties.
Can you elaborate on this a bit more? Which correctness properties of functions cannot be formalized by rust's type system
Most of them. E.g., Rust's type system cannot prove that an index is within bounds of a std::vec::Vec.
 and what particularly is lacking in it's type system to be able to do that?
 ...
E.g., dependent types.
 If you have any links handy that can explain these concepts also that'd 
 be super appreciated.
 ...
https://softwarefoundations.cis.upenn.edu/
 Also, can D's type system become expressive enough without being too 
 crazy to solve the same?
 
In principle, yes. But I am not aiming for this at the moment. Also, people have different ideas about what "too crazy" means.
Dec 01 2019
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 01.12.19 17:04, Timon Gehr wrote:
 
 If you have any links handy that can explain these concepts also 
 that'd be super appreciated.
 ...
https://softwarefoundations.cis.upenn.edu/
Of course there's also approaches based on program logics/contracts, e.g.: https://ethz.ch/content/dam/ethz/special-interest/infk/chair-program-method/pm/documents/Education/Courses/SS2017/Program%20Verification/08-Boogie.pdf https://www.pm.inf.ethz.ch/research/viper.html In such languages, assertion failures are compile-time errors.
Dec 01 2019
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 01.12.19 17:12, Timon Gehr wrote:
 On 01.12.19 17:04, Timon Gehr wrote:
 If you have any links handy that can explain these concepts also 
 that'd be super appreciated.
 ...
https://softwarefoundations.cis.upenn.edu/
Of course there's also approaches based on program logics/contracts, e.g.: https://ethz.ch/content/dam/ethz/special-interest/infk/chair-program-method/pm/documents/Education/Courses/SS2017/Program%20Verifi ation/08-Boogie.pdf https://www.pm.inf.ethz.ch/research/viper.html In such languages, assertion failures are compile-time errors.
Relevant: http://pm.inf.ethz.ch/publications/getpdf.php?bibname=Own&id=AstrauskasMuellerPoliSummers19b.pdf
Dec 01 2019
parent aliak <something something.com> writes:
On Sunday, 1 December 2019 at 16:19:53 UTC, Timon Gehr wrote:
 On 01.12.19 17:12, Timon Gehr wrote:
 On 01.12.19 17:04, Timon Gehr wrote:
 If you have any links handy that can explain these concepts 
 also that'd be super appreciated.
 ...
https://softwarefoundations.cis.upenn.edu/
Of course there's also approaches based on program logics/contracts, e.g.: https://ethz.ch/content/dam/ethz/special-interest/infk/chair-program-method/pm/documents/Education/Courses/SS2017/Program%20Verification/08-Boogie.pdf https://www.pm.inf.ethz.ch/research/viper.html In such languages, assertion failures are compile-time errors.
Relevant: http://pm.inf.ethz.ch/publications/getpdf.php?bibname=Own&id=AstrauskasMuellerPoliSummers19b.pdf
Thanks Timon! Much appreciated. And there foes christmas... Cheers, - Ali
Dec 02 2019
prev sibling parent reply victoroak <anyone one.com> writes:
On Tuesday, 26 November 2019 at 09:34:55 UTC, Walter Bright wrote:
 I understand that the salient difference between your scheme 
 and mine is you attach it to the pointer type while mine 
 attaches it to the function. Pretty much all of this discussion 
 is about consequences of this difference, so I don't think it 
 is necessary to go through it point-by-point agreeing with you 
 on those consequences.
I think another great difference is that in your proposal a pointer has ownership of the memory but I think that structs should have ownership over resources and pointers are only borrows (like references in Rust), this way is more useful and lets you track ownership over other resources other than memory. Another point is that on Timon's proposal scope pointers can be members of structs so you can have a safe slice that borrows from a container.
Nov 26 2019
parent Walter Bright <newshound2 digitalmars.com> writes:
On 11/26/2019 9:31 AM, victoroak wrote:
 I think another great difference is that in your proposal a pointer has 
 ownership of the memory but I think that structs should have ownership over 
 resources and pointers are only borrows (like references in Rust), this way is 
 more useful and lets you track ownership over other resources other than
memory.
 
 Another point is that on Timon's proposal scope pointers can be members of 
 structs so you can have a safe slice that borrows from a container.
That's the way both work.
Nov 26 2019
prev sibling next sibling parent mipri <mipri minimaltype.com> writes:
On Wednesday, 20 November 2019 at 04:59:37 UTC, Walter Bright 
wrote:
 https://github.com/dlang/dmd/pull/10586

 It's entirely opt-in by adding the ` live` attribute on a 
 function, and the implementation is pretty much contained in 
 one module, so it doesn't disrupt the rest of the compiler.
Here's a thing: struct LinearFile { import std.stdio: File; protected File file; // [1] static void open(string name, void delegate(LinearFile*) live f) { auto lf = LinearFile(File(name)); f(&lf); } void close() { // [3] file.close(); } } string readln(scope LinearFile* f) { return f.file.readln; } void live_close(LinearFile* f) { f.close; } void main() { import std.stdio: write; LinearFile.open("/etc/passwd", delegate(f) live { write(f.readln); //f.close; // not an error to omit this [2] }); void firstline(LinearFile* f) live { write(f.readln); // f.close; // still dangling after this f.live_close; } LinearFile.open("/etc/passwd", &firstline); void better(LinearFile* f) live { // [4] write(f.readln); destroy(f); // error to omit this // write(f.readln); // error to use f after destroy } LinearFile.open("/etc/passwd", &better); } The idea of course is to have a resource that's statically guaranteed to be cleaned up, by giving the caller a live requirement to not leave a pointer dangling, and no non-borrowing function to pass the pointer to, except for the clean-up function. With references: 1. I don't know how (or if it's possible to) require callers to be live though, so instead I require that a passed delegate be live. 2. live doesn't seem to work with anonymous delegates though, as the f.close isn't required here. 3. methods also seem to be non-borrowing, so the LinearFile.close method is a bug. Which isn't convenient for imports at least: import linear: LinearThing, close; 4. ...and it was only when I started to add that it was a shame that you can't just use destructors, that I thought to try them.
Nov 21 2019
prev sibling parent reply rikki cattermole <rikki cattermole.co.nz> writes:
After having some time to think about the messages here and what the 
current specification document says, I have gotten a bit concerned about 
the complexity involved in having multiple states a pointer can be in.

Alternatively what I have derived from these discussions is that what is 
being described is essentially a head const reference type tied to 
lifetime (i.e. DIP25).

With a head const type, I think that we might be able to tick all the 
boxes without much iteration:

- Non-null
- Always points to valid memory
- Pointer cannot be modified _anywhere_ (this includes dynamic arrays 
appending, but not subslicing)
- Not possible to free (can't & or cast to void*)
- Tied to a point in the stack at CT with predetermined point of time 
deallocation

I wrote up some code which basically acts as a proposal for it (using 
``const ref`` as syntax, but that is not important).

https://gist.github.com/rikkimax/4cb2cc8ddcac33c1a9bb20de432f9dea

Am I completely bonkers to think that this might be a workable solution?
Nov 27 2019
parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Wednesday, 27 November 2019 at 10:24:57 UTC, rikki cattermole 
wrote:
 Am I completely bonkers to think that this might be a workable 
 solution?
You need to ensure at least the following: - nobody reads the pointer and puts it somewhere else - nobody reads any pointer reachable from the root pointer and puts it somewhere else - the root pointer does not point to any datatype that can be modified/extended with new pointers (like a dynamic array of pointers or a non-const pointer) But that isn't enough, is it? You could have an integer-id as parameter that index a global array with pointers... so it also has to be pure. To get to something useful the common assumption should be that you need lifetime-names/variables under the hood. No point in wasting time thinking otherwise... unless someone has written a paper on it.
Nov 27 2019
parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Wednesday, 27 November 2019 at 11:12:55 UTC, Ola Fosheim 
Grøstad wrote:
 To get to something useful the common assumption should be that 
 you need lifetime-names/variables under the hood. No point in 
 wasting time thinking otherwise... unless someone has written a 
 paper on it.
That said, lifetime-names works ok with GC. The lifetime of GC objects lasts until the end of the program, so you can just mark it as "*" or "forever"...
Nov 27 2019
parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Wednesday, 27 November 2019 at 11:17:12 UTC, Ola Fosheim 
Grøstad wrote:
 That said, lifetime-names works ok with GC. The lifetime of GC 
 objects lasts until the end of the program, so you can just 
 mark it as "*", i.e. "forever"...
That assumes that you only use GC-scannable pointers of course!
Nov 27 2019
parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
FWIW, Microsoft is working on a new research language Verona that 
partition groups of objects into regions that can have their own 
memory management scheme. Only one thread can access the same 
region at the same time as I understand it.

https://www.slideshare.net/KTNUK/digital-security-by-design-security-and-legacy-at-microsoft-matthew-parkinson-microsoft

One of the people behind Pony (Sylvan Clebsch) is listed on one 
of the slides, and yeah, this approach has some commonalities 
with Pony, I guess.

Anyway, it is for interacting with C++ code.
Dec 02 2019
parent reply rikki cattermole <rikki cattermole.co.nz> writes:
On 03/12/2019 11:51 AM, Ola Fosheim Grøstad wrote:
 FWIW, Microsoft is working on a new research language Verona that 
 partition groups of objects into regions that can have their own memory 
 management scheme. Only one thread can access the same region at the 
 same time as I understand it.
 
 https://www.slideshare.net/KTNUK/digital-security-by-design-security-and-legacy-at-microsoft-matthew
parkinson-microsoft 
 
 
 One of the people behind Pony (Sylvan Clebsch) is listed on one of the 
 slides, and yeah, this approach has some commonalities with Pony, I guess.
 
 Anyway, it is for interacting with C++ code.
I only just watched the talk a couple of hours ago. There is one key feature that both of us share. Objects life times get owned by a data structure. In both of my examples in the code proposal I linked above were based upon just this. Some sort of object (either a class or the double linked lists nodes) is owned by some sort of data structure (Scoped vs DoubleLinkedList). I am not convinced about their memory region scheme, based upon the talk it looks to be prone to data races with locks. Which is worrying. But so far I'm getting convinced that my idea isn't completely crazy. I'm going to try and reach out and get a confirmation on how their memory region system works.
Dec 02 2019
parent reply Ola Fosheim Grostad <ola.fosheim.grostad gmail.com> writes:
On Monday, 2 December 2019 at 23:18:55 UTC, rikki cattermole 
wrote:
 I only just watched the talk a couple of hours ago.
Ah, then you have more info on this than me.
 There is one key feature that both of us share.
 Objects life times get owned by a data structure.

 In both of my examples in the code proposal I linked above were 
 based upon just this. Some sort of object (either a class or 
 the double linked lists nodes) is owned by some sort of data 
 structure (Scoped vs DoubleLinkedList).
Ok, maybe I misread the intent you were conveying. Since I haven't watched the talk yet, I don't know how they ensure the integrity. I would suspect they use the type system...
 I am not convinced about their memory region scheme, based upon 
 the talk it looks to be prone to data races with locks. Which 
 is worrying.
It is possible that they intend to use a verifier that prove that deadlocks or starvation cannot happen since they design a new language from scratch? Or put that on the programmer? ( they rule out dataraces as only one thread has access to the region)
 But so far I'm getting convinced that my idea isn't completely 
 crazy.
Maybe write up something more detailed? I believe "separation logic" has been used for partitioning the heap into groups of objects, but I don't know how it works... Probably intricate.
 I'm going to try and reach out and get a confirmation on how 
 their memory region system works.
You could ask if they have pointers to papers, perhaps?
Dec 02 2019
parent reply rikki cattermole <rikki cattermole.co.nz> writes:
On 03/12/2019 7:35 PM, Ola Fosheim Grostad wrote:
 On Monday, 2 December 2019 at 23:18:55 UTC, rikki cattermole wrote:
 There is one key feature that both of us share.
 Objects life times get owned by a data structure.

 In both of my examples in the code proposal I linked above were based 
 upon just this. Some sort of object (either a class or the double 
 linked lists nodes) is owned by some sort of data structure (Scoped vs 
 DoubleLinkedList).
Ok, maybe I misread the intent you were conveying. Since I haven't watched the talk yet, I don't know how they ensure the integrity. I would suspect they use the type system...
Its not described in the talk. It was pretty light on the details we'd need.
 I am not convinced about their memory region scheme, based upon the 
 talk it looks to be prone to data races with locks. Which is worrying.
It is possible that they intend to use a verifier that prove that deadlocks or starvation cannot happen since they design a new language from scratch? Or put that on the programmer? ( they rule out dataraces as only one thread has access to the region)
They rule it out once you 'own' a region. I'm concerned about how they go about 'owning' it. E.g. locks end in the same problem as though it was on individual object.
 But so far I'm getting convinced that my idea isn't completely crazy.
Maybe write up something more detailed? I believe "separation logic" has been used for partitioning the heap into groups of objects, but I don't know how it works... Probably intricate.
I'm not the right person to do this. Lifetime's are a bit over my head. In my code example I hand waved a bunch of details related to it. But basically its a head const reference tied to the lifetime of the owning memory e.g. a data structure. That guarantees no pointer modification at any point in time.
 I'm going to try and reach out and get a confirmation on how their 
 memory region system works.
You could ask if they have pointers to papers, perhaps?
They do and the project is meant to be released in a couple of weeks. Right now they don't have a compiler, so yeah... Lots of theory wishy washy atm.
Dec 02 2019
parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Tuesday, 3 December 2019 at 07:14:23 UTC, rikki cattermole 
wrote:
 Lifetime's are a bit over my head.
 In my code example I hand waved a bunch of details related to 
 it.
I found this master thesis on adding lifetimes to Whiley very readable, explained in english and not a lot of formalisms: http://homepages.ecs.vuw.ac.nz/~djp/files/MSCThesisSebastian.pdf The difference between this and Rust seems to be that: 1. Rust now ends the lifetime at last use and not at the end of the scope. 2. Rust infers the lifetime variables when they have not been explicitly given, but that is a minor difference.
 But basically its a head const reference tied to the lifetime 
 of the owning memory e.g. a data structure. That guarantees no 
 pointer modification at any point in time.
Hm, I think I would need a more elaborate description to understand what it can express.
 They do and the project is meant to be released in a couple of 
 weeks.
 Right now they don't have a compiler, so yeah... Lots of theory 
 wishy washy atm.
So it was a christmas-teaser... :-) We will have to wait and see then, I guess.
Dec 03 2019