www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Safe Usage of Mutable Ranges in foreach scopes

reply "Per =?UTF-8?B?Tm9yZGzDtnci?= <per.nordlow gmail.com> writes:
I use a lot of file parsing looking like

alias T = double;
T[] values;

foreach (line; File(path).byLine)
{
     foreach (part; line.splitter(separator))
     {
         values ~= part.to!T;
     }
}

The key D thing here is that this is _both_ fast (because no 
copying of file-memory-slices needs to be performed) _and_ 
"high-level". However I'm currently lacking one feature here in 
D. Namely:

A way to qualify `line` to prevent it from escaping the scope of 
the current foreach iteration. And `part` should 
D-style-transitively inherit this property in the same scope as 
`line`.

This because line is a reference to a volatile buffer that 
changes its contents with every foreach-iteration, which is a 
performance design choice which achieves "maximum D performance" 
because it avoids memory reallocation in in each iteration of 
`File.byLine`.

Such a feature would make the usage of this pattern very (perhaps 
even absolutely) safe from a memory corruption point of view.

Could the scope keyword be used here?

Could the work done in DIP-25 be reused here, Walter?

Destroy!
May 08 2015
next sibling parent reply "Per =?UTF-8?B?Tm9yZGzDtnci?= <per.nordlow gmail.com> writes:
On Friday, 8 May 2015 at 11:25:26 UTC, Per Nordlöw wrote:
 Such a feature would make the usage of this pattern very 
 (perhaps even absolutely) safe from a memory corruption point 
 of view.
Correction: Not exactly memory corruption point of view. Rather to avoid logical bugs when parsing/decoding line-based streams of text.
May 08 2015
parent reply "Per =?UTF-8?B?Tm9yZGzDtnci?= <per.nordlow gmail.com> writes:
On Friday, 8 May 2015 at 11:29:53 UTC, Per Nordlöw wrote:
 Such a feature would make the usage of this pattern very 
 (perhaps even absolutely) safe from a memory corruption point 
 of view.
I guess I should have posted this on digitalmars.D instead ...
May 08 2015
parent "Per =?UTF-8?B?Tm9yZGzDtnci?= <per.nordlow gmail.com> writes:
On Friday, 8 May 2015 at 11:32:50 UTC, Per Nordlöw wrote:
 On Friday, 8 May 2015 at 11:29:53 UTC, Per Nordlöw wrote:
 Such a feature would make the usage of this pattern very 
 (perhaps even absolutely) safe from a memory corruption point 
 of view.
An alternative non-restrictive (relaxed) possibile solution here is to change `byLine` to instead return a reference counted or GC-allocated object. Then in each iteration `ByLine.popFront()` checks if the number of references for the internally stored line is >= 2 (including its own reference). If so `ByLine.popFront()` allocates a new instance of the internally stored line and return that in the new iteration (through return-value of front()). I'm assuming this is not implemented. Is it possible to quickly query the number of references (and slices) of a GC-allocated object? Destroy, once again!
May 08 2015
prev sibling parent "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:
On Friday, 8 May 2015 at 11:25:26 UTC, Per Nordlöw wrote:
 Could the scope keyword be used here?

 Could the work done in DIP-25 be reused here, Walter?
I had `scope!(const ...)` in my original proposal [1] to handle exactly this problem. The latest iteration doesn't have it as an explicit annotation anymore, but the functionality is still there in the way it interacts with ` safe` [2]. It's no longer opt-in, because it turned out that `byLine` is just a special case of a more general problem. This became clear during the discussion of RCArray/DIP25 [3]. [1] http://wiki.dlang.org/User:Schuetzm/scope#scope.21.28const_....29 [2] [3] http://forum.dlang.org/thread/huspgmeupgobjubtsmfe forum.dlang.org
May 09 2015