www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 8026] New: Fix or disallow randomShuffle() on fixed-sized arrays

reply d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8026

           Summary: Fix or disallow randomShuffle() on fixed-sized arrays
           Product: D
           Version: D2
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: Phobos
        AssignedTo: nobody puremagic.com
        ReportedBy: bearophile_hugs eml.cc



This comes after a report by Vidar Wahlberg:
http://forum.dlang.org/thread/jnu1an$rjr$1 digitalmars.com

Several functions of std.algorithm don't work with fixed-sized arrays, and
require you to slice them first to turn them into ranges. But
std.random.randomShuffle() accepts fixed-sized arrays as well:


import std.stdio: writeln;
import std.random: randomShuffle;
void main() {
    int[6] data = [1, 2, 3, 4, 5, 6];
    randomShuffle(data);
    writeln(data);
}


This prints the unshuffled array:
[1, 2, 3, 4, 5, 6]


This is bug-prone, and in my opinion it's not acceptable.

I see two alternative solutions:
1) To make randomShuffle() properly support fixed-sized arrays, taking them by
reference;
2) To make randomShuffle() refuse a fixed-sized array at compile-time.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
May 03 2012
next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8026


Jonathan M Davis <jmdavisProg gmx.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jmdavisProg gmx.com
           Severity|enhancement                 |normal



PDT ---
Well, that's certainly weird. Range-based functions don't normally take static
arrays, and I'd argue that they shouldn't, given the problems surrounding
slicing static arrays (it's fine to do it, but you need to be aware of what
you're doing) - though randomShuffle doesn't have the same problem as most
range-based functions do with static arrays given that it's void. Still, I'd
argue that it's probably better for it to require slicing like all the rest.

It looks like the problem is that randomShuffle doesn't have a template
constraint (obviously not good), so I very much doubt that it was ever intended
to work with static arrays without slicing. uninformDistribution (which is
right above randomShuffle) appears to have the same problem.

I'd say that this is definitely a bug, not an enhancement. If I remember, I'll
probably throw together a pull request for it tonight, since it should be a
quick fix.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
May 03 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8026




PDT ---
https://github.com/D-Programming-Language/phobos/pull/565

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
May 03 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8026


Vidar Wahlberg <canidae exent.net> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |canidae exent.net



---

 Well, that's certainly weird. Range-based functions don't normally take static
 arrays, and I'd argue that they shouldn't, given the problems surrounding
 slicing static arrays (it's fine to do it, but you need to be aware of what
 you're doing) - though randomShuffle doesn't have the same problem as most
 range-based functions do with static arrays given that it's void. Still, I'd
 argue that it's probably better for it to require slicing like all the rest.
Are the problems surrounding slicing static arrays easily explainable? From my point of view (fairly new to the language) it's confusing that you can pass a dynamic array to a Range-based function, while you'll need to slice a static array if you want to pass that data to a such function. To me it seems more user friendly if you could pass even static arrays to such method, but I presume there's a good reason why this is avoided. I haven't quite managed to figure out this yet, any pointers would be appreciated. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
May 11 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8026




PDT ---
http://stackoverflow.com/questions/8873265/is-a-static-array-a-forward-range

A static array is a value type. It owns its own memory. It's a container, not a
range.

A dynamic array is a reference type. It does _not_ own its own memory (the
runtime does). It is _not_ a container. It _is_ a range.

If I do

int[] func()
{
    auto arr = [1, 2, 3, 4, 5];
    return find(arr, 3);
}

there is _zero_ difference between passing arr and passing arr[]. In either
case, you're slicing the whole array. And the array being returned is a slice
of arr ([3, 4, 5] to be exact) whose memory is on the heap, owned by the
runtime.

If I do

int[] func()
{
    int[5] arr = [1, 2, 3, 4, 5];
    return find(arr[], 3);
}

I've now allocated a static array _on the stack_. Passing arr to a function
would copy it (since it's a value type). Passing arr[] to a function would
slice it, which means passing a reference to the static array. Now look at what
func returns. It's returning a slice of arr, which is a _local variable_ _on
the stack_. What you've done is the equivalent of

int* func()
{
    int i;
    return &i;
}

You've returned a reference/pointer to a local variable which no longer exists.
Bad things will happen if you do this. So, the semantics of dealing with
dynamic and static arrays are _very_ different. Slicing a static array in the
wrong place can lead to major bugs, whereas slicing a dynamic array is
perfectly safe.


Now, as to range-based functions in general. They're templated functions. That
means that they use IFTI (implicit function template instantiation) - i.e. it
infers the type. So, if you have

auto func(T)(T foo)
{
    ...
}

int[] dArr;
int[5] sArr;

foo(dArr);
foo(sArr);

T is going to be inferred as the _exact type_ that you pass in. So, in the
first case, func gets instantiated as

auto func(int[] foo)
{
    ...
}

whereas in the second, it gets instantiated as

auto func(int[5] foo)
{
    ...
}

int[] is a range, int[5] is not. So, if func is a range-based function, it
should have a template constraint on it, and the static array will fail that
constraint.

auto func(T)(T foo)
    if(isInputRange!T)
{
    ...
}

Static arrays are _not_ ranges and _cannot_ be ranges. At their must basic
level (the input range), ranges must implement empty, front, and popFront, and
static arrays _cannot_ implement popFront, because you cannot change their
length. The _only_ way that a static array can be passed to a range-based
function is if it's sliced so that you have a dynamic array (which _is_ a
range). And as templates take the _exact_ type, no templated function will
automatically slice a static array for you. The language _could_ be changed so
that IFTI treated static arrays as dynamic arrays and automatically sliced
them, but then you'd have problems creating templated functions that actually
wanted to take static arrays rather than dynamic ones.

You _could_ overload every range-based function with an overload specifically
for dynamic arrays (with the idea that the static array would be sliced when
it's passed to it as happens with non-templated functions which take dynamic
arrays), but that would be a royal pain. It would also significantly increase
the risk of using static arrays, because you'd end up returning ranges which
reference the static array from whatever range-based function you called, and
the fact that you're dealing with a static array could very quickly become
buried (easily resulting in returning a range to a static array which then no
longer exists, because it was a local variable). It's much better to require
that the programmer explicitly slice the static array, because then they know
that they're slicing it and then can know that they have to watch to make sure
that no reference to that static array escapes.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
May 11 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8026




Commits pushed to master at https://github.com/D-Programming-Language/phobos

https://github.com/D-Programming-Language/phobos/commit/76def7af856dfa898b78c0cf67228dac87dd4d83


I added template constraints to all of the templates missing template
constraints in std.random - including randomShuffle, per the bug.

https://github.com/D-Programming-Language/phobos/commit/28bca62304c0ec21a9062b4dc4fcb64b80c1c5d8


Fix issue 8026

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
May 20 2012
prev sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8026


bearophile_hugs eml.cc changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED



I think few Phobos functions (like walkLength) should accept fixed-sized arrays
too, but now randomShuffle() statically refuses fixed sized arrays and this is
acceptable. So I close this bug report.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
May 22 2012