www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 9102] New: OutputRange should be ref parameter

reply d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9102

           Summary: OutputRange should be ref parameter
           Product: D
           Version: D2
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Phobos
        AssignedTo: nobody puremagic.com
        ReportedBy: zan77137 nifty.com



This code doesn't work:

----------------------
import std.algorithm, std.digest.crc;
void main()
{
    CRC32 crcA, crcB;
    const ubyte[] ary3 = [1,2,3,4];
    copy(ary3, crcA);
    copy(ary3, &crcB);
    assert(crcA.finish() == crcB.finish());
}
----------------------
The second parameter of copy is non-ref OutputRange. This will be broken by
receiving OutputRange which is a non-reference type.
Like std.range.put, it is necessary to make ref parameter.
Some similar cases appear elsewhere. (eg. formattedWrite

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Dec 01 2012
next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9102


Nick Sabalausky <cbkbbejeap mailinator.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |cbkbbejeap mailinator.com



19:16:43 PST ---
I just hit this (with both copy and base64.decode) and it seems like a pretty
major usability problem with ranges. Should it be a higher severity?

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Feb 19 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9102


Steven Schveighoffer <schveiguy yahoo.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |schveiguy yahoo.com



05:59:47 PST ---
This seems to be of a problem of CRC32, not ranges in general.

An output range passed by value must internally be a reference, or else there
is no point!

I propose we rename the put function to something else (ingest?), and create a
new CRC32OutputRange that wraps a pointer to a CRC32.  CRC32OutputRange.put
will throw if the reference is null.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Feb 20 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9102




I think that the functions such as std.algorithm.copy or
std.format.formattedWrite should handle it definitely even if any OutputRange.
In addition, I think that the function to produce a result not to aim at though
given correct OutputRange is not good interface.

OutputRange does not have the rule except what is handed to std.range.put
function.
Therefore, OutputRange such as CRC32 is correct.
But, it is a big burden for all people defining OutputRange that define wrapper
such as CRC32OutputRange to all OutputRange that is value type having states.

Furthermore, OutputRange is intended to update a state by the data which it was
handed to, and it is necessary for most cases to update Range itself.
OutputRange should be handled with a reference type basically.

From these, I think that it is appropriate that a function to treat OutputRange
with ref parameter.
(std.range.put takes ref in the first argument as collateral evidence.)

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Feb 21 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9102




08:54:46 PST ---


 From these, I think that it is appropriate that a function to treat OutputRange
 with ref parameter.
 (std.range.put takes ref in the first argument as collateral evidence.)
If functions accepting output ranges only took ranges via ref, then you have to make copies just to pass ranges to a function. For example: int[] array1 = [1,2,3,4]; int[] array2 = new int[](4); // must make a copy of array2 auto array3 = array2; copy(array1, array3); Note also, the slice case: int[] array4 = new int[](5); copy(array1, array4[1..5]); Such a call is not possible with ref qualifier. With CRC32, it is not a valid output range, because put only affects pass-by-value data. I know the template to check for output range simply checks if put can work with it, but I don't know if there is a valid way to check for this in a template. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Feb 21 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9102


Johannes Pfau <johannespfau gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |johannespfau gmail.com



PST ---
This problem was already mentioned in the review, however no good solution was
found. I don't think it's as simple as saying CRC32 is not a valid OutputRange.
I haven't found any public documentation stating that a OutputRange cannot be a
pure value type or some similar note.

The big question "What type of range is passed by ref/auto ref/value" has not
been answered in any way, phobos sometimes uses pass by value, pass by ref,....

If you're saying OutputRanges shouldn't be passed by ref you're also saying put
is wrong as it does take it's range parameter by ref. This needs some
discussion though, cause last time copy was considered to be wrong.


Anyway, for std.digest a good name for put would be update, that's what's used
in other libraries. But I don't know if we can still break the API.

A nicer solution is to prevent creating copies of this type. IIRC this can be
done with a disabled postblit. Creating a copy is dangerous in any case and we
even have some examples with comment "Do not pass by value". Creating bit
copies of hashes wasn't officially supported so it's not a real API breakage.
CRC32 could still be used with functions which do take OutputRanges by ref,
passing it to ref functions would work and CRC32OutputRange is not necessary,
as CRC32* can be used (think of new CRC32()...)

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Feb 25 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9102




10:14:31 PST ---
I think citing put as justification is not exactly correct.  put is a basic
operation of an output range.  It is meant to be used basically as a method of
the output range, and is meant to modify it's owner.  Considering that methods
always ref their owners, this is a valid practice.

On the other hand, copy's output range is a target, not its owner.  Copy is an
algorithm.  There are valid use cases for having copy consume the output range,
and valid cases for having it not consume the output range (for those output
ranges that might choose to have partially value-like passing).

However, a type without a reference cannot really be a valid output range,
since an output range should not only operate on value-passed data.  In
essence, such a type is not an output range, but a reference to that type is. 
I don't think it's good practice to burden functions that USE output ranges
with the requirement that they figure out how to the parameter a valid output
range.  Such design is impossible for the function to get right.  Making the
wrong decision can critically affect the side effects, and destroy any
intentions of the output range's author.

The only exception I can think of is a range that intentionally throws away
data.  Like a > /dev/null type of redirect.

Actually, any type can be made into an output range if it has a put-like
function, even if it has no references, by using a pointer to an instance, or
creating an instance on the heap.  Perhaps a auto outputRange(string method,
T)(ref T t) function could be useful:

struct CRC32
{
   void ingest(ubyte[]) {...}
}

CRC32 crc;
copy(ary3, outputRange!"ingest"(crc));

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Feb 25 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9102




PST ---
You're probably right with most of that. I'd still like to keep the name as
"put" to avoid API breakage and for consistency with the OOP API. Also
introducing a special reference type complicates the API, if we keep "put" the
pointer type can be used as that reference type.

Disabling the postblit on CRC32 is IMHO a quite nice solution. It should be
done anyway as in 99% if a CRC32 is passed to another function by value it's a
bug and a  disabled postblit would prevent this nicely. It should therefore
also prevent calling copy with a CRC32.

The only place where  disabling postblit should produce a different result than
renaming "put" should be if a function takes an OutputRange by ref, such as the
standalone put function. But I don't think that's a problem.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Feb 25 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9102




11:13:32 PST ---

 You're probably right with most of that. I'd still like to keep the name as
 "put" to avoid API breakage and for consistency with the OOP API. Also
 introducing a special reference type complicates the API, if we keep "put" the
 pointer type can be used as that reference type.
Yeah, this is true. It would be inventing an external type when a reference does just fine. Hopefully the compiler could inline this to be equivalent code. Also, functions which *do* take a ref are short-changed of being able use this as a valid output range. There's not a very good answer to this.
 Disabling the postblit on CRC32 is IMHO a quite nice solution. It should be
 done anyway as in 99% if a CRC32 is passed to another function by value it's a
 bug and a  disabled postblit would prevent this nicely. It should therefore
 also prevent calling copy with a CRC32.
Well, the problem with this is, you disable legitimate copies. For example, if I wanted to save the current state because the next few bytes passed in may not be valid. I don't know if this is a valid concern, but it seems like that is solving the problem by curing the symptoms. I don't really have a horse in this race as I don't use the given code, just offering my opinion.
 The only place where  disabling postblit should produce a different result than
 renaming "put" should be if a function takes an OutputRange by ref, such as the
 standalone put function. But I don't think that's a problem.
I don't think it would be different. A ref of a ref is still a ref ;) -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Feb 25 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9102




PST ---
 Well, the problem with this is, you disable legitimate copies.  For example, if
 I wanted to save the current state because the next few bytes passed in may not
 be valid.

 I don't know if this is a valid concern, but it seems like that is solving the
 problem by curing the symptoms.
Problem is that the std.digest interface doesn't guarantee that a trivial bit copy actually saves the state. This is deliberate to allow wrappers to OpenSSL or other C hash functions to be written as structs (Although it's questionable if those shouldn't be done as OOP API only). Also the risk of unwanted copies is high (e.g. when passing to a function). If a way to copy digests is needed we could always implement an explicit copy/save function for this and enforce all isDigest types to implement this. This would also work for wrapper types or types which have to keep any kind of internal reference. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Feb 25 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9102




10:08:36 PDT ---
FWIW, This problem is *not* specific to CRC32. I've run into the same problem
with Appender. In fact, the problem will occur with *any* struct-based
OutputRange that does not specifically go out of its way to work around this
problem.

Furthermore, it makes absolutely no sense to pass ranges by value anyway since
doing so amounts to an implicit, ad-hoc and error-prone recreation of calling
'.save' - which isn't even sensible for non-ForwardRanges anyway.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Mar 24 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9102




10:15:55 PDT ---
Well, the problem with this is, you disable legitimate copies.  For example, if
I wanted to save the current state because the next few bytes passed in may not
be valid.

I don't know if this is a valid concern, but it seems like that is solving
the problem by curing the symptoms.
It's not a valid concern. If a range's state even *can* be saved, it is supposed to be done via '.save'. Saving a range's state via copying is undefined and reliant on the range's internal implementation details. I'll grant that maybe saving a range *should* have been defined as being done by an ordinary copy, plus postblit if necessary, and that a non-saveable range should have been defined as having postblit disabled, but unfortunately that's just not how ranges have been defined. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Mar 24 2013
prev sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=9102




PDT ---
I agree with Nick here. I'll open a pull request for std.digest next week which
will disable the postblit and add save methods to make the copies.

But AFAIK save hasn't been used with OutputRanges yet, we'll see what the pull
review will bring :-)

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Mar 24 2013