www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Vote for the new std.hash (oops, std.digest)

reply "Dmitry Olshansky" <dmitry.olsh gmail.com> writes:
The discussion around new unified API for digest hash functions 
has subdued, just in time as the review period has ended.

The voting for std.digest package starts today and ends on 29 
August.

Rules are simple: reply in this thread with definite "YES" or 
"NO" on whether this package should go into Phobos. Including 
descriptive short notes is appreciated (esp. the NO ones).


Latest locations of docs and source:

Code: (location changed!)
https://github.com/jpf91/phobos/tree/newHash/std/digest
https://github.com/jpf91/phobos/compare/master...newHash

Docs: (location changed!)
http://dl.dropbox.com/u/24218791/d/phobos/std_digest_digest.html
http://dl.dropbox.com/u/24218791/d/phobos/std_digest_md.html
http://dl.dropbox.com/u/24218791/d/phobos/std_digest_sha.html
http://dl.dropbox.com/u/24218791/d/phobos/std_digest_crc.html
Aug 22 2012
next sibling parent Jens Mueller <jens.k.mueller gmx.de> writes:
Dmitry Olshansky wrote:
 The discussion around new unified API for digest hash functions has
 subdued, just in time as the review period has ended.
 
 The voting for std.digest package starts today and ends on 29
 August.
 
 Rules are simple: reply in this thread with definite "YES" or "NO"
 on whether this package should go into Phobos. Including descriptive
 short notes is appreciated (esp. the NO ones).
 
 
 Latest locations of docs and source:
 
 Code: (location changed!)
 https://github.com/jpf91/phobos/tree/newHash/std/digest
 https://github.com/jpf91/phobos/compare/master...newHash
 
 Docs: (location changed!)
 http://dl.dropbox.com/u/24218791/d/phobos/std_digest_digest.html
 http://dl.dropbox.com/u/24218791/d/phobos/std_digest_md.html
 http://dl.dropbox.com/u/24218791/d/phobos/std_digest_sha.html
 http://dl.dropbox.com/u/24218791/d/phobos/std_digest_crc.html
 
YES. I like it. The API is solid and future-proof for more digesting. Many thanks Johannes. Jens
Aug 23 2012
prev sibling next sibling parent reply "d_follower" <d_follower fakemail.com> writes:
On Wednesday, 22 August 2012 at 12:36:08 UTC, Dmitry Olshansky 
wrote:
 The discussion around new unified API for digest hash functions 
 has subdued, just in time as the review period has ended.

 The voting for std.digest package starts today and ends on 29 
 August.

 Rules are simple: reply in this thread with definite "YES" or 
 "NO" on whether this package should go into Phobos. Including 
 descriptive short notes is appreciated (esp. the NO ones).


 Latest locations of docs and source:

 Code: (location changed!)
 https://github.com/jpf91/phobos/tree/newHash/std/digest
 https://github.com/jpf91/phobos/compare/master...newHash

 Docs: (location changed!)
 http://dl.dropbox.com/u/24218791/d/phobos/std_digest_digest.html
 http://dl.dropbox.com/u/24218791/d/phobos/std_digest_md.html
 http://dl.dropbox.com/u/24218791/d/phobos/std_digest_sha.html
 http://dl.dropbox.com/u/24218791/d/phobos/std_digest_crc.html
Can all the algorithms work at compile time (given immutable constant data, e.g. a string) ? YES : NO. Hope that's descriptive.
Aug 23 2012
parent reply Jens Mueller <jens.k.mueller gmx.de> writes:
d_follower wrote:
 On Wednesday, 22 August 2012 at 12:36:08 UTC, Dmitry Olshansky
 wrote:
The discussion around new unified API for digest hash functions
has subdued, just in time as the review period has ended.

The voting for std.digest package starts today and ends on 29
August.

Rules are simple: reply in this thread with definite "YES" or "NO"
on whether this package should go into Phobos. Including
descriptive short notes is appreciated (esp. the NO ones).


Latest locations of docs and source:

Code: (location changed!)
https://github.com/jpf91/phobos/tree/newHash/std/digest
https://github.com/jpf91/phobos/compare/master...newHash

Docs: (location changed!)
http://dl.dropbox.com/u/24218791/d/phobos/std_digest_digest.html
http://dl.dropbox.com/u/24218791/d/phobos/std_digest_md.html
http://dl.dropbox.com/u/24218791/d/phobos/std_digest_sha.html
http://dl.dropbox.com/u/24218791/d/phobos/std_digest_crc.html
Can all the algorithms work at compile time (given immutable constant data, e.g. a string) ? YES : NO.
It says "Digests do not work in CTFE". Just checked it for MD5. I do not know but I think this is just a current limitation of the CTFE implementation. Jens
Aug 23 2012
parent reply Johannes Pfau <nospam example.com> writes:
Am Thu, 23 Aug 2012 22:46:35 +0200
schrieb Jens Mueller <jens.k.mueller gmx.de>:

 
 It says "Digests do not work in CTFE".
 Just checked it for MD5.
 I do not know but I think this is just a current limitation of the
 CTFE implementation.
It's possible to support CTFE, Piotr Szturmaj has some digests which work in CTFE. But it's difficult as everything which depends on endianness isn't supported in CTFE. https://github.com/pszturmaj/phobos/commit/d06c258b442c5b59ab3a66125c9aea8a4c00a0b7 take for example the setByte function: a[offset / E.sizeof] |= value << ((offset % E.sizeof) * 8); This is necessary because you can't cast from uint to ubyte[4] in CTFE.
Aug 24 2012
next sibling parent Jens Mueller <jens.k.mueller gmx.de> writes:
Johannes Pfau wrote:
 Am Thu, 23 Aug 2012 22:46:35 +0200
 schrieb Jens Mueller <jens.k.mueller gmx.de>:
 
 
 It says "Digests do not work in CTFE".
 Just checked it for MD5.
 I do not know but I think this is just a current limitation of the
 CTFE implementation.
It's possible to support CTFE, Piotr Szturmaj has some digests which work in CTFE. But it's difficult as everything which depends on endianness isn't supported in CTFE. https://github.com/pszturmaj/phobos/commit/d06c258b442c5b59ab3a66125c9aea8a4c00a0b7 take for example the setByte function: a[offset / E.sizeof] |= value << ((offset % E.sizeof) * 8); This is necessary because you can't cast from uint to ubyte[4] in CTFE.
I see. Though I do not understand that limitation of CTFE. Since D has version(LittleEndian) and version(BigEndian) and CTFE should follow versioning I see no reason why casts that depend on endianness should not be supported. When writing a function you have to care about endianness anyway. Jens
Aug 24 2012
prev sibling parent reply "d_follower" <d_follower fakemail.com> writes:
On Friday, 24 August 2012 at 08:18:07 UTC, Johannes Pfau wrote:
 Am Thu, 23 Aug 2012 22:46:35 +0200
 schrieb Jens Mueller <jens.k.mueller gmx.de>:

 
 It says "Digests do not work in CTFE".
 Just checked it for MD5.
 I do not know but I think this is just a current limitation of 
 the
 CTFE implementation.
It's possible to support CTFE, Piotr Szturmaj has some digests which work in CTFE. But it's difficult as everything which depends on endianness isn't supported in CTFE.
Writing a library which is good enough to belong to std IS difficult. With D's incredibly powerful compile-time capabilities, I'm 100% many people will wonder why such a pure functionality as digest doesn't work with CTFE. As long as the interface itself allows future CTFE support, and the work is planned that's a YES from me. But I'd like to see at least one algorithm being implemented (e.g. CRC32 which currently works with CTFE) as a proof of concept (and a unit-test).
Aug 24 2012
parent Johannes Pfau <nospam example.com> writes:
Am Fri, 24 Aug 2012 17:49:00 +0200
schrieb "d_follower" <d_follower fakemail.com>:

 
 But I'd like to see at least one algorithm being implemented 
 (e.g. CRC32 which currently works with CTFE) as a proof of 
 concept (and a unit-test).
Proof of concept: http://dpaste.dzfl.pl/b14e8aad As you can see the templated digest and toHexString functions already work fine in CTFE. Please note that it can only work with data which can be converted to ubyte[] in CTFE (so char[] works, but wchar[] probably not). The changes for CRC32 where quite simple, replacing "nativeToLittleEndian" with version(BigEndian) and bswap and using the getByte function to convert from uint to ubyte[4]. I won't include this code now though, as the review is almost finished. I'll post an additional pull request instead. However, I also have to be sure what getByte does is actually supported and not just some oversight in CTFE.
Aug 27 2012
prev sibling next sibling parent "Jesse Phillips" <Jessekphillips+D gmail.com> writes:
On Wednesday, 22 August 2012 at 12:36:08 UTC, Dmitry Olshansky 
wrote:
 The discussion around new unified API for digest hash functions 
 has subdued, just in time as the review period has ended.
Yes.
Aug 23 2012
prev sibling next sibling parent "Bernard Helyer" <b.helyer gmail.com> writes:
Yes.

The API seems fairly solid to me, and the
need for these things is fairly wide reaching.
Aug 23 2012
prev sibling next sibling parent =?UTF-8?B?U8O2bmtlIEx1ZHdpZw==?= <sludwig outerproduct.org> writes:
Yes
Aug 24 2012
prev sibling next sibling parent reply "Oleg Kuporosov" <Oleg.Kuporosov gmail.com> writes:
Yes
---
Oleg
Aug 25 2012
parent reply Manu <turkeyman gmail.com> writes:
Looks good, though one thing annoys me as always throughout the D docs,
liberal use of auto can make them very difficult to understand.

auto result = hash.finish();

From the examples where this appears, I have absolutely no idea what
'result' could possibly be and what I can do with it, and you're forcing me to go and dig further for that information (waste of time). Surely there would be no harm in just writing the type there for clarity? <rant> I'd personally like to see auto abolished, or at least heavily discouraged for the sake of clarity from code examples throughout the docs. I'm constantly having to chase up what auto's may resolve to when reading examples >_< You may argue this demonstrated un-idiomatic code, and my trouble is due to inexperience; I ask, who is most likely to be reading docs? On 26 August 2012 04:54, Oleg Kuporosov <Oleg.Kuporosov gmail.com> wrote:
 Yes
 ---
 Oleg
Aug 26 2012
next sibling parent reply "Mike James" <foo bar.com> writes:
	charset="UTF-8"
Content-Transfer-Encoding: quoted-printable


"Manu" <turkeyman gmail.com> wrote in message =
news:mailman.1410.1345976415.31962.digitalmars-d puremagic.com...
Looks good, though one thing annoys me as always throughout the D docs, =
liberal use of auto can make them very difficult to understand.=20
auto result =3D hash.finish();From the examples where this appears, I =
have absolutely no idea what 'result' could possibly be and what I can =
do with it, and you're forcing me to go and dig further for that =
information (waste of time).
Surely there would be no harm in just writing the type there for =
clarity?
<rant>
I'd personally like to see auto abolished, or at least heavily =
discouraged for the sake of clarity from code examples throughout the =
docs. I'm constantly having to chase up what auto's may resolve to when =
reading examples >_<
You may argue this demonstrated un-idiomatic code, and my trouble is due =
to inexperience; I ask, who is most likely to be reading docs?



On 26 August 2012 04:54, Oleg Kuporosov <Oleg.Kuporosov gmail.com> =
wrote:


  Yes
  ---
  Oleg




+1.

The documentation should be there for clarity.

-=3Dmike=3D-
Aug 26 2012
next sibling parent reply "Craig Dillabaugh" <cdillaba cg.scs.carleton.ca> writes:
On Sunday, 26 August 2012 at 10:32:37 UTC, Mike James wrote:
 "Manu" <turkeyman gmail.com> wrote in message 
 news:mailman.1410.1345976415.31962.digitalmars-d puremagic.com...
 Looks good, though one thing annoys me as always throughout the 
 D docs, liberal use of auto can make them very difficult to 
 understand.
 auto result = hash.finish();From the examples where this 
 appears, I have absolutely no idea what 'result' could possibly 
 be and what I can do with it, and you're forcing me to go and 
 dig further for that information (waste of time).
 Surely there would be no harm in just writing the type there 
 for clarity?
 <rant>
 I'd personally like to see auto abolished, or at least heavily 
 discouraged for the sake of clarity from code examples 
 throughout the docs. I'm constantly having to chase up what 
 auto's may resolve to when reading examples >_<
 You may argue this demonstrated un-idiomatic code, and my 
 trouble is due to inexperience; I ask, who is most likely to be 
 reading docs?



 On 26 August 2012 04:54, Oleg Kuporosov 
 <Oleg.Kuporosov gmail.com> wrote:


   Yes
   ---
   Oleg




 +1.

 The documentation should be there for clarity.

 -=mike=-
+2 As a very inexperienced D user, I find the use of auto in the documentation frustrating too. Cheers, Craig
Aug 26 2012
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 8/26/12 8:35 AM, Craig Dillabaugh wrote:
 On Sunday, 26 August 2012 at 10:32:37 UTC, Mike James wrote:
 +2
 As a very inexperienced D user, I find the use of auto in the
 documentation frustrating too.

 Cheers,

 Craig
I'm torn on this. The arguments make sense; on the other hand, people will in all likelihood write their own code in the style promoted by the doc examples. How about this - use auto for code samples, but not for documenting function return types (except Voldemort)? Andrei
Aug 26 2012
next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Andrei Alexandrescu:

 I'm torn on this. The arguments make sense; on the other hand, 
 people will in all likelihood write their own code in the style 
 promoted by the doc examples.

 How about this - use auto for code samples, but not for 
 documenting function return types (except Voldemort)?
Generally I'd like the documentation examples to not use auto, because this makes the examples more clear and less easy to misunderstand. Programmers are lazy, so I think D programmers learn quickly how much handy 'auto' is. So I think the risk you refer to of them not using auto in their code is low. Teachers know that usually you have to teach people the less handy things, because students learn the lazy things more quickly. Example: in D.learn someone was saying that adding the immutable/const/pure/nothrow annotations is boring. In my code I use often auto for variable defined inside functions, but usually I write the real return type of functions, because this helps me understand better the code I am working one, it's more self-documenting. Even in Haskell, where there is a powerful global type inferencer, you usually write down the types of inputs and outputs of functions, for similar reasons. Bye, bearophile
Aug 26 2012
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday, August 26, 2012 10:14:14 Andrei Alexandrescu wrote:
 How about this - use auto for code samples, but not for documenting
 function return types (except Voldemort)?
I tend to agree with this. If a function needs to return auto in order to be reasonable (e.g. it has to because of Voldemort types, it's return type is too complicated to _not_ be auto, or its return type varies), then it should return auto. Otherwise, we should probably be explicit about the return type. However, I think that the code samples should still generally use auto, because that's the style that we think should be used in actual code. - Jonathan M Davis
Aug 26 2012
prev sibling next sibling parent reply David Gileadi <gileadis NSPMgmail.com> writes:
On 8/26/12 7:14 AM, Andrei Alexandrescu wrote:
 How about this - use auto for code samples, but not for documenting
 function return types (except Voldemort)?
Pie-in-the-sky dream: DDOC would advance enough to show a hover hint over the "auto" keyword with the computed type.
Aug 27 2012
parent Danni Coy <danni.coy gmail.com> writes:
On Tue, Aug 28, 2012 at 1:19 AM, David Gileadi <gileadis nspmgmail.com>wrote:

 On 8/26/12 7:14 AM, Andrei Alexandrescu wrote:

 How about this - use auto for code samples, but not for documenting
 function return types (except Voldemort)?
Pie-in-the-sky dream: DDOC would advance enough to show a hover hint over the "auto" keyword with the computed type.
I was just thinking the same thing but not sure what would happen with printed versions of the documentation (that would be a solveable problem though I think).
Aug 28 2012
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 8/26/2012 7:14 AM, Andrei Alexandrescu wrote:
 On 8/26/12 8:35 AM, Craig Dillabaugh wrote:
 On Sunday, 26 August 2012 at 10:32:37 UTC, Mike James wrote:
 +2
 As a very inexperienced D user, I find the use of auto in the
 documentation frustrating too.

 Cheers,

 Craig
I'm torn on this. The arguments make sense; on the other hand, people will in all likelihood write their own code in the style promoted by the doc examples. How about this - use auto for code samples, but not for documenting function return types (except Voldemort)?
I think there need to be explicit types in the doc so people can click on a link to them and see what they are.
Aug 28 2012
parent Jacob Carlborg <doob me.com> writes:
On 2012-08-28 23:29, Walter Bright wrote:

 I think there need to be explicit types in the doc so people can click
 on a link to them and see what they are.
I should work with "auto" as well. -- /Jacob Carlborg
Aug 28 2012
prev sibling parent "Gary Willoughby" <dev kalekold.net> writes:
On Sunday, 26 August 2012 at 10:32:37 UTC, Mike James wrote:
 "Manu" <turkeyman gmail.com> wrote in message 
 news:mailman.1410.1345976415.31962.digitalmars-d puremagic.com...
 Looks good, though one thing annoys me as always throughout the 
 D docs, liberal use of auto can make them very difficult to 
 understand.
 auto result = hash.finish();From the examples where this 
 appears, I have absolutely no idea what 'result' could possibly 
 be and what I can do with it, and you're forcing me to go and 
 dig further for that information (waste of time).
 Surely there would be no harm in just writing the type there 
 for clarity?
I also find this very annoying. I'm trying to learn what types different methods return but i have no idea what is beneath an auto type. Explicit types would aid learning so much more! I understand the need for the auto keyword as a shortcut for productivity but you need to understand the underlying type before you're really comfortable with auto. I would like to see all examples in the language reference using explicit types purely for clarity.
Aug 28 2012
prev sibling next sibling parent reply Johannes Pfau <nospam example.com> writes:
Am Sun, 26 Aug 2012 13:19:35 +0300
schrieb Manu <turkeyman gmail.com>:

 Looks good, though one thing annoys me as always throughout the D
 docs, liberal use of auto can make them very difficult to understand.
 
 auto result = hash.finish();
 
From the examples where this appears, I have absolutely no idea what
'result' could possibly be and what I can do with it, and you're forcing me to go and dig further for that information (waste of time). Surely there would be no harm in just writing the type there for clarity? <rant> I'd personally like to see auto abolished, or at least heavily discouraged for the sake of clarity from code examples throughout the docs. I'm constantly having to chase up what auto's may resolve to when reading examples >_< You may argue this demonstrated un-idiomatic code, and my trouble is due to inexperience; I ask, who is most likely to be reading docs?
OK, I'll fix this if std.digest is accepted into phobos. However, in this example auto is not just used for convenience, it has an actual use case: As documented here http://dl.dropbox.com/u/24218791/d/phobos/std_digest_digest.html#ExampleDigest (finish method) the type returned by finish can differ between different hash implementations. It's always a static array of ubyte, but for crc it's ubyte[4] for sha1 it's ubyte[20] and for md5 it's ubyte[16]. So in templated methods, auto is more generic than hardcoding a type. The equivalent to auto is using digestType!Digest, so auto result = hash.finish(); digestType!(typeof(hash)) result = hash.finish(); are equivalent. But ubyte[4] result = hash.finish(); does only work if hash is a CRC32 digest. It does not work for md5, sha1 digests, etc. But I agree that examples are easier to read without auto, so I'll change them to use the explicit type where possible.
Aug 26 2012
next sibling parent reply Johannes Pfau <nospam example.com> writes:
Am Sun, 26 Aug 2012 15:43:48 +0200
schrieb Johannes Pfau <nospam example.com>:

 Am Sun, 26 Aug 2012 13:19:35 +0300
 schrieb Manu <turkeyman gmail.com>:
 
 Looks good, though one thing annoys me as always throughout the D
 docs, liberal use of auto can make them very difficult to
 understand.
 
 auto result = hash.finish();
 
From the examples where this appears, I have absolutely no idea
what
'result' could possibly be and what I can do with it, and you're forcing me to go and dig further for that information (waste of time). Surely there would be no harm in just writing the type there for clarity? <rant> I'd personally like to see auto abolished, or at least heavily discouraged for the sake of clarity from code examples throughout the docs. I'm constantly having to chase up what auto's may resolve to when reading examples >_< You may argue this demonstrated un-idiomatic code, and my trouble is due to inexperience; I ask, who is most likely to be reading docs?
OK, I'll fix this if std.digest is accepted into phobos.
Seems some people would rather want to keep auto in the examples. I'm not sure what to do about this. I never used auto return types, so the return type can be looked up in the documentation but it's of course not obvious by just reading the examples.
Aug 26 2012
next sibling parent reply "nazriel" <spam dzfl.pl> writes:
Yes.

----
Yea, I agree that docs shouldn't use auto. It's real pain in ...
I am fine with auto in code examples but in function/method 
description the returned type should be clearly pointed out.
Aug 26 2012
parent "Bernard Helyer" <b.helyer gmail.com> writes:
 Yea, I agree that docs shouldn't use auto. It's real pain in ...
 I am fine with auto in code examples but in function/method 
 description the returned type should be clearly pointed out.
I personally only try to use auto where the type is unambiguous. For example, if I'm calling a function in the same module, I'll still write the type out -- a little work for more clarity. Where auto shines of course is redundant type specification.
Aug 26 2012
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 8/26/12 1:32 PM, Johannes Pfau wrote:
 Seems some people would rather want to keep auto in the examples. I'm
 not sure what to do about this. I never used auto return types, so the
 return type can be looked up in the documentation but it's of course
 not obvious by just reading the examples.
After looking through the docs, it seems examples are in good shape as they are. Andrei
Aug 26 2012
parent Manu <turkeyman gmail.com> writes:
On 27 August 2012 02:06, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org>wrote:

 On 8/26/12 1:32 PM, Johannes Pfau wrote:

 Seems some people would rather want to keep auto in the examples. I'm
 not sure what to do about this. I never used auto return types, so the
 return type can be looked up in the documentation but it's of course
 not obvious by just reading the examples.
After looking through the docs, it seems examples are in good shape as they are.
Assuming it's clear at a glance that it returns byte[]s. Also, are they binary, hex string? base32? :)
Aug 26 2012
prev sibling parent reply Manu <turkeyman gmail.com> writes:
On 26 August 2012 16:43, Johannes Pfau <nospam example.com> wrote:

 Am Sun, 26 Aug 2012 13:19:35 +0300
 schrieb Manu <turkeyman gmail.com>:

 Looks good, though one thing annoys me as always throughout the D
 docs, liberal use of auto can make them very difficult to understand.

 auto result = hash.finish();

From the examples where this appears, I have absolutely no idea what
'result' could possibly be and what I can do with it, and you're forcing me to go and dig further for that information (waste of time). Surely there would be no harm in just writing the type there for clarity? <rant> I'd personally like to see auto abolished, or at least heavily discouraged for the sake of clarity from code examples throughout the docs. I'm constantly having to chase up what auto's may resolve to when reading examples >_< You may argue this demonstrated un-idiomatic code, and my trouble is due to inexperience; I ask, who is most likely to be reading docs?
OK, I'll fix this if std.digest is accepted into phobos. However, in this example auto is not just used for convenience, it has an actual use case: As documented here http://dl.dropbox.com/u/24218791/d/phobos/std_digest_digest.html#ExampleDigest (finish method) the type returned by finish can differ between different hash implementations. It's always a static array of ubyte, but for crc it's ubyte[4] for sha1 it's ubyte[20] and for md5 it's ubyte[16]. So in templated methods, auto is more generic than hardcoding a type. The equivalent to auto is using digestType!Digest, so auto result = hash.finish(); digestType!(typeof(hash)) result = hash.finish(); are equivalent. But ubyte[4] result = hash.finish(); does only work if hash is a CRC32 digest. It does not work for md5, sha1 digests, etc. But I agree that examples are easier to read without auto, so I'll change them to use the explicit type where possible.
Case in point, look how much information you just gave us that is completely obscured by 'auto' :) My suggestion, simplify the example, or, at least the example associated with the direct function documentation: auto md5 = digest!MD5( "The quick brown fox jumps over the lazy dog"); auto sha1 = digest!SHA1( "The quick brown fox jumps over the lazy dog"); auto crc32 = digest!CRC32("The quick brown fox jumps over the lazy dog"); Those autos can be ubyte[n]. Though that said, looking at the API doc for finish(), it appears to return a dynamic array, not a static array as you say... so ubyte[] could be used everywhere, and that would be nice and clear?
Aug 26 2012
parent Johannes Pfau <nospam example.com> writes:
Am Mon, 27 Aug 2012 01:49:12 +0300
schrieb Manu <turkeyman gmail.com>:

 Though that said, looking at the API doc for finish(), it appears to
 return a dynamic array, not a static array as you say... so ubyte[]
 could be used everywhere, and that would be nice and clear?
 
That's the Digest / OOP APIs finish method. As all finish methods in the OOP API must have the same return type so ubyte[] is used. The drawback is that the data is allocated using the GC unless a buffer was explicitly passed to finish. The template API uses ubyte[n] and can therefore return the value using stack space and no allocations. I made another pass through the docs and replaced auto where possible. I also added another simple example.
Aug 27 2012
prev sibling parent Jacob Carlborg <doob me.com> writes:
On 2012-08-26 12:19, Manu wrote:
 Looks good, though one thing annoys me as always throughout the D docs,
 liberal use of auto can make them very difficult to understand.

 auto  result = hash.finish();

  From the examples where this appears, I have absolutely no idea what
 'result' could possibly be and what I can do with it, and you're forcing
 me to go and dig further for that information (waste of time).
 Surely there would be no harm in just writing the type there for clarity?

 <rant>
 I'd personally like to see auto abolished, or at least heavily
 discouraged for the sake of clarity from code examples throughout the
 docs. I'm constantly having to chase up what auto's may resolve to when
 reading examples >_<
 You may argue this demonstrated un-idiomatic code, and my trouble is due
 to inexperience; I ask, who is most likely to be reading docs?
I think that in general "auto" shouldn't be used. In some case auto can be used, i.e. where you don't care/don't need to know the actual type and using duck typing instead. If we compare this with OO it would be like using an interface instead of the actual type. I guess in std.algorithm it would be ok to use "auto". -- /Jacob Carlborg
Aug 29 2012
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 8/22/12 8:36 AM, Dmitry Olshansky wrote:
 The discussion around new unified API for digest hash functions has
 subdued, just in time as the review period has ended.

 The voting for std.digest package starts today and ends on 29 August.

 Rules are simple: reply in this thread with definite "YES" or "NO" on
 whether this package should go into Phobos. Including descriptive short
 notes is appreciated (esp. the NO ones).


 Latest locations of docs and source:

 Code: (location changed!)
 https://github.com/jpf91/phobos/tree/newHash/std/digest
 https://github.com/jpf91/phobos/compare/master...newHash

 Docs: (location changed!)
 http://dl.dropbox.com/u/24218791/d/phobos/std_digest_digest.html
 http://dl.dropbox.com/u/24218791/d/phobos/std_digest_md.html
 http://dl.dropbox.com/u/24218791/d/phobos/std_digest_sha.html
 http://dl.dropbox.com/u/24218791/d/phobos/std_digest_crc.html
I couldn't make time for a review during the window, apologies. Nevertheless I'd like to make a pass in hope that a few issues can be fixed before the code is publicly released. * No need for start() after default construction in the documentation example of CRC32. BTW the documentation of start() suggests that it _must_ be called even right after default construction, is that the case? * In the template interface, no need for void peek(ref ...). Returning by value is fine due to the RVO. Same about finish() - just keep the functions returning by value. * This example: copy(oneMillionRange, &ctx); //Note: You must pass a pointer to copy! suggests we're doing something wrong. I think a better solution would be to have copy() take the target range by "auto ref", and institute this passing convention as a general rule for output ranges. * s/digestType/DigestType/ * s/isDigestable/isDigestible/ * s/startDigest/makeDigest/ - but then again I wonder if requiring start() after default construction is needed. * s/asArray/asStaticArray/. Better yet, we should include this primitive in std.conv. Then we get to use to!(ubyte[16])(dynarray). * The fact that digests aren't unique is trivial by the pigeonhole principle, and should not be listed as a bug (the BUGS: tag implies bugs in the implementation). It's more interesting if a document can be altered to produce a specific digest. If that's what you meant, include a link to the appropriate work. * I don't think the aliases for WrapperDigest!Xyz should be defined. Just call that Digest!Xyz and have people use it. It's about as short as the aliases. I vote for inclusion whether or not the points above are addressed, but of course it would be great if they were minded before release. Apologies for the late review. Andrei
Aug 26 2012
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
One more thing - for future work, it would be great to have a function a 
la mcookie, see http://linux.about.com/library/cmd/blcmdl1_mcookie.htm.

Andrei
Aug 26 2012
prev sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 26-Aug-12 19:11, Andrei Alexandrescu wrote:> On 8/22/12 8:36 AM, 
Dmitry Olshansky wrote:
 The discussion around new unified API for digest hash functions has
 subdued, just in time as the review period has ended.

 The voting for std.digest package starts today and ends on 29 August.

 Rules are simple: reply in this thread with definite "YES" or "NO" on
 whether this package should go into Phobos. Including descriptive short
 notes is appreciated (esp. the NO ones).
I couldn't make time for a review during the window, apologies. Nevertheless I'd like to make a pass in hope that a few issues can be fixed before the code is publicly released.
Well I don't think it'll make any harm to see what issues you have found that need to be addressed. If they prove to be way too much we may consider prolonging review/postponing voting. I'll reply to some points.
 * No need for start() after default construction in the documentation
 example of CRC32. BTW the documentation of start() suggests that it
 _must_ be called even right after default construction, is that the case?
I believe it is. FWIW that's how OpenSSL and it's ilk work and people thought that being able to wrap it cleanly would good idea. And there is Q of being able to do initialization at C-T, say if wrapping OS primitives (like Win32 Crypto Provider).
 * In the template interface, no need for void peek(ref ...). Returning
 by value is fine due to the RVO. Same about finish() - just keep the
 functions returning by value.

 * This example:

 copy(oneMillionRange, &ctx); //Note: You must pass a pointer to copy!

 suggests we're doing something wrong. I think a better solution would be
 to have copy() take the target range by "auto ref", and institute this
 passing convention as a general rule for output ranges.
This one actually is the most important (IMO). Namely the suggested pattern is error prone and needs to get fixed. Interestingly though, some array code may break because it never modified passed slice (and it would now).
 * s/digestType/DigestType/

 * s/isDigestable/isDigestible/

 * s/startDigest/makeDigest/ - but then again I wonder if requiring
 start() after default construction is needed.

 * s/asArray/asStaticArray/. Better yet, we should include this primitive
 in std.conv. Then we get to use to!(ubyte[16])(dynarray).
Better yet there should be (there is ?) a better way to overload to! template.
 * The fact that digests aren't unique is trivial by the pigeonhole
 principle, and should not be listed as a bug (the BUGS: tag implies bugs
 in the implementation). It's more interesting if a document can be
 altered to produce a specific digest. If that's what you meant, include
 a link to the appropriate work.
I do suspect it's the latter as it is a well known issue that MD5 is reversible in reasonable amount of time on modern PCs. See rainbow tables.
 * I don't think the aliases for WrapperDigest!Xyz should be defined.
 Just call that Digest!Xyz and have people use it. It's about as short as
 the aliases.
Actually +1. I'm just wondering if there is something that'll make people do class version but not the struct one (bypassing the wrapper).
 I vote for inclusion whether or not the points above are addressed, but
 of course it would be great if they were minded before release.
 Apologies for the late review.
-- Olshansky Dmitry
Aug 26 2012
parent Johannes Pfau <nospam example.com> writes:
Am Sun, 26 Aug 2012 23:57:24 +0400
schrieb Dmitry Olshansky <dmitry.olsh gmail.com>:

 On 26-Aug-12 19:11, Andrei Alexandrescu wrote:> On 8/22/12 8:36 AM, 
 Dmitry Olshansky wrote:
  >> The discussion around new unified API for digest hash functions
  >> has subdued, just in time as the review period has ended.
  >>
  >> The voting for std.digest package starts today and ends on 29
  >> August.
  >>
  >> Rules are simple: reply in this thread with definite "YES" or
  >> "NO" on whether this package should go into Phobos. Including
  >> descriptive short notes is appreciated (esp. the NO ones).
  >
  > I couldn't make time for a review during the window, apologies.
  > Nevertheless I'd like to make a pass in hope that a few issues can
  > be fixed before the code is publicly released.
  >
 Well I don't think it'll make any harm to see what issues you have
 found that need to be addressed. If they prove to be way too much we
 may consider prolonging review/postponing voting. I'll reply to some
 points.
 
  > * No need for start() after default construction in the
  > documentation example of CRC32. BTW the documentation of start()
  > suggests that it _must_ be called even right after default
  > construction, is that the case?
  >
 I believe it is. FWIW that's how OpenSSL and it's ilk work and people 
 thought that being able to wrap it cleanly would good idea. And there
 is Q of being able to do initialization at C-T, say if wrapping OS 
 primitives (like Win32 Crypto Provider).
Yes, generic code should always call start for the reasons mentioned above. Examples using the CRC32, MD5 and SHA1 digests directly could avoid the call though. I'll also document that the current implementations don't need a call to start after default construction.
 
  > * In the template interface, no need for void peek(ref ...).
  > Returning by value is fine due to the RVO. Same about finish() -
  > just keep the functions returning by value.
OK, that makes sense. As it also makes implementing a digest easier, I'm all for it.
  >
  > * This example:
  >
  > copy(oneMillionRange, &ctx); //Note: You must pass a pointer to
  > copy!
  >
  > suggests we're doing something wrong. I think a better solution
  > would be to have copy() take the target range by "auto ref", and
  > institute this passing convention as a general rule for output
  > ranges.
  >
 This one actually is the most important (IMO).
 Namely the suggested pattern is error prone and needs to get fixed. 
 Interestingly though, some array code may break because it never 
 modified passed slice (and it would now).
I agree this should be fixed in copy. Should we special case arrays then? BTW: This does work as well, but it's still error-prone: ctx = copy(oneMillionRange, ctx);
 
  > * s/digestType/DigestType/
OK
  > * s/isDigestable/isDigestible/
OK
  > * s/startDigest/makeDigest/ - but then again I wonder if requiring
  > start() after default construction is needed.
OK. As makeDigest is meant to work correctly for all digests a call to start is needed (see above, for example OpenSSL wrappers need this)
  > * s/asArray/asStaticArray/. Better yet, we should include this
  > primitive in std.conv. Then we get to use to!(ubyte[16])(dynarray).
  >
 Better yet there should be (there is ?) a better way to overload to! 
 template.
I just removed asArray from the documentation for now. I'll probably file a pull request to merge it into std.conv. Currently the code just asserts in debug mode to make sure the dynamic array is big enough. Would I have to use enforce to get this merged into std.conv.to? BTW: Is the asArray behavior really what'd be expected from to!(ubyte[16])(dynarray)? asArray does explicitly not make a copy, so changing the returned static array will change the dynamic array's content.
  > * The fact that digests aren't unique is trivial by the pigeonhole
  > principle, and should not be listed as a bug (the BUGS: tag
  > implies bugs in the implementation). It's more interesting if a
  > document can be altered to produce a specific digest. If that's
  > what you meant, include a link to the appropriate work.
 
 I do suspect it's the latter as it is a well known issue that MD5 is 
 reversible in reasonable amount of time on modern PCs. See rainbow
 tables.
That was part of the original std.md5 documentation and I just left it in. To be honest I'm not sure what was meant by that statement. I always thought it referred to hash collisions, but as you said those occur for every digest. So I think you're right and it's better to remove this from the BUGS section.
 
  > * I don't think the aliases for WrapperDigest!Xyz should be
  > defined. Just call that Digest!Xyz and have people use it. It's
  > about as short as the aliases.
 
 Actually +1. I'm just wondering if there is something that'll make 
 people do class version but not the struct one (bypassing the
 wrapper).
The interface is already called Digest ;-) I'm not sure, I'd rather leave things as they are. There might be reasons not to use the wrapper template for some digests (e.g. special functionality not covered by the wrapper)
 
  > I vote for inclusion whether or not the points above are
  > addressed, but of course it would be great if they were minded
  > before release. Apologies for the late review.
 
 
Thanks for the review, I hope I addressed most of your points.
Aug 27 2012