www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Why is Base64 considered range of ubytes?

reply Dukc <ajieskola gmail.com> writes:
I just had a range of characters in Base64 I wanted to decode. 
Compiler complained something about no matching template. My 
first reactions were to add `.array` and/or `byCodeUnit` to my 
range. Did not work. Hmm, ensure the range is `char[]` and then 
feed it? Nope. Could it be a regression? Hardly, as this example 
from docs is supposedly unittested:
```
auto encoded = Base64.encoder(cast(ubyte[])"0123456789");
foreach (n; map!q{a - '0'}(Base64.decoder(encoded)))
{
     writeln(n);
}
```

I anyways tested it in a separate project just in case, and it 
complied and worked.

Only after adding `.array` to that first line and changing `auto` 
to `char[]` the problem dawned: `Base64.decoder` does not except 
an input range of characters, it excepts an input range of 
unsigned bytes!

I fail to understand why. Is not the whole point of Base64 to 
encode arbitrary data into readable characters that can be 
unambiquosly written on paper?
Jun 03
parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Monday, June 3, 2019 3:13:26 AM MDT Dukc via Digitalmars-d wrote:
 I just had a range of characters in Base64 I wanted to decode.
 Compiler complained something about no matching template. My
 first reactions were to add `.array` and/or `byCodeUnit` to my
 range. Did not work. Hmm, ensure the range is `char[]` and then
 feed it? Nope. Could it be a regression? Hardly, as this example
 from docs is supposedly unittested:
 ```
 auto encoded = Base64.encoder(cast(ubyte[])"0123456789");
 foreach (n; map!q{a - '0'}(Base64.decoder(encoded)))
 {
      writeln(n);
 }
 ```

 I anyways tested it in a separate project just in case, and it
 complied and worked.

 Only after adding `.array` to that first line and changing `auto`
 to `char[]` the problem dawned: `Base64.decoder` does not except
 an input range of characters, it excepts an input range of
 unsigned bytes!

 I fail to understand why. Is not the whole point of Base64 to
 encode arbitrary data into readable characters that can be
 unambiquosly written on paper?
It looks like it's set up to operate on strings or on ranges of ubyte, which is certainly weird. It makes some sense that it would accept ranges of ubyte, but given that base64 encoded data should be valid ASCII, it's valid UTF-8 and thus would normally be a range of char. I would have expected encode to take a range of ubyte and output a range of char and decode to accept a range of char and ouput a range of ubyte, but I don't think that it used ranges originally, and so the way it uses ranges is probably different from what it would ideally do. Regardless, I don't see any reason why decode couldn't be made to work with ranges of char if it's already working with ranges of ubyte. It might be more than simply changing the template constraint due to issues with narrow strings, but it would probably be pretty straightforward - and it wouldn't be a breaking change. - Jonathan M Davis
Jun 03
parent Dukc <ajieskola gmail.com> writes:
On Monday, 3 June 2019 at 10:14:09 UTC, Jonathan M Davis wrote:
 Regardless, I don't see
 any reason why decode couldn't be made to work with ranges of 
 char if it's
 already working with ranges of ubyte. It might be more than 
 simply changing
 the template constraint due to issues with narrow strings, but 
 it would
 probably be pretty straightforward - and it wouldn't be a 
 breaking change.

 - Jonathan M Davis
I'll consider that next time I decide to contribute.
Jun 03