digitalmars.D - Why is Base64 considered range of ubytes?

Dukc (22/22) Jun 03 2019 I just had a range of characters in Base64 I wanted to decode.

Jonathan M Davis (14/36) Jun 03 2019 It looks like it's set up to operate on strings or on ranges of ubyte, w...

Dukc (2/12) Jun 03 2019 I'll consider that next time I decide to contribute.

Dukc <ajieskola gmail.com> writes:

I just had a range of characters in Base64 I wanted to decode. 
Compiler complained something about no matching template. My 
first reactions were to add `.array` and/or `byCodeUnit` to my 
range. Did not work. Hmm, ensure the range is `char[]` and then 
feed it? Nope. Could it be a regression? Hardly, as this example 
from docs is supposedly unittested:
```
auto encoded = Base64.encoder(cast(ubyte[])"0123456789");
foreach (n; map!q{a - '0'}(Base64.decoder(encoded)))
{
     writeln(n);
}
```

I anyways tested it in a separate project just in case, and it 
complied and worked.

Only after adding `.array` to that first line and changing `auto` 
to `char[]` the problem dawned: `Base64.decoder` does not except 
an input range of characters, it excepts an input range of 
unsigned bytes!

I fail to understand why. Is not the whole point of Base64 to 
encode arbitrary data into readable characters that can be 
unambiquosly written on paper?

Jun 03 2019

Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:

On Monday, June 3, 2019 3:13:26 AM MDT Dukc via Digitalmars-d wrote:
 I just had a range of characters in Base64 I wanted to decode.
 Compiler complained something about no matching template. My
 first reactions were to add `.array` and/or `byCodeUnit` to my
 range. Did not work. Hmm, ensure the range is `char[]` and then
 feed it? Nope. Could it be a regression? Hardly, as this example
 from docs is supposedly unittested:
 ```
 auto encoded = Base64.encoder(cast(ubyte[])"0123456789");
 foreach (n; map!q{a - '0'}(Base64.decoder(encoded)))
 {
      writeln(n);
 }
 ```

 I anyways tested it in a separate project just in case, and it
 complied and worked.

 Only after adding `.array` to that first line and changing `auto`
 to `char[]` the problem dawned: `Base64.decoder` does not except
 an input range of characters, it excepts an input range of
 unsigned bytes!

 I fail to understand why. Is not the whole point of Base64 to
 encode arbitrary data into readable characters that can be
 unambiquosly written on paper?

It looks like it's set up to operate on strings or on ranges of ubyte, which
is certainly weird. It makes some sense that it would accept ranges of
ubyte, but given that base64 encoded data should be valid ASCII, it's valid
UTF-8 and thus would normally be a range of char.

I would have expected encode to take a range of ubyte and output a range of
char and decode to accept a range of char and ouput a range of ubyte, but
I don't think that it used ranges originally, and so the way it uses ranges
is probably different from what it would ideally do. Regardless, I don't see
any reason why decode couldn't be made to work with ranges of char if it's
already working with ranges of ubyte. It might be more than simply changing
the template constraint due to issues with narrow strings, but it would
probably be pretty straightforward - and it wouldn't be a breaking change.

- Jonathan M Davis

Jun 03 2019

Dukc <ajieskola gmail.com> writes:

On Monday, 3 June 2019 at 10:14:09 UTC, Jonathan M Davis wrote:
 Regardless, I don't see
 any reason why decode couldn't be made to work with ranges of 
 char if it's
 already working with ranges of ubyte. It might be more than 
 simply changing
 the template constraint due to issues with narrow strings, but 
 it would
 probably be pretty straightforward - and it wouldn't be a 
 breaking change.

 - Jonathan M Davis

I'll consider that next time I decide to contribute.

Jun 03 2019

D Programming

C/C++ Programming

Other

digitalmars.D - Why is Base64 considered range of ubytes?