digitalmars.D - Why is Base64 considered range of ubytes?
- Dukc (22/22) Jun 03 2019 I just had a range of characters in Base64 I wanted to decode.
- Jonathan M Davis (14/36) Jun 03 2019 It looks like it's set up to operate on strings or on ranges of ubyte, w...
- Dukc (2/12) Jun 03 2019 I'll consider that next time I decide to contribute.
I just had a range of characters in Base64 I wanted to decode. Compiler complained something about no matching template. My first reactions were to add `.array` and/or `byCodeUnit` to my range. Did not work. Hmm, ensure the range is `char[]` and then feed it? Nope. Could it be a regression? Hardly, as this example from docs is supposedly unittested: ``` auto encoded = Base64.encoder(cast(ubyte[])"0123456789"); foreach (n; map!q{a - '0'}(Base64.decoder(encoded))) { writeln(n); } ``` I anyways tested it in a separate project just in case, and it complied and worked. Only after adding `.array` to that first line and changing `auto` to `char[]` the problem dawned: `Base64.decoder` does not except an input range of characters, it excepts an input range of unsigned bytes! I fail to understand why. Is not the whole point of Base64 to encode arbitrary data into readable characters that can be unambiquosly written on paper?
Jun 03 2019
On Monday, June 3, 2019 3:13:26 AM MDT Dukc via Digitalmars-d wrote:I just had a range of characters in Base64 I wanted to decode. Compiler complained something about no matching template. My first reactions were to add `.array` and/or `byCodeUnit` to my range. Did not work. Hmm, ensure the range is `char[]` and then feed it? Nope. Could it be a regression? Hardly, as this example from docs is supposedly unittested: ``` auto encoded = Base64.encoder(cast(ubyte[])"0123456789"); foreach (n; map!q{a - '0'}(Base64.decoder(encoded))) { writeln(n); } ``` I anyways tested it in a separate project just in case, and it complied and worked. Only after adding `.array` to that first line and changing `auto` to `char[]` the problem dawned: `Base64.decoder` does not except an input range of characters, it excepts an input range of unsigned bytes! I fail to understand why. Is not the whole point of Base64 to encode arbitrary data into readable characters that can be unambiquosly written on paper?It looks like it's set up to operate on strings or on ranges of ubyte, which is certainly weird. It makes some sense that it would accept ranges of ubyte, but given that base64 encoded data should be valid ASCII, it's valid UTF-8 and thus would normally be a range of char. I would have expected encode to take a range of ubyte and output a range of char and decode to accept a range of char and ouput a range of ubyte, but I don't think that it used ranges originally, and so the way it uses ranges is probably different from what it would ideally do. Regardless, I don't see any reason why decode couldn't be made to work with ranges of char if it's already working with ranges of ubyte. It might be more than simply changing the template constraint due to issues with narrow strings, but it would probably be pretty straightforward - and it wouldn't be a breaking change. - Jonathan M Davis
Jun 03 2019
On Monday, 3 June 2019 at 10:14:09 UTC, Jonathan M Davis wrote:Regardless, I don't see any reason why decode couldn't be made to work with ranges of char if it's already working with ranges of ubyte. It might be more than simply changing the template constraint due to issues with narrow strings, but it would probably be pretty straightforward - and it wouldn't be a breaking change. - Jonathan M DavisI'll consider that next time I decide to contribute.
Jun 03 2019