digitalmars.D.learn - repack ubyte[] to use only 7 bits

Charles Hixson via Digitalmars-d-learn (72/72) Dec 06 2014 Is there a standard way to do this? The code below is untested, as I

Manolo (14/96) Dec 13 2014 Are you trying to make a "kind-of" Variable-Length quantity

Manolo (19/116) Dec 13 2014 Sorry, lack or accuraccy: the maximum value represented was a 24
Charles Hixson via Digitalmars-d-learn (9/93) Dec 13 2014 What I was trying to do was pack things into 7 bits so I could recode

Manolo (3/115) Dec 13 2014 Sorry, I misunderstood the thing.

Charles Hixson via Digitalmars-d-learn writes:

Is there a standard way to do this?  The code below is untested, as I 
haven't yet written the x7to8 routine, and came up with a better way to 
do what this was to accomplish, but it feels as if this should be 
somewhere in the standard library, if I could only find it.

/** Repack the data from an array of ubytes into an array of ubytes of
  * which only the last 7 are significant.  The high bit will be set only
  * if the byte would otherwise be zero.    */
byte[]    x8to7 (ubyte[] bin)
{
     ubyte[] bout;
     //    bit masks:    0 => 0xfe = 11111110, 0x00 = 00000000
     //                1 => 0x7f = 01111111, 0x00 = 00000000
     //                2 => 0x3f = 00111111, 0x80 = 10000000
     //                3 => 0x1f = 00011111, 0xc0 = 11000000
     //                4 => 0x0f = 00001111, 0xe0 = 11100000
     //                5 => 0x07 = 00000111, 0xf0 = 11110000
     //                6 => 0x03 = 00000011, 0xf8 = 11111000
     //                7 => 0x01 = 00000001, 0xfc = 11111100
     if (bin.length < 1)    return    bout;
     int    fByte, fBit;
     while    (fByte < bin.length)
     {    if (fByte + 1 == bin.length && fBit > 1)  break;
         ubyte    b;
         switch (fBit)
         {    case    0:
                 b    =    bin[fByte]    / 2;
                 break;
             case    1:
                 b    =    bin[fByte] & 0x7f;
                 break;
             case    2:
                 ubyte    b1    =    (bin[fByte] & 0x3f) << 1;
                 ubyte    b2    =    (bin[fByte + 1] & 0x80) >>> 7;
                 b    ~=    (b1 | b2);
                 break;
             case    3:
                 ubyte    b1    =    (bin[fByte] & 0x1f) << 2;
                 ubyte    b2    =    (bin[fByte + 1] & 0xc0) >>> 6;
                 b    ~= (b1 | b2);
                 break;
             case    4:
                 ubyte    b1    =    (bin[fByte] & 0x0f) << 3;
                 ubyte    b2    =    (bin[fByte + 1] & 0xe0) >>> 5;
                 b    ~= (b1 | b2);
                 break;
             case    5:
                 ubyte    b1    =    (bin[fByte] & 0x07) << 4;
                 ubyte    b2    =    (bin[fByte + 1] & 0xf0) >>> 4;
                 b    ~= (b1 | b2);
                 break;
             case    6:
                 ubyte    b1    =    (bin[fByte] & 0x03) << 5;
                 ubyte    b2    =    (bin[fByte + 1] & 0xf8) >>> 3;
                 b    ~= (b1 | b2);
                 break;
             case    7:
                 ubyte    b1    =    (bin[fByte] & 0x01) << 6;
                 ubyte    b2    =    (bin[fByte + 1] & 0xfc) >>> 2;
                 b    ~= (b1 | b2);
                 break;
             default:
                 assert (false, "This path should never be taken");
         }    //    switch (fBit)
         if    (b == 0)    bout    ~=    0x80;
         else            bout    ~=    b;
         fBit    =    fBit + 7;
         if    (fBit > 7)
         {    fByte++;
             fBit -=    7;
         }
     }
}

Dec 06 2014

"Manolo" <Manolo oops.kp> writes:

On Saturday, 13 December 2014 at 10:09:27 UTC, Charles Hixson via 
Digitalmars-d-learn wrote:
 Is there a standard way to do this?  The code below is 
 untested, as I haven't yet written the x7to8 routine, and came 
 up with a better way to do what this was to accomplish, but it 
 feels as if this should be somewhere in the standard library, 
 if I could only find it.

 /** Repack the data from an array of ubytes into an array of 
 ubytes of
  * which only the last 7 are significant.  The high bit will be 
 set only
  * if the byte would otherwise be zero.    */
 byte[]    x8to7 (ubyte[] bin)
 {
     ubyte[] bout;
     //    bit masks:    0 => 0xfe = 11111110, 0x00 = 00000000
     //                1 => 0x7f = 01111111, 0x00 = 00000000
     //                2 => 0x3f = 00111111, 0x80 = 10000000
     //                3 => 0x1f = 00011111, 0xc0 = 11000000
     //                4 => 0x0f = 00001111, 0xe0 = 11100000
     //                5 => 0x07 = 00000111, 0xf0 = 11110000
     //                6 => 0x03 = 00000011, 0xf8 = 11111000
     //                7 => 0x01 = 00000001, 0xfc = 11111100
     if (bin.length < 1)    return    bout;
     int    fByte, fBit;
     while    (fByte < bin.length)
     {    if (fByte + 1 == bin.length && fBit > 1)  break;
         ubyte    b;
         switch (fBit)
         {    case    0:
                 b    =    bin[fByte]    / 2;
                 break;
             case    1:
                 b    =    bin[fByte] & 0x7f;
                 break;
             case    2:
                 ubyte    b1    =    (bin[fByte] & 0x3f) << 1;
                 ubyte    b2    =    (bin[fByte + 1] & 0x80) >>> 
 7;
                 b    ~=    (b1 | b2);
                 break;
             case    3:
                 ubyte    b1    =    (bin[fByte] & 0x1f) << 2;
                 ubyte    b2    =    (bin[fByte + 1] & 0xc0) >>> 
 6;
                 b    ~= (b1 | b2);
                 break;
             case    4:
                 ubyte    b1    =    (bin[fByte] & 0x0f) << 3;
                 ubyte    b2    =    (bin[fByte + 1] & 0xe0) >>> 
 5;
                 b    ~= (b1 | b2);
                 break;
             case    5:
                 ubyte    b1    =    (bin[fByte] & 0x07) << 4;
                 ubyte    b2    =    (bin[fByte + 1] & 0xf0) >>> 
 4;
                 b    ~= (b1 | b2);
                 break;
             case    6:
                 ubyte    b1    =    (bin[fByte] & 0x03) << 5;
                 ubyte    b2    =    (bin[fByte + 1] & 0xf8) >>> 
 3;
                 b    ~= (b1 | b2);
                 break;
             case    7:
                 ubyte    b1    =    (bin[fByte] & 0x01) << 6;
                 ubyte    b2    =    (bin[fByte + 1] & 0xfc) >>> 
 2;
                 b    ~= (b1 | b2);
                 break;
             default:
                 assert (false, "This path should never be 
 taken");
         }    //    switch (fBit)
         if    (b == 0)    bout    ~=    0x80;
         else            bout    ~=    b;
         fBit    =    fBit + 7;
         if    (fBit > 7)
         {    fByte++;
             fBit -=    7;
         }
     }
 }

Are you trying to make a "kind-of" Variable-Length quantity 
encoder ?

eg:
0b10101110: last bit not set, integrate 0b10101110 and stop 
reading.
0b10011001: last bit set, integrate 0b10011000 and continue to 
next byte.

http://en.wikipedia.org/wiki/Variable-length_quantity

except that this algo limits the length to 24 bits. It was used a 
lot with
MIDI, at a time when hardware memory was costly (eg inside 
hardware synthesizer or workstations).

Dec 13 2014

"Manolo" <Manolo oops.kp> writes:

On Saturday, 13 December 2014 at 11:20:21 UTC, Manolo wrote:
 On Saturday, 13 December 2014 at 10:09:27 UTC, Charles Hixson 
 via Digitalmars-d-learn wrote:
 Is there a standard way to do this?  The code below is 
 untested, as I haven't yet written the x7to8 routine, and came 
 up with a better way to do what this was to accomplish, but it 
 feels as if this should be somewhere in the standard library, 
 if I could only find it.

 /** Repack the data from an array of ubytes into an array of 
 ubytes of
 * which only the last 7 are significant.  The high bit will be 
 set only
 * if the byte would otherwise be zero.    */
 byte[]    x8to7 (ubyte[] bin)
 {
    ubyte[] bout;
    //    bit masks:    0 => 0xfe = 11111110, 0x00 = 00000000
    //                1 => 0x7f = 01111111, 0x00 = 00000000
    //                2 => 0x3f = 00111111, 0x80 = 10000000
    //                3 => 0x1f = 00011111, 0xc0 = 11000000
    //                4 => 0x0f = 00001111, 0xe0 = 11100000
    //                5 => 0x07 = 00000111, 0xf0 = 11110000
    //                6 => 0x03 = 00000011, 0xf8 = 11111000
    //                7 => 0x01 = 00000001, 0xfc = 11111100
    if (bin.length < 1)    return    bout;
    int    fByte, fBit;
    while    (fByte < bin.length)
    {    if (fByte + 1 == bin.length && fBit > 1)  break;
        ubyte    b;
        switch (fBit)
        {    case    0:
                b    =    bin[fByte]    / 2;
                break;
            case    1:
                b    =    bin[fByte] & 0x7f;
                break;
            case    2:
                ubyte    b1    =    (bin[fByte] & 0x3f) << 1;
                ubyte    b2    =    (bin[fByte + 1] & 0x80) >>> 
 7;
                b    ~=    (b1 | b2);
                break;
            case    3:
                ubyte    b1    =    (bin[fByte] & 0x1f) << 2;
                ubyte    b2    =    (bin[fByte + 1] & 0xc0) >>> 
 6;
                b    ~= (b1 | b2);
                break;
            case    4:
                ubyte    b1    =    (bin[fByte] & 0x0f) << 3;
                ubyte    b2    =    (bin[fByte + 1] & 0xe0) >>> 
 5;
                b    ~= (b1 | b2);
                break;
            case    5:
                ubyte    b1    =    (bin[fByte] & 0x07) << 4;
                ubyte    b2    =    (bin[fByte + 1] & 0xf0) >>> 
 4;
                b    ~= (b1 | b2);
                break;
            case    6:
                ubyte    b1    =    (bin[fByte] & 0x03) << 5;
                ubyte    b2    =    (bin[fByte + 1] & 0xf8) >>> 
 3;
                b    ~= (b1 | b2);
                break;
            case    7:
                ubyte    b1    =    (bin[fByte] & 0x01) << 6;
                ubyte    b2    =    (bin[fByte + 1] & 0xfc) >>> 
 2;
                b    ~= (b1 | b2);
                break;
            default:
                assert (false, "This path should never be 
 taken");
        }    //    switch (fBit)
        if    (b == 0)    bout    ~=    0x80;
        else            bout    ~=    b;
        fBit    =    fBit + 7;
        if    (fBit > 7)
        {    fByte++;
            fBit -=    7;
        }
    }
 }

 Are you trying to make a "kind-of" Variable-Length quantity 
 encoder ?

 eg:
 0b10101110: last bit not set, integrate 0b10101110 and stop 
 reading.
 0b10011001: last bit set, integrate 0b10011000 and continue to 
 next byte.

 http://en.wikipedia.org/wiki/Variable-length_quantity

 except that this algo limits the length to 24 bits. It was used 
 a lot with
 MIDI, at a time when hardware memory was costly (eg inside 
 hardware synthesizer or workstations).

Sorry, lack or accuraccy: the maximum value represented was a 24 
bit unsigned integer, but the data length was 32 bit for this 
value.
The thing is that the format included various fields, but because 
of the memory price the algo saved space when values where less 
than 0X7F, because only one byte was needed. Nowadays such as 
format would allow for example always 4 bytes to describes the 
data length:

nowadays:
data len "L" | data
0 1 2 3      | 4 ... L-1

so "nowadays", we can afford 4 bytes to say that a field length 
is 1

olddays:
variable len VL | data
1 to ?          | VL ... VL-1

"olddays", they used only one byte to say that a field length is 
1.

Dec 13 2014

Charles Hixson via Digitalmars-d-learn writes:

On 12/13/2014 03:20 AM, Manolo via Digitalmars-d-learn wrote:
 On Saturday, 13 December 2014 at 10:09:27 UTC, Charles Hixson via 
 Digitalmars-d-learn wrote:
 Is there a standard way to do this?  The code below is untested, as I 
 haven't yet written the x7to8 routine, and came up with a better way 
 to do what this was to accomplish, but it feels as if this should be 
 somewhere in the standard library, if I could only find it.

 /** Repack the data from an array of ubytes into an array of ubytes of
  * which only the last 7 are significant.  The high bit will be set only
  * if the byte would otherwise be zero.    */
 byte[]    x8to7 (ubyte[] bin)
 {
     ubyte[] bout;
     //    bit masks:    0 => 0xfe = 11111110, 0x00 = 00000000
     //                1 => 0x7f = 01111111, 0x00 = 00000000
     //                2 => 0x3f = 00111111, 0x80 = 10000000
     //                3 => 0x1f = 00011111, 0xc0 = 11000000
     //                4 => 0x0f = 00001111, 0xe0 = 11100000
     //                5 => 0x07 = 00000111, 0xf0 = 11110000
     //                6 => 0x03 = 00000011, 0xf8 = 11111000
     //                7 => 0x01 = 00000001, 0xfc = 11111100
     if (bin.length < 1)    return    bout;
     int    fByte, fBit;
     while    (fByte < bin.length)
     {    if (fByte + 1 == bin.length && fBit > 1) break;
         ubyte    b;
         switch (fBit)
         {    case    0:
                 b    =    bin[fByte]    / 2;
                 break;
             case    1:
                 b    =    bin[fByte] & 0x7f;
                 break;
             case    2:
                 ubyte    b1    =    (bin[fByte] & 0x3f) << 1;
                 ubyte    b2    =    (bin[fByte + 1] & 0x80) >>> 7;
                 b    ~=    (b1 | b2);
                 break;
             case    3:
                 ubyte    b1    =    (bin[fByte] & 0x1f) << 2;
                 ubyte    b2    =    (bin[fByte + 1] & 0xc0) >>> 6;
                 b    ~= (b1 | b2);
                 break;
             case    4:
                 ubyte    b1    =    (bin[fByte] & 0x0f) << 3;
                 ubyte    b2    =    (bin[fByte + 1] & 0xe0) >>> 5;
                 b    ~= (b1 | b2);
                 break;
             case    5:
                 ubyte    b1    =    (bin[fByte] & 0x07) << 4;
                 ubyte    b2    =    (bin[fByte + 1] & 0xf0) >>> 4;
                 b    ~= (b1 | b2);
                 break;
             case    6:
                 ubyte    b1    =    (bin[fByte] & 0x03) << 5;
                 ubyte    b2    =    (bin[fByte + 1] & 0xf8) >>> 3;
                 b    ~= (b1 | b2);
                 break;
             case    7:
                 ubyte    b1    =    (bin[fByte] & 0x01) << 6;
                 ubyte    b2    =    (bin[fByte + 1] & 0xfc) >>> 2;
                 b    ~= (b1 | b2);
                 break;
             default:
                 assert (false, "This path should never be taken");
         }    //    switch (fBit)
         if    (b == 0)    bout    ~=    0x80;
         else            bout    ~=    b;
         fBit    =    fBit + 7;
         if    (fBit > 7)
         {    fByte++;
             fBit -=    7;
         }
     }
 }

 Are you trying to make a "kind-of" Variable-Length quantity encoder ?

 eg:
 0b10101110: last bit not set, integrate 0b10101110 and stop reading.
 0b10011001: last bit set, integrate 0b10011000 and continue to next byte.

 http://en.wikipedia.org/wiki/Variable-length_quantity

 except that this algo limits the length to 24 bits. It was used a lot 
 with
 MIDI, at a time when hardware memory was costly (eg inside hardware 
 synthesizer or workstations).

What I was trying to do was pack things into 7 bits so I could recode 
0's as 128.  I finally thought clearly about it and realized that I only 
needed to use one particular byte value (I chose 127) to duplicate so I 
could repack things with a string of 0's replaced by 127 followed by the 
length (up to 126) of zeros, and for 127 itself I'd just emit 127 
twice.  This was to pack binary data into a string that C routines 
wouldn't think had ended partway through.  (If I get more than 127 zeros 
in a row, I just have more than one packing code.)

Dec 13 2014

"Manolo" <Manolo blabla.com> writes:

On Saturday, 13 December 2014 at 19:52:33 UTC, Charles Hixson via 
Digitalmars-d-learn wrote:
 On 12/13/2014 03:20 AM, Manolo via Digitalmars-d-learn wrote:
 On Saturday, 13 December 2014 at 10:09:27 UTC, Charles Hixson 
 via Digitalmars-d-learn wrote:
 Is there a standard way to do this?  The code below is 
 untested, as I haven't yet written the x7to8 routine, and 
 came up with a better way to do what this was to accomplish, 
 but it feels as if this should be somewhere in the standard 
 library, if I could only find it.

 /** Repack the data from an array of ubytes into an array of 
 ubytes of
 * which only the last 7 are significant.  The high bit will 
 be set only
 * if the byte would otherwise be zero.    */
 byte[]    x8to7 (ubyte[] bin)
 {
    ubyte[] bout;
    //    bit masks:    0 => 0xfe = 11111110, 0x00 = 00000000
    //                1 => 0x7f = 01111111, 0x00 = 00000000
    //                2 => 0x3f = 00111111, 0x80 = 10000000
    //                3 => 0x1f = 00011111, 0xc0 = 11000000
    //                4 => 0x0f = 00001111, 0xe0 = 11100000
    //                5 => 0x07 = 00000111, 0xf0 = 11110000
    //                6 => 0x03 = 00000011, 0xf8 = 11111000
    //                7 => 0x01 = 00000001, 0xfc = 11111100
    if (bin.length < 1)    return    bout;
    int    fByte, fBit;
    while    (fByte < bin.length)
    {    if (fByte + 1 == bin.length && fBit > 1) break;
        ubyte    b;
        switch (fBit)
        {    case    0:
                b    =    bin[fByte]    / 2;
                break;
            case    1:
                b    =    bin[fByte] & 0x7f;
                break;
            case    2:
                ubyte    b1    =    (bin[fByte] & 0x3f) << 1;
                ubyte    b2    =    (bin[fByte + 1] & 0x80)
 7;



                b    ~=    (b1 | b2);
                break;
            case    3:
                ubyte    b1    =    (bin[fByte] & 0x1f) << 2;
                ubyte    b2    =    (bin[fByte + 1] & 0xc0)
 6;



                b    ~= (b1 | b2);
                break;
            case    4:
                ubyte    b1    =    (bin[fByte] & 0x0f) << 3;
                ubyte    b2    =    (bin[fByte + 1] & 0xe0)
 5;



                b    ~= (b1 | b2);
                break;
            case    5:
                ubyte    b1    =    (bin[fByte] & 0x07) << 4;
                ubyte    b2    =    (bin[fByte + 1] & 0xf0)
 4;



                b    ~= (b1 | b2);
                break;
            case    6:
                ubyte    b1    =    (bin[fByte] & 0x03) << 5;
                ubyte    b2    =    (bin[fByte + 1] & 0xf8)
 3;



                b    ~= (b1 | b2);
                break;
            case    7:
                ubyte    b1    =    (bin[fByte] & 0x01) << 6;
                ubyte    b2    =    (bin[fByte + 1] & 0xfc)
 2;



                b    ~= (b1 | b2);
                break;
            default:
                assert (false, "This path should never be 
 taken");
        }    //    switch (fBit)
        if    (b == 0)    bout    ~=    0x80;
        else            bout    ~=    b;
        fBit    =    fBit + 7;
        if    (fBit > 7)
        {    fByte++;
            fBit -=    7;
        }
    }
 }

 Are you trying to make a "kind-of" Variable-Length quantity 
 encoder ?

 eg:
 0b10101110: last bit not set, integrate 0b10101110 and stop 
 reading.
 0b10011001: last bit set, integrate 0b10011000 and continue to 
 next byte.

 http://en.wikipedia.org/wiki/Variable-length_quantity

 except that this algo limits the length to 24 bits. It was 
 used a lot with
 MIDI, at a time when hardware memory was costly (eg inside 
 hardware synthesizer or workstations).

 What I was trying to do was pack things into 7 bits so I could 
 recode 0's as 128.  I finally thought clearly about it and 
 realized that I only needed to use one particular byte value (I 
 chose 127) to duplicate so I could repack things with a string 
 of 0's replaced by 127 followed by the length (up to 126) of 
 zeros, and for 127 itself I'd just emit 127 twice.  This was to 
 pack binary data into a string that C routines wouldn't think 
 had ended partway through.  (If I get more than 127 zeros in a 
 row, I just have more than one packing code.)

Sorry, I misunderstood the thing.

Dec 13 2014

D Programming

C/C++ Programming

Other

digitalmars.D.learn - repack ubyte[] to use only 7 bits