www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - customized "new" and pointer alignment

reply %u <new new.com> writes:
no answer from digitalmars.D.learn, try it here.

== Posted at 2007/01/29 15:52 to digitalmars.D.learn

I want to do explicit memory allocation for some of my objects,

I'm reading:  http://digitalmars.com/d/memory.html#newdelete

which says:


alignment. This is 8 on win32 systems.

Then on the next section:

http://digitalmars.com/d/memory.html#markrelease

new(size_t sz)
    {   void *p;

 p = &buffer[bufindex];
 bufindex += sz;
 if (bufindex > buffer.length)
     throw new OutOfMemory;
 return p;
    }

Is this code correct? I mean the object size (sz) could be any integer, how
can one ensure the alignment requirement?

If the above code in "Mark/Release" is incorrect, can anyone tell me how to
return aligned memory pointers?  and for lots of small objects, does alignment
waste too much memory?

Thanks.
Jan 29 2007
next sibling parent reply %u <new new.com> writes:
== Quote from %u (new new.com)'s article
 no answer from digitalmars.D.learn, try it here.
 == Posted at 2007/01/29 15:52 to digitalmars.D.learn
 I want to do explicit memory allocation for some of my objects,
 I'm reading:  http://digitalmars.com/d/memory.html#newdelete
 which says:

 alignment. This is 8 on win32 systems.
 Then on the next section:
 http://digitalmars.com/d/memory.html#markrelease
 new(size_t sz)
     {   void *p;
  p = &buffer[bufindex];
  bufindex += sz;
  if (bufindex > buffer.length)
      throw new OutOfMemory;
  return p;
     }
 Is this code correct? I mean the object size (sz) could be any integer, how
 can one ensure the alignment requirement?
 If the above code in "Mark/Release" is incorrect, can anyone tell me how to
 return aligned memory pointers?  and for lots of small objects, does alignment
 waste too much memory?
 Thanks.
Also buffer is declared as: void[] buffer; Is the size of void is the same as char?
Jan 29 2007
parent reply "Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:
"%u" <new new.com> wrote in message news:epm1ep$1ld5$1 digitaldaemon.com...

 Also buffer is declared as:

 void[] buffer;

 Is the size of void is the same as char?
The size of a void is (AFAIK) defined to be the smallest addressable (or maybe manipulatable) data unit on the machine. So on most computers, it'll be an 8-bit byte. The size of a char variable is always 8 bits, because it's a UTF-8 something-or-other. It's not a codepoint, it's a ...? But it's always 8 bits. So the two sizes are the same mostly by coincidence.
Jan 29 2007
parent reply Chris Paulson-Ellis <chris edesix.com> writes:
Jarrett Billingsley wrote:
 The size of a char variable is always 8 bits, because it's a UTF-8 
 something-or-other.  It's not a codepoint, it's a ...?  But it's always 8 
 bits.
The Unicode term is "code unit". For the benefit of the Unicode uninitiated, the D spec could be clearer on this point. Despite its name, a char variable does not hold a character, but rather a single unit of the UTF-8 character encoding. For example, the UTF-8 code unit sequence 0xE2 0x82 0xAC decodes into U+20AC, the Unicode code point for the Euro currency symbol character, €. Similarly, the wchar type is defined to be a UTF-16 code unit, which is usually the same as the corresponding code point, but not for code points > U+FFFF, which are encoded using 2 code units (called a surrogate pair). The dchar type is a UTF-32 code unit. These are the same as the code points, except for values > U+10FFFF which are beyond the range of Unicode. You are free to use out of range values to mean something within your application, but they will never represent Unicode characters. Another complication arises from the fact that the UTF encodings can encode "non-character" code points (anything ending in FFFE or FFFF, such as U+FFFE or U+3FFFF). Similarly, the "surrogates" (the code points with the same values as the code units used by UTF-16 to encode code points > U+FFFF) are not characters even though they can be represented in UTF-8 or UTF-32. So even a char or wchar sequence that decodes okay or a single dchar may not be a "character". Again, you can use these code points within your application, but in the words of the code page for U+FFF[EF], they are "not valid for interchange". Nothing is ever crystal clear in Unicode land. Chris.
Jan 30 2007
parent reply "Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:
"Chris Paulson-Ellis" <chris edesix.com> wrote in message 
news:epohto$1qsd$1 digitaldaemon.com...
 The Unicode term is "code unit".
Code _unit_. I thought it was something like that. I think the people who come up with Unicode have a little too much time on their hands.
Jan 30 2007
parent =?ISO-8859-1?Q?Julio_C=E9sar_Carrascal_Urquijo?= writes:
Jarrett Billingsley wrote:
 Code _unit_.  I thought it was something like that.  I think the people who 
 come up with Unicode have a little too much time on their hands. 
Not all of them. UTF-8 was designed in one night and it is still better than UTF-16: http://www.cl.cam.ac.uk/~mgk25/ucs/utf-8-history.txt
Jan 31 2007
prev sibling parent Kevin Bealer <kevinbealer gmail.com> writes:
%u wrote:
 no answer from digitalmars.D.learn, try it here.
 
 == Posted at 2007/01/29 15:52 to digitalmars.D.learn
 
 I want to do explicit memory allocation for some of my objects,
 
 I'm reading:  http://digitalmars.com/d/memory.html#newdelete
 
 which says:
 

 alignment. This is 8 on win32 systems.
 
 Then on the next section:
 
 http://digitalmars.com/d/memory.html#markrelease
 
 new(size_t sz)
     {   void *p;
 
  p = &buffer[bufindex];
  bufindex += sz;
  if (bufindex > buffer.length)
      throw new OutOfMemory;
  return p;
     }
 
 Is this code correct? I mean the object size (sz) could be any integer, how
 can one ensure the alignment requirement?
 
 If the above code in "Mark/Release" is incorrect, can anyone tell me how to
 return aligned memory pointers?  and for lots of small objects, does alignment
 waste too much memory?
 
 Thanks.
 
Most allocators align data to the largest primitive excepting perhaps 'real'. The malloc() manpage says 'for any data type' which implies at least 64 bits, but I'm not sure I would rely on more than 32 bit aligns on a 32 bit system. In the calling code you can align data with something like this, not tested btw: // Allocate sz bytes, aligned to a multiple of aln. void[] data = new void[sz + aln-1]; // enough bytes to make this work int shift = (aln - (data.ptr & (aln-1))) & (aln-1); data = data[shift..sz+shift]; // slice the array at a multiple of al This assumes 'al' is a power of 2. I can't imagine why a non-power-of-2 alignment would be useful, but you could replace "& (x-1)" with "% x" to get that. Kevin
Jan 30 2007