digitalmars.D - customized "new" and pointer alignment
- %u (23/23) Jan 29 2007 no answer from digitalmars.D.learn, try it here.
- %u (4/27) Jan 29 2007 Also buffer is declared as:
- Jarrett Billingsley (8/11) Jan 29 2007 The size of a void is (AFAIK) defined to be the smallest addressable (or...
- Chris Paulson-Ellis (26/29) Jan 30 2007 The Unicode term is "code unit".
- Jarrett Billingsley (4/5) Jan 30 2007 Code _unit_. I thought it was something like that. I think the people ...
- =?ISO-8859-1?Q?Julio_C=E9sar_Carrascal_Urquijo?= (4/6) Jan 31 2007 Not all of them. UTF-8 was designed in one night and it is still better
- Kevin Bealer (15/51) Jan 30 2007 Most allocators align data to the largest primitive excepting perhaps
no answer from digitalmars.D.learn, try it here. == Posted at 2007/01/29 15:52 to digitalmars.D.learn I want to do explicit memory allocation for some of my objects, I'm reading: http://digitalmars.com/d/memory.html#newdelete which says: alignment. This is 8 on win32 systems. Then on the next section: http://digitalmars.com/d/memory.html#markrelease new(size_t sz) { void *p; p = &buffer[bufindex]; bufindex += sz; if (bufindex > buffer.length) throw new OutOfMemory; return p; } Is this code correct? I mean the object size (sz) could be any integer, how can one ensure the alignment requirement? If the above code in "Mark/Release" is incorrect, can anyone tell me how to return aligned memory pointers? and for lots of small objects, does alignment waste too much memory? Thanks.
Jan 29 2007
== Quote from %u (new new.com)'s articleno answer from digitalmars.D.learn, try it here. == Posted at 2007/01/29 15:52 to digitalmars.D.learn I want to do explicit memory allocation for some of my objects, I'm reading: http://digitalmars.com/d/memory.html#newdelete which says: alignment. This is 8 on win32 systems. Then on the next section: http://digitalmars.com/d/memory.html#markrelease new(size_t sz) { void *p; p = &buffer[bufindex]; bufindex += sz; if (bufindex > buffer.length) throw new OutOfMemory; return p; } Is this code correct? I mean the object size (sz) could be any integer, how can one ensure the alignment requirement? If the above code in "Mark/Release" is incorrect, can anyone tell me how to return aligned memory pointers? and for lots of small objects, does alignment waste too much memory? Thanks.Also buffer is declared as: void[] buffer; Is the size of void is the same as char?
Jan 29 2007
"%u" <new new.com> wrote in message news:epm1ep$1ld5$1 digitaldaemon.com...Also buffer is declared as: void[] buffer; Is the size of void is the same as char?The size of a void is (AFAIK) defined to be the smallest addressable (or maybe manipulatable) data unit on the machine. So on most computers, it'll be an 8-bit byte. The size of a char variable is always 8 bits, because it's a UTF-8 something-or-other. It's not a codepoint, it's a ...? But it's always 8 bits. So the two sizes are the same mostly by coincidence.
Jan 29 2007
Jarrett Billingsley wrote:The size of a char variable is always 8 bits, because it's a UTF-8 something-or-other. It's not a codepoint, it's a ...? But it's always 8 bits.The Unicode term is "code unit". For the benefit of the Unicode uninitiated, the D spec could be clearer on this point. Despite its name, a char variable does not hold a character, but rather a single unit of the UTF-8 character encoding. For example, the UTF-8 code unit sequence 0xE2 0x82 0xAC decodes into U+20AC, the Unicode code point for the Euro currency symbol character, €. Similarly, the wchar type is defined to be a UTF-16 code unit, which is usually the same as the corresponding code point, but not for code points > U+FFFF, which are encoded using 2 code units (called a surrogate pair). The dchar type is a UTF-32 code unit. These are the same as the code points, except for values > U+10FFFF which are beyond the range of Unicode. You are free to use out of range values to mean something within your application, but they will never represent Unicode characters. Another complication arises from the fact that the UTF encodings can encode "non-character" code points (anything ending in FFFE or FFFF, such as U+FFFE or U+3FFFF). Similarly, the "surrogates" (the code points with the same values as the code units used by UTF-16 to encode code points > U+FFFF) are not characters even though they can be represented in UTF-8 or UTF-32. So even a char or wchar sequence that decodes okay or a single dchar may not be a "character". Again, you can use these code points within your application, but in the words of the code page for U+FFF[EF], they are "not valid for interchange". Nothing is ever crystal clear in Unicode land. Chris.
Jan 30 2007
"Chris Paulson-Ellis" <chris edesix.com> wrote in message news:epohto$1qsd$1 digitaldaemon.com...The Unicode term is "code unit".Code _unit_. I thought it was something like that. I think the people who come up with Unicode have a little too much time on their hands.
Jan 30 2007
Jarrett Billingsley wrote:Code _unit_. I thought it was something like that. I think the people who come up with Unicode have a little too much time on their hands.Not all of them. UTF-8 was designed in one night and it is still better than UTF-16: http://www.cl.cam.ac.uk/~mgk25/ucs/utf-8-history.txt
Jan 31 2007
%u wrote:no answer from digitalmars.D.learn, try it here. == Posted at 2007/01/29 15:52 to digitalmars.D.learn I want to do explicit memory allocation for some of my objects, I'm reading: http://digitalmars.com/d/memory.html#newdelete which says: alignment. This is 8 on win32 systems. Then on the next section: http://digitalmars.com/d/memory.html#markrelease new(size_t sz) { void *p; p = &buffer[bufindex]; bufindex += sz; if (bufindex > buffer.length) throw new OutOfMemory; return p; } Is this code correct? I mean the object size (sz) could be any integer, how can one ensure the alignment requirement? If the above code in "Mark/Release" is incorrect, can anyone tell me how to return aligned memory pointers? and for lots of small objects, does alignment waste too much memory? Thanks.Most allocators align data to the largest primitive excepting perhaps 'real'. The malloc() manpage says 'for any data type' which implies at least 64 bits, but I'm not sure I would rely on more than 32 bit aligns on a 32 bit system. In the calling code you can align data with something like this, not tested btw: // Allocate sz bytes, aligned to a multiple of aln. void[] data = new void[sz + aln-1]; // enough bytes to make this work int shift = (aln - (data.ptr & (aln-1))) & (aln-1); data = data[shift..sz+shift]; // slice the array at a multiple of al This assumes 'al' is a power of 2. I can't imagine why a non-power-of-2 alignment would be useful, but you could replace "& (x-1)" with "% x" to get that. Kevin
Jan 30 2007