www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - enum ubyte[] vs enum ubyte[3]

reply "Johannes Pfau" <spam example.com> writes:
Hi,
I'm currently patching Ragel (http://www.complang.org/ragel/) to generate  
D2 compatible code. Right now it creates output like this for static  
arrays:
------------------------
enum ubyte[] _parseResponseLine_key_offsets = [
	0, 0, 17, 18, 37, 41, 42, 44,
	50, 51, 57, 58, 78, 98, 118, 136,
	138, 141, 143, 146, 148, 150, 152, 153,
	159, 160, 160, 162, 164
];
------------------------
Making it output "enum ubyte[30]" would be more complicated, so I wonder  
if there's a difference between "enum ubyte[]" and "enum ubyte[30]"?

-- 
Johannes Pfau
Dec 20 2010
next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Johannes Pfau:

Hello Johannes and thank you for developing your tool for D2 too :-)


 Making it output "enum ubyte[30]" would be more complicated, so I wonder  
 if there's a difference between "enum ubyte[]" and "enum ubyte[30]"?
In D1 a enum ubyte[] is a compile-time constant dynamic array of unsigned bytes, it is a 2 word long struct that contains a pointer and a length. In D1 you express the same thing with "const ubyte[]". In D2 a "enum ubyte[30]" is a compile-time constant fixed size array of 32 unsigned bytes that gets passed around by value. In D1 a "const ubyte[30]" is a compile-time constant fixed size array of 32 unsigned bytes that gets passed around by reference. So they are two different things and you use one or the other according to your needs. Currently there are also some low performance issues in D with enums that get re-created each time you use them (this is true for associative arrays, but I don't remember if this is true for dynamic arrays too). So better to take a look at the produced asm to be sure, if you want to avoid performance pitfalls. Regardless the array kind you want to use, also take a look at "Hex Strings": http://www.digitalmars.com/d/2.0/lex.html That allow you to write bytes arrays as hex data: x"00 FBCD 32FD 0A" Bye, bearophile
Dec 20 2010
parent reply "Johannes Pfau" <spam example.com> writes:
At 20.12.2010, 11:02, bearophile wrote <bearophileHUGS lycos.com>:
 Hello Johannes and thank you for developing your tool for D2 too :-)
Actually it's not mine, I'm just a regular user. I don't think I could ever understand the finite state machine code (especially because it's c++), but patching the c/d1 codegen to output d2 code is easy enough ;-)
 In D1 a enum ubyte[] is a compile-time constant dynamic array of  
 unsigned bytes, it is a 2 word long struct that contains a pointer and a  
 length.
Did you mean in D2? I feared that, so I'll have to do some extra work...
 In D2 a "enum ubyte[30]" is a compile-time constant fixed size array of  
 32 unsigned bytes that gets passed around by value.
Yep, that's what I want.
 Regardless the array kind you want to use, also take a look at "Hex  
 Strings":
 http://www.digitalmars.com/d/2.0/lex.html
 That allow you to write bytes arrays as hex data:
 x"00 FBCD 32FD 0A"
That's interesting, I'll have a look at it, but ragel shares big parts of the c/c++/d code, so as long as the C syntax works there's no need to change that.
 Bye,
 bearophile
Thanks for your help! -- Johannes Pfau
Dec 20 2010
parent bearophile <bearophileHUGS lycos.com> writes:
Johannes Pfau:

 Did you mean in D2?
Right, sorry. Bye, bearophile
Dec 20 2010
prev sibling next sibling parent reply Nick Voronin <elfy.nv gmail.com> writes:
On Mon, 20 Dec 2010 10:26:16 +0100
"Johannes Pfau" <spam example.com> wrote:

 Hi,
 I'm currently patching Ragel (http://www.complang.org/ragel/) to generate  
 D2 compatible code.
Interesting. Ragel-generated code works fine for me in D2. I suppose it mostly uses such a restricted C-like subset of language that it didn't change much from D1 to D2. But if you are going to patch it, please make it add extra {} around action code! The thing is that when there is a label before {} block (and in ragel generated code I saw it's always so) the block isn't considered as a new scope which causes problems when you have local variables declaration inside actions. Anyway, good luck with whatever you plan :) Ragel is cool.
 Right now it creates output like this for static  
 arrays:
 ------------------------
 enum ubyte[] _parseResponseLine_key_offsets = [
 	0, 0, 17, 18, 37, 41, 42, 44,
 	50, 51, 57, 58, 78, 98, 118, 136,
 	138, 141, 143, 146, 148, 150, 152, 153,
 	159, 160, 160, 162, 164
 ];
 ------------------------
 Making it output "enum ubyte[30]" would be more complicated, so I wonder  
 if there's a difference between "enum ubyte[]" and "enum ubyte[30]"?
One is fixed size array and other is dynamic. Honestly I doubt that it matters for code generated by Ragel, since this is constant and won't be passed around. If it's harder to make it fixed-size then don't bother. -- Nick Voronin <elfy.nv gmail.com>
Dec 20 2010
parent reply "Johannes Pfau" <spam example.com> writes:
On Monday, December 20, 2010, Nick Voronin <elfy.nv gmail.com> wrote:

 On Mon, 20 Dec 2010 10:26:16 +0100
 "Johannes Pfau" <spam example.com> wrote:

 Hi,
 I'm currently patching Ragel (http://www.complang.org/ragel/) to  
 generate
 D2 compatible code.
Interesting. Ragel-generated code works fine for me in D2. I suppose it mostly uses such a restricted C-like subset of language that it didn't change much from D1 to D2.
The most important change is const correctness. Because of that table based output didn't work with D2. And you couldn't directly pass const data (like string.ptr) to Ragel.
 But if you are going to patch it, please make it add extra {} around  
 action code! The thing is that when there is a label before {} block  
 (and in ragel generated code I saw it's always so) the block isn't  
 considered as a new scope which causes problems when you have local  
 variables declaration inside actions.
You mean like this code: --------------------------------- tr15: #line 228 "jpf/http/parser.rl" { if(start != p) { key = line[(start - line.ptr) .. (p - line.ptr)]; } } --------------------------------- should become: ? --------------------------------- tr15: #line 228 "jpf/http/parser.rl" {{ if(start != p) { key = line[(start - line.ptr) .. (p - line.ptr)]; } }} ---------------------------------
 One is fixed size array and other is dynamic. Honestly I doubt that it  
 matters for code generated by Ragel, since this is constant and won't be  
 passed around. If it's harder to make it fixed-size then don't bother.
Could a dynamic array cause heap allocations, even if it's data is never changed? If not, dynamic arrays would work fine. -- Johannes Pfau
Dec 20 2010
parent reply Nick Voronin <elfy.nv gmail.com> writes:
On Mon, 20 Dec 2010 17:17:05 +0100
"Johannes Pfau" <spam example.com> wrote:

 But if you are going to patch it, please make it add extra {} around  
 action code! The thing is that when there is a label before {} block  
 (and in ragel generated code I saw it's always so) the block isn't  
 considered as a new scope which causes problems when you have local  
 variables declaration inside actions.
You mean like this code: --------------------------------- tr15: #line 228 "jpf/http/parser.rl" { if(start != p) { key = line[(start - line.ptr) .. (p - line.ptr)]; } } --------------------------------- should become: ? --------------------------------- tr15: #line 228 "jpf/http/parser.rl" {{ if(start != p) { key = line[(start - line.ptr) .. (p - line.ptr)]; } }} ---------------------------------
Yes. This way it becomes a scope which is kind of what one would expect from it.
 One is fixed size array and other is dynamic. Honestly I doubt that it  
 matters for code generated by Ragel, since this is constant and won't be  
 passed around. If it's harder to make it fixed-size then don't bother.
Could a dynamic array cause heap allocations, even if it's data is never changed? If not, dynamic arrays would work fine.
Sorry, I can't provide reliable information on what can happen in general, but right now there is no difference in produced code accessing elements of enum ubyte[] and enum ubyte[30]. In both cases constants are directly embedded in code. In fact as long as you only access its elements (no passing array as an argument, no assignment to another variable and no accessing .ptr) there is no array object at all. If you do -- new object is created every time you do. I believe Ragel doesn't generate code which passes tables around, so it doesn't matter. -- Nick Voronin <elfy.nv gmail.com>
Dec 20 2010
parent "Johannes Pfau" <spam example.com> writes:
On Tuesday, December 21, 2010, Nick Voronin <elfy.nv gmail.com> wrote:

 On Mon, 20 Dec 2010 17:17:05 +0100
 "Johannes Pfau" <spam example.com> wrote:

 But if you are going to patch it, please make it add extra {} around
 action code! The thing is that when there is a label before {} block
 (and in ragel generated code I saw it's always so) the block isn't
 considered as a new scope which causes problems when you have local
 variables declaration inside actions.
You mean like this code: --------------------------------- tr15: #line 228 "jpf/http/parser.rl" { if(start != p) { key = line[(start - line.ptr) .. (p - line.ptr)]; } } --------------------------------- should become: ? --------------------------------- tr15: #line 228 "jpf/http/parser.rl" {{ if(start != p) { key = line[(start - line.ptr) .. (p - line.ptr)]; } }} ---------------------------------
Yes. This way it becomes a scope which is kind of what one would expect from it.
OK, I sent an updated patch to the ragel mailing list.
 One is fixed size array and other is dynamic. Honestly I doubt that it
 matters for code generated by Ragel, since this is constant and won't  
be
 passed around. If it's harder to make it fixed-size then don't bother.
Could a dynamic array cause heap allocations, even if it's data is never changed? If not, dynamic arrays would work fine.
Sorry, I can't provide reliable information on what can happen in general, but right now there is no difference in produced code accessing elements of enum ubyte[] and enum ubyte[30]. In both cases constants are directly embedded in code. In fact as long as you only access its elements (no passing array as an argument, no assignment to another variable and no accessing .ptr) there is no array object at all. If you do -- new object is created every time you do. I believe Ragel doesn't generate code which passes tables around, so it doesn't matter.
Well Adrian Thurston said he'd look into this issue when he merges the D2 patch, so I guess we'll get the correct arrays anyway ;-) -- Johannes Pfau
Dec 21 2010
prev sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Monday 20 December 2010 01:26:16 Johannes Pfau wrote:
 Hi,
 I'm currently patching Ragel (http://www.complang.org/ragel/) to generate
 D2 compatible code. Right now it creates output like this for static
 arrays:
 ------------------------
 enum ubyte[] _parseResponseLine_key_offsets = [
 	0, 0, 17, 18, 37, 41, 42, 44,
 	50, 51, 57, 58, 78, 98, 118, 136,
 	138, 141, 143, 146, 148, 150, 152, 153,
 	159, 160, 160, 162, 164
 ];
 ------------------------
 Making it output "enum ubyte[30]" would be more complicated, so I wonder
 if there's a difference between "enum ubyte[]" and "enum ubyte[30]"?
ubyte[] is a dynamic array. ubyte[30] is a static array. They are inherently different types. The fact that you're dealing with an enum is irrelevant. So, the code that you're generating is _not_ a static array. It's a dynamic array. This is inherently different from C or C++ where having [] on a type (whether it has a number or not) is _always_ a static array. - Jonathan M Davis
Dec 20 2010