D - Some more ideas

cblack01 (26/26) Aug 24 2002 Hello D World!

Pavel Minayev (6/9) Aug 24 2002 It was discussed not long ago. Walter says it cannot be done, even thoug...

Walter (12/20) Aug 24 2002 array

Pavel Minayev (9/17) Aug 25 2002

Walter (11/27) Aug 25 2002 wrote:

Pavel Minayev (8/11) Aug 25 2002

Burton Radons (7/18) Aug 25 2002 I've done this for my port. Works nice. Assignment, returning, and

Walter (11/27) Aug 26 2002 wrote:

Juan Carlos Arevalo Baeza (10/14) Aug 26 2002 Of course, you can always make a case for this, even when bitmasking ...

Mac Reiter (9/18) Aug 26 2002 Unfortunately, while this is valid in most cases, it is not always valid...
Walter (4/16) Aug 26 2002 It's not so easy, because the compiler has to prove that a.length doesn'...

Pavel Minayev (5/13) Aug 26 2002

Walter (5/16) Aug 26 2002 wrote:

Pavel Minayev (19/21) Aug 27 2002 Just how many dynamic arrays you have in your program? std::vector also ...

Walter (17/38) Aug 27 2002 wrote:

Pavel Minayev (11/17) Aug 27 2002 You get something close to that in a parser. For a large program, with t...

Mac Reiter (19/36) Aug 27 2002 Firstly, I suspect that Walter has a very good understanding of how pars...
Sean L. Palmer (12/29) Aug 27 2002 The trick to game programming memory allocation is to allocate all objec...

Russell Lewis (4/4) Aug 28 2002 One advantage of the gc being a mark & sweep rather than a refcounter is...

Walter (3/7) Aug 29 2002 Nope. You're right. There are a lot of methods of eliminating the gc pau...

Walter (12/31) Aug 29 2002 wrote:

Juan Carlos Arevalo Baeza (10/16) Aug 27 2002 You can always store the size _and_ capacity as part of the allocated

Walter (7/21) Aug 29 2002 one.

Pavel Minayev (6/8) Aug 29 2002
Juan Carlos Arevalo Baeza (14/23) Aug 29 2002 No. Just allocate it together (in the same block of memory as) the ar...

Walter (11/33) Aug 29 2002 allocated

Sean L. Palmer (9/11) Aug 30 2002 A slice is just a pointer and size register pair, right? So just mentio...

Walter (7/13) Aug 30 2002 Yes.

Sean L. Palmer (12/25) Aug 31 2002 Right. It would be a good source of bugs. Just like iterators in STL. ...

Walter (6/13) Aug 31 2002 during

anderson (44/58) Sep 01 2002 Currently, how do you resize slices?

Walter (8/47) Sep 01 2002 allocate a new size and copy.

anderson (7/68) Sep 02 2002 set

Walter (6/7) Sep 02 2002 Right now, a reference to a dynamic array is a 2 word quantity. It fits

chris jones (18/37) Sep 11 2002 allocated

Walter (9/18) Sep 11 2002 of

Roland (7/11) Sep 02 2002 it seems to me not much if end_index (a.length & 7FFFFFFF) is evaluated ...

Walter (6/16) Sep 02 2002 only

Patrick Down (14/17) Aug 25 2002 I have another question along these lines.

Walter (3/20) Aug 25 2002 The entire block will be held.

Patrick Down (5/6) Aug 25 2002 That's what I thought. This should probably be

"cblack01" <cblack01 cox.net> writes:

Hello D World!

I've been looking over the specs for D.  I really like the way it's
evolving.  As always, I have some ideas.

My first suggestion is to use the std::vector resizing algorithm for array
resizing.  This way, you can add elements to an array without having to
worry about inefficiency.

Another suggestion is to include set syntax.  I think Delphi has sets if I'm
not mistaken.  Sets simplify syntax and save lots of typing.  There are two
ways to use sets.  The first is with the "in" keyword, or "is an element
of", which would work like this:

int a;
...
if(a in {1 .. 3, 6, 9}) { ... }

Note the {1 .. 3, 6, 9} following the "in" keyword is a set.  The compiler
would translate the above expression into:

if (a >=1 && a <=3 || a ==6 || a==9) { ... }

Sets could also be used to perform iterations.  Perhaps something like:

for(int a = {1 .. 3, 6, 9}) { ... }

And the for loop would work as you would expect.  The expression could be
expanded by defining the "..." to be a function.  Then the expression would
expand to:

void func(int a) { ... }
for(int a = 1; a <=3; a++)func(a);
func(6);
func(9);

-Craig

Aug 24 2002

Pavel Minayev <evilone omen.ru> writes:

On Sat, 24 Aug 2002 14:56:10 -0400 "cblack01" <cblack01 cox.net> wrote:

 My first suggestion is to use the std::vector resizing algorithm for array
 resizing.  This way, you can add elements to an array without having to
 worry about inefficiency.

It was discussed not long ago. Walter says it cannot be done, even though I 
don't really
understand reasons - but as soon as templates get implemented, we could write
our own vector.

Aug 24 2002

"Walter" <walter digitalmars.com> writes:

"Pavel Minayev" <evilone omen.ru> wrote in message
news:CFN374931050334954 news.digitalmars.com...
 On Sat, 24 Aug 2002 14:56:10 -0400 "cblack01" <cblack01 cox.net> wrote:

 My first suggestion is to use the std::vector resizing algorithm for


array
 resizing.  This way, you can add elements to an array without having to
 worry about inefficiency.

 It was discussed not long ago. Walter says it cannot be done, even though

I
 don't really
 understand reasons - but as soon as templates get implemented, we could

write
 our own vector.

The reason is the resizer doesn't know if some array slice is using the
memory just above it. For example, allocate an array of 100. Take a slice
from 0..50. Now take the slice, and resize it up to 80. It will step all
over the original array. For example:

    char a[] = new char[100];
    char b[] = a[0..50];
    b.length = b.length + 30;

Aug 24 2002

Pavel Minayev <evilone omen.ru> writes:

On Sat, 24 Aug 2002 20:29:23 -0700 "Walter" <walter digitalmars.com> wrote:

 The reason is the resizer doesn't know if some array slice is using the
 memory just above it. For example, allocate an array of 100. Take a slice
 from 0..50. Now take the slice, and resize it up to 80. It will step all
 over the original array. For example:
 
     char a[] = new char[100];
     char b[] = a[0..50];
     b.length = b.length + 30;

 
The answer is simple: slices should not be resized (and the compiler must be
able to catch it). This probably needs some kind of flag, for example the most
significant bit of length could serve as an indicator. Resizing slices is a 
weird
idea anyhow... Of course ~= should also be forbidden, as well as any other 
operations that change length. Once slice is taken, it cannot be changed - only
data within it can.

Aug 25 2002

"Walter" <walter digitalmars.com> writes:

"Pavel Minayev" <evilone omen.ru> wrote in message
news:CFN374934732140394 news.digitalmars.com...
 On Sat, 24 Aug 2002 20:29:23 -0700 "Walter" <walter digitalmars.com>

wrote:
 The reason is the resizer doesn't know if some array slice is using the
 memory just above it. For example, allocate an array of 100. Take a


slice
 from 0..50. Now take the slice, and resize it up to 80. It will step all
 over the original array. For example:

     char a[] = new char[100];
     char b[] = a[0..50];
     b.length = b.length + 30;

 The answer is simple: slices should not be resized (and the compiler must

be
 able to catch it). This probably needs some kind of flag, for example the

most
 significant bit of length could serve as an indicator. Resizing slices is

a
 weird
 idea anyhow... Of course ~= should also be forbidden, as well as any other
 operations that change length. Once slice is taken, it cannot be changed -

only
 data within it can.

The problem is distinguishing a slice from the original allocation. Having
an extra bit I suspect will kill off any performance gains that you'd get
from resizing in place.

Aug 25 2002

Pavel Minayev <evilone omen.ru> writes:

On Sun, 25 Aug 2002 09:25:23 -0700 "Walter" <walter digitalmars.com> wrote:

 The problem is distinguishing a slice from the original allocation. Having
 an extra bit I suspect will kill off any performance gains that you'd get
 from resizing in place.

 
Why? It would be just a single bitwise AND to check the bit, and, if it is a 
slice, another one to
reset it. On the other hand, adding elements to end of list would be _hundred_ 
times faster
on large lists.

Aug 25 2002

Burton Radons <loth users.sourceforge.net> writes:

Pavel Minayev wrote:

 On Sun, 25 Aug 2002 09:25:23 -0700 "Walter" <walter digitalmars.com> wrote:
 
The problem is distinguishing a slice from the original allocation. Having
an extra bit I suspect will kill off any performance gains that you'd get
from resizing in place.

  
 Why? It would be just a single bitwise AND to check the bit, and, if it is a 
 slice, another one to
 reset it. On the other hand, adding elements to end of list would be _hundred_ 
 times faster
 on large lists. 

I've done this for my port.  Works nice.  Assignment, returning, and 
passing as an in argument also clear the bit, although that doesn't have 
to be done if the previous owner doesn't take advantage of it or is 
going out of scope.  The only costly operation is in finding out how 
much more space a list has in it, but this is nothing compared to doing 
allocation too much.

Aug 25 2002

"Walter" <walter digitalmars.com> writes:

"Burton Radons" <loth users.sourceforge.net> wrote in message
news:3D6938BA.1070809 users.sourceforge.net...
 Pavel Minayev wrote:
 On Sun, 25 Aug 2002 09:25:23 -0700 "Walter" <walter digitalmars.com>


wrote:
The problem is distinguishing a slice from the original allocation.



Having
an extra bit I suspect will kill off any performance gains that you'd



get
from resizing in place.

 Why? It would be just a single bitwise AND to check the bit, and, if it


is a
 slice, another one to
 reset it. On the other hand, adding elements to end of list would be


_hundred_
 times faster
 on large lists.

 I've done this for my port.  Works nice.  Assignment, returning, and
 passing as an in argument also clear the bit, although that doesn't have
 to be done if the previous owner doesn't take advantage of it or is
 going out of scope.  The only costly operation is in finding out how
 much more space a list has in it, but this is nothing compared to doing
 allocation too much.

for (i = 0; i < a.length; i++)

becomes:

   for (i = 0; i < (a.length & 7FFFFFFF); i++)

which can have a significant speed penalty.

Aug 26 2002

"Juan Carlos Arevalo Baeza" <jcab JCABs-Rumblings.com> writes:

"Walter" <walter digitalmars.com> wrote in message
news:akdohj$2m1h$1 digitaldaemon.com...


 for (i = 0; i < a.length; i++)

 becomes:

    for (i = 0; i < (a.length & 7FFFFFFF); i++)

 which can have a significant speed penalty.

   Of course, you can always make a case for this, even when bitmasking is
not necessary:

int n = a.length & 7FFFFFFF;
for (i = 0; i < n; i++)

   Although, I would expect any decent compiler (wink, wink) to do that
optimization for you. Ahem...

Salutaciones,
                       JCAB

Aug 26 2002

Mac Reiter <Mac_member pathlink.com> writes:

    for (i = 0; i < (a.length & 7FFFFFFF); i++)

 which can have a significant speed penalty.

   Of course, you can always make a case for this, even when bitmasking is
not necessary:

int n = a.length & 7FFFFFFF;
for (i = 0; i < n; i++)

   Although, I would expect any decent compiler (wink, wink) to do that
optimization for you. Ahem...

Unfortunately, while this is valid in most cases, it is not always valid.  If
something inside the 'for' loop resized 'a', the original loop would still be
correct (at least in terms of not going outside the loop.  'i' would usually
also have to be manipulated by whatever did the resizing to make the loop
logically correct).  The optimized loop would not notice the change to length,
and could cause a memory violation, corruption, or incomplete loop.  If it were
a compiler optimization, presumably the compiler could note the array resizing
inside the loop body and avoid the optimization.

Mac

Aug 26 2002

"Walter" <walter digitalmars.com> writes:

"Juan Carlos Arevalo Baeza" <jcab JCABs-Rumblings.com> wrote in message
news:akds2h$2pvb$1 digitaldaemon.com...
 "Walter" <walter digitalmars.com> wrote in message
 news:akdohj$2m1h$1 digitaldaemon.com...
 for (i = 0; i < a.length; i++)
 becomes:
    for (i = 0; i < (a.length & 7FFFFFFF); i++)
 which can have a significant speed penalty.

    Of course, you can always make a case for this, even when bitmasking is
 not necessary:

 int n = a.length & 7FFFFFFF;
 for (i = 0; i < n; i++)

    Although, I would expect any decent compiler (wink, wink) to do that
 optimization for you. Ahem...

It's not so easy, because the compiler has to prove that a.length doesn't
change inside the loop.

Aug 26 2002

Pavel Minayev <evilone omen.ru> writes:

On Mon, 26 Aug 2002 10:28:16 -0700 "Walter" <walter digitalmars.com> wrote:

 for (i = 0; i < a.length; i++)
 
 becomes:
 
    for (i = 0; i < (a.length & 7FFFFFFF); i++)
 
 which can have a significant speed penalty.
 

 
Store the flag in the capacity field, not in the length (capacity as in 
std::vector).
It is only queried whenever new element is added.

Aug 26 2002

"Walter" <walter digitalmars.com> writes:

"Pavel Minayev" <evilone omen.ru> wrote in message
news:CFN374949765569907 news.digitalmars.com...
 On Mon, 26 Aug 2002 10:28:16 -0700 "Walter" <walter digitalmars.com>

wrote:
 for (i = 0; i < a.length; i++)

 becomes:

    for (i = 0; i < (a.length & 7FFFFFFF); i++)

 which can have a significant speed penalty.

 Store the flag in the capacity field, not in the length (capacity as in
 std::vector).
 It is only queried whenever new element is added.

That makes a dynamic array a 3 word quantity instead of a 2 word one.
There's a significant penalty for that.

Aug 26 2002

Pavel Minayev <evilone omen.ru> writes:

On Mon, 26 Aug 2002 17:46:32 -0700 "Walter" <walter digitalmars.com> wrote:

 That makes a dynamic array a 3 word quantity instead of a 2 word one.
 There's a significant penalty for that.

Just how many dynamic arrays you have in your program? std::vector also uses
3 words, by the way, but I haven't heard anyone complaining yet. I don't 
remember
exactly, but Delphi dynamic strings might also use 3 words.

On other hand, current implementation of dynamic arrays makes them simply 
useless
(at least for me). When you have an "append to end" ~= operator, but advised 
against
using it because it is extremely slow, it is awful. What you propose is doing 
memory
management manually, preallocating chunks of memory and filling them - back to
REDIM days? What's the point of D dynamic arrays then?

One of the nice features of D is that it claims to cover large area of STL 
functionality,
but with simpler approach. std::vector is probably the most basic container 
type,
yet there's nothing in D so far that can be really used instead of it...

Aug 27 2002

"Walter" <walter digitalmars.com> writes:

"Pavel Minayev" <evilone omen.ru> wrote in message
news:CFN374955394183796 news.digitalmars.com...
 On Mon, 26 Aug 2002 17:46:32 -0700 "Walter" <walter digitalmars.com>

wrote:
 That makes a dynamic array a 3 word quantity instead of a 2 word one.
 There's a significant penalty for that.

 Just how many dynamic arrays you have in your program?

I use them all over the place <g>. As a technical aside, being two words
means they can be implemented using register pairs, which should make them
pretty efficient. Three words means they don't get enregistered.

 std::vector also uses
 3 words, by the way, but I haven't heard anyone complaining yet. I don't
 remember
 exactly, but Delphi dynamic strings might also use 3 words.

I didn't know that.

 On other hand, current implementation of dynamic arrays makes them simply
 useless
 (at least for me). When you have an "append to end" ~= operator, but

advised
 against
 using it because it is extremely slow, it is awful. What you propose is

doing
 memory
 management manually, preallocating chunks of memory and filling them -

back to
 REDIM days? What's the point of D dynamic arrays then?

The tradeoff is to get faster loops on arrays at the expense of slower
resizing. The only time the resize speed is really a problem are things
like:
    for (i = 0; i < 1000; i++)
        a ~= "hello";
and I suggest that such uses aren't that common.

 One of the nice features of D is that it claims to cover large area of STL
 functionality,
 but with simpler approach. std::vector is probably the most basic

container
 type,
 yet there's nothing in D so far that can be really used instead of it...

Aug 27 2002

Pavel Minayev <evilone omen.ru> writes:

On Tue, 27 Aug 2002 10:36:10 -0700 "Walter" <walter digitalmars.com> wrote:

 The tradeoff is to get faster loops on arrays at the expense of slower
 resizing. The only time the resize speed is really a problem are things
 like:
     for (i = 0; i < 1000; i++)
         a ~= "hello";
 and I suggest that such uses aren't that common.

You get something close to that in a parser. For a large program, with thousands
of identifiers (e.g. Win32 headers =)), dynamic arrays grow very large, and 
every
new declared identifier requires reallocation... I noticed it when I ran my 
PAS2D
converter. It is _very_ noticeable. The same will happen in a game if one 
decides
to use a dynamic array to store game entities - due to constant creation of new
objects (projectiles, particles, blood etc). This also happens when somebody
reads a string character by character, which is quite a common operation.

Aug 27 2002

Mac Reiter <Mac_member pathlink.com> writes:

In article <CFN374959716746759 news.digitalmars.com>, Pavel Minayev says...
On Tue, 27 Aug 2002 10:36:10 -0700 "Walter" <walter digitalmars.com> wrote:

 The tradeoff is to get faster loops on arrays at the expense of slower
 resizing. The only time the resize speed is really a problem are things
 like:
     for (i = 0; i < 1000; i++)
         a ~= "hello";
 and I suggest that such uses aren't that common.

You get something close to that in a parser. For a large program, with thousands
of identifiers (e.g. Win32 headers =)), dynamic arrays grow very large, and 
every
new declared identifier requires reallocation... I noticed it when I ran my 
PAS2D
converter. It is _very_ noticeable. The same will happen in a game if one 
decides
to use a dynamic array to store game entities - due to constant creation of new
objects (projectiles, particles, blood etc). This also happens when somebody
reads a string character by character, which is quite a common operation.

Firstly, I suspect that Walter has a very good understanding of how parsers
work...

Secondly, wouldn't an associative array be better for some of these
applications?  Parsers, at least, need fast lookups, and depending on what
you're doing in your game you may also need fast lookups (dunno -- not a great
deal of real game programming experience yet -- all book learnin')  Associative
arrays *do* automatically increase in size as necessary, although I do not know
the exact details of the underlying extendible hash mechanism.  The rehash
mechanism is also quite nice for optimizing lookups after loading in an
associative array.

I would assume that associative arrays are also more efficient, at least in the
space domain, for sparse array usages.

This is not meant as an argument against auto-resizing arrays.  I just wanted to
mention that at least one form of auto-resizing array does appear to exist.
Personally, I always use .reserve even on my STL vectors.  It doesn't bother me
to determine my own growth rate mechanism.  But other people have different
tastes...

Mac

Aug 27 2002

"Sean L. Palmer" <seanpalmer earthlink.net> writes:

The trick to game programming memory allocation is to allocate all objects
from fixed pools.  You can even do that in Java.  Too bad you can't disable
the GC completely.

Sean

"Pavel Minayev" <evilone omen.ru> wrote in message
news:CFN374959716746759 news.digitalmars.com...
 On Tue, 27 Aug 2002 10:36:10 -0700 "Walter" <walter digitalmars.com>

wrote:
 The tradeoff is to get faster loops on arrays at the expense of slower
 resizing. The only time the resize speed is really a problem are things
 like:
     for (i = 0; i < 1000; i++)
         a ~= "hello";
 and I suggest that such uses aren't that common.

 You get something close to that in a parser. For a large program, with

thousands
 of identifiers (e.g. Win32 headers =)), dynamic arrays grow very large,

and
 every
 new declared identifier requires reallocation... I noticed it when I ran

my
 PAS2D
 converter. It is _very_ noticeable. The same will happen in a game if one
 decides
 to use a dynamic array to store game entities - due to constant creation

of new
 objects (projectiles, particles, blood etc). This also happens when

somebody
 reads a string character by character, which is quite a common operation.

Aug 27 2002

Russell Lewis <spamhole-2001-07-16 deming-os.org> writes:

One advantage of the gc being a mark & sweep rather than a refcounter is 
that, once you call gc.disable(), it is altogether disabled (other than 
the memory footprint for the gc code).

Am I wrong here?

Aug 28 2002

"Walter" <walter digitalmars.com> writes:

"Russell Lewis" <spamhole-2001-07-16 deming-os.org> wrote in message
news:3D6CDE4E.2030704 deming-os.org...
 One advantage of the gc being a mark & sweep rather than a refcounter is
 that, once you call gc.disable(), it is altogether disabled (other than
 the memory footprint for the gc code).

 Am I wrong here?

Nope. You're right. There are a lot of methods of eliminating the gc pause.

Aug 29 2002

"Walter" <walter digitalmars.com> writes:

"Pavel Minayev" <evilone omen.ru> wrote in message
news:CFN374959716746759 news.digitalmars.com...
 On Tue, 27 Aug 2002 10:36:10 -0700 "Walter" <walter digitalmars.com>

wrote:
 The tradeoff is to get faster loops on arrays at the expense of slower
 resizing. The only time the resize speed is really a problem are things
 like:
     for (i = 0; i < 1000; i++)
         a ~= "hello";
 and I suggest that such uses aren't that common.

 You get something close to that in a parser. For a large program, with

thousands
 of identifiers (e.g. Win32 headers =)), dynamic arrays grow very large,

and
 every
 new declared identifier requires reallocation... I noticed it when I ran

my
 PAS2D
 converter. It is _very_ noticeable.

I'm not sure what your program is doing, but if it is a symbol table, using
an associative array will get you much faster results.

 The same will happen in a game if one
 decides
 to use a dynamic array to store game entities - due to constant creation

of new
 objects (projectiles, particles, blood etc).

 This also happens when somebody
 reads a string character by character, which is quite a common operation.

What I do for those cases is read the entire file into one array, and then
map slices onto it. The wc is an example of that (also of the associative
array).

Aug 29 2002

"Juan Carlos Arevalo Baeza" <jcab JCABs-Rumblings.com> writes:

"Walter" <walter digitalmars.com> wrote in message
news:akgeie$2qh4$1 digitaldaemon.com...

 That makes a dynamic array a 3 word quantity instead of a 2 word one.
 There's a significant penalty for that.

 Just how many dynamic arrays you have in your program?

 I use them all over the place <g>. As a technical aside, being two words
 means they can be implemented using register pairs, which should make them
 pretty efficient. Three words means they don't get enregistered.

   You can always store the size _and_ capacity as part of the allocated
block of memory. It would mean an extra level of indirection for some
operations, but the dynamic array would occupy one register, instead of two.

   It's somewhat of a tradeoff, but it sounds reasonable to me: extra
indirection (optimizable in many cases), smaller register footprint and
better reallocation in the worst case.

Salutaciones,
                       JCAB

Aug 27 2002

"Walter" <walter digitalmars.com> writes:

"Juan Carlos Arevalo Baeza" <jcab JCABs-Rumblings.com> wrote in message
news:akgq9d$8c8$1 digitaldaemon.com...
 "Walter" <walter digitalmars.com> wrote in message
 news:akgeie$2qh4$1 digitaldaemon.com...
 That makes a dynamic array a 3 word quantity instead of a 2 word




one.
 There's a significant penalty for that.

 Just how many dynamic arrays you have in your program?

 I use them all over the place <g>. As a technical aside, being two words
 means they can be implemented using register pairs, which should make


them
 pretty efficient. Three words means they don't get enregistered.

    You can always store the size _and_ capacity as part of the allocated
 block of memory. It would mean an extra level of indirection for some
 operations, but the dynamic array would occupy one register, instead of

two.
    It's somewhat of a tradeoff, but it sounds reasonable to me: extra
 indirection (optimizable in many cases), smaller register footprint and
 better reallocation in the worst case.

That would require a memory allocation (for the 3 word quantity) for every
dynamic array. I suspect it would be a big hit on performance.

Aug 29 2002

Pavel Minayev <evilone omen.ru> writes:

"Walter" <walter digitalmars.com> wrote in
news:aklmo8$pm3$1 digitaldaemon.com: 

 That would require a memory allocation (for the 3 word quantity) for
 every dynamic array. I suspect it would be a big hit on performance.

 
Don't suspect. Just try it. A simple D program, written both
using current implementation of dynamic arrays, and then
another that emulates the proposal.

Aug 29 2002

"Juan Carlos Arevalo Baeza" <jcab JCABs-Rumblings.com> writes:

"Walter" <walter digitalmars.com> wrote in message
news:aklmo8$pm3$1 digitaldaemon.com...

    You can always store the size _and_ capacity as part of the allocated
 block of memory. It would mean an extra level of indirection for some
 operations, but the dynamic array would occupy one register, instead of

 two.
    It's somewhat of a tradeoff, but it sounds reasonable to me: extra
 indirection (optimizable in many cases), smaller register footprint and
 better reallocation in the worst case.

 That would require a memory allocation (for the 3 word quantity) for every
 dynamic array. I suspect it would be a big hit on performance.

   No. Just allocate it together (in the same block of memory as) the array
elements themselves. No performance hit besides the extra indirection and
any calculations neccessary to preserve alignment.

   It's a pretty common thing to do. All implementations of std::string that
I know (except one - flex_string by Alexandrescu, which allocates it
separately) of use this to store the string's reference counter, for
example.

   So, for example, an array of 64-bit doubles, if it allocates storage for
4 elements, it allocates 4*8 (elements) + 4 (size) + 4 (capacity) = 40
bytes. Simple and easy.

Salutaciones,
                       JCAB

Aug 29 2002

"Walter" <walter digitalmars.com> writes:

"Juan Carlos Arevalo Baeza" <jcab JCABs-Rumblings.com> wrote in message
news:akm8rv$1e9n$1 digitaldaemon.com...
 "Walter" <walter digitalmars.com> wrote in message
 news:aklmo8$pm3$1 digitaldaemon.com...

    You can always store the size _and_ capacity as part of the



allocated
 block of memory. It would mean an extra level of indirection for some
 operations, but the dynamic array would occupy one register, instead



of
 two.
    It's somewhat of a tradeoff, but it sounds reasonable to me: extra
 indirection (optimizable in many cases), smaller register footprint



and
 better reallocation in the worst case.

 That would require a memory allocation (for the 3 word quantity) for


every
 dynamic array. I suspect it would be a big hit on performance.

    No. Just allocate it together (in the same block of memory as) the

array
 elements themselves. No performance hit besides the extra indirection and
 any calculations neccessary to preserve alignment.

    It's a pretty common thing to do. All implementations of std::string

that
 I know (except one - flex_string by Alexandrescu, which allocates it
 separately) of use this to store the string's reference counter, for
 example.

    So, for example, an array of 64-bit doubles, if it allocates storage

for
 4 elements, it allocates 4*8 (elements) + 4 (size) + 4 (capacity) = 40
 bytes. Simple and easy.

Allocating it together means that slices (as currently done) cannot work. A
slice would require copying the array contents.

Aug 29 2002

"Sean L. Palmer" <seanpalmer earthlink.net> writes:

A slice is just a pointer and size register pair, right?  So just mention in
the spec that if you resize a dynamic array, all slices of it become
invalid.

Or maybe resizing a dynamic array that has had slices made of it, requires
it to be copied (leaving the old array untouched, until GC reclaims it)

Sean

"Walter" <walter digitalmars.com> wrote in message
news:akmeg2$1k62$1 digitaldaemon.com...
 Allocating it together means that slices (as currently done) cannot work.

A
 slice would require copying the array contents.

Aug 30 2002

"Walter" <walter digitalmars.com> writes:

"Sean L. Palmer" <seanpalmer earthlink.net> wrote in message
news:aknd4g$2kgn$1 digitaldaemon.com...
 A slice is just a pointer and size register pair, right?

Yes.

 So just mention in
 the spec that if you resize a dynamic array, all slices of it become
 invalid.

That's the best idea yet. It might actually work <g>. The downside, of
course, is it can lead to subtle and disastrous programming bugs.

 Or maybe resizing a dynamic array that has had slices made of it, requires
 it to be copied (leaving the old array untouched, until GC reclaims it)

That turns every slice into a function call with some arbitrary code being
executed.

Aug 30 2002

"Sean L. Palmer" <seanpalmer earthlink.net> writes:

"Walter" <walter digitalmars.com> wrote in message
news:akpd4j$231a$1 digitaldaemon.com...
 "Sean L. Palmer" <seanpalmer earthlink.net> wrote in message
 news:aknd4g$2kgn$1 digitaldaemon.com...
 A slice is just a pointer and size register pair, right?

 Yes.

 So just mention in
 the spec that if you resize a dynamic array, all slices of it become
 invalid.

 That's the best idea yet. It might actually work <g>. The downside, of
 course, is it can lead to subtle and disastrous programming bugs.

Right.  It would be a good source of bugs.  Just like iterators in STL.  ;)

 Or maybe resizing a dynamic array that has had slices made of it,


requires
 it to be copied (leaving the old array untouched, until GC reclaims it)

 That turns every slice into a function call with some arbitrary code being
 executed.

No, that means every time you slice an array you have to set a flag bit
somewhere in the array so that when the array is resized, it will know it
must be copied somewhere else safe, leaving the existing slices unharmed,
and of course clear the bit in the new copy.  99% of the work happens during
darray resize, which is going to be slow anyway.

Maybe it's a bad idea... that setting of a bit per slice operation may add
up to be noticeable.

Sean

Aug 31 2002

"Walter" <walter digitalmars.com> writes:

"Sean L. Palmer" <seanpalmer earthlink.net> wrote in message
news:akptsn$2l4q$1 digitaldaemon.com...
 No, that means every time you slice an array you have to set a flag bit
 somewhere in the array so that when the array is resized, it will know it
 must be copied somewhere else safe, leaving the existing slices unharmed,
 and of course clear the bit in the new copy.  99% of the work happens

during
 darray resize, which is going to be slow anyway.

 Maybe it's a bad idea... that setting of a bit per slice operation may add
 up to be noticeable.

This would require not only setting a bit per slice, but setting that bit
every time you make a copy of a reference to an array. Issues similar to
reference counting come up.

Aug 31 2002

"anderson" <anderson firestar.com.au> writes:

Currently, how do you resize slices?

Just another idea to add to the list.
Parhaps an array could be taged as either having extra elements or not using
a single bit. Arrays that do not have this bit set would be treated
normally, until they are resized. When an array is resized the bit is set
and it is oversized my a predefinded step. When the gc runs it could also
perform an action to reclaim memory, by simply removing the bit.

ie
For example (jumps of 10)
Length = 8 size  = 8 items  (with out bit set)
Length = 8 size = 10 items  (with bit set)
Length = 12 size  = 12 items  (with out bit set)
Length = 12 size = 20 items  (with bit set)

This way there would be only one extra bit needed per array.

If you have memory to spare a frame count could be added to the array (ie a
byte), so that array items could have slighly longer life times.

Still theres problems with slices.

I think slices should be dealt with like this at compile time as a constant
array.

const int[] Slice;
const int[] Slice2;
int [] something;
...

//Assign slice
Slice = Something[1..10];

//As a parmitor
function(const int[] aSlice)
{

}

//Implicit conversion
Something = Slice;

//Implicit conversion
Slice = Something;

//Error
Slice = Slice ~ Somthing;

//Impicit conversion
Something = Slice ~ Something;

//Fine
Slice = Slice2;

"Walter" <walter digitalmars.com> wrote in message
news:akr27q$143p$1 digitaldaemon.com...
 "Sean L. Palmer" <seanpalmer earthlink.net> wrote in message
 news:akptsn$2l4q$1 digitaldaemon.com...
 No, that means every time you slice an array you have to set a flag bit
 somewhere in the array so that when the array is resized, it will know


it
 must be copied somewhere else safe, leaving the existing slices


unharmed,
 and of course clear the bit in the new copy.  99% of the work happens

 during
 darray resize, which is going to be slow anyway.

 Maybe it's a bad idea... that setting of a bit per slice operation may


add
 up to be noticeable.

 This would require not only setting a bit per slice, but setting that bit
 every time you make a copy of a reference to an array. Issues similar to
 reference counting come up.

Sep 01 2002

"Walter" <walter digitalmars.com> writes:

"anderson" <anderson firestar.com.au> wrote in message
news:aktj5r$t5d$1 digitaldaemon.com...
 Currently, how do you resize slices?

allocate a new size and copy.

 Just another idea to add to the list.
 Parhaps an array could be taged as either having extra elements or not

using
 a single bit. Arrays that do not have this bit set would be treated
 normally, until they are resized. When an array is resized the bit is set
 and it is oversized my a predefinded step. When the gc runs it could also
 perform an action to reclaim memory, by simply removing the bit.

 ie
 For example (jumps of 10)
 Length = 8 size  = 8 items  (with out bit set)
 Length = 8 size = 10 items  (with bit set)
 Length = 12 size  = 12 items  (with out bit set)
 Length = 12 size = 20 items  (with bit set)

 This way there would be only one extra bit needed per array.

 If you have memory to spare a frame count could be added to the array (ie

a
 byte), so that array items could have slighly longer life times.

 Still theres problems with slices.

 I think slices should be dealt with like this at compile time as a

constant
 array.

 const int[] Slice;
 const int[] Slice2;
 int [] something;
 ...

 //Assign slice
 Slice = Something[1..10];

 //As a parmitor
 function(const int[] aSlice)
 {

 }

 //Implicit conversion
 Something = Slice;

 //Implicit conversion
 Slice = Something;

 //Error
 Slice = Slice ~ Somthing;

 //Impicit conversion
 Something = Slice ~ Something;

 //Fine
 Slice = Slice2;

Setting a bit means that accessing an array means masking off the bit. There
can be a significant penalty for loops doing this.

Sep 01 2002

"anderson" <anderson firestar.com.au> writes:

"Walter" <walter digitalmars.com> wrote in message
news:aku2uu$1e8m$1 digitaldaemon.com...
 "anderson" <anderson firestar.com.au> wrote in message
 news:aktj5r$t5d$1 digitaldaemon.com...
 Currently, how do you resize slices?

 allocate a new size and copy.

 Just another idea to add to the list.
 Parhaps an array could be taged as either having extra elements or not

 using
 a single bit. Arrays that do not have this bit set would be treated
 normally, until they are resized. When an array is resized the bit is


set
 and it is oversized my a predefinded step. When the gc runs it could


also
 perform an action to reclaim memory, by simply removing the bit.

 ie
 For example (jumps of 10)
 Length = 8 size  = 8 items  (with out bit set)
 Length = 8 size = 10 items  (with bit set)
 Length = 12 size  = 12 items  (with out bit set)
 Length = 12 size = 20 items  (with bit set)

 This way there would be only one extra bit needed per array.

 If you have memory to spare a frame count could be added to the array


(ie
 a
 byte), so that array items could have slighly longer life times.

 Still theres problems with slices.

 I think slices should be dealt with like this at compile time as a

 constant
 array.

 const int[] Slice;
 const int[] Slice2;
 int [] something;
 ...

 //Assign slice
 Slice = Something[1..10];

 //As a parmitor
 function(const int[] aSlice)
 {

 }

 //Implicit conversion
 Something = Slice;

 //Implicit conversion
 Slice = Something;

 //Error
 Slice = Slice ~ Somthing;

 //Impicit conversion
 Something = Slice ~ Something;

 //Fine
 Slice = Slice2;

 Setting a bit means that accessing an array means masking off the bit.

There
 can be a significant penalty for loops doing this.

Howable using an entire byte for a lifetime value instead.

Sep 02 2002

"Walter" <walter digitalmars.com> writes:

"anderson" <anderson firestar.com.au> wrote in message
news:akvere$2li$1 digitaldaemon.com...
 Howable using an entire byte for a lifetime value instead.

Right now, a reference to a dynamic array is a 2 word quantity. It fits
nicely into the code generator which deals efficently with register pairs.
Increasing the size of the reference will make for much less efficient code
to handle the references.

Sep 02 2002

"chris jones" <flak clara.co.uk> writes:

"Juan Carlos Arevalo Baeza" <jcab JCABs-Rumblings.com> wrote in message
news:akm8rv$1e9n$1 digitaldaemon.com...
 "Walter" <walter digitalmars.com> wrote in message
 news:aklmo8$pm3$1 digitaldaemon.com...

    You can always store the size _and_ capacity as part of the



allocated
 block of memory. It would mean an extra level of indirection for some
 operations, but the dynamic array would occupy one register, instead



of
 two.
    It's somewhat of a tradeoff, but it sounds reasonable to me: extra
 indirection (optimizable in many cases), smaller register footprint



and
 better reallocation in the worst case.

 That would require a memory allocation (for the 3 word quantity) for


every
 dynamic array. I suspect it would be a big hit on performance.

    No. Just allocate it together (in the same block of memory as) the

array
 elements themselves. No performance hit besides the extra indirection and
 any calculations neccessary to preserve alignment.

    It's a pretty common thing to do. All implementations of std::string

that
 I know (except one - flex_string by Alexandrescu, which allocates it
 separately) of use this to store the string's reference counter, for
 example.

Thats how Delphi does it, a dynamic array is a pointer, the first 2 words of
the memory block are the ref count and the number of elements . The pointer
points to the first element of the array so the ref cound is at -8 and
lenght is at -4.

Also in delhpi strings are arrays with copy on write but the neat thing
Delphi does is that it automaticly maintans a trailing null char at the end.
So to cast a delphi string to a C string you just point at the first char,
zero overhead. And because the lenght is stored you dont have to scan the
string to determine its length.

chris

Sep 11 2002

"Walter" <walter digitalmars.com> writes:

"chris jones" <flak clara.co.uk> wrote in message
news:alo02r$njr$2 digitaldaemon.com...
 Thats how Delphi does it, a dynamic array is a pointer, the first 2 words

of
 the memory block are the ref count and the number of elements . The

pointer
 points to the first element of the array so the ref cound is at -8 and
 lenght is at -4.

That will work, but it's slower than D's implementation. It also makes
slices not possible without copying.

 Also in delhpi strings are arrays with copy on write but the neat thing
 Delphi does is that it automaticly maintans a trailing null char at the

end.
 So to cast a delphi string to a C string you just point at the first char,
 zero overhead. And because the lenght is stored you dont have to scan the
 string to determine its length.

D in general tries to put 0 bytes at the ends of character strings for the
same reason, but it is not possible for slices.

Sep 11 2002

Roland <rv ronetech.com> writes:

Walter a �crit :

 for (i = 0; i < a.length; i++)

 becomes:

    for (i = 0; i < (a.length & 7FFFFFFF); i++)

 which can have a significant speed penalty.

it seems to me not much if end_index (a.length & 7FFFFFFF) is evaluated only
once.
if end_index need to be evaluated at each iteration of the loop, the loop can
be rewritten like this:

for (i=0  |  (a.lenght & (0x1000000)); i < a.length; i++)

roland

Sep 02 2002

"Walter" <walter digitalmars.com> writes:

"Roland" <rv ronetech.com> wrote in message
news:3D73B504.4DE3E1C9 ronetech.com...
 Walter a �crit :
 for (i = 0; i < a.length; i++)
 becomes:
    for (i = 0; i < (a.length & 7FFFFFFF); i++)
 which can have a significant speed penalty.

 it seems to me not much if end_index (a.length & 7FFFFFFF) is evaluated

only
 once.
 if end_index need to be evaluated at each iteration of the loop, the loop

can
 be rewritten like this:

 for (i=0  |  (a.lenght & (0x1000000)); i < a.length; i++)

True, but proving that the length of the array does not change within the
loop, directly or indirectly, is non-trivial.

Sep 02 2002

Patrick Down <pat codemoon.com> writes:

"Walter" <walter digitalmars.com> wrote in
news:ak9ime$bdv$2 digitaldaemon.com: 

     char a[] = new char[100];
     char b[] = a[0..50];
     b.length = b.length + 30;

I have another question along these lines.

char[] a;
a.length = 1000000;

char[] b;
b = a[9000..9010];

char[] c;
a = c;

Now except for 9000..9010 the rest of the
1000000 byte array is unreferenced.  Does
b hold the entire original memory block from 
being collected or is there some gc magic 
that will reclaim the unreferenced parts?

Aug 25 2002

"Walter" <walter digitalmars.com> writes:

"Patrick Down" <pat codemoon.com> wrote in message
news:Xns927562C96AF5Dpatcodemooncom 63.105.9.61...
 "Walter" <walter digitalmars.com> wrote in
 news:ak9ime$bdv$2 digitaldaemon.com:

     char a[] = new char[100];
     char b[] = a[0..50];
     b.length = b.length + 30;

 I have another question along these lines.

 char[] a;
 a.length = 1000000;

 char[] b;
 b = a[9000..9010];

 char[] c;
 a = c;

 Now except for 9000..9010 the rest of the
 1000000 byte array is unreferenced.  Does
 b hold the entire original memory block from
 being collected or is there some gc magic
 that will reclaim the unreferenced parts?

The entire block will be held.

Aug 25 2002

Patrick Down <pat codemoon.com> writes:

"Walter" <walter digitalmars.com> wrote in news:akb9vs$28kh$2
 digitaldaemon.com: 

 The entire block will be held.

That's what I thought.  This should probably be
in the "Things to be careful about" section of
"Tips and Tricks of the D programming gurus".

Aug 25 2002

D Programming

C/C++ Programming

Other

D - Some more ideas