www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - Documentation bugs

reply =?ISO-8859-1?Q?Jari-Matti_M=E4kel=E4?= <jmjmak utu.fi.invalid> writes:


http://www.digitalmars.com/d/faq.html#real

2. There's no mention about class templates
(e.g. class foo(type) { ... }) on

http://www.digitalmars.com/d/class.html

3. There aren't any examples of classes that inherit from super classes
/ interfaces with protection attributes. In fact I don't even know how
and why they should work (maybe the way C++ uses them?). E.g.

 class foo : private bar { ... }


Walter, please don't go the c++ way here. That would make the current
interfaces totally useless. E.g.

 interface foo { void fn(); }
 class bar : private foo { private void fn() { ..implementation.. } }
 foo a = new bar();
 a.fn();

would become illegal.

4. One other question - what are the hard limits of DMD? For example,
how many objects of a class can be instantiated at most? We have made a
suffix tree -algorithm that generates a huge amount of objects. I think
the compiler introduces some limits, since the algorithm segfaults only
after about >300 identical successful runs - the amount of successful
identical runs before segfault decreases when the tree depth increases.
It segfaults on the following code:

class Node {
  Node[char] children;
  char[] label;

  void fn(char[] s) {
    assert(s !is null);
    assert(s.length > 0);
    assert(s[0] in children);
    auto a = children[s[0]];
    assert(a !is null);
    char[] l = a.label;  // segfault exactly here
    assert(l != null);
    ...
  }
}

The segfault shouldn't be possible since all objects all guaranteed to
be properly allocated from the heap. We don't use any explicit memory
management. I can post the whole source if needs be.

-- 
Jari-Matti
Mar 07 2006
next sibling parent "Walter Bright" <newshound digitalmars.com> writes:
"Jari-Matti Mäkelä" <jmjmak utu.fi.invalid> wrote in message 
news:dul6sh$30eu$1 digitaldaemon.com...
 4. One other question - what are the hard limits of DMD?
There aren't any, other than identifier length limits.
 For example,
 how many objects of a class can be instantiated at most?
Limited by memory the os can provide.
 We have made a
 suffix tree -algorithm that generates a huge amount of objects. I think
 the compiler introduces some limits, since the algorithm segfaults only
 after about >300 identical successful runs - the amount of successful
 identical runs before segfault decreases when the tree depth increases.
 It segfaults on the following code:

 class Node {
  Node[char] children;
  char[] label;

  void fn(char[] s) {
    assert(s !is null);
    assert(s.length > 0);
    assert(s[0] in children);
    auto a = children[s[0]];
    assert(a !is null);
    char[] l = a.label;  // segfault exactly here
    assert(l != null);
    ...
  }
 }

 The segfault shouldn't be possible since all objects all guaranteed to
 be properly allocated from the heap. We don't use any explicit memory
 management. I can post the whole source if needs be.
I'd start adding in more asserts to track it down.
Mar 07 2006
prev sibling parent reply Thomas Kuehne <thomas-dloop kuehne.cn> writes:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jari-Matti Mäkelä schrieb am 2006-03-08:

[snip]

 4. One other question - what are the hard limits of DMD? For example,
 how many objects of a class can be instantiated at most? We have made a
 suffix tree -algorithm that generates a huge amount of objects. I think
 the compiler introduces some limits, since the algorithm segfaults only
 after about >300 identical successful runs - the amount of successful
 identical runs before segfault decreases when the tree depth increases.
 It segfaults on the following code:

 class Node {
   Node[char] children;
   char[] label;

   void fn(char[] s) {
     assert(s !is null);
     assert(s.length > 0);
     assert(s[0] in children);
     auto a = children[s[0]];
     assert(a !is null);
     char[] l = a.label;  // segfault exactly here
     assert(l != null);
     ...
   }
 }

 The segfault shouldn't be possible since all objects all guaranteed to
 be properly allocated from the heap. We don't use any explicit memory
 management. I can post the whole source if needs be.
Are you sure you aren't hitting the memory limit of your system? If so, it's a bug reported about one year ago: (no exection is thrown if the system is out of memory) http://dstress.kuehne.cn/run/OutOfMemory_03.d Thomas -----BEGIN PGP SIGNATURE----- iD8DBQFEELEX3w+/yD4P9tIRAng8AKCjx9Kf/WPjydCDTUKFWOfDIf6yWgCguFIu 3TCxRGfxa7bw/dN3kfl8KKE= =j/IH -----END PGP SIGNATURE-----
Mar 09 2006
parent reply =?UTF-8?B?SmFyaS1NYXR0aSBNw6RrZWzDpA==?= <jmjmak utu.fi.invalid> writes:
Thomas Kuehne wrote:
 Jari-Matti Mýkelý schrieb am 2006-03-08:
 
 [snip]
 
 4. One other question - what are the hard limits of DMD? For example,
 how many objects of a class can be instantiated at most? We have made a
 suffix tree -algorithm that generates a huge amount of objects. I think
 the compiler introduces some limits, since the algorithm segfaults only
 after about >300 identical successful runs - the amount of successful
 identical runs before segfault decreases when the tree depth increases.
 It segfaults on the following code:

 class Node {
   Node[char] children;
   char[] label;

   void fn(char[] s) {
     assert(s !is null);
     assert(s.length > 0);
     assert(s[0] in children);
     auto a = children[s[0]];
     assert(a !is null);
     char[] l = a.label;  // segfault exactly here
     assert(l != null);
     ...
   }
 }

 The segfault shouldn't be possible since all objects all guaranteed to
 be properly allocated from the heap. We don't use any explicit memory
 management. I can post the whole source if needs be.
Are you sure you aren't hitting the memory limit of your system?
No, I think I'm not. The program is consuming < 50 MB of memory and I have > 2 GB of physical RAM + swap. It takes a while to fill all memory (especially the swap partition), but this algorithm segfaults already in 1-2 seconds. I have another D program that is trying to fill up the memory as fast as possible with small linked lists. It's only able to occupy ~300 MB in 2 seconds. I'm not sure, but I think the problem is in the slicing code (most probably), in associative arrays or in the code that creates new instances of classes. I'm working on a minimal test case. It's ~200 LOC now. -- Jari-Matti
Mar 09 2006
parent reply =?UTF-8?B?SmFyaS1NYXR0aSBNw6RrZWzDpA==?= <jmjmak utu.fi.invalid> writes:
Jari-Matti Mäkelä wrote:
 Thomas Kuehne wrote:
 Jari-Matti Mýkelý schrieb am 2006-03-08:

 [snip]

 4. One other question - what are the hard limits of DMD? For example,
 how many objects of a class can be instantiated at most? We have made a
 suffix tree -algorithm that generates a huge amount of objects. I think
 the compiler introduces some limits, since the algorithm segfaults only
 after about >300 identical successful runs - the amount of successful
 identical runs before segfault decreases when the tree depth increases.
 It segfaults on the following code:

 class Node {
   Node[char] children;
   char[] label;

   void fn(char[] s) {
     assert(s !is null);
     assert(s.length > 0);
     assert(s[0] in children);
     auto a = children[s[0]];
     assert(a !is null);
     char[] l = a.label;  // segfault exactly here
     assert(l != null);
     ...
   }
 }

 The segfault shouldn't be possible since all objects all guaranteed to
 be properly allocated from the heap. We don't use any explicit memory
 management. I can post the whole source if needs be.
Are you sure you aren't hitting the memory limit of your system?
No, I think I'm not. The program is consuming < 50 MB of memory and I have > 2 GB of physical RAM + swap. It takes a while to fill all memory (especially the swap partition), but this algorithm segfaults already in 1-2 seconds. I have another D program that is trying to fill up the memory as fast as possible with small linked lists. It's only able to occupy ~300 MB in 2 seconds. I'm not sure, but I think the problem is in the slicing code (most probably), in associative arrays or in the code that creates new instances of classes. I'm working on a minimal test case. It's ~200 LOC now.
Well, one thing I found out is that the algorithm segfaults exactly when the input strings for the suffix tree are long enough (>180 chars) or the algorithm is repeated several hundreds of times. It works perfectly with shorter input strings. If I try to writefln the segfaulting node contents, the runtime reports that the node labels are invalid unicode - this is very strange since the input strings consist of characters a-f, which are represented using 8-bit values and should remain valid throughout the slicing processes. Ok, then I tried to change the data type from char[] to dchar[] and now I'm not encountering any problems anymore. Maybe there are some problems with the implementation of slices on the char-type? One other thing I noticed was that this segfaulted sometimes too: char[] teststring = "testing"; doAlgorithm(teststring); writefln("%s", teststring); // segfault here. the algorithm doesn't write to the string, change its length nor does it free/delete it in any way. Just plain slicing and reference passing. Very weird. -- Jari-Matti
Mar 09 2006
parent reply Thomas Kuehne <thomas-dloop kuehne.cn> writes:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jari-Matti Mäkelä schrieb am 2006-03-10:
 Jari-Matti Mäkelä wrote:
 Thomas Kuehne wrote:
 Jari-Matti Mýkelý schrieb am 2006-03-08:

 [snip]

 4. One other question - what are the hard limits of DMD? For example,
 how many objects of a class can be instantiated at most? We have made a
 suffix tree -algorithm that generates a huge amount of objects. I think
 the compiler introduces some limits, since the algorithm segfaults only
 after about >300 identical successful runs - the amount of successful
 identical runs before segfault decreases when the tree depth increases.
 It segfaults on the following code:

 class Node {
   Node[char] children;
   char[] label;

   void fn(char[] s) {
     assert(s !is null);
     assert(s.length > 0);
     assert(s[0] in children);
     auto a = children[s[0]];
     assert(a !is null);
     char[] l = a.label;  // segfault exactly here
     assert(l != null);
     ...
   }
 }

 The segfault shouldn't be possible since all objects all guaranteed to
 be properly allocated from the heap. We don't use any explicit memory
 management. I can post the whole source if needs be.
Are you sure you aren't hitting the memory limit of your system?
No, I think I'm not. The program is consuming < 50 MB of memory and I have > 2 GB of physical RAM + swap. It takes a while to fill all memory (especially the swap partition), but this algorithm segfaults already in 1-2 seconds. I have another D program that is trying to fill up the memory as fast as possible with small linked lists. It's only able to occupy ~300 MB in 2 seconds. I'm not sure, but I think the problem is in the slicing code (most probably), in associative arrays or in the code that creates new instances of classes. I'm working on a minimal test case. It's ~200 LOC now.
Well, one thing I found out is that the algorithm segfaults exactly when the input strings for the suffix tree are long enough (>180 chars) or the algorithm is repeated several hundreds of times. It works perfectly with shorter input strings. If I try to writefln the segfaulting node contents, the runtime reports that the node labels are invalid unicode - this is very strange since the input strings consist of characters a-f, which are represented using 8-bit values and should remain valid throughout the slicing processes. Ok, then I tried to change the data type from char[] to dchar[] and now I'm not encountering any problems anymore. Maybe there are some problems with the implementation of slices on the char-type? One other thing I noticed was that this segfaulted sometimes too: char[] teststring = "testing"; doAlgorithm(teststring); writefln("%s", teststring); // segfault here. the algorithm doesn't write to the string, change its length nor does it free/delete it in any way. Just plain slicing and reference passing. Very weird.
Please replace
 doAlgorithm(teststring);
with
 doAlgorithm(teststring.dup);
I know the behaviour shouldn't change as "the algorithm doesn't write to the string", but ... Thomas -----BEGIN PGP SIGNATURE----- iD8DBQFEETQL3w+/yD4P9tIRAtUIAKDTHmAHkQhPfBujP3rXeVsqzu/CjQCgqaOZ mVWeOozmJt+bYTDJxrWZH2E= =xbc+ -----END PGP SIGNATURE-----
Mar 10 2006
parent reply =?UTF-8?B?SmFyaS1NYXR0aSBNw6RrZWzDpA==?= <jmjmak utu.fi.invalid> writes:
Thomas Kuehne wrote:
 Jari-Matti Mýkelý schrieb am 2006-03-10:
 char[] teststring = "testing";
 doAlgorithm(teststring);
 writefln("%s", teststring); // segfault here.

 the algorithm doesn't write to the string, change its length nor does it
 free/delete it in any way. Just plain slicing and reference passing.
 Very weird.
Please replace
 doAlgorithm(teststring);
with
 doAlgorithm(teststring.dup);
I know the behaviour shouldn't change as "the algorithm doesn't write to the string", but ...
I tried that one too. Didn't help in any way. The only assignment-operations I have are the following: class Node { char[] label; this(char[] foo) { label = foo.dup; } void shorten(int c) { label = label[c..$].dup; } } Somehow I though that the CoW prevents the misuse of strings. The dchar-version works perfectly without .dup-cloning on any arbitrary long input. The char-version doesn't work, if too many iterations/nodes are created, but works perfectly when run once with short input strings. The maximum input string length is then a constant. I guess it might be valuable to end up with a short test case here. I'll try my best to create one. -- Jari-Matti
Mar 10 2006
parent reply =?UTF-8?B?SmFyaS1NYXR0aSBNw6RrZWzDpA==?= <jmjmak utu.fi.invalid> writes:
Jari-Matti Mäkelä wrote:
 Somehow I though that the CoW prevents the misuse of strings. The
 dchar-version works perfectly without .dup-cloning on any arbitrary long
 input. The char-version doesn't work, if too many iterations/nodes are
 created, but works perfectly when run once with short input strings. The
 maximum input string length is then a constant. I guess it might be
 valuable to end up with a short test case here. I'll try my best to
 create one.
Actually the dchar-version doesn't work perfectly either. I tried to use the code coverage (dmd -cov) and it "failed": (as you can see, it shouldn't be even possible to run those commented lines.) |import std.stdio, suffixnode, abstractsearcher; | |class SuffixTreeSearcher(T) : AbstractSearcher!(T) { | T[] catenatedStrings; | T[] sentinels; | alias SuffixNode!(T) GenericNode; | GenericNode root; 130| this(T[][] strings, T[] sentinels ...) | { 1| this.sentinels = sentinels; 1| catenatedStrings = strings[0]~sentinels[0]~ strings[1]~sentinels[1]; | } | public T[] searchLCS() { 1| root = new GenericNode(); 206| for (int i = 0; i < catenatedStrings.length; ++i) { 185| root.addSuffix(catenatedStrings[i..$]); | } 84| return null; 388| //return root.mark(sentinels); | } 83| | char[] name() { 1| return "Suffix tree"; | } |} suffixtreesearcher.d is 100% covered I works as it should only when I comment the addSuffix-line: |import std.stdio, suffixnode, abstractsearcher; | |class SuffixTreeSearcher(T) : AbstractSearcher!(T) { | T[] catenatedStrings; | T[] sentinels; | alias SuffixNode!(T) GenericNode; | GenericNode root; | this(T[][] strings, T[] sentinels ...) | { 1| this.sentinels = sentinels; 1| catenatedStrings = strings[0]~sentinels[0]~ strings[1]~sentinels[1]; | } | public T[] searchLCS() { 1| root = new GenericNode(); 206| for (int i = 0; i < catenatedStrings.length; ++i) { |// root.addSuffix(catenatedStrings[i..$]); | } 1| return null; | //return root.mark(sentinels); | } | | char[] name() { 1| return "Suffix tree"; | } |} suffixtreesearcher.d is 100% covered Well, the only think I can think of is that the algorithm truly corrupts the runtime code by using array slicing. I also noticed that although it works fine without -cov, running it >=2 times segfaults with -cov. -- Jari-Matti
Mar 14 2006
parent reply Thomas Kuehne <thomas-dloop kuehne.cn> writes:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jari-Matti Mäkelä schrieb am 2006-03-14:
 Jari-Matti Mäkelä wrote:
 Somehow I though that the CoW prevents the misuse of strings. The
 dchar-version works perfectly without .dup-cloning on any arbitrary long
 input. The char-version doesn't work, if too many iterations/nodes are
 created, but works perfectly when run once with short input strings. The
 maximum input string length is then a constant. I guess it might be
 valuable to end up with a short test case here. I'll try my best to
 create one.
Actually the dchar-version doesn't work perfectly either. I tried to use the code coverage (dmd -cov) and it "failed": (as you can see, it shouldn't be even possible to run those commented lines.) |import std.stdio, suffixnode, abstractsearcher; | |class SuffixTreeSearcher(T) : AbstractSearcher!(T) { | T[] catenatedStrings; | T[] sentinels; | alias SuffixNode!(T) GenericNode; | GenericNode root; 130| this(T[][] strings, T[] sentinels ...) | { 1| this.sentinels = sentinels; 1| catenatedStrings = strings[0]~sentinels[0]~ strings[1]~sentinels[1]; | } | public T[] searchLCS() { 1| root = new GenericNode(); 206| for (int i = 0; i < catenatedStrings.length; ++i) { 185| root.addSuffix(catenatedStrings[i..$]); | } 84| return null; 388| //return root.mark(sentinels); | } 83| | char[] name() { 1| return "Suffix tree"; | } |} suffixtreesearcher.d is 100% covered I works as it should only when I comment the addSuffix-line: |import std.stdio, suffixnode, abstractsearcher; | |class SuffixTreeSearcher(T) : AbstractSearcher!(T) { | T[] catenatedStrings; | T[] sentinels; | alias SuffixNode!(T) GenericNode; | GenericNode root; | this(T[][] strings, T[] sentinels ...) | { 1| this.sentinels = sentinels; 1| catenatedStrings = strings[0]~sentinels[0]~ strings[1]~sentinels[1]; | } | public T[] searchLCS() { 1| root = new GenericNode(); 206| for (int i = 0; i < catenatedStrings.length; ++i) { |// root.addSuffix(catenatedStrings[i..$]); | } 1| return null; | //return root.mark(sentinels); | } | | char[] name() { 1| return "Suffix tree"; | } |} suffixtreesearcher.d is 100% covered Well, the only think I can think of is that the algorithm truly corrupts the runtime code by using array slicing. I also noticed that although it works fine without -cov, running it >=2 times segfaults with -cov.
The code above isn't complete thus makes testing it a bit difficult. As a start, let's try to find out the real coverage numbers: replace
  for (int i = 0; i < catenatedStrings.length; ++i) {
      root.addSuffix(catenatedStrings[i..$]);
  }
with something like
  writefln("LOOP_PRE");
  for (int i = 0; writefln("LOOP_CHECK"), i < catenatedStrings.length; ++i,
writefln("LOOP_INC")) {
      writefln("LOOP");
      root.addSuffix(catenatedStrings[i..$]);
  }
  writefln("LOOP_POST");
That isn't the kind of code I usually want to touch, but since the RT (and maybe the line numbers in the debug info) might be corrupted... Thomas -----BEGIN PGP SIGNATURE----- iD8DBQFEH0zB3w+/yD4P9tIRAqe/AJ9D9Tt8k9MNR0dfTMbSBVDoZhm8FQCcC3ff 552DecBfxViqJKyiEQiTx14= =giOX -----END PGP SIGNATURE-----
Mar 20 2006
parent =?UTF-8?B?SmFyaS1NYXR0aSBNw6RrZWzDpA==?= <jmjmak utu.fi.invalid> writes:
Thomas Kuehne wrote:
 The code above isn't complete thus makes testing it a bit difficult.
 As a start, let's try to find out the real coverage numbers:
 
 replace
 
  for (int i = 0; i < catenatedStrings.length; ++i) {
      root.addSuffix(catenatedStrings[i..$]);
  }
with something like
  writefln("LOOP_PRE");
  for (int i = 0; writefln("LOOP_CHECK"), i < catenatedStrings.length; ++i,
writefln("LOOP_INC")) {
      writefln("LOOP");
      root.addSuffix(catenatedStrings[i..$]);
  }
  writefln("LOOP_POST");
That isn't the kind of code I usually want to touch, but since the RT (and maybe the line numbers in the debug info) might be corrupted...
I'm really grateful that you want to help here. Ok, here are the results: First I commented the original loop and copy-pasted it beneath the original: /++ for (int i = 0; i < catenatedStrings.length; ++i) { root.addSuffix(catenatedStrings[i..$]); } ++/ for (int i = 0; i < catenatedStrings.length; ++i) { root.addSuffix(catenatedStrings[i..$]); } Guess what, the runtime code segfaults! Then I removed one line from the commented part: /++ for (int i = 0; i < catenatedStrings.length; ++i) { root.addSuffix(catenatedStrings[i..$]); ++/ for (int i = 0; i < catenatedStrings.length; ++i) { root.addSuffix(catenatedStrings[i..$]); } Still segfaults. Then yet another line: /++ for (int i = 0; i < catenatedStrings.length; ++i) { ++/ for (int i = 0; i < catenatedStrings.length; ++i) { root.addSuffix(catenatedStrings[i..$]); } Now it "works"! (But the coverage info is still wrong) Then I tried the writefln-method: <snip> | public T[] searchLCS() { 2| root = new GenericNode(); | 2| writefln("Pre loop"); 652| for (int i = 0; writefln("Loop check"), i < catenatedStrings.length; ++i, writefln("Loop inc")) { 324| root.addSuffix(catenatedStrings[i..$]); 312| } 2| writefln("Post loop"); 312| 1438| return root.searchLCS(sentinels); | } 312| <snip> I wasn't able to split the for-statement fully, but here is the best non-segfaulting version: <snip> | public T[] searchLCS() { 2| root = new GenericNode(); | 2| writefln("Pre loop"); 328| for (int i = 0; writefln("Loop check"), i < catenatedStrings.length; 324| ++i, 334| writefln("Loop inc")) { 324| root.addSuffix(catenatedStrings[i..$]); 334| } 1550| writefln("Post loop"); | 336| return root.searchLCS(sentinels); | } <snip> I haven't got much spare time to hunt this bug down, but I'll try to create a "minimal" test case this week. Maybe you can check it out then? There are at least two cases here, the one segfaults without code coverage on (char[]-strings) and the other one pops out when coverage is turned on. I've tested these with DMD 0.147, 0.148 & 0.149, but now I'm going to upgrade to 0.150. -- Jari-Matti
Mar 21 2006