www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - What a nice bug!

reply "Denis Koroskin" <2korden gmail.com> writes:
After a *very* small refactoring which only involved a class and a
corresponding file rename, I got a strange bug that prints "object.Error:
Access Violation" and hangs my GUI application.
After 2 hours of tracking an issue, I could find the following code (highly
modified to reduce its size as much as possible):

final void startMonitoring()
{
    _monitorInfos.length = _monitorInfos.length + 1; // removing this makes
issue disappear

    char[] buffer = null;
    buffer.length = 4096;
}

Note that even though buffer is unused (anymore), it is what causes an AV and a
hang.
The funny thing is that if you change "buffer.length = 4096;" to "buffer.length
= 4095;", the bug goes away!

Since I got no stack trace from an Exception, I have no idea where it comes
from, but it definitely crashes somewhere else.
Even after tracking this far, it's still a lot of code to post (500+ Kb of D
source code), so I'm trying to reduce it even further. Problem is, changing
nearly anything leads to either another crash or problem disappears. For
example, there is dummy thread now that does completely nothing (I removed all
of its references and functionality while tracking it down):

void run()
{
    while (true) {
        Sleep(100);
    }
}

Thread mainThread = new Thread(&run);
mainThread.start();

But if I remove this thread (comment out the "mainThread.start()" or make run()
empty), it leads to crash in some other part of program - a GUI code (An
exception was thrown while finalizing an instance of class
ui.control.impl.win32.Win32FormImpl.Win32FormImpl - object.Error: Access
Violation). I don't know what the hell it is either, and trying to hunt it.

P.S. DMD2.030 + Phobos
May 18 2009
next sibling parent reply "Denis Koroskin" <2korden gmail.com> writes:
Arrrgh! Nevermind!

The minute I made a post I realized I forgot to recompile one of the dependency
libraries (win32 headers, moved out to a separate library). Looks like that's
what was a problem.

Sorry for a buzz.
May 18 2009
parent div0 <div0 users.sourceforge.net> writes:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Denis Koroskin wrote:
 Arrrgh! Nevermind!
 
 The minute I made a post I realized I forgot to recompile one of the
 dependency libraries (win32 headers, moved out to a separate library).
 Looks like that's what was a problem.
 
 Sorry for a buzz.
Have you updated to the correct winmain usage? http://d.puremagic.com/issues/show_bug.cgi?id=2580 I was having occasional bizarre crashes also. Fixed now I've updated winmain. - -- My enormous talent is exceeded only by my outrageous laziness. http://www.ssTk.co.uk -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFKEdsOT9LetA9XoXwRAt4KAJwMJgQBfJyMbyIZsMLgliASAs10PACdHQU6 8+395is7UKXAJJU8r9cVS7k= =7Anc -----END PGP SIGNATURE-----
May 18 2009
prev sibling next sibling parent Kagamin <spam here.lot> writes:
Denis Koroskin Wrote:

 Arrrgh! Nevermind!
 
 The minute I made a post I realized I forgot to recompile one of the
dependency libraries (win32 headers, moved out to a separate library). Looks
like that's what was a problem.
 
 Sorry for a buzz.
I also hit a ghostly av bug which showed itself sometimes from run to run and sometimes disappeared after recompilation. I tracked it to towlower function, then it disappeared. I f33r it still lurks somewhere out there.
May 18 2009
prev sibling parent reply "Denis Koroskin" <2korden gmail.com> writes:
Sorry for bumping this thread once, but I came across this issue again today.

It took my all the day to cut my project as much as possible while still
preserving this bug.

Changing nearly anything also vanishes the bug. Here are some examples:

1) There are a few cases where I extend an empty interface:
module dice.build.Source;
interface Source {}

module dice.build.d.DSourceFile;
class DSourceFile : Source { /* ... */ }

Removing an interface inheritance makes bug disappear.

2) There are more than 20 empty D source files in this project (!), and
removing _any_ of them also makes this bug disappear!

3) Some data structures got paddings during code simplification:

module dice.filesystem.FileSystemMonitor;
struct MonitorInfo
{
    void[] buffer;
    char[13] padding; // reducing it to 12 or lower makes bug disappear
}

4) There a few cases like this in the code:

module dice.filesystem.FileSystemMonitor;
buffer.length = 4096;

Reducing this number to 4095 or lower also makes bug disappear.

5) There is a do-nothing thread, which is never accessed. Its run method looks
like this:
module dice.filesystem.FileSystemMonitor;

private final void run()
{
    while (true) {
        Sleep(100);
    }
}

Making run() method empty (i.e. letting this thread finish) causes a crash in
another part of a program - I have no idea why:
An exception was thrown while finalizing an instance of class
ui.control.impl.win32.Win32TreeControlImpl.Win32TreeControlImpl

This class has a parent with a __dtor that doesn't do anything special.
Removing ~this still crashes the program.

Looks like a bad codegen or a GC issue, although it may be my bug, too (but I
have no idea about possible reasons).
I suspect it have something to do with memory management, because I could
easily remove code that doesn't allocate anything, and couldn't remove most of
the code that allocates.

I tried disabling GC, but it didn't help.

Could anyone take a look at it, try to reproduce or give some ideas/advices?
Any help is _very_ appreciated!

Here is my test-case: http://www.sendspace.com/file/2laod7 (I couldn't find a
better place to store attachment, sorry)

Extract its contents to d:\Projects\dice (or C:\Projects\dice and reflect the
change in main.d)
Bug may not be reproducible otherwise. Build and run it with build_and_run.bat.
Win32 only (as of now).

Also, while debugging this I came across an OPTLINK unexpected termination bug
during linking an executable. Adding one more empty file, or removing another
one resulted in proper linking. Shall I submit a bugreport about it? I didn't
do it because there are lots of similar ones already (possible duplication),
test-case is huge, and chances are it won't get fixed anyway.
May 25 2009
parent "Denis Koroskin" <2korden gmail.com> writes:
Sorry for a lot of mistakes in this post, I'm sleepy (it's almost 5 am here)
and forgot to make a second pass through the text to fix them.
May 25 2009