www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - ARSD PNG memory usage

reply Joerg Joergonson <JJoergonson gmail.com> writes:
Hi, so, do you have any idea why when I load an image with png.d 
it takes a ton of memory?

I have a 3360x2100 that should take around 26mb of memory 
uncompressed and a bunch of other smaller png files.

Are you keeping multiple buffers of the image around? A 
trueimage, a memoryimage, an opengl texture thing that might be 
in main memory, etc? Total file space of all the images is only 
about 3MB compressed and 40MB uncompressed. So it's using around 
10x more memory than it should! I tried a GC collect and all that.

I don't think my program will have a chance in hell using that 
much memory. That's just a few images for gui work. I'll be 
loading full page png's later on that might have many pages(100+) 
that I would want to pre-cache. This would probably cause the 
program to use TB's of space.

I don't know where to begin diagnosing the problem. I am using 
openGL but I imagine that shouldn't really allocate anything new?

I have embedded the images using `import` but that shouldn't 
really add much size(since it is compressed) or change things.

You could try it out yourself on a test case to see? (might be a 
windows thing too) Create a high res image(3000x3000, say) and 
load it like

auto eImage = cast(ubyte[])import("mylargepng.png");

TrueColorImage image = 
imageFromPng(readPng(eImage)).getAsTrueColorImage;	
OpenGlTexture oGLimage = new OpenGlTexture(image); // Will crash 
without create2dwindow
//oGLimage.draw(0,0,3000,3000);


When I do a bare loop minimum project(create2dwindow + event 
handler) I get 13% cpu(on 8-core skylake 4ghz) and 14MB memory.

When I add the code above I get 291MB of memory(for one image.

Here's the full D code source:


module winmain;

import arsd.simpledisplay;
import arsd.png;
import arsd.gamehelpers;

void main()
{
			
		auto window = create2dWindow(1680, 1050, "Test");

		auto eImage = cast(ubyte[])import("Mock.png");

		TrueColorImage image = 
imageFromPng(readPng(eImage)).getAsTrueColorImage;   // 178MB	
		OpenGlTexture oGLimage = new OpenGlTexture(image);	 // 291MB
		//oGLimage.draw(0,0,3000,3000);

		window.eventLoop(50,
			delegate ()
			{
				window.redrawOpenGlSceneNow();
			},

		);
}

Note that I have modified create2dWindow to take the viewport and 
set it to 2x as large in my own code(I removed here). It 
shouldn't matter though as it's the png and OpenGlTexture that 
seem to have the issue.

Surely once the image is loaded by opengl we could potentially 
disregard the other images and virtually no extra memory would be 
required? I do use getpixel though, not sure it that could be 
used on OpenGLTexture's? I don't mind keeping a main memory copy 
though but I just need it to have a realistic size ;)

So two problems: 1 is the cpu usage, which I'll try to get more 
info on my side when I can profile and 2 is the 10x memory usage. 
If it doesn't happen on your machine can you try alternate(if 
'nix, go for win, or vice versa). This way we can get an idea 
where the problem might be.

Thanks!  Also, when I try to run the app in 64-bit windows, 
RegisterClassW throws for some reason ;/ I haven't been able to 
figure that one out yet ;/
Jun 16 2016
next sibling parent reply thedeemon <dlang thedeemon.com> writes:
On Friday, 17 June 2016 at 01:51:41 UTC, Joerg Joergonson wrote:
 Hi, so, do you have any idea why when I load an image with 
 png.d it takes a ton of memory?
I've bumped into this previously. It allocates a lot of temporary arrays for decoded chunks of data, and I managed to reduce those allocations a bit, here's the version I used: http://stuff.thedeemon.com/png.d (last changed Oct 2014, so may need some tweaks today) But most of allocations are really caused by using std.zlib. This thing creates tons of temporary arrays/slices and they are not collected well by the GC. To deal with that I had to use GC arenas for each PNG file I decode. This way all the junk created during PNG decoding is eliminated completely after the decoding ends. See gcarena module here: https://bitbucket.org/infognition/dstuff You may see Adam's PNG reader was really the source of motivation for it. ;)
Jun 16 2016
next sibling parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Friday, 17 June 2016 at 02:55:43 UTC, thedeemon wrote:
 I've bumped into this previously. It allocates a lot of 
 temporary arrays for decoded chunks of data, and I managed to 
 reduce those allocations a bit, here's the version I used:
If you can PR any of it to me, I'll merge. It actually has been on my todo list for a while to change the decoder to generate less garbage. I have had trouble in the past with temporary arrays being pinned by false pointers and the memory use ballooning from that, and the lifetime is really easy to manage so just malloc/freeing it would be an easy solution, just like you said, std.zlib basically sucks so I have to use the underlying C functions and I just haven't gotten around to it.
Jun 16 2016
parent ketmar <ketmar ketmar.no-ip.org> writes:
On Friday, 17 June 2016 at 03:41:02 UTC, Adam D. Ruppe wrote:
 It actually has been on my todo list for a while to change the 
 decoder to generate less garbage. I have had trouble in the 
 past with temporary arrays being pinned by false pointers and 
 the memory use ballooning from that, and the lifetime is really 
 easy to manage so just malloc/freeing it would be an easy 
 solution, just like you said, std.zlib basically sucks so I 
 have to use the underlying C functions and I just haven't 
 gotten around to it.
did that. decoding still sux, but now it should suck less. ;-) encoder is still using "std.zlib", though. next time, maybe.
Jun 17 2016
prev sibling parent reply Guillaume Piolat <first.last gmail.com> writes:
On Friday, 17 June 2016 at 02:55:43 UTC, thedeemon wrote:
 On Friday, 17 June 2016 at 01:51:41 UTC, Joerg Joergonson wrote:
 Hi, so, do you have any idea why when I load an image with 
 png.d it takes a ton of memory?
I've bumped into this previously. It allocates a lot of temporary arrays for decoded chunks of data, and I managed to reduce those allocations a bit, here's the version I used: http://stuff.thedeemon.com/png.d (last changed Oct 2014, so may need some tweaks today)
Hey, I also stumbled upon this with imageformats decoding PNG. Image loading makes 10x the garbage it should. Let's see what this threads unveils...
Aug 16 2016
parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Tuesday, 16 August 2016 at 16:29:18 UTC, Guillaume Piolat 
wrote:
 Hey, I also stumbled upon this with imageformats decoding PNG. 
 Image loading makes 10x the garbage it should.
 Let's see what this threads unveils...
leet me know how it is now
Aug 16 2016
next sibling parent Guillaume Piolat <first.last gmail.com> writes:
On Tuesday, 16 August 2016 at 16:40:30 UTC, Adam D. Ruppe wrote:
 On Tuesday, 16 August 2016 at 16:29:18 UTC, Guillaume Piolat 
 wrote:
 Hey, I also stumbled upon this with imageformats decoding PNG. 
 Image loading makes 10x the garbage it should.
 Let's see what this threads unveils...
leet me know how it is now
Reverted back to a stb_image translation to avoid the problem altogether (though it's a bit slower now). Rewriting offending allocations in std.zlib was harder than expected.
Aug 17 2016
prev sibling parent Guillaume Piolat <first.last gmail.com> writes:
On Tuesday, 16 August 2016 at 16:40:30 UTC, Adam D. Ruppe wrote:
 On Tuesday, 16 August 2016 at 16:29:18 UTC, Guillaume Piolat 
 wrote:
 Hey, I also stumbled upon this with imageformats decoding PNG. 
 Image loading makes 10x the garbage it should.
 Let's see what this threads unveils...
leet me know how it is now
So I made a small benchmark for testing PNG loading in D * dplug.gui.pngload ( = stb_image): 134ms / 4.4mb memory * arsd.png: 118ms / 7mb memory * imageformats: 108ms / 13.1mb memory Compiler: ldc-1.0.0-b2, release-nobounds build type
Aug 29 2016
prev sibling parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Friday, 17 June 2016 at 01:51:41 UTC, Joerg Joergonson wrote:
 Are you keeping multiple buffers of the image around? A 
 trueimage, a memoryimage, an opengl texture
MemoryImage and TrueImage are the same thing, memory is just the interface, true image is the implementation. OpenGL texture is separate, but it references the same memory as a TrueColorImage, so it wouldn't be adding. You might have pinned temporary buffers though. That shouldn't happen on 64 bit, but on 32 bit I have seen it happen a lot.
 When I do a bare loop minimum project(create2dwindow + event 
 handler) I get 13% cpu(on 8-core skylake 4ghz) and 14MB memory.
I haven't seen that here.... but I have a theory now: you have some pinned temporary buffer on 32 bit (on 64 bit, the GC would actually clean it up) that keeps memory usage near the collection boundary. Then, a small allocation in the loop - which shouldn't be happening, I don't see any in here... - but if there is a small allocation I'm missing, it could be triggering a GC collection cycle each time, eating CPU to scan all that wasted memory without being able to free anything. If you can run it in the debugger and just see where it is by breaking at random, you might be able to prove it. That's a possible theory.... I can reproduce the memory usage here, but not the CPU usage though. Sitting idle, it is always <1% here (0 if doing nothing, like 0.5% if I move the mouse in the window to generate some activity) I need to get to bed though, we'll have to check this out in more detail later.
 Thanks!  Also, when I try to run the app in 64-bit windows, 
 RegisterClassW throws for some reason ;/ I haven't been able to 
 figure that one out yet ;/
errrrrr this is a mystery to me too... a hello world on 64 bit seems to work fine, but your program tells me error 998 (invalid memory access) when I run it. WTF, both register class the same way. I'm kinda lost on that.
Jun 16 2016
parent reply Joerg Joergonson <JJoergonson gmail.com> writes:
On Friday, 17 June 2016 at 04:32:02 UTC, Adam D. Ruppe wrote:
 On Friday, 17 June 2016 at 01:51:41 UTC, Joerg Joergonson wrote:
 Are you keeping multiple buffers of the image around? A 
 trueimage, a memoryimage, an opengl texture
MemoryImage and TrueImage are the same thing, memory is just the interface, true image is the implementation. OpenGL texture is separate, but it references the same memory as a TrueColorImage, so it wouldn't be adding.
ok, then it's somewhere in TrueColorImage or the loading of the png.
 You might have pinned temporary buffers though. That shouldn't 
 happen on 64 bit, but on 32 bit I have seen it happen a lot.
Ok, IIRC LDC both x64 and x86 had high memory usage too, so if it shouldn't happen on 64-bit(if it applies to ldc), this then is not the problem. I'll run a -vgc on it and see if it shows up anything interesting.
 When I do a bare loop minimum project(create2dwindow + event 
 handler) I get 13% cpu(on 8-core skylake 4ghz) and 14MB memory.
I haven't seen that here.... but I have a theory now: you have some pinned temporary buffer on 32 bit (on 64 bit, the GC would actually clean it up) that keeps memory usage near the collection boundary.
Again, it might be true but I'm pretty sure I saw the problem with ldc x64.
 Then, a small allocation in the loop - which shouldn't be 
 happening, I don't see any in here... - but if there is a small 
 allocation I'm missing, it could be triggering a GC collection 
 cycle each time, eating CPU to scan all that wasted memory 
 without being able to free anything.
Ok, Maybe... -vgc might show that.
 If you can run it in the debugger and just see where it is by 
 breaking at random, you might be able to prove it.
Good idea, not thought about doing that ;) Might be a crap shoot but who knows...
 That's a possible theory.... I can reproduce the memory usage 
 here, but not the CPU usage though. Sitting idle, it is always 
 <1% here (0 if doing nothing, like 0.5% if I move the mouse in 
 the window to generate some activity)

  I need to get to bed though, we'll have to check this out in 
 more detail later.
me too ;) I'll try to test stuff out a little more when I get a chance.
 Thanks!  Also, when I try to run the app in 64-bit windows, 
 RegisterClassW throws for some reason ;/ I haven't been able 
 to figure that one out yet ;/
errrrrr this is a mystery to me too... a hello world on 64 bit seems to work fine, but your program tells me error 998 (invalid memory access) when I run it. WTF, both register class the same way. I'm kinda lost on that.
Well, It works on LDC x64! again ;) This seems like an issue with DMD x64? I was thinking maybe it has to do the layout of the struct or something, but not sure. --- I just run a quick test: LDC x64 uses about 250MB and 13% cpu. I couldn't check on x86 because of the error phobos2-ldc.lib(gzlib.c.obj) : fatal error LNK1112: module machine type 'x64' conflicts with target machine type 'X86' not sure what that means with gzlib.c.ojb. Must be another bug in ldc alpha ;/ Anyways, We'll figure it all out at some point ;) I'm really liking your lib by the way. It's let me build a gui and get a lot done and just "work". Not sure if it will work on X11 with just a recompile, but I hope ;)
Jun 16 2016
next sibling parent reply kinke <noone nowhere.com> writes:
On Friday, 17 June 2016 at 04:54:27 UTC, Joerg Joergonson wrote:
 LDC x64 uses about 250MB and 13% cpu.

 I couldn't check on x86 because of the error

 phobos2-ldc.lib(gzlib.c.obj) : fatal error LNK1112: module 
 machine type 'x64' conflicts with target machine type 'X86'

 not sure what that means with gzlib.c.ojb. Must be another bug 
 in ldc alpha ;/
It looks like you're trying to link 32-bit objects to a 64-bit Phobos. The only pre-built LDC for Windows capable of linking both 32-bit and 64-bit code is the multilib CI release, see https://github.com/ldc-developers/ldc/releases/tag/LDC-Win64-master.
Jun 17 2016
parent Joerg Joergonson <JJoergonson gmail.com> writes:
On Friday, 17 June 2016 at 14:39:32 UTC, kinke wrote:
 On Friday, 17 June 2016 at 04:54:27 UTC, Joerg Joergonson wrote:
 LDC x64 uses about 250MB and 13% cpu.

 I couldn't check on x86 because of the error

 phobos2-ldc.lib(gzlib.c.obj) : fatal error LNK1112: module 
 machine type 'x64' conflicts with target machine type 'X86'

 not sure what that means with gzlib.c.ojb. Must be another bug 
 in ldc alpha ;/
It looks like you're trying to link 32-bit objects to a 64-bit Phobos. The only pre-built LDC for Windows capable of linking both 32-bit and 64-bit code is the multilib CI release, see https://github.com/ldc-developers/ldc/releases/tag/LDC-Win64-master.
Yes, it looks that way but it's not the case I believe(I did check when this error first came up). I'm using the phobo's libs from ldc that are x86. I could be mistaken but phobos2-ldc.lib(gzlib.c.obj) suggests that the problem isn't with the entire phobos lib but gzlib.c.obj and that that is the only one marked incorrectly, since it's not for all the other imports, it seems something got marked wrong in that specific case?
Jun 17 2016
prev sibling parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Friday, 17 June 2016 at 04:54:27 UTC, Joerg Joergonson wrote:
 ok, then it's somewhere in TrueColorImage or the loading of the 
 png.
So, opengltexture actually does reallocate if the size isn't right for the texture... and your image was one of those sizes. The texture pixel size needs to be a power of two, so 3000 gets rounded up to 4096, which means an internal allocation. But it can be a temporary one! So ketmar tackled png.d's loaders' temporaries and I took care of gamehelper.d's... And the test program went down about to 1/3 of its memory usage. Try grabbing the new ones from github now and see if it works for you too.
 Well, It works on LDC x64! again ;) This seems like an issue 
 with DMD x64? I was thinking maybe it has to do the layout of 
 the struct or something, but not sure.
I have a fix for this too, though I don't understand why it works.... I just .dup'd the string literal before passing it to Windows. I think dmd is putting the literal in a bad place for these functions (they do bit tests to see if it is a pointer or an atom, so maybe it is in an address where the wrong bits are set) In any case, the .dup seems to fix it, so all should work on 32 or 64 bit now. In my tests, now that the big temporary arrays are manually freed, the memory usage is actually slightly lower on 32 bit, but it isn't bad on 64 bit either. The CPU usage is consistently very low on my computer. I still don't know what could be causing it for you, but maybe it is the temporary garbage... let us know if the new patches make a difference there.
 Anyways, We'll figure it all out at some point ;) I'm really 
 liking your lib by the way. It's let me build a gui and get a 
 lot done and just "work". Not sure if it will work on X11 with 
 just a recompile, but I hope ;)
It often will! If you aren't using any of the native event handler functions or any of the impl.* members, most things just work (exception being the windows hotkey functions, but those are marked Windows anyway!). The basic opengl stuff is all done for both platforms. Advanced opengl isn't implemented on Windows yet though (I don't know it; my opengl knowledge stops in like 1998 with opengl 1.1 sooooo yeah, I depend on people's contributions for that and someone did Linux for me, but not Windows yet. I think.)
Jun 17 2016
next sibling parent Joerg Joergonson <JJoergonson gmail.com> writes:
On Friday, 17 June 2016 at 14:48:22 UTC, Adam D. Ruppe wrote:
 On Friday, 17 June 2016 at 04:54:27 UTC, Joerg Joergonson wrote:
 [...]
So, opengltexture actually does reallocate if the size isn't right for the texture... and your image was one of those sizes. [...]
Cool, I'll check all this out and report back. I'll look into the cpu issue too. Thanks!
Jun 17 2016
prev sibling next sibling parent reply Joerg Joergonson <JJoergonson gmail.com> writes:
On Friday, 17 June 2016 at 14:48:22 UTC, Adam D. Ruppe wrote:
 On Friday, 17 June 2016 at 04:54:27 UTC, Joerg Joergonson wrote:
 ok, then it's somewhere in TrueColorImage or the loading of 
 the png.
So, opengltexture actually does reallocate if the size isn't right for the texture... and your image was one of those sizes. The texture pixel size needs to be a power of two, so 3000 gets rounded up to 4096, which means an internal allocation. But it can be a temporary one! So ketmar tackled png.d's loaders' temporaries and I took care of gamehelper.d's... And the test program went down about to 1/3 of its memory usage. Try grabbing the new ones from github now and see if it works for you too.
Yes, same here! Great! It runs around 122MB in x86 and 107MB x64. Much better!
 Well, It works on LDC x64! again ;) This seems like an issue 
 with DMD x64? I was thinking maybe it has to do the layout of 
 the struct or something, but not sure.
I have a fix for this too, though I don't understand why it works.... I just .dup'd the string literal before passing it to Windows. I think dmd is putting the literal in a bad place for these functions (they do bit tests to see if it is a pointer or an atom, so maybe it is in an address where the wrong bits are set)
Yeah, strange but good catch! It now works in x64! I modified it to to!wstring(title).dup simply to have the same title and classname.
 In any case, the .dup seems to fix it, so all should work on 32 
 or 64 bit now. In my tests, now that the big temporary arrays 
 are manually freed, the memory usage is actually slightly lower 
 on 32 bit, but it isn't bad on 64 bit either.
I have the opposite on memory but not a big deal.
 The CPU usage is consistently very low on my computer. I still 
 don't know what could be causing it for you, but maybe it is 
 the temporary garbage... let us know if the new patches make a 
 difference there.
I will investigate this soon and report back anything. It probably is something straightforward.
 Anyways, We'll figure it all out at some point ;) I'm really 
 liking your lib by the way. It's let me build a gui and get a 
 lot done and just "work". Not sure if it will work on X11 with 
 just a recompile, but I hope ;)
It often will! If you aren't using any of the native event handler functions or any of the impl.* members, most things just work (exception being the windows hotkey functions, but those are marked Windows anyway!). The basic opengl stuff is all done for both platforms. Advanced opengl isn't implemented on Windows yet though (I don't know it; my opengl knowledge stops in like 1998 with opengl 1.1 sooooo yeah, I depend on people's contributions for that and someone did Linux for me, but not Windows yet. I think.)
I found this on non-power of 2 textures: https://www.opengl.org/wiki/NPOT_Texture https://www.opengl.org/registry/specs/ARB/texture_non_power_of_two.txt It seems like it's probably a quick and easy add on and you already have the padding code, it could easily be optional(set a flag or pass a bool or whatever). it could definitely same some serious memory for large textures. e.g., a 3000x3000x4 texture takes about 36MB or 2^25.1 bytes. Since this has to be rounded up to 2^26 = 67MB, we almost have doubled the amount of wasted space. Hence, allowing for non-power of two would probably reduce the memory footprint of my code to near 50MB(around 40MB being the minimum using uncompressed textures). I might try to get a working version of that at some point. Going to deal with the cpu thing now though. Thanks again.
Jun 17 2016
parent reply Joerg Joergonson <JJoergonson gmail.com> writes:
On Saturday, 18 June 2016 at 00:56:57 UTC, Joerg Joergonson wrote:
 On Friday, 17 June 2016 at 14:48:22 UTC, Adam D. Ruppe wrote:
[...]
Yes, same here! Great! It runs around 122MB in x86 and 107MB x64. Much better!
[...]
Yeah, strange but good catch! It now works in x64! I modified it to to!wstring(title).dup simply to have the same title and classname.
 [...]
I have the opposite on memory but not a big deal.
 [...]
I will investigate this soon and report back anything. It probably is something straightforward.
 [...]
I found this on non-power of 2 textures: https://www.opengl.org/wiki/NPOT_Texture https://www.opengl.org/registry/specs/ARB/texture_non_power_of_two.txt It seems like it's probably a quick and easy add on and you already have the padding code, it could easily be optional(set a flag or pass a bool or whatever). it could definitely same some serious memory for large textures. e.g., a 3000x3000x4 texture takes about 36MB or 2^25.1 bytes. Since this has to be rounded up to 2^26 = 67MB, we almost have doubled the amount of wasted space. Hence, allowing for non-power of two would probably reduce the memory footprint of my code to near 50MB(around 40MB being the minimum using uncompressed textures). I might try to get a working version of that at some point. Going to deal with the cpu thing now though. Thanks again.
Never mind about this. I wasn't keeping in mind that these textures are ultimately going to end up in the video card memory. I simply removed your nextpowerof2 code(so the width and height wasn't being enlarged) and saw no memory change). Obviously because they are temporary buffers, I guess? If this is the case, then maybe there is one odd temporary still hanging around in png?
Jun 17 2016
parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Saturday, 18 June 2016 at 01:44:28 UTC, Joerg Joergonson wrote:
 I simply removed your nextpowerof2 code(so the width and height 
 wasn't being enlarged) and saw no memory change). Obviously 
 because they are temporary buffers, I guess?
right, the new code free() them right at scope exit.
 If this is the case, then maybe there is one odd temporary 
 still hanging around in png?
Could be, though the png itself has relatively small overhead, and the opengl texture adds to it still. I'm not sure if video memory is counted by task manager or not... but it could be loading up the whole ogl driver that accounts for some of it. I don't know.
Jun 17 2016
parent reply Joerg Joergonson <JJoergonson gmail.com> writes:
On Saturday, 18 June 2016 at 01:46:32 UTC, Adam D. Ruppe wrote:
 On Saturday, 18 June 2016 at 01:44:28 UTC, Joerg Joergonson 
 wrote:
 I simply removed your nextpowerof2 code(so the width and 
 height wasn't being enlarged) and saw no memory change). 
 Obviously because they are temporary buffers, I guess?
right, the new code free() them right at scope exit.
 If this is the case, then maybe there is one odd temporary 
 still hanging around in png?
Could be, though the png itself has relatively small overhead, and the opengl texture adds to it still. I'm not sure if video memory is counted by task manager or not... but it could be loading up the whole ogl driver that accounts for some of it. I don't know.
Ok. Also, maybe the GC hasn't freed some of those temporaries yet. What's strange is that when the app is run, it seems to do a lot of small allocations around 64kB or something for about 10 seconds(I watch the memory increase in TM) then it stabilizes. Not a big deal, just seems a big weird(maybe some type of lazy allocations going on) Anyways, I'm much happier now ;) Thanks!
Jun 17 2016
parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Saturday, 18 June 2016 at 01:57:49 UTC, Joerg Joergonson wrote:
 Ok. Also, maybe the GC hasn't freed some of those temporaries 
 yet.
The way GC works in general is it allows allocations to just continue until it considers itself under memory pressure. Then, it tries to do a collection. Since collections are expensive, it puts them off as long as it can and tries to do them as infrequently as reasonable. (some GCs do smaller collections to spread the cost out though, so the details always differ based on implementation) So, you'd normally see it go up to some threshold then stabilize there, even if it is doing a lot of little garbage allocations. However, once the initialization is done here, it shouldn't be allocating any more. The event loop itself doesn't when all is running normally.
Jun 17 2016
parent reply Joerg Joergonson <JJoergonson gmail.com> writes:
On Saturday, 18 June 2016 at 02:17:01 UTC, Adam D. Ruppe wrote:

I have an auto generator for pngs and 99% of the time it works, 
but every once in a while I get an error when loading the png's. 
Usually re-running the generator "fixes the problem" so it might 
be on my end. Regardless of where the problem stems, it would be 
nice to have more info why instead of a range violation. 
previousLine is null in the break.

All the png's generated are loadable by external app like 
ifranview, so they are not completely corrupt but possibly could 
have some thing that is screwing png.d up.


The code where the error happens is:

		case 3:
			auto arr = data.dup;
			foreach(i; 0 .. arr.length) {
				auto prev = i < bpp ? 0 : arr[i - bpp];
				arr[i] += cast(ubyte)
					/*std.math.floor*/( cast(int) (prev + previousLine[i]) / 2);
			}

Range violation at png.d(1815)

Any ideas?
Jun 19 2016
parent Joerg Joergonson <JJoergonson gmail.com> writes:
Also, for some reason one image has a weird horizontal line at 
the bottom of the image that is not part of the original. This is 
as if the height was 1 pixel to much and it's reading "junk". I 
have basically a few duplicate images that were generated from 
the same base image. None of the others have this problem.

If I reduce the image dimensions it doesn't have this problem. My 
guess is that there is probably a bug with a > vs >= or 
something. When the image dimensions are "just right" an extra 
line is added that may be non-zero.

The image dimensions are 124x123.

This is all speculation but it seems like it is a png.d or 
opengltexture issue. I cannot see this added line in any image 
editor I've tried(PS, ifranview) and changing the dimensions of 
the image fix it.

Since it's a hard one to debug without test case I will work on 
it... Hoping you have some possible points of attack though.
Jun 19 2016
prev sibling parent reply Joerg Joergonson <JJoergonson gmail.com> writes:
 The CPU usage is consistently very low on my computer. I still 
 don't know what could be causing it for you, but maybe it is 
 the temporary garbage... let us know if the new patches make a 
 difference there.
Ok, I tried the breaking at random method and I always ended up in system code and no stack trace to... seems it was an alternate thread(maybe GC?). I did a sampling profile and got this: Function Name Inclusive Exclusive Inclusive % Exclusive % _DispatchMessageW 4 10,361 5 88.32 0.04 [nvoglv32.dll] 7,874 745 67.12 6.35 _GetExitCodeThread 8 5,745 5,745 48.97 48.97 _SwitchToThread 0 2,166 2,166 18.46 18.46 So possibly it is simply my system and graphics card. For some reason NVidia might be using a lot of cpu here for no apparent reason? DispatchMessage is still taking quite a bit of that though? Seems like someone else has a similar issue: https://devtalk.nvidia.com/default/topic/832506/opengl/nvoglv32-consuming-a-ton-of-cpu/ https://github.com/mpv-player/mpv/issues/152 BTW, trying sleep in the MSG loop Error: undefined identifier 'sleep', did you mean function 'Sleep'? "import core.thread; sleep(10);" ;) Adding a Sleep(10); to the loop dropped the cpu usage down to 0-1% cpu! http://stackoverflow.com/questions/33948837/win32-application-with-high-cpu-usage/33948865 Not sure if that's the best approach though but it does work. They mention to use PeekMessage and I don't see you doing that, not sure if it would change things though?
Jun 17 2016
parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Saturday, 18 June 2016 at 01:20:16 UTC, Joerg Joergonson wrote:
 Error: undefined identifier 'sleep', did you mean function 
 'Sleep'?		

 "import core.thread; sleep(10);"
It is `Thread.sleep(10.msecs)` or whatever time - `sleep` is a static member of the Thread class.
 They mention to use PeekMessage and I don't see you doing that, 
 not sure if it would change things though?
I am using MsgWaitForMultipleObjectsEx which blocks until something happens. That something can be a timer, input event, other message, or an I/O thing... it doesn't eat CPU unless *something* is happening.
Jun 17 2016
parent Joerg Joergonson <JJoergonson gmail.com> writes:
On Saturday, 18 June 2016 at 02:01:29 UTC, Adam D. Ruppe wrote:
 On Saturday, 18 June 2016 at 01:20:16 UTC, Joerg Joergonson 
 wrote:
 Error: undefined identifier 'sleep', did you mean function 
 'Sleep'?		

 "import core.thread; sleep(10);"
It is `Thread.sleep(10.msecs)` or whatever time - `sleep` is a static member of the Thread class.
 They mention to use PeekMessage and I don't see you doing 
 that, not sure if it would change things though?
I am using MsgWaitForMultipleObjectsEx which blocks until something happens. That something can be a timer, input event, other message, or an I/O thing... it doesn't eat CPU unless *something* is happening.
Yeah, I don't know what though. Adding Sleep(5); reduces it's consumption to 0% so it is probably just spinning. It might be the nvidia issue that creates some weird messages to the app. I'm not too concerned about it as it's now done to 0, it is minimal wait time for my app(maybe not acceptable for performance apps but ok for mine... at least for now). As I continue to work on it, I might stumble on the problem or it might disappear spontaneously.
Jun 18 2016