digitalmars.D.learn - Bloat with std.(string.)format?
- Chris (22/22) Sep 17 2015 If I have code like this:
- John Colvin (10/32) Sep 17 2015 Some initial bloat is expected, format is pretty big (although
- Chris (7/17) Sep 17 2015 It was in a test program. Only a few lines. But it would still
- John Colvin (13/32) Sep 17 2015 The upfront cost is paid only once per unique template arguments
- Chris (5/40) Sep 17 2015 Thanks.
- John Colvin (4/12) Sep 17 2015 Reasonably so in my testing, but expect more bugs than in a full
If I have code like this: auto builder = appender!string; builder ~= "Hello, World!"; builder ~= "I'm here!"; builder ~= "Now I'm there!"; the object file grows by 10-11 lines with each call to `builder ~=`. If I use this: builder ~= format("%s", "Hello, World!"); builder ~= format("%s", "I'm here!"); builder ~= format("%s", "Now I'm there!"); The object file is more than twice as big and it grows by 20 lines with each call to `format`. If I use builder ~= format("%s %s %s", "Hello, World!", "I'm here!", "Now I'm there!"); the code bloat is even worse. There are many situation where a formatting string is preferable to concatenation, however it adds _a lot_ of bloat. Would a custom formatter be preferable to reduce code bloat or should std/format.d be optimized? (Or both?) dmd 2.067.1 -release -boundscheck=off -inline -O
Sep 17 2015
On Thursday, 17 September 2015 at 09:54:07 UTC, Chris wrote:If I have code like this: auto builder = appender!string; builder ~= "Hello, World!"; builder ~= "I'm here!"; builder ~= "Now I'm there!"; the object file grows by 10-11 lines with each call to `builder ~=`. If I use this: builder ~= format("%s", "Hello, World!"); builder ~= format("%s", "I'm here!"); builder ~= format("%s", "Now I'm there!"); The object file is more than twice as big and it grows by 20 lines with each call to `format`. If I use builder ~= format("%s %s %s", "Hello, World!", "I'm here!", "Now I'm there!"); the code bloat is even worse. There are many situation where a formatting string is preferable to concatenation, however it adds _a lot_ of bloat. Would a custom formatter be preferable to reduce code bloat or should std/format.d be optimized? (Or both?) dmd 2.067.1 -release -boundscheck=off -inline -OSome initial bloat is expected, format is pretty big (although twice as big is a lot, unless your original code was quite small?). The extra bloat per call is likely due to inlining. I would hope that dmd would spot consecutive inlining of the same function and merge them, but perhaps it doesn't. You could certainly make a less feature complete implementation of format that is smaller. Have you tried with ldc or gdc. In particular, have you tried using ldc with --gc-sections on linux?
Sep 17 2015
On Thursday, 17 September 2015 at 10:33:44 UTC, John Colvin wrote:Some initial bloat is expected, format is pretty big (although twice as big is a lot, unless your original code was quite small?).It was in a test program. Only a few lines. But it would still add a lot of bloat in a program that uses it in different modules, wouldn't it?The extra bloat per call is likely due to inlining. I would hope that dmd would spot consecutive inlining of the same function and merge them, but perhaps it doesn't.You could certainly make a less feature complete implementation of format that is smaller.Don't know if it's worth the trouble.Have you tried with ldc or gdc. In particular, have you tried using ldc with --gc-sections on linux?Not yet. GDC and LDC always lag behind (this time considerably), so I'm usually stuck with DMD for development.
Sep 17 2015
On Thursday, 17 September 2015 at 10:53:17 UTC, Chris wrote:On Thursday, 17 September 2015 at 10:33:44 UTC, John Colvin wrote:The upfront cost is paid only once per unique template arguments per binary. So no, it doesn't scale badly there. Inlining, on the other hand, will - roughly speaking - increase binary sizes linearly with the number of calls. That's the cost you pay for (hopefully) better performance.Some initial bloat is expected, format is pretty big (although twice as big is a lot, unless your original code was quite small?).It was in a test program. Only a few lines. But it would still add a lot of bloat in a program that uses it in different modules, wouldn't it?I would say not worth it, unless you have a real problem with binary sizes for an actual finished product. Even then, I'd say you could get bigger, easier gains by messing around with -fvisibility settings, --gc-sections, strip etc. on GDC and LDCThe extra bloat per call is likely due to inlining. I would hope that dmd would spot consecutive inlining of the same function and merge them, but perhaps it doesn't.You could certainly make a less feature complete implementation of format that is smaller.Don't know if it's worth the trouble.That's a shame. https://github.com/ldc-developers/ldc/releases/tag/v0.16.0-alpha3 is at 2.067.1, is that not up-to-date enough?Have you tried with ldc or gdc. In particular, have you tried using ldc with --gc-sections on linux?Not yet. GDC and LDC always lag behind (this time considerably), so I'm usually stuck with DMD for development.
Sep 17 2015
On Thursday, 17 September 2015 at 12:49:03 UTC, John Colvin wrote:On Thursday, 17 September 2015 at 10:53:17 UTC, Chris wrote:Thanks. That's up to date enough now. Is it stable, though? For version 2.067.1 it took a long time this time. Maybe we should focus some of our efforts on LDC and GCD being up to date faster.On Thursday, 17 September 2015 at 10:33:44 UTC, John Colvin wrote:The upfront cost is paid only once per unique template arguments per binary. So no, it doesn't scale badly there. Inlining, on the other hand, will - roughly speaking - increase binary sizes linearly with the number of calls. That's the cost you pay for (hopefully) better performance.Some initial bloat is expected, format is pretty big (although twice as big is a lot, unless your original code was quite small?).It was in a test program. Only a few lines. But it would still add a lot of bloat in a program that uses it in different modules, wouldn't it?I would say not worth it, unless you have a real problem with binary sizes for an actual finished product. Even then, I'd say you could get bigger, easier gains by messing around with -fvisibility settings, --gc-sections, strip etc. on GDC and LDCThe extra bloat per call is likely due to inlining. I would hope that dmd would spot consecutive inlining of the same function and merge them, but perhaps it doesn't.You could certainly make a less feature complete implementation of format that is smaller.Don't know if it's worth the trouble.That's a shame. https://github.com/ldc-developers/ldc/releases/tag/v0.16.0-alpha3 is at 2.067.1, is that not up-to-date enough?Have you tried with ldc or gdc. In particular, have you tried using ldc with --gc-sections on linux?Not yet. GDC and LDC always lag behind (this time considerably), so I'm usually stuck with DMD for development.
Sep 17 2015
On Thursday, 17 September 2015 at 13:42:15 UTC, Chris wrote:On Thursday, 17 September 2015 at 12:49:03 UTC, John Colvin wrote:Reasonably so in my testing, but expect more bugs than in a full release.[...]Thanks. That's up to date enough now. Is it stable, though?For version 2.067.1 it took a long time this time. Maybe we should focus some of our efforts on LDC and GCD being up to date faster.It would be great to have more people working on them, yes.
Sep 17 2015
On Thursday, 17 September 2015 at 15:17:21 UTC, John Colvin wrote:On Thursday, 17 September 2015 at 13:42:15 UTC, Chris wrote:I suppose it's an area most people (including myself) shy away from. I know next to nothing about compiler implementation.On Thursday, 17 September 2015 at 12:49:03 UTC, John Colvin wrote:Reasonably so in my testing, but expect more bugs than in a full release.[...]Thanks. That's up to date enough now. Is it stable, though?For version 2.067.1 it took a long time this time. Maybe we should focus some of our efforts on LDC and GCD being up to date faster.It would be great to have more people working on them, yes.
Sep 17 2015
On Thursday, 17 September 2015 at 15:45:10 UTC, Chris wrote:I suppose it's an area most people (including myself) shy away from. I know next to nothing about compiler implementation.Sometimes it's just diagnosis of test failures.
Sep 18 2015