digitalmars.D - Compiler flags when profiling a build
- Per =?UTF-8?B?Tm9yZGzDtnc=?= (6/6) Oct 25 2020 What flags do you feed to dmd/ldc when you profile a build?
- Guillaume Piolat (10/16) Oct 25 2020 dub -b release-debug # optimizations AND debug information
- Per =?UTF-8?B?Tm9yZGzDtnc=?= (11/13) Oct 25 2020 Why not use dmd's own `-profile` flag? Is too intrusive on
- Gregory II (20/33) Oct 25 2020 I've never used the flag but I don't imagine it is going to be
- Guillaume Piolat (5/15) Oct 25 2020 1. Because LDC doesn't have -profile and changing backends is an
- Guillaume Piolat (4/6) Oct 25 2020 Erratum: ldc does have -profile, at least it works in ldmd2 so
- Johan Engelen (6/13) Oct 25 2020 LDC has --fdmd-trace-functions resp. --finstrument-functions for
- Jacob Carlborg (18/24) Oct 26 2020 For maximum performance you should compile with LDC and do the
- Per =?UTF-8?B?Tm9yZGzDtnc=?= (2/4) Oct 26 2020 Thanks
What flags do you feed to dmd/ldc when you profile a build? Do you initially - compile in debug or release mode? - activate inlining or not? - use dmd or ldc? - use any other alternatives not mentioned above?
Oct 25 2020
On Sunday, 25 October 2020 at 14:08:21 UTC, Per Nordlöw wrote:What flags do you feed to dmd/ldc when you profile a build? Do you initially- compile in debug or release mode?If you want no bounds check you can make a custom build type in dub.- activate inlining or not?Inlining on.- use dmd or ldc?LDC- use any other alternatives not mentioned above?The AMD profiler is a very nice alternative to Intel Amplifier. If you don't provide debug info then the line information will be wrong. Automate your comparisons to improve statistical significance etc.
Oct 25 2020
On Sunday, 25 October 2020 at 14:34:54 UTC, Guillaume Piolat wrote:Why not use dmd's own `-profile` flag? Is too intrusive on performance? I've noticed a massive slow-down with about a magnitude.The AMD profiler is a very nice alternative to Intel Amplifier.I'm sitting on Linux. What is the preferred open alternative there? What are the pros and cons of the choices oprofile, gprof, sysprof, ...? Does anybody have a good comparison chart? I want to profile an application that parser files into ASTs and generates text from those AST. Current bottleneck is currently AST-node allocations.
Oct 25 2020
On Sunday, 25 October 2020 at 14:47:50 UTC, Per Nordlöw wrote:On Sunday, 25 October 2020 at 14:34:54 UTC, Guillaume Piolat wrote:I've never used the flag but I don't imagine it is going to be any good anyways. Anything that adds to your own programs computation is going to skew it. If you aren't profiling the final build, eg LDC with all optimizations and inlining on, then your optimizations are kind of pointless.Why not use dmd's own `-profile` flag? Is too intrusive on performance? I've noticed a massive slow-down with about a magnitude.Both AMD's and Intel's profilers support Linux. They give you access to hardware counters, at the very least you won't find profilers with more information than these. Pick whichever processor you have. https://developer.amd.com/amd-uprof/ https://software.intel.com/content/www/us/en/develop/tools/vtune-profiler.html I also use tracy, which integrates with your code to give you a more accurate picture. You have to write your own D wrapper though, and place additional code such as mixins to the functions you want to profile. Unlike D's -profile you can choose what to enable and narrow it down to a specific issue you are trying to fix. So that it doesn't slow your run times too much. https://github.com/wolfpld/tracyThe AMD profiler is a very nice alternative to Intel Amplifier.I'm sitting on Linux. What is the preferred open alternative there? What are the pros and cons of the choices oprofile, gprof, sysprof, ...? Does anybody have a good comparison chart? I want to profile an application that parser files into ASTs and generates text from those AST. Current bottleneck is currently AST-node allocations.
Oct 25 2020
On Sunday, 25 October 2020 at 14:47:50 UTC, Per Nordlöw wrote:Why not use dmd's own `-profile` flag? Is too intrusive on performance? I've noticed a massive slow-down with about a magnitude.1. Because LDC doesn't have -profile and changing backends is an important an easy optimization. 2. Quoting https://en.wikipedia.org/wiki/Profiling_(computer_programming)#Data_granulari y_in_profiler_types :In practice, sampling profilers can often provide a more accurate picture of the target program's execution than other approaches, as they are not as intrusive to the target program, and thus don't have as many side effects (such as on memory caches or instruction decoding pipelines). Also since they don't affect the execution speed as much, they can detect issues that would otherwise be hidden.
Oct 25 2020
On Sunday, 25 October 2020 at 19:58:29 UTC, Guillaume Piolat wrote:1. Because LDC doesn't have -profile and changing backends is an important an easy optimization.Erratum: ldc does have -profile, at least it works in ldmd2 so should map to a ldc2 flag.
Oct 25 2020
On Sunday, 25 October 2020 at 20:08:52 UTC, Guillaume Piolat wrote:On Sunday, 25 October 2020 at 19:58:29 UTC, Guillaume Piolat wrote:LDC has --fdmd-trace-functions resp. --finstrument-functions for DMD resp. basic GCC-style profiling. cheers, Johan1. Because LDC doesn't have -profile and changing backends is an important an easy optimization.Erratum: ldc does have -profile, at least it works in ldmd2 so should map to a ldc2 flag.
Oct 25 2020
On Sunday, 25 October 2020 at 14:08:21 UTC, Per Nordlöw wrote:What flags do you feed to dmd/ldc when you profile a build? Do you initially - compile in debug or release mode? - activate inlining or not? - use dmd or ldc? - use any other alternatives not mentioned above?For maximum performance you should compile with LDC and do the following: * Enable optimizations: -O3 * Enable Link Time Optimizations (LTO): --flto=full * Link against druntime and Phobos compiled for LTO: --defaultlib=druntime-ldc-lto,phobos2-ldc-lto * Target your specific CPU instead of some generic CPU. This will enable SSE and other features that otherwise are disabled: -mcpu=native * Depending on you're preferences, you might want to disable asserts, contracts and invariants and bounds checks in non- safe functions: --release * You can also control the bounds check in more details using this flag: --boundscheck * For profiling I assume you want debug symbols as well: -g -- /Jacob Carlborg
Oct 26 2020
On Monday, 26 October 2020 at 10:18:43 UTC, Jacob Carlborg wrote:For maximum performance you should compile with LDC and do the following:Thanks
Oct 26 2020