digitalmars.D - Profile-Guided Optimization (PGO) support in D ecosystem
- Alexander Zaitsev (37/37) Nov 10 2023 Hi!
- Sergey (12/16) Nov 10 2023 Hi,
- Johan (7/27) Nov 11 2023 This list is a fantastic "overview" Sergey, thanks!
- Imperatorn (6/19) Nov 11 2023 I have only used PGO with LDC, if I remember correctly I posted
- Iain Buclaw (9/16) Nov 12 2023 IIRC, Jon wrote a bit about LTO and PGO (with benchmarks
- max haughton (10/18) Nov 12 2023 If you search for PGO in the dmd repo you will find that I
- Siarhei Siamashka (44/48) Dec 17 2023 The PGO code is there, but it seems to fail at compiling
- max haughton (3/9) Dec 17 2023 Does druntime need building?
- Siarhei Siamashka (55/67) Dec 18 2023 That's a good question. As the author of this code, you probably
- Siarhei Siamashka (8/12) Dec 21 2023 Max, do you have any comments on the described procedure? Are
- Siarhei Siamashka (6/8) Dec 18 2023 Formally submitted an issue about this at
Hi! I am investigating the Profile-Guided Optimization (PGO) state across the ecosystem - all my current results (with a lot of benchmarks, PGO-related information, and much more) are available at https://github.com/zamazan4ik/awesome-pgo . I am interested in PGO state in the D ecosystem too - that's why I am here. I had been researching a little about PGO in D but compared to C++ almost no information is available in the official documentation (or I just don't know where to search it, hah). I have the following questions (for each D compiler: DMD, GDC, LDC - am interested in all of them): 1. What is the most up-to-date place for PGO documentation? Right now I found only [this](https://wiki.dlang.org/LDC_LLVM_profiling_instrumentation) for LDC. What about DMD and GDC? 2. Does any D compiler support [Sampling PGO](https://clang.llvm.org/docs/UsersManual.html#using-sampling-profilers) (also known as [AutoFDO](https://github.com/google/autofdo))? If Sampling PGO is not supported - do you plan to support it in the future? For us sampling PGO can be important since it's much easier to use for gathering the PGO profiles directly from the production environment without hurting the production performance a lot. 3. Do you support [other](https://aaupov.github.io/blog/2023/07/09/pgo) PGO modes like CSIR PGO in D compilers? If not, do you plan to support them in the future? 4. What performance improvements did you get with enabling LTO + PGO on D compilers? Could you please share the number for each compiler? With this information it's much easier to consider rebuilding a D compiler (due to strict security requirements) locally with PGO since we can estimate benefits from PGO for the D compiler based on the actual benchmarks from the compiler developers. 5. Is there any documentation on how to build DMD and GDC with LTO+PGO? I am looking for smth like it's [done](https://clickhouse.com/docs/en/operations/optimizing-performance/profile- uided-optimization) in the ClickHouse documentation (or like it's done for Clang or Rustc). 6. Am I right that the officially released D compiler binaries are already LTO + PGO optimized? According to the [script](https://github.com/ldc-developers/ldc/blob/master/.github workflows/main.yml) it's true at least for LDC. What about other compilers? Similar questions about LDC in the upstream: https://github.com/ldc-developers/ldc/discussions/4524 Thanks a lot for the help!
Nov 10 2023
On Friday, 10 November 2023 at 12:47:56 UTC, Alexander Zaitsev wrote:Hi! Similar questions about LDC in the upstream: https://github.com/ldc-developers/ldc/discussions/4524 Thanks a lot for the help!Hi, Some internal details about PGO is in this article: https://archive.fosdem.org/2017/schedule/event/ldc_d_optimization/attachments/slides/1819/export/events/attachments/ldc_d_optimization/slides/1819/FOSDEM_2017.pdf and blog posts from LDC dev: http://johanengelen.github.io/ldc/2016/11/10/Link-Time-Optimization-LDC.html http://johanengelen.github.io/ldc/2016/04/13/PGO-in-LDC-virtual-calls.html For PGO + LTO example of usage and the performance improvements, I think one of the most helpful source is this repo: https://github.com/eBay/tsv-utils/blob/master/docs/BuildingWithLTO.md https://github.com/eBay/tsv-utils/blob/master/docs/lto-pgo-study.md
Nov 10 2023
On Friday, 10 November 2023 at 13:09:01 UTC, Sergey wrote:On Friday, 10 November 2023 at 12:47:56 UTC, Alexander Zaitsev wrote:This list is a fantastic "overview" Sergey, thanks! Please add this to the discussion https://github.com/ldc-developers/ldc/discussions/4524, so it does not get lost as easily. Cheers, JohanHi! Similar questions about LDC in the upstream: https://github.com/ldc-developers/ldc/discussions/4524 Thanks a lot for the help!Hi, Some internal details about PGO is in this article: https://archive.fosdem.org/2017/schedule/event/ldc_d_optimization/attachments/slides/1819/export/events/attachments/ldc_d_optimization/slides/1819/FOSDEM_2017.pdf and blog posts from LDC dev: http://johanengelen.github.io/ldc/2016/11/10/Link-Time-Optimization-LDC.html http://johanengelen.github.io/ldc/2016/04/13/PGO-in-LDC-virtual-calls.html For PGO + LTO example of usage and the performance improvements, I think one of the most helpful source is this repo: https://github.com/eBay/tsv-utils/blob/master/docs/BuildingWithLTO.md https://github.com/eBay/tsv-utils/blob/master/docs/lto-pgo-study.md
Nov 11 2023
On Friday, 10 November 2023 at 12:47:56 UTC, Alexander Zaitsev wrote:Hi! I am investigating the Profile-Guided Optimization (PGO) state across the ecosystem - all my current results (with a lot of benchmarks, PGO-related information, and much more) are available at https://github.com/zamazan4ik/awesome-pgo . I am interested in PGO state in the D ecosystem too - that's why I am here. I had been researching a little about PGO in D but compared to C++ almost no information is available in the official documentation (or I just don't know where to search it, hah). I have the following questions (for each D compiler: DMD, GDC, LDC - am interested in all of them): Thanks a lot for the help!I have only used PGO with LDC, if I remember correctly I posted something about it in the forums. Let me see if I can find it. I think it was this: https://forum.dlang.org/post/ajorqeooyccwuwpvteue forum.dlang.org
Nov 11 2023
On Friday, 10 November 2023 at 12:47:56 UTC, Alexander Zaitsev wrote:Hi!IIRC, Jon wrote a bit about LTO and PGO (with benchmarks somewhere) for tsv-utils. https://github.com/eBay/tsv-utils/1. What is the most up-to-date place for PGO documentation? Right now I found only [this](https://wiki.dlang.org/LDC_LLVM_profiling_instrumentation) for LDC. What about DMD and GDC?Look up any GCC documentation/how-tos on using `-fprofile-generate=` and `-fprofile-use=`.5. Is there any documentation on how to build DMD and GDC with LTO+PGO? I am looking for smth like it's [done](https://clickhouse.com/docs/en/operations/optimizing-performance/profile- uided-optimization) in the ClickHouse documentation (or like it's done for Clang or Rustc).LTO and PGO aren't a feature of the language, rather the compiler infrastructure.
Nov 12 2023
On Friday, 10 November 2023 at 12:47:56 UTC, Alexander Zaitsev wrote:Hi! I am investigating the Profile-Guided Optimization (PGO) state across the ecosystem - all my current results (with a lot of benchmarks, PGO-related information, and much more) are available at https://github.com/zamazan4ik/awesome-pgo . I am interested in PGO state in the D ecosystem too - that's why I am here. [...]If you search for PGO in the dmd repo you will find that I implemented a pgo build for the compiler a while ago. I'm not sure if it's enabled for releases but we use it internally at Symmetry IIRC. One thing if note is that GCC has a feature called AutoFDO which is quite interesting. I think LLVM might have a similar concept but I'm not sure, but also has a tool called Bolt which does the same thing only after compilation.
Nov 12 2023
On Sunday, 12 November 2023 at 18:03:19 UTC, max haughton wrote:If you search for PGO in the dmd repo you will find that I implemented a pgo build for the compiler a while ago.The PGO code is there, but it seems to fail at compiling "dshell_prebuilt.d" file. And aborts collecting the profiling data prematurely because of this: ``` [...] Built dmd with PGO instrumentation Compiling dmd testsuite to generate PGO data Executing: ldmd2 -m64 -of/tmp/dmd/compiler/test/test_results/d_do_test /tmp/dmd/compiler/test/tools/d_do_test.d -fPIC -I/tmp/dmd/compiler/test/tools -i -version=NoMain Executing: ldmd2 -m64 -of/tmp/dmd/compiler/test/test_results/unit_test_runner /tmp/dmd/compiler/test/tools/unit_test_runner.d -fPIC /tmp/dmd/compiler/test/tools/paths Executing: /tmp/dmd/generated/linux/release/64/dmd -conf= -m64 -of/tmp/dmd/compiler/test/test_results/dshell_prebuilt.o -c /tmp/dmd/compiler/test/tools/dshell_prebuilt/dshell_prebuilt.d -fPIC Executing: ldmd2 -m64 -of/tmp/dmd/compiler/test/test_results/sanitize_json /tmp/dmd/compiler/test/tools/sanitize_json.d -fPIC /tmp/dmd/compiler/test/tools/dshell_prebuilt/dshell_prebuilt.d(6): Error: unable to read module `stdlib` /tmp/dmd/compiler/test/tools/dshell_prebuilt/dshell_prebuilt.d(6): Expected 'core/stdc/stdlib.d' or 'core/stdc/stdlib/package.d' in one of the following import paths: import path[0] = /tmp/dmd/compiler/test/../../druntime/import import path[1] = /tmp/dmd/compiler/test/../../../phobos failed to build '/tmp/dmd/compiler/test/test_results/dshell_prebuilt.o' dmd tests failed! This will not end the PGO build because some data may have been gathered Merging PGO data [...] ``` The compiler is still built successfully, but its performance is not optimal this way. I patched up "dshell_prebuilt.d" to strip everything out of it, the error disappears, the whole DMD test suite seems to compile and the compiler gets a noticeable performance boost.I'm not sure if it's enabled for releasesThis doesn't seem to be the case at least for the https://downloads.dlang.org/releases/2.x/2.106.0/dmd.2.106.0.linux.tar.xz tarball. LTO is enabled, but apparently without PGO.but we use it internally at Symmetry IIRC.It's good to know that DMD with LTO+PGO is already successfully used in production at Symmetry. Would it make sense to also enable this optimization for everyone else?
Dec 17 2023
On Monday, 18 December 2023 at 01:57:17 UTC, Siarhei Siamashka wrote:On Sunday, 12 November 2023 at 18:03:19 UTC, max haughton wrote:Does druntime need building?[...]The PGO code is there, but it seems to fail at compiling "dshell_prebuilt.d" file. And aborts collecting the profiling data prematurely because of this: [...]
Dec 17 2023
On Monday, 18 December 2023 at 02:30:21 UTC, max haughton wrote:On Monday, 18 December 2023 at 01:57:17 UTC, Siarhei Siamashka wrote:That's a good question. As the author of this code, you probably have a much better idea about how it's supposed to work. I tried to come up with some scriptable step by step build instructions: ``` DMD_TAG=v2.106.0 LDMD=ldmd2-1.32.0 git clone --depth 1 --branch "${DMD_TAG}" https://github.com/dlang/dmd.git || exit 1 git clone --depth 1 --branch "${DMD_TAG}" https://github.com/dlang/phobos.git || exit 1 cd dmd make -j4 -f posix.mak HOST_DMD=$LDMD ENABLE_RELEASE=1 ENABLE_LTO=1 || exit 1 cd ../phobos make -j4 -f posix.mak || exit 1 cd ../dmd cp generated/linux/release/64/dmd "../dmd_${DMD_TAG}_lto" rm -rf generated rdmd compiler/src/build.d OS="linux" BUILD="release" MODEL="64" HOST_DMD="$LDMD" CXX="c++" AUTO_BOOTSTRAP="" DOCDIR="" STDDOC="" DOC_OUTPUT_DIR="" MAKE="make" VERBOSE="" ENABLE_RELEASE="1" ENABLE_DEBUG="" ENABLE_ASSERTS="" ENABLE_LTO="1" ENABLE_UNITTEST="" ENABLE_PROFILE="" ENABLE_COVERAGE="" DFLAGS="" dmd-pgo || exit 1 cp generated/linux/release/64/dmd "../dmd_${DMD_TAG}_lto+pgo" ls -l dmd_* ``` Running it results in the following: ``` -rwxr-xr-x 1 ssvb ssvb 7626560 Dec 18 13:30 dmd_v2.106.0_lto -rwxr-xr-x 1 ssvb ssvb 7994232 Dec 18 13:38 dmd_v2.106.0_lto+pgo ``` The `dmd_v2.106.0_lto` file roughly matches the size and performance characteristics of the `dmd` executable from the https://downloads.dlang.org/releases/2.x/2.106.0/dmd.2.106.0.linux.tar.xz release tarball and `dmd_v2.106.0_lto+pgo` is its faster PGO-enabled upgrade. I can observe at least 10% compilation time reduction when using the PGO-enabled `dmd`. It's rather messy, but this somehow works. There are many questions though. For example, should the "dmd-pgo" target be accessible from the makefile without invoking "rdmd compiler/src/build.d" directly? Is sharing the same directory "generated/linux/release" for the produced non-PGO and PGO binaries actually okay? Is the dmd testsuite a good training set or maybe collecting profiling data during Phobos compilation would be better? The "dshell_prebuilt.d" glitch happens if the PGO-enabled DMD is built before Phobos & druntime and this makes everything fragile and non-intuitive. So the LTO-enabled DMD needs to be built first, then we need to use it to compile druntime, and finally the "generated/linux/release" directory has to be erased before the PGO build is started in order not to clash with it. Either way, providing faster PGO-enabled binary releases of DMD would make it more competitive in the compilation speed race against LDC: https://forum.dlang.org/post/pugqkvthbicqaigemijj forum.dlang.org :-)On Sunday, 12 November 2023 at 18:03:19 UTC, max haughton wrote:Does druntime need building?[...]The PGO code is there, but it seems to fail at compiling "dshell_prebuilt.d" file. And aborts collecting the profiling data prematurely because of this: [...]
Dec 18 2023
On Monday, 18 December 2023 at 12:24:39 UTC, Siarhei Siamashka wrote:[...] It's rather messy, but this somehow works. There are many questions though. [...]Max, do you have any comments on the described procedure? Are people from Symmetry using the current unmodified `build.d` from the DMD repository for building the compiler with PGO support? Or have you done some extra customizations since then? The PGO issue is not resolved until the official DMD binary releases actually take it into use.
Dec 21 2023
On Monday, 18 December 2023 at 01:57:17 UTC, Siarhei Siamashka wrote:This doesn't seem to be the case at least for the https://downloads.dlang.org/releases/2.x/2.106.0/dmd.2.106.0.linux.tar.xz tarball. LTO is enabled, but apparently without PGO.Formally submitted an issue about this at https://issues.dlang.org/show_bug.cgi?id=24287 So that the https://github.com/dlang/installer maintainers can probably take some action to improve the current situation.
Dec 18 2023