Converting a large mathematical software package written in C++ to C++20 modules

vblanco | 119 points

Thanks to author for doing some solid work in providing data points for modules. For those like me looking for the headline metric, here it is in the conclusion

  While the evidence shown above is pretty clear that building a software package as a module provides the claimed benefits in terms of compile time (a reduction by around 10%, see Section 5.1.1) and perhaps better code structure (Section 5.1.4), the data shown in Section 5.1.2 also make clear that the effect on compile time of downstream projects is at best unclear. 
So, alas, underwhelming in this iteration and perhaps speaks to 'module-fication' of existing source code (deal.II, dates from the '90s I believe), rather than doing it from scratch. More work might be needed in structuring the source code into modules as I have known good speedup with just pch, forward decls etc. (more than 10%). Good data point and rich analysis, nevertheless.
npalli | 16 hours ago

Oh, it’s Wolfgang. In computational math, he has a focus on research software that few others are able to do, he (the deal.ii team more generally) got an award for it last SIAMCSE. Generally a great writer, looking forward to reading this.

trostaft | 17 hours ago

A few points

1) modules only really help address time spent parsing stuff, not time spent doing codegen. Actually they can negatively impact codegen performance because they can make more definitions available for inlining/global opts, even in non-lto builds. For this reason it's likely best to compare using thin-lto in both cases.

2) when your dependencies aren't yet modularized you tend to get pretty big global module fragments, inflating both the size of your BMIs and the parsing time. Header units are supposed to partially address this but right now they are not supported in any build systems properly (except perhaps msbuild?). Also clang is pretty bad at pruning the global module fragment of unused data, which makes this worse again.

barchar | 10 hours ago

I really wonder whether LLMs are helpful in this case. This kind of task should be the forte of LLMs: well-defined syntax and requirements, abundant training material available, and outputs that are verifiable and validatable.

Perhaps we should use LLMs to convert all the legacy programs written in Fortran or COBOL into modern languages.

nsoonhui | 7 hours ago

I would like to see a comparison between modules and precompiled headers. I have a suspicion that using precompiled headers could provide the same build time gains with much less work.

Asooka | 17 hours ago

To be fair, C++’s modules make no sense, just like their namespaces that span multiple translation units.

It’s just more heavy clunky abstractions for the sake of abstractions.

KingLancelot | 16 hours ago

The code block styling is less than ideal.

isatty | 15 hours ago