I want to finally understand how to compile in C well, any resource recommendations?
I am a scientist who has semi-frequently written code in C (and other compiled languages like Fortran). When it comes time to compile, I typically tape together a Makefile from past projects and hope for the best, but even then I spend more time than I'd like to admit trying to figure out why my project is not being compiled or linked correctly. I've had a hard time finding any resources that aren't extremely surface level, or else are not behind some type of paywall. Can anyone recommend me some reading so that I can confidently write Makefiles and compile programs and actually understand what the different flags and commands are doing? I don't need extreme "under the hood" information as I don't intend to do things like write my own compiler, I just want to understand the process a little better. Help a scientist out!
I don't know much about writing makefiles, and I'm sure someone will come by later and give a more comprehensive and sensible answer, but what I'd suggest is keep it simple stupid. Start off with a blank makefile, learn the basics, and then focus on gcc flags where the real meat and potatoes is.
For the most part I'm happy just doing
gcc -O2 -j8 foo.cwhere my CPU has 8 threads. This will compile with optimization level 2 (faster to execute, slower to compile), and 8 threads. If I know I'm going to be running the binary on the current CPU I'm using I'll add-march=nativeor if I know it'll be run on a different, but still modern CPU I'll do something like-march=skylaketo enable support for things like SSE4 or AVX or whatnot.In a makefile, to keep things neat and maintainable you'll want to put these into variables at the top of the file. You'll want to put those compiler options into a CFLAGS variable, and your source files into another variable, etc, like so:
Or if you want to compile all the C files in the current directory you could do like
SOURCES=$(wildcard *.c)Then further down you'd use a syntax like this to use your variables:
The makefile mostly just says what input files you want to compile, and what options you want the compiler to use. If you are targeting a specific distro you can write up an install target to tell it where to put the resulting binaries, etc. The meat and potatoes of optimizing compilation comes down to which CFLAGS you give gcc.
The comprehensive and exhaustive documentation for gcc can be found here. While doing surface level research for this reply I found this basic makefile tutorial here, another surface level tutorial here, and a good list of optimizations for gcc here.
You can also use
nprocon Linux to assign some number of processor cores to the compiler automatically. For example, in my Arch Linux machine's/etc/makepkg.conffile I have the following:So any packages I build myself using Arch's
makepkgcommand will automatically compile with O3 and native optimizations, and will spawn as many compile jobs as there are processor cores, plus two.I'm curious about the
n+2part. Why do we want to spawn two extra processes? I've heard people say you should use-jwith a couple less threads than you have, just for system overhead. So, if nproc doesn't account for hypertheading and only targets physical cores that'd make sense to me; Otherwise are we trying to overprovision here?nproccounts the number of threads/logical cores. For example, my Ryzen 7 5800X reports16.So
nproc + 2is overprovisioning. However, I'm not sure why I have it set up that way.I vaguely remember reading, years ago, some answer on Stack Overflow that showed
nproc + 2was better for large compilation jobs than justnprocor evennproc * 2, but I could also be misremembering. This was done by the hungariantoast of seven years ago, so honestly there's no telling why I have it set that way.Hmm. I feel like this whole field of research is surrounded in magic rituals and it probably comes down to your specific codebase, your specific computer, how often you need to recompile, and what else you're doing at the same time.
The only large project I compile semi-often is the linux kernel, and I usually set it to 8 threads when my CPU actually has 12. Mind you, I'm using a laptop as my main computer, so I'm likely watching youtube and/or playing balatro at the same time so I'm willing to wait an extra minute if it doesn't impact those things, lol.
The core functionality of
makeis simple and very general, extending beyond C and software development: given a target file, its prerequisite files and a recipe that generates the target, it runs the command only if any of the prerequisite files are newer than the target. Rules can be dependent: if the target of one rule is a prerequisite of another rule, it'll evaluate the former rule first.For make, I like the Introduction to Makefiles in the GNU Make manual.
For the compiler front-end, there are a lot of flags and options for more or less obscure use cases, so it really depends on what you do. The GCC manual provides a helpful summary. Most importantly I think, you pass
-cto tell the frontend to compile the given source files into an object file without linking it. You pass-owith a name to specify an output name. I usually pass-Wall -O3when compiling to enable all warnings and a high degree of optimization respectively. For linking to system libraries I usepkg-configto generate the appropriate flags to the linker and compiler.So for a simple single-file C application, in the shell I might invoke
The corresponding Makefile might be something like
or, because GNU make has pre-defined implicit rules for these things based on a set of variables
Note that the Makefile only really shines when you are linking multiple independent object files. Then, if you have specified the dependencies correctly, it'll only recompile the objects to which the dependencies have changed, and relink, saving you a whole lot of compilation of source files that haven't changed.
In addition to the great info in the comments here, I'd like to share this old topic from 2018:
How do I hack makefiles?
There was also another topic about makefiles posted earlier this year:
Be aware of the Makefile effect
But I haven't read it yet