Load-Bearing C Projects
A guide to building and maintaining nontrivial C projects of high quality.
Estimated reading time: 18 mins
May 23, 2021C can be a very fun language to program in, if you can set up your project well. A lot of what I know about managing C projects was learned the hard way by experimentation and studying other people’s code. This post is meant to be the guide I never had to organizing and polishing nontrivial C projects, as well as setting up Makefiles for building them and manpages for documentation.
For this post, I’m going to assume some knowledge of C programming, but not much experience working on larger, multi-file C projects. We’ll start with a simple “Hello world” command line utility and grow the project through incremental steps, explaining as we go.
Beginning a Project
Our program begins its life in a simple and traditional way:
/*
* hello.c - A simple hello world program.
*/
#include <stdio.h>
int main(void)
{
("Hello world!\n");
printfreturn 0;
}
I like to start all of my files with a short, descriptive comment that says
what the file contains. It’s very easy to assume this information is so obvious
that it doesn’t need to be explicit, but I find that it’s surprisingly helpful,
especially when you start to have a lot of files. It might seem obvious when you
write it what a file like loader.c
does, but it’s much less obvious
to someone else, or to future-you. A little comment like
loader.c - Contains code for loading serialized data
helps a lot!
Make it a habit to start giving files descriptive headers right off the bat, and
you’ll be glad you did later. It doesn’t need to be anything too verbose, just a
single sentence can be enormously helpful.
In order to compile this program, you need to invoke a C compiler. On most
systems, cc
(C Compiler) is a symbolic link to the
preferred compiler on that system: gcc
on Linux, or
clang
on MacOS. On MacOS, gcc
is
actually a symlink to clang
because so many people type
gcc
when they should use cc
. As a best
practice, I would recommend using cc
instead of gcc
when compiling C programs. It’s a good reminder that there are other compilers
out there besides GCC. So, let’s compile and run our program:
$ cc hello.c -o hello
$ ./hello
Hello world!
Great, our program compiles and runs! However, typing
cc hello.c -o hello
every time we want to recompile the program can
be a little tedious. make
(specifically GNU Make in this blog post)
is a great way to set up how you want a project to compile. make
isn’t perfect, but it’s very simple to use for personal projects, and it’s
available on pretty much every system. There are a lot of alternative build
systems like automake
or ninja
or whatnot, but in my
opinion, they introduce a lot of complexity that isn’t needed for most projects.
If your project is so complicated that it needs a complicated build system, you
should reconsider how much of that complexity is truly necessary and how much of
it is due to overengineering.
Automating Builds with Make
The essence of how make
works is it looks for a file called
Makefile
, which contains a set of rules. Each rule follows a simple
pattern: output file, followed by its dependencies, then a shell command that
will use the dependencies to generate the output file:
hello: hello.c
cc hello.c -o hello
In this case, that means “the file hello
depends on
hello.c
and can be built by running the command
cc hello.c -o hello
”. Now, you can try building hello
using make:
$ make hello
make: 'hello' is up to date
Oh, it looks like nothing happened. That’s because the Makefile specifies
that the output file hello
only depends on the input file
hello.c
, and hello.c
hasn’t been modified since we
compiled hello
. Make will not recompile unless either the
output file does not exist, or the dependencies have a more recent file
modification time! Try making a trivial change to hello.c
and re-running make
:
$ touch -m hello.c # Update the file modification time
$ make hello
cc hello.c -o hello
Great, it compiled!
If you don’t provide any arguments, make
defaults to first rule
defined in the Makefile. In this case, that means the hello
rule.
So, in our case, we can just run make
instead of
make hello
and it will do the same thing.
Cleaning Up
It’s sometimes nice to have a way to force recompilation even when
hello.c
hasn’t changed. For example, if we want to rebuild the same
source code with different compiler flags. A common way to do this is to add a
“phony” clean
rule that removes build files (“phony” here just
means that clean
isn’t the name of an actual file, it’s just the
name of the rule):
.PHONY: clean
clean:
rm -f hello
Now, we can run make clean
and it will remove the build files
and let us rebuild:
$ make clean
rm -f hello
$ make
cc hello.c -o hello
Making Everything
Another common practice is to include a phony all
rule as the
first rule in the file. This can sometimes be useful if you have multiple build
targets that you want to compile by default. For this example, there’s only one
target, so that would look like this:
.PHYONY: all
all: hello
Notice that there’s actually no “body” to the rule, it’s just a phony target
with one dependency: hello
. This means that running
make all
involves nothing besides satisfying the precondition of
building hello
. In this example, it’s probably not worth having
this rule, but keep it in mind as an option for if you have multiple build
targets.
Adding Compiler Flags
So far, we have a working Makefile, but sooner or later, it will be useful to have different compiler flags available (for example, warning flags, or optimization flags). The simplest solution is to just add those flags to the Makefile rule:
hello: hello.c
cc -std=c99 -Wall -Wextra -Werror -O2 hello.c -o hello
However, this makes it tricky to quickly swap between different flag
settings. Instead, it’s better to use make
variables to store the
flag values. I like to split the flags into four categories:
- Warning flags (
CWARN
) - Optimization level (
O
) - Debug flags (
G
) - Everything else (
CFLAGS
)
This is what it looks like in a Makefile:
CC=cc
CFLAGS=-std=C99
CWARN=-Wall -Wextra -Werror
O=-O2
G=
hello: hello.c
$(CC) $(CFLAGS) $(CWARN) $(O) $(G) hello.c -o hello
Now, running make
will use the default flags:
$ make
cc -std=C99 -Wall -Wextra -Werror -O2 hello.c -o hello
However, we can also manually specify flags, for example, if we want to do a debug build:
$ make G=-ggdb O=-O0
cc -std=C99 -Wall -Wextra -Werror -O0 -ggdb hello.c -o hello
Or we can even specify the compiler to use:
make CC=tcc
tcc -std=C99 -Wall -Wextra -Werror -O2 hello.c -o hello
Handling Multiple Files
At this point, we’re in a very good place when it comes to a single-file
project. We can run make
whenever we want and it will ensure the
program compiles. However, if the project grows over time, it’s likely that
there will be a point when there’s too much code to put in a single file. As a
rule of thumb, I typically don’t let individual C files get more than 700 lines
long, but I won’t split into multiple files unless there is a clean demarcation
of functionality. Having code split into multiple files based on functionality
makes it easier for you (or other contributors) to find the code you’re looking
for, and makes it easier to understand changes in source control. However, if
you go too far and split everything into too many tiny files, then it becomes
tedious to have to hop across many different files when editing or reading code.
Each file should be relatively self-contained and modular–the public-facing API
should not change frequently, and non-API changes to the code in one file should
not require changes in another file.
For this example, let’s consider moving the hello-world-printing logic to a
separate file: greeter.c
:
/*
* greeter.c - The logic for greeting things.
*/
#include <stdio.h>
void greet(char *subject)
{
("Hello %s!\n", subject);
printf}
If we want to use this code in our main program, our main program will need to import a header file so it knows what functions are available:
/*
* greeter.h - Header file for greeting logic.
*/
void greet(char *subject);
/*
* hello.c - A simple hello world program.
*/
#include "greeter.h"
int main(void)
{
("world");
greetreturn 0;
}
The next step is to tell our Makefile that we need to compile and link this new file as well:
hello: hello.c greeter.h greeter.o
$(CC) $(CFLAGS) $(CWARN) $(O) $(G) greeter.o hello.c -o hello
greeter.o: greeter.c greeter.h
$(CC) -c $(CFLAGS) $(CWARN) $(O) $(G) greeter.c -o greeter.o
Some things to notice:
- The
hello
rule now has 3 dependencies:hello.c
,greeter.h
, andgreeter.o
. If any of these files change,hello
will be rebuilt. - There is a new rule:
greeter.o
. This is an object file that contains the compiled code ofgreeter.c
.make
works like a dependency tree, so ifgreeter.c
changes and you buildhello
, it will rebuildgreeter.o
first. - The rule for
greeter.o
uses the compiler flag-c
, which tells the compiler to not invoke the linker, just build an object file that can be used in the main program.
In theory, splitting into multiple files can also greatly improve
recompilation time because you can skip recompiling files that weren’t changed.
In this example, if you make changes to hello.c
, but
greeter.c
remains unchanged, then make
will only
recompile hello
, but reuse the existing greeter.o
. In
practice, unless you’re running on an old Raspberry Pi, or have a truly massive
project (like an operating system), it will probably only take a few hundred
milliseconds to compile your entire project, so you won’t notice much of a
difference.
As you add more and more files, adding rules like this will get a bit
repetitive, but fortunately, make
has some pattern matching rules
that allow you to write general-purpose rules. This is how I would normally
handle greeter.o
:
%.o: %.c %.h
$(CC) -c $(CFLAGS) $(CWARN) $(O) $(G) $< -o $@
The %.o: %.c %.h
part is special Make wildcard syntax that means
“if you’re looking for how to build a file called foo.o
, and
there’s no specific rule for it, then you can build it with this rule and the
dependencies foo.c
and foo.h
.” The $<
is a special Make variable representing “the first dependency”
(greeter.c
), and $@
is a Make variable for for “the
output file” (greeter.o
). The C file itself imports the
greeter.h
header file, so the C compiler doesn’t need to be passed
the header filename, but it’s included in the Makefile because any changes to
the header should probably trigger a recompilation of
greeter.o
.
And just to make things a little bit tidier, I like to put the C files that
will need to be compiled in their own make
variable:
CC=cc
CFLAGS=-std=C99
CWARN=-Wall -Wextra -Werror
O=-O2
G=
HEADERS=greeter.h
CFILES=greeter.c
OBJFILES=$(CFILES:%.c=%.o)
hello: hello.c $(HEADERS) $(OBJFILES)
$(CC) $(CFLAGS) $(CWARN) $(O) $(G) $(OBJFILES) hello.c -o hello
%.o: %.c %.h
$(CC) -c $(CFLAGS) $(CWARN) $(O) $(G) $< -o $@
.PHONY: clean
clean:
rm -f hello $(OBJFILES)
The OBJFILES=$(CFILES:%.c=%.o)
part is make
syntax
that sets the OBJFILES
variable to whatever is in the
CFILES
variable, but with .c
replaced with
.o
. With this setup, we can easily add new files to the project
just by updating HEADERS
and CFILES
.
Preventing Header Duplication
As the number of header files grows, you run the risk of importing the same
header file multiple times. This can happen if foo.h
and
baz.h
both #include "qux.h"
. This causes the compiler
to complain about redeclaring variables or types. However, there’s a simple fix:
wrap the header file in a #ifndef
macro like this:
#ifndef FILE__GREETER_H
#define FILE__GREETER_H
/*
* greeter.h - Header file for greeting logic.
*/
void greet(char *subject);
#endif
The name FILE__GREETER_H
isn’t meaningful, it just has to be a
unique value that doesn’t collide with any names being used elsewhere. I tend to
go with FILE__<filename>
as my template. This macro ensures
that if a file is #include
d multiple times, the subsequent includes
don’t appear and the compiler is happy.
C File Layout
At this point, we have a robust project setup that’s easy to compile and extend. That leaves the question of how to divide up the code among different files. The approach that I favor is to have:
- Header files containing datatype declarations and function declarations that are used in other files.
- One
.c
file for each header file containing function implementations. - One main
.c
file that includes themain()
function and primarily handles command line argument parsing and invoking functions defined in other files.
Command Line Argument Parsing
Parsing command line arguments isn’t very fun, but it’s also not very
difficult. If you intend to run your program as a command line program, you
should support a few basic flags, like --help
. Below is my general
approach to command line flags. It’s overkill for a program as simple as
hello
, but it shows how you can begin to build a more feature-rich
command line tool:
/*
* hello.c - A simple hello world program.
*/
#include <stdio.h>
#include <string.h>
#include "greeter.h"
#define startswith(str, prefix) (!strncmp(str, prefix, strlen(prefix)))
const char *description = "hello - A simple hello world program";
const char *usage =
"Usage: hello <flags...>\n"
"Flags:\n"
" --help|-h Print this message and exit\n"
" --name=|-n <name> The name being greeted";
int main(int argc, char *argv[])
{
// Variables that get set by command line flags:
char *name = "world";
// Loop over command line arguments:
for (int i = 1; i < argc; i++) {
if (strcmp(argv[i], "--help") == 0) {
:
print_help("%s\n%s\n", description, usage);
printfreturn 0;
} else if (startswith(argv[i], "--name=")) {
= &argv[i][strlen("--name=")];
name } else if (argv[i][0] == '-' && argv[i][1] != '-') {
// Single-char flags:
for (char *c = &argv[i][1]; *c; c++) {
switch (*c) {
case 'h': goto print_help;
case 'n':
= (i+1 < argc) ? argv[++i] : "";
name break;
default:
("Unknown flag: -%c\n%s\n",
printf*c, usage);
return 1;
}
}
} else {
("Unknown flag: %s\n%s\n", argv[i], usage);
printfreturn 1;
}
}
// Do the business logic:
(name);
greet
return 0;
}
Minimum Viable Documentation
Documentation is often a hassle, but there three things you should really have:
- A README file
- A software license
- A manpage
README
The README file is the first thing to include. I like to use markdown (which
renders nicely on most platforms, but is also human-readable), so I’ll create a
README.md
file like this:
# hello
This is a simple hello world project.
## Dependencies
`make`.
There are no dependencies besides a C compiler and
## Building
`make` to build.
Simply run
## Installing
`sudo make install` to install.
Run
## Usage
`./hello`. See `man ./hello.1`
After compiling, run
for more details.
## Code Layout
- `hello.c` The main program logic.
- `greeting.h`/`greeting.c` The implementation of
the greeting.
## License
This software is released under the MIT license`LICENSE` for more
with the Commons Clause, see details.
Software License
It’s good practice to include a software license with your projects. This lets others know what is or isn’t allowed. Personally, I like the Commons Clause applied to the MIT License. The Commons Clause is not very well known, and if you want to be pedantic, it’s not technically “Open Source”. However, I think the combination of MIT License with the Commons Clause is best aligned with my goals for freely distributing my work:
- My code is available for free
- Anyone can use it
- Anyone can view the source
- Anyone can modify it
- Anyone can redistribute it
- But if anyone wants to sell my work, they’ll need to get my permission first
Here is an example of a license file I would use.
Manpages
Manpages (manual pages) are a bit more of a pain in the neck compared to
READMEs and license files, but they’re not fundamentally very complicated.
man
(the command) looks in a specified location for documentation
on a command. That documentation is written using roff typesetting. Here’s an example of a manpage file:
.\" Manpage for hello.
.TH man 1 "January 2, 2021" "0.1" "hello manual page"
.SH NAME
\- A hello world program
hello
.SH SYNOPSIS
.B hello
\fB-h\fR|\fB--help\fR]
[\fB-n\fR|\fB--name\fR=\fIname\fR]
[
.SH DESCRIPTION
\fBhello\fR is a hello world program. It greets things.
.SH OPTIONS
.B \-h\fR, \fB--help
Print a help message.
.B \-n\fR \fIname\fR, \fB--name\fR=\fIname\fR
\fIworld\fR
Set the name being greeted. Default:
.SH EXAMPLES
.TP
.B hello
"Hello world!"
Print
.TP
.B hello --help
Print the usage.
.TP
.B hello --name=\fIDave\fR
"Hello Dave!"
Print
.SH AUTHOR
Harry Q. Bovik
And this is how it renders:
Fully explaining roff would require a separate blog post, but there are essentially only three things you need to know to write a manpage:
- Lines that begin with things like
.SH
indicate a formatting directive for a line or section (.SH
is section header, for example). - You can change the formatting of text inline with
\fI
(italics),\fB
(bold), and\fR
(regular). These names are slightly misleading because on the terminal, “italic” text is rendered as lilac-colored and underlined, and “bold” text is rendered as blue and bold. - When you don’t know how to do something, you can just look at an existing
manpage to see how they did it. The simplest way to do this is
vim $(man -w foobar)
which will show you the source code of the manpage forfoobar
.
Once you have a manpage written out, the file should saved with the name
hello.1
(or whatever your program’s name is, followed by
.1
). The 1
refers to the manual section number for
“executable programs and shell files”, which is what this is. See
man man
for more details on the different sections.
If you want to preview your manpage, run man ./hello.1
. You
must include the ./
prefix or use an absolute
path, otherwise man
won’t be able to find your file. Eventually,
when you install the manpage file, it will either be put into
/usr/share/man/man1/
or /usr/local/share/man/man1
,
depending on your system. In order to find the correct directory, you can run
man -w cat
, which will tell you where the manpage for the
cat
program is (any program will work).
Alternative: Using Pandoc for Manpages
As an alternative to writing manpages with roff typesetting, you can also
write them using Markdown (or your format of choice) and convert the text to
roff using Pandoc. Pandoc is a really handy tool for cross-compiling
between different document formats, and it plays very nicely with
make
as well. Jérôme Belleman has written a nice, short
tutorial here, and Eddie Antonio has written a slightly more
in-depth tutorial here. If you do go this route, be sure to check the
Pandoc-generated manpages into your source control, so that your users don’t
need to have Pandoc installed to access the manpages.
Making Software Installable
At this point, we’re nearly done, there’s just one more thing left:
installing the program. We’ll need to add an install
rule to our
makefile. This rule has two tasks: put a copy of the executable in the right
spot and put a copy of the manpage in the right spot. There are generally
standard locations where these things go, but it’s good practice to allow the
user to customize the install location with make
flags if they want
to. Here’s what a simple installation rule would look like:
PREFIX=/usr/local
BINDIR=$(PREFIX)/bin
MANDIR=$(PREFIX)/man
.PHONY: install
install: hello
mkdir -p -m 755 "$(BINDIR)" "$(MANDIR)/man1"
"$(BINDIR)/"
cp hello "$(MANDIR)/man1/" cp hello.1
Also, I like to include an uninstall
rule, just as a courtesy to
make it easier to cleanly uninstall programs:
.PHONY: uninstall
uninstall:
rm -f "$(BINDIR)/hello" "$(MANDIR)/man1/hello.1"
Conclusion
And that’s everything! With all of this, you have a pretty swanky setup for a decently-sized C program that compiles, installs, and comes with documentation. This is essentially the same setup used by the Lua programming language’s source code, and by most suckless programs. It’s easy to maintain, and once you have a project with this setup, it’s very easy to copy over to a new project.
Here are the final files:
Epilogue: Languages Other than C
Most of the tips here apply just as well to projects in other programming
languages as well. For example, I write blog posts in markdown use Pandoc to convert
them to HTML, then use rsync
to send them to my server. I have a
Makefile that handles all of that, and it looks something like this:
MD_FILES := $(shell find -name '*.md')
MD_GEN=$(MD_FILES:%.md=%.html)
all: $(MD_GEN)
%.md: %.html
pandoc --defaults .pandoc/defaults.yml $< -o $@
sync: all
rsync -a ./ www@bruce-hill.com:/var/www/
.PHONY: all sync
Hopefully after reading this post, you now have a clearer idea of how C projects can be organized, and how to make your own Makefiles and manpages. The tools and ideas described here have a very simple core, but are incredibly flexible and useful, so it’s well worth trying out on your next project. Thanks for sticking around to the end of this rather long post!