>Building from source
You've been rash enough to want to build some of the Glasgow Functional Programming tools (GHC, Happy, nofib, etc.) from source. You've slurped the source, from the CVS repository or from a source distribution, and now you're sitting looking at a huge mound of bits, wondering what to do next.
Gingerly, you type make. Wrong already!
This rest of this guide is intended for duffers like me, who aren't really interested in Makefiles and systems configurations, but who need a mental model of the interlocking pieces so that they can make them work, extend them consistently when adding new software, and lay hands on them gently when they don't work.
The source code is held in your source tree. The root directory of your source tree must contain the following directories and files:
Makefile: the root Makefile.
mk/: the directory that contains the main Makefile code, shared by all the fptools software.
configure.in, config.sub, config.guess: these files support the configuration process.
All the other directories are individual projects of the fptools system—for example, the Glasgow Haskell Compiler (ghc), the Happy parser generator (happy), the nofib benchmark suite, and so on. You can have zero or more of these. Needless to say, some of them are needed to build others.
The important thing to remember is that even if you want only one project (happy, say), you must have a source tree whose root directory contains Makefile, mk/, configure.in, and the project(s) you want (happy/ in this case). You cannot get by with just the happy/ directory.
If you just want to build the software once on a single platform, then your source tree can also be your build tree, and you can skip the rest of this section.
We often want to build multiple versions of our software for different architectures, or with different options (e.g. profiling). It's very desirable to share a single copy of the source code among all these builds.
So for every source tree we have zero or more build trees. Each build tree is initially an exact copy of the source tree, except that each file is a symbolic link to the source file, rather than being a copy of the source file. There are “standard” Unix utilities that make such copies, so standard that they go by different names: lndir, mkshadowdir are two (If you don't have either, the source distribution includes sources for the X11 lndir—check out fptools/glafp-utils/lndir). See Section 7.4 for a typical invocation.
The build tree does not need to be anywhere near the source tree in the file system. Indeed, one advantage of separating the build tree from the source is that the build tree can be placed in a non-backed-up partition, saving your systems support people from backing up untold megabytes of easily-regenerated, and rapidly-changing, gubbins. The golden rule is that (with a single exception—Section 7.3) absolutely everything in the build tree is either a symbolic link to the source tree, or else is mechanically generated. It should be perfectly OK for your build tree to vanish overnight; an hour or two compiling and you're on the road again.
You need to be a bit careful, though, that any new files you create (if you do any development work) are in the source tree, not a build tree!
Remember, that the source files in the build tree are symbolic links to the files in the source tree. (The build tree soon accumulates lots of built files like Foo.o, as well.) You can delete a source file from the build tree without affecting the source tree (though it's an odd thing to do). On the other hand, if you edit a source file from the build tree, you'll edit the source-tree file directly. (You can set up Emacs so that if you edit a source file from the build tree, Emacs will silently create an edited copy of the source file in the build tree, leaving the source file unchanged; but the danger is that you think you've edited the source file whereas actually all you've done is edit the build-tree copy. More commonly you do want to edit the source file.)
Like the source tree, the top level of your build tree must be (a linked copy of) the root directory of the fptools suite. Inside Makefiles, the root of your build tree is called $(FPTOOLS_TOP). In the rest of this document path names are relative to $(FPTOOLS_TOP) unless otherwise stated. For example, the file ghc/mk/target.mk is actually $(FPTOOLS_TOP)/ghc/mk/target.mk.
When you build fptools you will be compiling code on a particular host platform, to run on a particular target platform (usually the same as the host platform). The difficulty is that there are minor differences between different platforms; minor, but enough that the code needs to be a bit different for each. There are some big differences too: for a different architecture we need to build GHC with a different native-code generator.
There are also knobs you can turn to control how the fptools software is built. For example, you might want to build GHC optimised (so that it runs fast) or unoptimised (so that you can compile it fast after you've modified it. Or, you might want to compile it with debugging on (so that extra consistency-checking code gets included) or off. And so on.
All of this stuff is called the configuration of your build. You set the configuration using a three-step process.
Change directory to $(FPTOOLS_TOP) and issue the command autoconf (with no arguments). This GNU program converts $(FPTOOLS_TOP)/configure.in to a shell script called $(FPTOOLS_TOP)/configure.
Some projects, including GHC, have their own configure script. If there's an $(FPTOOLS_TOP)/<project>/configure.in, then you need to run autoconf in that directory too.
Both these steps are completely platform-independent; they just mean that the human-written file (configure.in) can be short, although the resulting shell script, configure, and mk/config.h.in, are long.
In case you don't have autoconf we distribute the results, configure, and mk/config.h.in, with the source distribution. They aren't kept in the repository, though.
Runs the newly-created configure script, thus:
configure's mission is to scurry round your computer working out what architecture it has, what operating system, whether it has the vfork system call, where yacc is kept, whether gcc is available, where various obscure #include files are, whether it's a leap year, and what the systems manager had for lunch. It communicates these snippets of information in two ways:
It translates mk/config.mk.in to mk/config.mk, substituting for things between “@” brackets. So, “@HaveGcc@” will be replaced by “YES” or “NO” depending on what configure finds. mk/config.mk is included by every Makefile (directly or indirectly), so the configuration information is thereby communicated to all Makefiles.
It translates mk/config.h.in to mk/config.h. The latter is #included by various C programs, which can thereby make use of configuration information.
configure takes some optional arguments. Use ./configure --help to get a list of the available arguments. Here are some of the ones you might need:
Specifies the path to an installed GHC which you would like to use. This compiler will be used for compiling GHC-specific code (eg. GHC itself). This option cannot be specified using build.mk (see later), because configure needs to auto-detect the version of GHC you're using. The default is to look for a compiler named ghc in your path.
Specifies the path to any installed Haskell compiler. This compiler will be used for compiling generic Haskell code. The default is to use ghc.
Specifies the path to the installed GCC. This compiler will be used to compile all C files, except any generated by the installed Haskell compiler, which will have its own idea of which C compiler (if any) to use. The default is to use gcc.
configure caches the results of its run in config.cache. Quite often you don't want that; you're running configure a second time because something has changed. In that case, simply delete config.cache.
Next, you say how this build of fptools is to differ from the standard defaults by creating a new file mk/build.mk in the build tree. This file is the one and only file you edit in the build tree, precisely because it says how this build differs from the source. (Just in case your build tree does die, you might want to keep a private directory of build.mk files, and use a symbolic link in each build tree to point to the appropriate one.) So mk/build.mk never exists in the source tree—you create one in each build tree from the template. We'll discuss what to put in it shortly.
And that's it for configuration. Simple, eh?
What do you put in your build-specific configuration file mk/build.mk? For almost all purposes all you will do is put make variable definitions that override those in mk/config.mk.in. The whole point of mk/config.mk.in—and its derived counterpart mk/config.mk—is to define the build configuration. It is heavily commented, as you will see if you look at it. So generally, what you do is look at mk/config.mk.in, and add definitions in mk/build.mk that override any of the config.mk definitions that you want to change. (The override occurs because the main boilerplate file, mk/boilerplate.mk, includes build.mk after config.mk.)
For example, config.mk.in contains the definition:
The accompanying comment explains that this is the list of flags passed to GHC when building GHC itself. For doing development, it is wise to add -DDEBUG, to enable debugging code. So you would add the following to build.mk:
or, if you prefer,
GhcHcOpts += -DDEBUG
GNU make allows existing definitions to have new text appended using the “+=” operator, which is quite a convenient feature.)
If you want to remove the -O as well (a good idea when developing, because the turn-around cycle gets a lot quicker), you can just override GhcLibHcOpts altogether:
When reading config.mk.in, remember that anything between “@...@” signs is going to be substituted by configure later. You can override the resulting definition if you want, but you need to be a bit surer what you are doing. For example, there's a line that says:
YACC = @YaccCmd@
This defines the Make variables YACC to the pathname for a yacc that configure finds somewhere. If you have your own pet yacc you want to use instead, that's fine. Just add this line to mk/build.mk:
YACC = myyacc
You do not have to have a mk/build.mk file at all; if you don't, you'll get all the default settings from mk/config.mk.in.
You can also use build.mk to override anything that configure got wrong. One place where this happens often is with the definition of FPTOOLS_TOP_ABS: this variable is supposed to be the canonical path to the top of your source tree, but if your system uses an automounter then the correct directory is hard to find automatically. If you find that configure has got it wrong, just put the correct definition in build.mk.
Let's summarise the steps you need to carry to get yourself a fully-configured build tree from scratch.
Get your source tree from somewhere (CVS repository or source distribution). Say you call the root directory myfptools (it does not have to be called fptools). Make sure that you have the essential files (see Section 7.1).
(Optional) Use lndir or mkshadowdir to create a build tree.
$ cd myfptools $ mkshadowdir . /scratch/joe-bloggs/myfptools-sun4
(N.B. mkshadowdir's first argument is taken relative to its second.) You probably want to give the build tree a name that suggests its main defining characteristic (in your mind at least), in case you later add others.
Change directory to the build tree. Everything is going to happen there now.
$ cd /scratch/joe-bloggs/myfptools-sun4
Prepare for system configuration:
(You can skip this step if you are starting from a source distribution, and you already have configure and mk/config.h.in.)
Some projects, including GHC itself, have their own configure scripts, so it is necessary to run autoconf again in the appropriate subdirectories. eg:
$ (cd ghc; autoconf)
Do system configuration:
Don't forget to check whether you need to add any arguments to configure; for example, a common requirement is to specify which GHC to use with --with-ghc=ghc.
Create the file mk/build.mk, adding definitions for your desired configuration options.
$ emacs mk/build.mk
You can make subsequent changes to mk/build.mk as often as you like. You do not have to run any further configuration programs to make these changes take effect. In theory you should, however, say gmake clean, gmake all, because configuration option changes could affect anything—but in practice you are likely to know what's affected.
At this point you have made yourself a fully-configured build tree, so you are ready to start building real things.
The first thing you need to know is that you must use GNU make, usually called gmake, not standard Unix make. If you use standard Unix make you will get all sorts of error messages (but no damage) because the fptools Makefiles use GNU make's facilities extensively.
To just build the whole thing, cd to the top of your fptools tree and type gmake. This will prepare the tree and build the various projects in the correct order.
In any directory you should be able to make the following:
does the one-off preparation required to get ready for the real work. Notably, it does gmake depend in all directories that contain programs. It also builds the necessary tools for compilation to proceed.
Invoking the boot target explicitly is not normally necessary. From the top-level fptools directory, invoking gmake causes gmake boot all to be invoked in each of the project subdirectories, in the order specified by $(AllTargets) in config.mk.
If you're working in a subdirectory somewhere and need to update the dependencies, gmake boot is a good way to do it.
makes all the final target(s) for this Makefile. Depending on which directory you are in a “final target” may be an executable program, a library archive, a shell script, or a Postscript file. Typing gmake alone is generally the same as typing gmake all.
installs the things built by all (except for the documentation). Where does it install them? That is specified by mk/config.mk.in; you can override it in mk/build.mk, or by running configure with command-line arguments like --bindir=/home/simonpj/bin; see ./configure --help for the full details.
installs the documentation. Otherwise behaves just like install.
reverses the effect of install.
Delete all files from the current directory that are normally created by building the program. Don't delete the files that record the configuration, or files generated by gmake boot. Also preserve files that could be made by building, but normally aren't because the distribution comes with them.
Delete all files from the current directory that are created by configuring or building the program. If you have unpacked the source and built the program without creating any other files, make distclean should leave only the files that were in the distribution.
Like clean, but may refrain from deleting a few files that people normally don't want to recompile.
Delete everything from the current directory that can be reconstructed with this Makefile. This typically includes everything deleted by distclean, plus more: C source files produced by Bison, tags tables, Info files, and so on.
One exception, however: make maintainer-clean should not delete configure even if configure can be remade using a rule in the Makefile. More generally, make maintainer-clean should not delete anything that needs to exist in order to run configure and then begin to build the program.
run the test suite.
All of these standard targets automatically recurse into sub-directories. Certain other standard targets do not:
is only available in the root directory $(FPTOOLS_TOP); it has been discussed in Section 7.3.
make a .depend file in each directory that needs it. This .depend file contains mechanically-generated dependency information; for example, suppose a directory contains a Haskell source module Foo.lhs which imports another module Baz. Then the generated .depend file will contain the dependency:
Foo.o : Baz.hi
which says that the object file Foo.o depends on the interface file Baz.hi generated by compiling module Baz. The .depend file is automatically included by every Makefile.
make a binary distribution. This is the target we use to build the binary distributions of GHC and Happy.
make a source distribution. Note that this target does “make distclean” as part of its work; don't use it if you want to keep what you've built.
Most Makefiles have targets other than these. You can discover them by looking in the Makefile itself.
If you want to build GHC (say) and just use it direct from the build tree without doing make install first, you can run the in-place driver script: ghc/compiler/ghc-inplace.
Do NOT use ghc/compiler/ghc, or ghc/compiler/ghc-5.xx, as these are the scripts intended for installation, and contain hard-wired paths to the installed libraries, rather than the libraries in the build tree.
Happy can similarly be run from the build tree, using happy/src/happy-inplace.
Sometimes the dependencies get in the way: if you've made a small change to one file, and you're absolutely sure that it won't affect anything else, but you know that make is going to rebuild everything anyway, the following hack may be useful:
This tells the make system to ignore dependencies and just build what you tell it to. In other words, it's equivalent to temporarily removing the .depend file in the current directory (where mkdependHS and friends store their dependency information).
A bit of history: GHC used to come with a fastmake script that did the above job, but GNU make provides the features we need to do it without resorting to a script. Also, we've found that fastmaking is less useful since the advent of GHC's recompilation checker (see the User's Guide section on "Separate Compilation").