Process of the Linux kernel building ================================================================================ Introduction -------------------------------------------------------------------------------- I won't tell you how to build and install a custom Linux kernel on your machine. If you need help with this, you can find many [resources](https://encrypted.google.com/search?q=building+linux+kernel#q=building+linux+kernel+from+source+code) that will help you do it. Instead, we will learn what occurs when you execute `make` in the root directory of the Linux kernel source code. When I started to study the source code of the Linux kernel, the [makefile](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/Makefile) was the first file that I opened. And it was scary :). The [makefile](https://en.wikipedia.org/wiki/Make_%28software%29) contained `1591` lines of code when I wrote this part and the kernel was the [4.2.0-rc3](https://github.com/torvalds/linux/commit/52721d9d3334c1cb1f76219a161084094ec634dc) release. This makefile is the top makefile in the Linux kernel source code and the kernel building starts here. Yes, it is big, but moreover, if you've read the source code of the Linux kernel you may have noted that all directories containing source code has its own makefile. Of course it is not possible to describe how each source file is compiled and linked, so we will only study the standard compilation case. You will not find here building of the kernel's documentation, cleaning of the kernel source code, [tags](https://en.wikipedia.org/wiki/Ctags) generation, [cross-compilation](https://en.wikipedia.org/wiki/Cross_compiler) related stuff, etc... We will start from the `make` execution with the standard kernel configuration file and will finish with the building of the [bzImage](https://en.wikipedia.org/wiki/Vmlinux#bzImage). It would be better if you're already familiar with the [make](https://en.wikipedia.org/wiki/Make_%28software%29) util, but I will try to describe every piece of code in this part anyway. So let's start. Preparation before the kernel compilation --------------------------------------------------------------------------------- There are many things to prepare before the kernel compilation can be started. The main point here is to find and configure the type of compilation, to parse command line arguments that are passed to `make`, etc... So let's dive into the top `Makefile` of Linux kernel. The top `Makefile` of Linux kernel is responsible for building two major products: [vmlinux](https://en.wikipedia.org/wiki/Vmlinux) (the resident kernel image) and the modules (any module files). The [Makefile](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/Makefile) of the Linux kernel starts with the definition of following variables: ```Makefile VERSION = 4 PATCHLEVEL = 2 SUBLEVEL = 0 EXTRAVERSION = -rc3 NAME = Hurr durr I'ma sheep ``` These variables determine the current version of Linux kernel and are used in different places, for example in the forming of the `KERNELVERSION` variable in the same `Makefile`: ```Makefile KERNELVERSION = $(VERSION)$(if $(PATCHLEVEL),.$(PATCHLEVEL)$(if $(SUBLEVEL),.$(SUBLEVEL)))$(EXTRAVERSION) ``` After this we can see a couple of `ifeq` conditions that check some of the parameters passed to `make`. The Linux kernel `makefiles` provides a special `make help` target that prints all available targets and some of the command line arguments that can be passed to `make`. For example : `make V=1` => verbose build. The first `ifeq` checks whether the `V=n` option is passed to `make`: ```Makefile ifeq ("$(origin V)", "command line") KBUILD_VERBOSE = $(V) endif ifndef KBUILD_VERBOSE KBUILD_VERBOSE = 0 endif ifeq ($(KBUILD_VERBOSE),1) quiet = Q = else quiet=quiet_ Q = @ endif export quiet Q KBUILD_VERBOSE ``` If this option is passed to `make`, we set the `KBUILD_VERBOSE` variable to the value of `V` option. Otherwise we set the `KBUILD_VERBOSE` variable to zero. After this we check the value of `KBUILD_VERBOSE` variable and set values of the `quiet` and `Q` variables depending on the value of `KBUILD_VERBOSE` variable. The `@` symbols suppress the output of command. And if it is present before a command the output will be something like this: `CC scripts/mod/empty.o` instead of `Compiling .... scripts/mod/empty.o`. In the end we just export all of these variables. The next `ifeq` statement checks that `O=/dir` option was passed to the `make`. This option allows to locate all output files in the given `dir`: ```Makefile ifeq ($(KBUILD_SRC),) ifeq ("$(origin O)", "command line") KBUILD_OUTPUT := $(O) endif ifneq ($(KBUILD_OUTPUT),) saved-output := $(KBUILD_OUTPUT) KBUILD_OUTPUT := $(shell mkdir -p $(KBUILD_OUTPUT) && cd $(KBUILD_OUTPUT) \ && /bin/pwd) $(if $(KBUILD_OUTPUT),, \ $(error failed to create output directory "$(saved-output)")) sub-make: FORCE $(Q)$(MAKE) -C $(KBUILD_OUTPUT) KBUILD_SRC=$(CURDIR) \ -f $(CURDIR)/Makefile $(filter-out _all sub-make,$(MAKECMDGOALS)) skip-makefile := 1 endif # ifneq ($(KBUILD_OUTPUT),) endif # ifeq ($(KBUILD_SRC),) ``` We check the `KBUILD_SRC` that represents the top directory of the kernel source code and whether it is empty (it is empty when the makefile is executed for the first time). We then set the `KBUILD_OUTPUT` variable to the value passed with the `O` option (if this option was passed). In the next step we check this `KBUILD_OUTPUT` variable and if it is set, we do following things: * Store the value of `KBUILD_OUTPUT` in the temporary `saved-output` variable; * Try to create the given output directory; * Check that directory created, in other way print error message; * If the custom output directory was created successfully, execute `make` again with the new directory (see the `-C` option). The next `ifeq` statements check that the `C` or `M` options passed to `make`: ```Makefile ifeq ("$(origin C)", "command line") KBUILD_CHECKSRC = $(C) endif ifndef KBUILD_CHECKSRC KBUILD_CHECKSRC = 0 endif ifeq ("$(origin M)", "command line") KBUILD_EXTMOD := $(M) endif ``` The `C` option tells the `makefile` that we need to check all `c` source code with a tool provided by the `$CHECK` environment variable, by default it is [sparse](https://en.wikipedia.org/wiki/Sparse). The second `M` option provides build for the external modules (will not see this case in this part). We also check whether the `KBUILD_SRC` variable is set, and if it isn't, we set the `srctree` variable to `.`: ```Makefile ifeq ($(KBUILD_SRC),) srctree := . endif objtree := . src := $(srctree) obj := $(objtree) export srctree objtree VPATH ``` That tells `Makefile` that the kernel source tree will be in the current directory where `make` was executed. We then set `objtree` and other variables to this directory and export them. The next step is to get value for the `SUBARCH` variable that represents what the underlying architecture is: ```Makefile SUBARCH := $(shell uname -m | sed -e s/i.86/x86/ -e s/x86_64/x86/ \ -e s/sun4u/sparc64/ \ -e s/arm.*/arm/ -e s/sa110/arm/ \ -e s/s390x/s390/ -e s/parisc64/parisc/ \ -e s/ppc.*/powerpc/ -e s/mips.*/mips/ \ -e s/sh[234].*/sh/ -e s/aarch64.*/arm64/ ) ``` As you can see, it executes the [uname](https://en.wikipedia.org/wiki/Uname) util that prints information about machine, operating system and architecture. As it gets the output of `uname`, it parses the output and assigns the result to the `SUBARCH` variable. Now that we have `SUBARCH`, we set the `SRCARCH` variable that provides the directory of the certain architecture and `hdr-arch` that provides the directory for the header files: ```Makefile ifeq ($(ARCH),i386) SRCARCH := x86 endif ifeq ($(ARCH),x86_64) SRCARCH := x86 endif hdr-arch := $(SRCARCH) ``` Note `ARCH` is an alias for `SUBARCH`. In the next step we set the `KCONFIG_CONFIG` variable that represents path to the kernel configuration file and if it was not set before, it is set to `.config` by default: ```Makefile KCONFIG_CONFIG ?= .config export KCONFIG_CONFIG ``` and the [shell](https://en.wikipedia.org/wiki/Shell_%28computing%29) that will be used during kernel compilation: ```Makefile CONFIG_SHELL := $(shell if [ -x "$$BASH" ]; then echo $$BASH; \ else if [ -x /bin/bash ]; then echo /bin/bash; \ else echo sh; fi ; fi) ``` The next set of variables are related to the compilers used during Linux kernel compilation. We set the host compilers for the `c` and `c++` and the flags to be used with them: ```Makefile HOSTCC = gcc HOSTCXX = g++ HOSTCFLAGS = -Wall -Wmissing-prototypes -Wstrict-prototypes -O2 -fomit-frame-pointer -std=gnu89 HOSTCXXFLAGS = -O2 ``` Next we get to the `CC` variable that represents compiler too, so why do we need the `HOST*` variables? `CC` is the target compiler that will be used during kernel compilation, but `HOSTCC` will be used during compilation of the set of the `host` programs (we will see it soon). After this we can see the definition of `KBUILD_MODULES` and `KBUILD_BUILTIN` variables that are used to determine what to compile (modules, kernel, or both): ```Makefile KBUILD_MODULES := KBUILD_BUILTIN := 1 ifeq ($(MAKECMDGOALS),modules) KBUILD_BUILTIN := $(if $(CONFIG_MODVERSIONS),1) endif ``` Here we can see definition of these variables and the value of `KBUILD_BUILTIN` variable will depend on the `CONFIG_MODVERSIONS` kernel configuration parameter if we pass only `modules` to `make`. The next step is to include the `kbuild` file. ```Makefile include scripts/Kbuild.include ``` The [Kbuild](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/Documentation/kbuild/kbuild.txt) or `Kernel Build System` is a special infrastructure to manage building the kernel and its modules. `kbuild` files have the same syntax as makefiles. The [scripts/Kbuild.include](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/scripts/Kbuild.include) file provides some generic definitions for the `kbuild` system. After including this `kbuild` file (back in [makefile](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/Makefile)) we can see the definitions of the variables that are related to the different tools used during kernel and module compilation (like linker, compilers, utils from the [binutils](http://www.gnu.org/software/binutils/), etc...): ```Makefile AS = $(CROSS_COMPILE)as LD = $(CROSS_COMPILE)ld CC = $(CROSS_COMPILE)gcc CPP = $(CC) -E AR = $(CROSS_COMPILE)ar NM = $(CROSS_COMPILE)nm STRIP = $(CROSS_COMPILE)strip OBJCOPY = $(CROSS_COMPILE)objcopy OBJDUMP = $(CROSS_COMPILE)objdump AWK = awk ... ... ... ``` We then define two other variables: `USERINCLUDE` and `LINUXINCLUDE`, which specify paths to header file directories (public for users in the first case and for kernel in the second case): ```Makefile USERINCLUDE := \ -I$(srctree)/arch/$(hdr-arch)/include/uapi \ -Iarch/$(hdr-arch)/include/generated/uapi \ -I$(srctree)/include/uapi \ -Iinclude/generated/uapi \ -include $(srctree)/include/linux/kconfig.h LINUXINCLUDE := \ -I$(srctree)/arch/$(hdr-arch)/include \ ... ``` And the standard flags for the C compiler: ```Makefile KBUILD_CFLAGS := -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs \ -fno-strict-aliasing -fno-common \ -Werror-implicit-function-declaration \ -Wno-format-security \ -std=gnu89 ``` These are not the final compilation flags, as they can be updated in other makefiles (for example kbuilds from `arch/`). After all of these, all variables will be exported to be available in the other makefiles. The `RCS_FIND_IGNORE` and the `RCS_TAR_IGNORE` variables contain files that will be ignored in the version control system: ```Makefile export RCS_FIND_IGNORE := \( -name SCCS -o -name BitKeeper -o -name .svn -o \ -name CVS -o -name .pc -o -name .hg -o -name .git \) \ -prune -o export RCS_TAR_IGNORE := --exclude SCCS --exclude BitKeeper --exclude .svn \ --exclude CVS --exclude .pc --exclude .hg --exclude .git ``` With that, we have finished all preparations. The next step is building the `vmlinux` target. Directly to the kernel build -------------------------------------------------------------------------------- We have now finished all the preparations, and next step in the main makefile is related to the kernel build. Before this moment, nothing has been printed to the terminal by `make`. But now the first steps of the compilation are started. We need to go to line [598](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/Makefile#L598) of the Linux kernel top makefile and we will find the `vmlinux` target there: ```Makefile all: vmlinux include arch/$(SRCARCH)/Makefile ``` Don't worry that we have missed many lines in Makefile that are between `export RCS_FIND_IGNORE.....` and `all: vmlinux.....`. This part of the makefile is responsible for the `make *.config` targets and as I wrote in the beginning of this part we will see only building of the kernel in a general way. The `all:` target is the default when no target is given on the command line. You can see here that we include architecture specific makefile there (in our case it will be [arch/x86/Makefile](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/arch/x86/Makefile)). From this moment we will continue from this makefile. As we can see `all` target depends on the `vmlinux` target that defined a little lower in the top makefile: ```Makefile vmlinux: scripts/link-vmlinux.sh $(vmlinux-deps) FORCE ``` The `vmlinux` is the Linux kernel in a statically linked executable file format. The [scripts/link-vmlinux.sh](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/scripts/link-vmlinux.sh) script links and combines different compiled subsystems into vmlinux. The second target is the `vmlinux-deps` that defined as: ```Makefile vmlinux-deps := $(KBUILD_LDS) $(KBUILD_VMLINUX_INIT) $(KBUILD_VMLINUX_MAIN) ``` and consists from the set of the `built-in.o` from each top directory of the Linux kernel. Later, when we will go through all directories in the Linux kernel, the `Kbuild` will compile all the `$(obj-y)` files. It then calls `$(LD) -r` to merge these files into one `built-in.o` file. For this moment we have no `vmlinux-deps`, so the `vmlinux` target will not be executed now. For me `vmlinux-deps` contains following files: ``` arch/x86/kernel/vmlinux.lds arch/x86/kernel/head_64.o arch/x86/kernel/head64.o arch/x86/kernel/head.o init/built-in.o usr/built-in.o arch/x86/built-in.o kernel/built-in.o mm/built-in.o fs/built-in.o ipc/built-in.o security/built-in.o crypto/built-in.o block/built-in.o lib/lib.a arch/x86/lib/lib.a lib/built-in.o arch/x86/lib/built-in.o drivers/built-in.o sound/built-in.o firmware/built-in.o arch/x86/pci/built-in.o arch/x86/power/built-in.o arch/x86/video/built-in.o net/built-in.o ``` The next target that can be executed is following: ```Makefile $(sort $(vmlinux-deps)): $(vmlinux-dirs) ; $(vmlinux-dirs): prepare scripts $(Q)$(MAKE) $(build)=$@ ``` As we can see `vmlinux-dirs` depends on two targets: `prepare` and `scripts`. `prepare` is defined in the top `Makefile` of the Linux kernel and executes three stages of preparations: ```Makefile prepare: prepare0 prepare0: archprepare FORCE $(Q)$(MAKE) $(build)=. archprepare: archheaders archscripts prepare1 scripts_basic prepare1: prepare2 $(version_h) include/generated/utsrelease.h \ include/config/auto.conf $(cmd_crmodverdir) prepare2: prepare3 outputmakefile asm-generic ``` The first `prepare0` expands to the `archprepare` that expands to the `archheaders` and `archscripts` that defined in the `x86_64` specific [Makefile](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/arch/x86/Makefile). Let's look on it. The `x86_64` specific makefile starts from the definition of the variables that are related to the architecture-specific configs ([defconfig](https://github.com/torvalds/linux/tree/master/arch/x86/configs), etc...). After this it defines flags for the compiling of the [16-bit](https://en.wikipedia.org/wiki/Real_mode) code, calculating of the `BITS` variable that can be `32` for `i386` or `64` for the `x86_64` flags for the assembly source code, flags for the linker and many many more (all definitions you can find in the [arch/x86/Makefile](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/arch/x86/Makefile)). The first target is `archheaders` in the makefile generates syscall table: ```Makefile archheaders: $(Q)$(MAKE) $(build)=arch/x86/entry/syscalls all ``` And the second target is `archscripts` in this makefile is: ```Makefile archscripts: scripts_basic $(Q)$(MAKE) $(build)=arch/x86/tools relocs ``` We can see that it depends on the `scripts_basic` target from the top [Makefile](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/Makefile). At the first we can see the `scripts_basic` target that executes make for the [scripts/basic](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/scripts/basic/Makefile) makefile: ```Makefile scripts_basic: $(Q)$(MAKE) $(build)=scripts/basic ``` The `scripts/basic/Makefile` contains targets for compilation of the two host programs: `fixdep` and `bin2`: ```Makefile hostprogs-y := fixdep hostprogs-$(CONFIG_BUILD_BIN2C) += bin2c always := $(hostprogs-y) $(addprefix $(obj)/,$(filter-out fixdep,$(always))): $(obj)/fixdep ``` First program is `fixdep` - optimizes list of dependencies generated by [gcc](https://gcc.gnu.org/) that tells make when to remake a source code file. The second program is `bin2c`, which depends on the value of the `CONFIG_BUILD_BIN2C` kernel configuration option and is a very little C program that allows to convert a binary on stdin to a C include on stdout. You can note here a strange notation: `hostprogs-y`, etc... This notation is used in the all `kbuild` files and you can read more about it in the [documentation](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/Documentation/kbuild/makefiles.txt). In our case `hostprogs-y` tells `kbuild` that there is one host program named `fixdep` that will be built from `fixdep.c` that is located in the same directory where the `Makefile` is. The first output after we execute `make` in our terminal will be result of this `kbuild` file: ``` $ make HOSTCC scripts/basic/fixdep ``` As `script_basic` target was executed, the `archscripts` target will execute `make` for the [arch/x86/tools](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/arch/x86/tools/Makefile) makefile with the `relocs` target: ```Makefile $(Q)$(MAKE) $(build)=arch/x86/tools relocs ``` The `relocs_32.c` and the `relocs_64.c` will be compiled that will contain [relocation](https://en.wikipedia.org/wiki/Relocation_%28computing%29) information and we will see it in the `make` output: ```Makefile HOSTCC arch/x86/tools/relocs_32.o HOSTCC arch/x86/tools/relocs_64.o HOSTCC arch/x86/tools/relocs_common.o HOSTLD arch/x86/tools/relocs ``` There is checking of the `version.h` after compiling of the `relocs.c`: ```Makefile $(version_h): $(srctree)/Makefile FORCE $(call filechk,version.h) $(Q)rm -f $(old_version_h) ``` We can see it in the output: ``` CHK include/config/kernel.release ``` and the building of the `generic` assembly headers with the `asm-generic` target from the `arch/x86/include/generated/asm` that generated in the top Makefile of the Linux kernel. After the `asm-generic` target the `archprepare` will be done, so the `prepare0` target will be executed. As I wrote above: ```Makefile prepare0: archprepare FORCE $(Q)$(MAKE) $(build)=. ``` Note on the `build`. It defined in the [scripts/Kbuild.include](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/scripts/Kbuild.include) and looks like this: ```Makefile build := -f $(srctree)/scripts/Makefile.build obj ``` Or in our case it is current source directory - `.`: ```Makefile $(Q)$(MAKE) -f $(srctree)/scripts/Makefile.build obj=. ``` The [scripts/Makefile.build](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/scripts/Makefile.build) tries to find the `Kbuild` file by the given directory via the `obj` parameter, include this `Kbuild` files: ```Makefile include $(kbuild-file) ``` and build targets from it. In our case `.` contains the [Kbuild](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/Kbuild) file that generates the `kernel/bounds.s` and the `arch/x86/kernel/asm-offsets.s`. After this the `prepare` target finished to work. The `vmlinux-dirs` also depends on the second target - `scripts` that compiles following programs: `file2alias`, `mk_elfconfig`, `modpost`, etc..... After scripts/host-programs compilation our `vmlinux-dirs` target can be executed. First of all let's try to understand what does `vmlinux-dirs` contain. For my case it contains paths of the following kernel directories: ``` init usr arch/x86 kernel mm fs ipc security crypto block drivers sound firmware arch/x86/pci arch/x86/power arch/x86/video net lib arch/x86/lib ``` We can find definition of the `vmlinux-dirs` in the top [Makefile](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/Makefile) of the Linux kernel: ```Makefile vmlinux-dirs := $(patsubst %/,%,$(filter %/, $(init-y) $(init-m) \ $(core-y) $(core-m) $(drivers-y) $(drivers-m) \ $(net-y) $(net-m) $(libs-y) $(libs-m))) init-y := init/ drivers-y := drivers/ sound/ firmware/ net-y := net/ libs-y := lib/ ... ... ... ``` Here we remove the `/` symbol from the each directory with the help of the `patsubst` and `filter` functions and put it to the `vmlinux-dirs`. So we have list of directories in the `vmlinux-dirs` and the following code: ```Makefile $(vmlinux-dirs): prepare scripts $(Q)$(MAKE) $(build)=$@ ``` The `$@` represents `vmlinux-dirs` here that means that it will go recursively over all directories from the `vmlinux-dirs` and its internal directories (depens on configuration) and will execute `make` in there. We can see it in the output: ``` CC init/main.o CHK include/generated/compile.h CC init/version.o CC init/do_mounts.o ... CC arch/x86/crypto/glue_helper.o AS arch/x86/crypto/aes-x86_64-asm_64.o CC arch/x86/crypto/aes_glue.o ... AS arch/x86/entry/entry_64.o AS arch/x86/entry/thunk_64.o CC arch/x86/entry/syscall_64.o ``` Source code in each directory will be compiled and linked to the `built-in.o`: ``` $ find . -name built-in.o ./arch/x86/crypto/built-in.o ./arch/x86/crypto/sha-mb/built-in.o ./arch/x86/net/built-in.o ./init/built-in.o ./usr/built-in.o ... ... ``` Ok, all buint-in.o(s) built, now we can back to the `vmlinux` target. As you remember, the `vmlinux` target is in the top Makefile of the Linux kernel. Before the linking of the `vmlinux` it builds [samples](https://github.com/torvalds/linux/tree/master/samples), [Documentation](https://github.com/torvalds/linux/tree/master/Documentation), etc... but I will not describe it here as I wrote in the beginning of this part. ```Makefile vmlinux: scripts/link-vmlinux.sh $(vmlinux-deps) FORCE ... ... +$(call if_changed,link-vmlinux) ``` As you can see main purpose of it is a call of the [scripts/link-vmlinux.sh](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/scripts/link-vmlinux.sh) script is linking of the all `built-in.o`(s) to the one statically linked executable and creation of the [System.map](https://en.wikipedia.org/wiki/System.map). In the end we will see following output: ``` LINK vmlinux LD vmlinux.o MODPOST vmlinux.o GEN .version CHK include/generated/compile.h UPD include/generated/compile.h CC init/version.o LD init/built-in.o KSYM .tmp_kallsyms1.o KSYM .tmp_kallsyms2.o LD vmlinux SORTEX vmlinux SYSMAP System.map ``` and `vmlinux` and `System.map` in the root of the Linux kernel source tree: ``` $ ls vmlinux System.map System.map vmlinux ``` That's all, `vmlinux` is ready. The next step is creation of the [bzImage](https://en.wikipedia.org/wiki/Vmlinux#bzImage). Building bzImage -------------------------------------------------------------------------------- The `bzImage` file is the compressed Linux kernel image. We can get it by executing `make bzImage` after `vmlinux` is built. That, or we can just execute `make` without any argument and we will get `bzImage` anyway because it is default image: ```Makefile all: bzImage ``` in the [arch/x86/kernel/Makefile](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/arch/x86/Makefile). Let's look on this target, it will help us to understand how this image builds. As I already said the `bzImage` target defined in the [arch/x86/kernel/Makefile](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/arch/x86/Makefile) and looks like this: ```Makefile bzImage: vmlinux $(Q)$(MAKE) $(build)=$(boot) $(KBUILD_IMAGE) $(Q)mkdir -p $(objtree)/arch/$(UTS_MACHINE)/boot $(Q)ln -fsn ../../x86/boot/bzImage $(objtree)/arch/$(UTS_MACHINE)/boot/$@ ``` We can see here, that first of all called `make` for the boot directory, in our case it is: ```Makefile boot := arch/x86/boot ``` The main goal now is to build the source code in the `arch/x86/boot` and `arch/x86/boot/compressed` directories, build `setup.bin` and `vmlinux.bin`, and build the `bzImage` from them in the end. First target in the [arch/x86/boot/Makefile](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/arch/x86/boot/Makefile) is the `$(obj)/setup.elf`: ```Makefile $(obj)/setup.elf: $(src)/setup.ld $(SETUP_OBJS) FORCE $(call if_changed,ld) ``` We already have the `setup.ld` linker script in the `arch/x86/boot` directory and the `SETUP_OBJS` variable that expands to the all source files from the `boot` directory. We can see first output: ```Makefile AS arch/x86/boot/bioscall.o CC arch/x86/boot/cmdline.o AS arch/x86/boot/copy.o HOSTCC arch/x86/boot/mkcpustr CPUSTR arch/x86/boot/cpustr.h CC arch/x86/boot/cpu.o CC arch/x86/boot/cpuflags.o CC arch/x86/boot/cpucheck.o CC arch/x86/boot/early_serial_console.o CC arch/x86/boot/edd.o ``` The next source file is [arch/x86/boot/header.S](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/arch/x86/boot/header.S), but we can't build it now because this target depends on the following two header files: ```Makefile $(obj)/header.o: $(obj)/voffset.h $(obj)/zoffset.h ``` The first is `voffset.h` generated by the `sed` script that gets two addresses from the `vmlinux` with the `nm` util: ```C #define VO__end 0xffffffff82ab0000 #define VO__text 0xffffffff81000000 ``` They are the start and the end of the kernel. The second is `zoffset.h` depens on the `vmlinux` target from the [arch/x86/boot/compressed/Makefile](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/arch/x86/boot/compressed/Makefile): ```Makefile $(obj)/zoffset.h: $(obj)/compressed/vmlinux FORCE $(call if_changed,zoffset) ``` The `$(obj)/compressed/vmlinux` target depends on the `vmlinux-objs-y` that compiles source code files from the [arch/x86/boot/compressed](https://github.com/torvalds/linux/tree/master/arch/x86/boot/compressed) directory and generates `vmlinux.bin`, `vmlinux.bin.bz2`, and compiles program - `mkpiggy`. We can see this in the output: ```Makefile LDS arch/x86/boot/compressed/vmlinux.lds AS arch/x86/boot/compressed/head_64.o CC arch/x86/boot/compressed/misc.o CC arch/x86/boot/compressed/string.o CC arch/x86/boot/compressed/cmdline.o OBJCOPY arch/x86/boot/compressed/vmlinux.bin BZIP2 arch/x86/boot/compressed/vmlinux.bin.bz2 HOSTCC arch/x86/boot/compressed/mkpiggy ``` Where `vmlinux.bin` is the `vmlinux` file with debugging information and comments stripped and the `vmlinux.bin.bz2` compressed `vmlinux.bin.all` + `u32` size of `vmlinux.bin.all`. The `vmlinux.bin.all` is `vmlinux.bin + vmlinux.relocs`, where `vmlinux.relocs` is the `vmlinux` that was handled by the `relocs` program (see above). As we got these files, the `piggy.S` assembly files will be generated with the `mkpiggy` program and compiled: ```Makefile MKPIGGY arch/x86/boot/compressed/piggy.S AS arch/x86/boot/compressed/piggy.o ``` This assembly files will contain the computed offset from the compressed kernel. After this we can see that `zoffset` generated: ```Makefile ZOFFSET arch/x86/boot/zoffset.h ``` As the `zoffset.h` and the `voffset.h` are generated, compilation of the source code files from the [arch/x86/boot](https://github.com/torvalds/linux/tree/master/arch/x86/boot/) can be continued: ```Makefile AS arch/x86/boot/header.o CC arch/x86/boot/main.o CC arch/x86/boot/mca.o CC arch/x86/boot/memory.o CC arch/x86/boot/pm.o AS arch/x86/boot/pmjump.o CC arch/x86/boot/printf.o CC arch/x86/boot/regs.o CC arch/x86/boot/string.o CC arch/x86/boot/tty.o CC arch/x86/boot/video.o CC arch/x86/boot/video-mode.o CC arch/x86/boot/video-vga.o CC arch/x86/boot/video-vesa.o CC arch/x86/boot/video-bios.o ``` As all source code files will be compiled, they will be linked to the `setup.elf`: ```Makefile LD arch/x86/boot/setup.elf ``` or: ``` ld -m elf_x86_64 -T arch/x86/boot/setup.ld arch/x86/boot/a20.o arch/x86/boot/bioscall.o arch/x86/boot/cmdline.o arch/x86/boot/copy.o arch/x86/boot/cpu.o arch/x86/boot/cpuflags.o arch/x86/boot/cpucheck.o arch/x86/boot/early_serial_console.o arch/x86/boot/edd.o arch/x86/boot/header.o arch/x86/boot/main.o arch/x86/boot/mca.o arch/x86/boot/memory.o arch/x86/boot/pm.o arch/x86/boot/pmjump.o arch/x86/boot/printf.o arch/x86/boot/regs.o arch/x86/boot/string.o arch/x86/boot/tty.o arch/x86/boot/video.o arch/x86/boot/video-mode.o arch/x86/boot/version.o arch/x86/boot/video-vga.o arch/x86/boot/video-vesa.o arch/x86/boot/video-bios.o -o arch/x86/boot/setup.elf ``` The last two things is the creation of the `setup.bin` that will contain compiled code from the `arch/x86/boot/*` directory: ``` objcopy -O binary arch/x86/boot/setup.elf arch/x86/boot/setup.bin ``` and the creation of the `vmlinux.bin` from the `vmlinux`: ``` objcopy -O binary -R .note -R .comment -S arch/x86/boot/compressed/vmlinux arch/x86/boot/vmlinux.bin ``` In the end we compile host program: [arch/x86/boot/tools/build.c](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/arch/x86/boot/tools/build.c) that will create our `bzImage` from the `setup.bin` and the `vmlinux.bin`: ``` arch/x86/boot/tools/build arch/x86/boot/setup.bin arch/x86/boot/vmlinux.bin arch/x86/boot/zoffset.h arch/x86/boot/bzImage ``` Actually the `bzImage` is the concatenated `setup.bin` and the `vmlinux.bin`. In the end we will see the output which is familiar to all who once built the Linux kernel from source: ``` Setup is 16268 bytes (padded to 16384 bytes). System is 4704 kB CRC 94a88f9a Kernel: arch/x86/boot/bzImage is ready (#5) ``` That's all. Conclusion ================================================================================ It is the end of this part and here we saw all steps from the execution of the `make` command to the generation of the `bzImage`. I know, the Linux kernel makefiles and process of the Linux kernel building may seem confusing at first glance, but it is not so hard. Hope this part will help you understand the process of building the Linux kernel. Links ================================================================================ * [GNU make util](https://en.wikipedia.org/wiki/Make_%28software%29) * [Linux kernel top Makefile](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/Makefile) * [cross-compilation](https://en.wikipedia.org/wiki/Cross_compiler) * [Ctags](https://en.wikipedia.org/wiki/Ctags) * [sparse](https://en.wikipedia.org/wiki/Sparse) * [bzImage](https://en.wikipedia.org/wiki/Vmlinux#bzImage) * [uname](https://en.wikipedia.org/wiki/Uname) * [shell](https://en.wikipedia.org/wiki/Shell_%28computing%29) * [Kbuild](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/Documentation/kbuild/kbuild.txt) * [binutils](http://www.gnu.org/software/binutils/) * [gcc](https://gcc.gnu.org/) * [Documentation](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/Documentation/kbuild/makefiles.txt) * [System.map](https://en.wikipedia.org/wiki/System.map) * [Relocation](https://en.wikipedia.org/wiki/Relocation_%28computing%29)