ELF stands for the Executable and Linkable Format. It was originally developed by Sun Microsystems for use in their operating system, but is now in widespread use in many other operating systems, such as Linux and FreeBSD.
There are two views of an ELF file. The section view sees the file as a bunch of sections, which are to be linked or loaded in some manner. The program view sees the file as a bunch of ELF segments (not to be confused with Intel segments) which are to be loaded into memory in order to execute the program.
This split is designed to allow someone writing a linker to easily get the information they need (using the section view) and someone writing a loader (that's you) easily get the information they need without worrying about a lot of the complications of linking (using the program view).
Because you are writing a loader, not a linker, you can completely ignore the section view. You only care about the program view. This throws away around 80% of the ELF spec. Doesn't that make you feel good?
The first thing you need to find is the ELF header. This header is pretty easy to find, since it will be at location zero of the ELF file you're attempting to load.
The ELF header is exposed in
elf.h as the
Elf32_Ehdr structure. The first 16 bytes of the structure
are used to identify the ELF file. You should check that the first
four bytes of these 16 bytes correspond to the values
ELFMAG3. These bytes are the
ELF "magic number". This allows you to make sure that the
ELF file is really a genuine ELF file.
You can more or less ignore all the other entries in the header (if
you want, you can include other sanity checks, like making sure the
machine type is correct) except for the
entry. This entry gives the location in the file (in bytes) of the
program headers. It's the program headers that we're going to use
to load the program into memory. The
e_phnum entry is
also important, as it tells you how many program headers are present.
The linker script that we provide for user programs (programs which
your kernel loads) generates exactly two program headers. One contains
the code for the program (coming from the
and the other contains the program's static data. Both program headers
will have a segment type of
PT_LOAD (the other types
are only used for linking, not loading). The
member tells you where the segment's data is located in the
file. There are two sizes in each program
p_filesz specifies how many bytes of data are in
the file that need to be copied to memory.
specifies how much memory you should allocated to the segment. Note
p_memsz is always greater than or equal to
p_filesz. For the segment containing the program's code,
these values will be equal. Because programs may contain data that is
zero-initialized (and hence doesn't need to take up any space in the
p_memsz may be greater than
for the segment containing the program data. In that case you should
p_filesz bytes to memory, and set the rest to
Now you only need to know which program header corresponds to code,
and which corresponds to data. To do this, look at the
p_flags field. For the code segment this will be
PF_R+PF_X (indicating a read-execute segment). For the
data segment this will be
PF_R+PF_W, indicating a
This should be all you need to get the information you want from the ELF file. I haven't talked about exactly what you do with this information, as this was covered in class.
/u/cs452/i586-3.3.3/include/cs452/elf.hcontains a header file defining all the ELF structures.
/u/cs452/i586-3.3.3/examples/useful/loader.app.xcontains the user program linker script.