Common standard file format for executable files, object code, shared libraries, and core dumps:
- CPU and OS independent
- Chosen as a standard for x86, x86-64 UNIX systems
Table of Contents:
a.out
- assember output was historically used as the default format for executable files.
The name persists in the UNIX World, although the format has changed to ELF - Executable-Linkable Format.
ELF Header is the first thing in the file:
- defines whether to use 32-bit or 64-bit format
- identifies all low level interfaces (byte order, file type, machine type, instruction set architecture, etc.)
Let's look at the header file ex1.c.
gcc -o ex1 examples/ex1.c
hexdump -h ex1 | head -n 2
# Output
00000000 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 |.ELF............|
00000010 03 00 3e 00 01 00 00 00 40 10 00 00 00 00 00 00 |..>.....@.......|
Understand the ELF Header:
Bytes | Description | Value | Notes |
---|---|---|---|
1-4 | MAGIC NUMBER | 7f 45 4c 46 |
magic number - present in all ELF files, marks the file as an executable |
5 | ELF CLASS | 02 |
32-bit format |
6 | DATA ENCODING | 01 |
little-endian byte order |
7 | ELF VERSION | 01 |
ELF version |
8 | OS ABI | 00 |
System V ABI |
9 | ABI VERSION | 00 |
ABI version |
10-15 | PADDING | 00 00 00 00 00 00 |
padding (unused) |
16-17 | OBJECT FILE TYPE | 03 00 |
executable file |
18-19 | ARCHITECTURE | 3e 00 |
x86-64 (AMD64) |
20-23 | ELF VERSION | 01 00 00 00 |
ELF version (again) |
24-27 | ENTRY POINT | 40 10 00 00 |
entry point address - where the program starts executing |
Entrypoint (written in Big Endian) is 0x00001040
in Little Endian. It is the address where the program starts executing.
It can be checked with readelf
or objdump
:
objdump -D -M Intel ex1 | grep "<_start>"
Calling readelf
confirms our findings:
readelf -h ex1
Output:
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: DYN (Position-Independent Executable file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x1040
Start of program headers: 64 (bytes into file)
Start of section headers: 13496 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 56 (bytes)
Number of program headers: 13
Size of section headers: 64 (bytes)
Number of section headers: 30
Section header string table index: 29
Program Header Table is a list of segments in the file - tells the system how to create a process image.
Section Header Table is a list of sections in the file - tells the system how to link the file.
After executing execvp()
system call the kernel does the following:
- Load the image of the executable
- Use
namei()
to find the file INode - Use
readi()
to read the file into memory
- Use
- Read file headers:
- ELF Header
- Program Header Table
- Section Header Table
- Load the program into memory
- Position Pointers:
- Setup base address for the program
- Setup the stack
- Jump to the entry point
- Execute the program
Assume we have a program ex2.c
that uses a shared library ex2lib.c
. First we compile the shared library:
gcc -shared -fPIC -o examples/ex2lib.so examples/ex2lib.c
Then we compile the program:
gcc -o ex2 examples/ex2.c -L./examples/ ./examples/ex2lib.so
Or compile the program as an object file and link it with the shared library:
gcc -c -o ex2.o examples/ex2.c
gcc -o ex2 ex2.o -L./examples/ ./examples/ex2lib.so