Browse Source

Remove the duplicate paragraph in the PDF file

pull/102/head
Tu Do 3 years ago
parent
commit
d162884ef0
1 changed files with 151 additions and 157 deletions
  1. 151
    157
      Operating_Systems_From_0_to_1.pdf

+ 151
- 157
Operating_Systems_From_0_to_1.pdf View File

@@ -11185,73 +11185,68 @@ source-level debugger without bother looking at the assembly code from
the split layout. As a consequence, the true cause of the non-working code
could never been discovered.

8.5.2 Debugging the memory layout Figure 8.5.1: Memory state after
loading 2nd sector.
8.5.2 Debugging the memory layout Figure 8.5.1: Memory state after
loading 2nd sector.
What is the reason for the incorrect Assembly code in main displayed by
gdb? There can only be one cause: the bootloader jumped to the wrong 0x0 ELF header
addresses. But why was the address wrong? We made the .text section 0x500
gdb? There can only be one cause: the bootloader jumped to the wrong 0x0 ELF header
addresses. But why was the address wrong? We made the .text section 0x500
at address 0x500, in which main code is in the first byte for executing,
and instructed the bootloader to retrieve the address at the offset 0x18, Loaded content
and instructed the bootloader to retrieve the address at the offset 0x18, Loaded content
then jump to the entry address.
.text
.text

Then, it might be possible for the bootloader to load the operating sy-

Then, it might be possible for the bootloader to load the operating sy- 0xFFFFFFFF Memory
stem address at the wrong address. But then, we explicitly set the load

address to 50h:00, which is 0x500, and so the correct address was used.
After the bootloader loas the 2nd sector, the in-memory state should look
like the figure 8.5.1:

What is the reason for the incorrect Assembly code in main displayed
by gdb? There can only be one cause: the bootloader jumped to the wrong
addresses. But why was the address wrong? We made the .text section
at address 0x500, in which main code is in the first byte for executing,
and instructed the bootloader to retrieve the address at the offset 0x18,
then jump to the entry address.

Then, it might be possible for the bootloader to load the operating sy-
stem address at the wrong address. But then, we explicitly set the load
address to 50h:00, which is 0x500, and so the correct address was used.
After the bootloader loads the 2nd sector, the in-memory state should
look like the figure 8.5.1.
linking and loading on bare metal 263
like the figure 8.5.1: 0xFFFFFFFF Memory

Here is the problem: 0x500 is the start of the ELF header. The boot-
loader actually loads the 2nd sector, which stores the executable as a whole,

to 0x500. Clearly, .text section, where main resides, is far from 0x500.

Since the in-memory entry address of the executable binary is 0x500, .text

should be at 0x500 + 0x500 = 0xa00. However, the entry address recor-

ded in the ELF header remains 0x500 and as a result, the bootloader jum-

ped there instead of 0xa00. This is one of the issues that must be fixed.

The other issue is the mapping between debug info and the memory
address. Because the debug info is compiled with the assumed offset 0x500
that is the start of .text section, but due to actual loading, the offset
is pushed another 0x500 bytes, making the address actually is at 0xa00.
This memory mismatch renders the debug info useless.
linking and loading on bare metal 263

Debug Info 0x0 ELF header Figure 8.5.2: Wrong symbol-
.text 0x500 memory mappings in debug info.
This memory mismatch renders the debug info useless.

.text Loaded content
Debug Info 0x0 ELF header Figure 8.5.2: Wrong symbol-
.text 0x500 memory mappings in debug info.

.text
.text Loaded content

Debug info is
supposed to be here
.text

Debug info is
supposed to be here

0xFFFFFFFF Memory

In summary, we have 2 problems to overcome:

£ Fix the entry address to account for the extra offset when loading into
memory.

memory.
£ Fix the debug info to account for the extra offset when loading into

memory.

First, we need to know the actual layout of the compiled executable bi-
nary:l
264 operating systems: from 0 to 1

$ readelf -l build/os/os

@@ -11263,18 +11258,18 @@ Output Elf file type is EXEC (Executable file)

Program Headers:

Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align

PHDR 0x000000 0x00000000 0x00000000 0x00074 0x00074 R 0x4
PHDR 0x000000 0x00000000 0x00000000 0x00074 0x00074 R 0x4

LOAD 0x000500 0x00000500 0x00000500 0x00040 0x00040 R E 0x1000
LOAD 0x000500 0x00000500 0x00000500 0x00040 0x00040 R E 0x1000

Section to Segment mapping:

Segment Sections...
264 operating systems: from 0 to 1

00

01 .text

Notice the Offset and the VirtAddress fields: both have the same
@@ -11310,7 +11305,6 @@ Output Elf file type is EXEC (Executable file)
Segment Sections...

00
linking and loading on bare metal 265

01 .text

@@ -11332,17 +11326,16 @@ Output Elf file type is EXEC (Executable file)
LOAD 0x001073 0x00001073 0x00001073 0x00006 0x00006 R E 0x1000

Section to Segment mapping:
linking and loading on bare metal 265

Segment Sections...
00
01 .text

00

01 .text

The key to answer such phenonemon is in the Align field. The value 19 All the outputs are produced by the
0x1000 indicates that the offset address of the segment should be divisi- command:
The key to answer such phenonemon is in the Align field. The value 19 All the outputs are produced by the
0x1000 indicates that the offset address of the segment should be divisi- command:
ble by 0x1000, or if the distance between segment is divisible by 0x1000,
the linker removes such distance to save the binary size. We can do some $ readelf -l build/os/os
the linker removes such distance to save the binary size. We can do some $ readelf -l build/os/os
experiments to verify this claim19:

£ By setting the virtual address of .text to 0x0 to 0x73 (in os.lds),
@@ -11357,48 +11350,47 @@ Output Elf file type is EXEC (Executable file)

Program Headers:

Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align

PHDR 0x000000 0x00000000 0x00000000 0x00074 0x00074 R 0x4
PHDR 0x000000 0x00000000 0x00000000 0x00074 0x00074 R 0x4

LOAD 0x001000 0x00000000 0x00000000 0x00006 0x00006 R E 0x1000
LOAD 0x001000 0x00000000 0x00000000 0x00006 0x00006 R E 0x1000

Section to Segment mapping:

Segment Sections...
266 operating systems: from 0 to 1

00

01 .text

By default, if we do not specify any virtual address, the offset stays at
0x1000 because 0x1000 is the perfect offset to satisfy the alignment
constraint. Any addition from 0x1 to 0x73 makes the segment misa-
ligned, but the linker keeps it anyway because it is told so.
By default, if we do not specify any virtual address, the offset stays at
0x1000 because 0x1000 is the perfect offset to satisfy the alignment
constraint. Any addition from 0x1 to 0x73 makes the segment misa-
ligned, but the linker keeps it anyway because it is told so.

£ By setting the virtual address of .text to 0x74 (in os.lds):

Output Elf file type is EXEC (Executable file)

Entry point 0x74

There are 2 program headers, starting at offset 52
266 operating systems: from 0 to 1

Program Headers:

Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align

PHDR 0x000000 0x00000000 0x00000000 0x00074 0x00074 R 0x4
PHDR 0x000000 0x00000000 0x00000000 0x00074 0x00074 R 0x4

LOAD 0x000074 0x00000074 0x00000074 0x00006 0x00006 R E 0x1000
LOAD 0x000074 0x00000074 0x00000074 0x00006 0x00006 R E 0x1000

Section to Segment mapping:

Segment Sections...

00
00

01 .text
01 .text

PHDR is 0x74 bytes in size, so if LOAD starts at 0x1074, the distance
between the PHDR segment and LOAD segment is 0x1074 − 0x74 = 0x1000
@@ -11416,7 +11408,6 @@ Now we get a hint how to control the values of Offset and VirtAddr to
produce a desired binary layout. What we need is to change the Align
field to a value with smaller value for finer grain control. It might work
out with a binary layout like this:
linking and loading on bare metal 267

Output Elf file type is EXEC (Executable file)

@@ -11439,42 +11430,44 @@ Output Elf file type is EXEC (Executable file)
00

01 .text
linking and loading on bare metal 267

The binary will look like figure 8.5.3 in memory:
The binary will look like figure 8.5.3 in memory:

0x0 ELF header Figure 8.5.3: A good binary
0x500 layout.
Debug Info 0x100 Loaded content
.text 0x600
.text
0x0 ELF header Figure 8.5.3: A good binary
0x500 layout.
Debug Info 0x100 Loaded content
.text 0x600
.text

0xFFFFFFFF Memory
0xFFFFFFFF Memory

If we set the Offset field to 0x100 from the beginning of the file and
the VirtAddr to 0x600, when loading in memory, the actual memory of
.text is 0x500 + 0x100 = 0x600; 0x500 is the memory location where the
bootloader loads into the physical memory and 0x100 is the offset from
the end of ELF header to .text. The entry address and the debug info
will then take the value 0x600 from the VirtAddr field above, which to-
tally matches the actual physical layout. We can do it by changing os.lds
as follow:
268 operating systems: from 0 to 1
If we set the Offset field to 0x100 from the beginning of the file and
the VirtAddr to 0x600, when loading in memory, the actual memory of
.text is 0x500 + 0x100 = 0x600; 0x500 is the memory location where the
bootloader loads into the physical memory and 0x100 is the offset from
the end of ELF header to .text. The entry address and the debug info
will then take the value 0x600 from the VirtAddr field above, which to-
tally matches the actual physical layout. We can do it by changing os.lds
as follow:

main.lds
main.lds

ENTRY(main);
ENTRY(main);

PHDRS
{
PHDRS
{

headers PT_PHDR FILEHDR PHDRS;
code PT_LOAD;
}
headers PT_PHDR FILEHDR PHDRS;
code PT_LOAD;
}

SECTIONS
{
SECTIONS
{

.text 0x600: ALIGN(0x100) { *(.text) } :code
268 operating systems: from 0 to 1

.text 0x600: ALIGN(0x100) { *(.text) } :code
.data : { *(.data) }
.bss : { *(.bss) }
/DISCARD/ : { *(.eh_frame) }
@@ -11501,7 +11494,6 @@ Output -n
$(OS): $(OS_OBJS)

ld -m elf_i386 -nmagic -Tos.lds $(OS_OBJS) -o $@
linking and loading on bare metal 269

Finally, we also need to update the top-level Makefile to write more
than one sector into the disk image for the operating system binary, as
@@ -11513,6 +11505,8 @@ Output -n
We update the rule so that the sectors are automatically calculated:

os/Makefile
linking and loading on bare metal 269

..... above content omitted ....
bootdisk: bootloader os

@@ -11549,19 +11543,17 @@ Output Elf file type is EXEC (Executable file)
00

01 .text
270 operating systems: from 0 to 1

8.5.3 Testing the new binary
First, we start the QEMU machine:

$ make qemu

In another terminal, we start gdb, loading the debug info and set a bre-
akpoint at main:

$ gdb
The following output should be produced:
$ gdb
The following output should be produced:
270 operating systems: from 0 to 1

Output ---Type <return> to continue, or q <return> to quit---
[f000:fff0] 0x0000fff0 in ?? ()
@@ -11577,70 +11569,74 @@ Output ---Type <return> to continue, or q <return> to quit---

Output

main.c
main.c

B+> 1 void main(){}
B+> 1 void main(){}

2
2

3
3

4
4

5
5

6
6

7
7

8
8

9
linking and loading on bare metal 271
9

10
11
12
13
14
15
16
10

B+> 0x600 <main> push bp
bp,sp
0x601 <main+1> mov
bp
0x603 <main+3> nop
BYTE PTR [bx+si],al
0x604 <main+4> pop BYTE PTR [si],al
BYTE PTR [bx+si],al
0x605 <main+5> ret BYTE PTR [bx+si],al
BYTE PTR [si],al
0x606 aaa ax,bp
ss
0x607 add BYTE PTR [bx+si],al
al,0x67
0x609 add al,BYTE PTR [bx+si]
BYTE PTR [bx+si+0x2],al
0x60b add
11

0x60d add
12

0x60f add
13

0x611 add
14

0x613 push
15

0x614 add
16

0x616 or
B+> 0x600 <main> push bp
0x601 <main+1> mov bp,sp
0x603 <main+3> nop
linking and loading on bare metal 271

0x618 adc
0x604 <main+4> pop bp

0x61a add
0x605 <main+5> ret BYTE PTR [bx+si],al
BYTE PTR [si],al
0x606 aaa BYTE PTR [bx+si],al
BYTE PTR [bx+si],al
0x607 add BYTE PTR [si],al
ax,bp
0x609 add ss
BYTE PTR [bx+si],al
0x60b add al,0x67
al,BYTE PTR [bx+si]
0x60d add BYTE PTR [bx+si+0x2],al

remote Thread 1 In: main L1 PC: 0x600
0x60f add

0x611 add

0x613 push

0x614 add

0x616 or

0x618 adc

0x61a add

remote Thread 1 In: main L1 PC: 0x600
(gdb) c
Continuing.
[ 0:7c00]
@@ -11648,20 +11644,19 @@ Breakpoint 1, 0x00007c00 in ?? ()
(gdb) c
Continuing.
[ 0: 600]
272 operating systems: from 0 to 1

Breakpoint 2, main () at main.c:1
(gdb) layout split
Breakpoint 2, main () at main.c:1
(gdb) layout split

Now, the displayed assembly is the same as in objdump, except the re-
gisters are 16-bit ones. This is normal, as gdb is operating in 16-bit mode,
while objdump displays code in 32-bit mode. To make sure, we verify the
raw opcode by using x command:
Now, the displayed assembly is the same as in objdump, except the re-
gisters are 16-bit ones. This is normal, as gdb is operating in 16-bit mode,
while objdump displays code in 32-bit mode. To make sure, we verify the
raw opcode by using x command:

(gdb) x/16xb 0x600
(gdb) x/16xb 0x600
272 operating systems: from 0 to 1

Output 0x600 <main>: 0x55 0x89 0xe5 0x90 0x5d 0xc3 0x37
0x00 0x00 0x04 0x00 0x00 0x00 0x00 0x00
Output 0x600 <main>: 0x55 0x89 0xe5 0x90 0x5d 0xc3 0x37
0x00 0x00 0x04 0x00 0x00 0x00 0x00 0x00

0x608: 0x00

@@ -11672,7 +11667,7 @@ Output 0x600 <main>: 0x55 0x89 0xe5 0x90 0x5d

$ objdump -z -M intel -S -D build/os/os | less

Output build/os/os: file format elf32-i386
Output build/os/os: file format elf32-i386

Disassembly of section .text:

@@ -11696,18 +11691,17 @@ Output build/os/os: file format elf32-i386

Both raw opcode displayed by the two programs are the same. In this
case, it proved that gdb correctly jumped to the address in main for a pro-
per debugging. This is an extremely important milestone. Being able to
debug in bare metal will help tremendously in writing an operating sy-
stem, as a debugger allows a programmer to inspect the internal state of
a running machine at each step to verify his code, step by step, to gra-
dually build up a solid understanding. Some professional programmers
do not like debuggers, but it is because they understand their domain
deep enough to not need to rely on a debugger to verify their code. When
encountering new domains, a debugger is indispensable learning tool be-
cause of its verifiability.
linking and loading on bare metal 273

per debugging. This is an extremely important milestone. Being able to
debug in bare metal will help tremendously in writing an operating sy-
stem, as a debugger allows a programmer to inspect the internal state of
a running machine at each step to verify his code, step by step, to gra-
dually build up a solid understanding. Some professional programmers
do not like debuggers, but it is because they understand their domain
deep enough to not need to rely on a debugger to verify their code. When
encountering new domains, a debugger is indispensable learning tool be-
cause of its verifiability.

However, even with the aid of debugger, writing an operating system
is still not a walk in the park. The debugger may give the access to the
machine at one point in time, but it does not give the cause. To find out

Loading…
Cancel
Save