Yangbo's Blog

MIT 6.828 HW - Boot xv6 & Shell

The homework solutions of mine were uploaded to here.

Installing tools

  • Use make && sudo make install to compile QEMU, to avoid any permission error.
  • There is a auto-loading warning when launching GDB for the first time. Add a line to the .gdbinit file according to the info below:
    1
    2
    3
    4
    5
    6
    7
    warning: File "/home/ylong/Desktop/hobby_project/abc/MIT/6.828/xv6-public/.gdbinit" auto-loading has been declined by your `auto-load safe-path' set to "$debugdir:$datadir/auto-load".
    To enable execution of this file add
    add-auto-load-safe-path /home/ylong/Desktop/hobby_project/abc/MIT/6.828/xv6-public/.gdbinit
    line to your configuration file "/home/ylong/.gdbinit".
    To completely disable this security protection add
    set auto-load safe-path /
    line to your configuration file "/home/ylong/.gdbinit".`

Booting xv6

8086 can reference up to 1 MB memory, in which

  • DRAM: 0x00000 - 0x9FFFF, 640K;
  • BIOS ROM: 0xF0000 - 0xFFFFF, 64KB;
  • Peripherals: 0xA0000 - 0xEFFFF, 320K (Graphic memory mapping: 0xB8000 - 0xBFFFF).

Find the address of _start, the entry point of the kernel:

1
2
3
4
$ nm kernel | grep _start
8010a48c D _binary_entryother_start
8010a460 D _binary_initcode_start
0010000c T _start

Before entering bootmain

The very first instruction that the CPU executes upon power-up is a jump instruction stored in ROM. The BIOS will load the 512-byte boot sector, which was stored in the first sector of hard disk, into memory at physical addresses 0x7c00 through 0x7dff. Next, it’s the time to execute the boot loader code (switching from real mode to protected mode) and enter the kernel, which takes charge of everything finally.

As shown in the above image, at the entry point of 0x7c00 the system is still under the real mode, with SS = 0 and the stack pointer ESP points to an address below 0x7c00.

When the program halts right before calling routine bootmain, it can be found that the system has entered the protected mode which is indicated by the SS = 0x10 in the picture below (in the protected mode, the content in a segment selector, for example DS, is no longer a segment address, but the index of a segment descriptor in the GDT). ESP is also initialized to be 0x7c00 right before entering bootmain.

When routine bootmain is called, the call bootmain instruction saves the return address (0x7c4d) to stack. In the case of a function call with arguments, all arguments will also be saved into stack. As of now, the memory structure can be depicted as below.

Executing bootmain

In bootblock.asm, the first instruction of bootmain, push %ebp, saves the frame pointer EBP to stack, which is usually used to fetch function arguments in stack during a function call.

1
2
3
4
5
void
bootmain(void)
{
7d3b: 55 push %ebp
7d3c: 89 e5 mov %esp,%ebp

The stack content look like below after saving EBP. Note that the stack top address decreases when new elements are pushed into stack.

Entering kernel

The C function entry() in bootmain.c enters the kernel code and starts kernel execution, which is the main purpose of bootmain. The compiled assembly instructions below shows that the entry address of kernel was saved at the address 0x10018.

1
2
3
4
5
// Call the entry point from the ELF header.
// Does not return!
entry = (void(*)(void))(elf->entry);
entry();
7dae: ff 15 18 00 01 00 call *0x10018

When the program jumps to kernel code address 0x10000c, the return address of the C function call entry() is also saved to stack as shown in the picture below.

Shell

Reading notes

While reading Chapter 0 of the xv6 book, some brief notes were taken as below.

  • fork and exec are not combined in a single call, to allow the child process to redirect I/O before it runs the program. For example,

    1
    2
    3
    4
    5
    if(fork() == 0) {
    close(0);
    open("input.txt", O_RDONLY);
    exec("cat", argv);
    }
  • The dup system call duplicates an existing file descriptor, returning a new one that refers to the same underlying I/O object.

  • It’s important for the child to close the write end of the pipe before executing wc below, just in case the fd of wc referred to the write end of the pipe and never receives EOF.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    int p[2];
    char *argv[2];
    argv[0] = "wc";
    argv[1] = 0;
    pipe(p); // creates a pipe and records the read and write fds in p
    if(fork() == 0) { // child
    close(0);
    dup(p[0]); // dups the read end onto fd 0
    close(p[0]);
    close(p[1]); // closes fds in p
    exec("/bin/wc", argv); // executes wc
    } else { // parent
    close(p[0]); // closes the read side of the pipe
    write(p[1], "hello world\n", 12);
    close(p[1]);
    }
  • cd is a special shell command. It doesn’t fork a child process, but change the current working directory of the shell itself. Other shell commands fork child processes when they run.

Coding notes

Unlike a OOP language, there is no object inheritance in the C language. The way to construct class-inheritance-like structures in C is shown as below code snippet.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
struct cmd { // act like a base class to be inherited by various commands
int type; // ' ' (exec), | (pipe), '<' or '>' for redirection
};
struct execcmd {
int type; // ' '
char *argv[MAXARGS]; // arguments to the command to be exec-ed
};
struct redircmd {
int type; // < or >
struct cmd *cmd; // the command to be run (e.g., an execcmd)
char *file; // the input/output file
int mode; // the mode to open the file with
int fd; // the file descriptor number to use for the file
};
struct pipecmd {
int type; // |
struct cmd *left; // left side of pipe
struct cmd *right; // right side of pipe
};

execcmd means the commands that can be executed directly, without I/O redirection and pipes etc. More details about the structure and logic of the sh.c program can be seen in my solution.