Introduction
https://www.intel.com/content/www/us/en/content-details/671200/intel-64-and-ia-32-architectures-software-developer-s-manual-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4.html
Requirements
Prerequisite
Assembly
There are a couple of file extensions used for assembly files: [.a
, .s
, .S
, .asm
]
In x86, there are two separate versions of assembly syntax:
- AT&T (used by Unix compilers like
gcc
) - Intel/NASM (with a couple dialects, like MASM vs. NASM itself).
Intel syntax is dominant in the DOS and Windows world, and AT&T syntax is dominant in the Unix world.
The .S
file extension is appropriate for assembly files with GNU syntax using as
, while .asm
more often is associated with Intel syntax NASM/YASM, or MASM, source code.
AT&T | Intel | |
---|---|---|
Parameter order | movl $5, %eax Source before the destination | mov eax, 5 Destination before source |
Stack/Heap
Storage
Division
- A drive refers to the physical hardware component used to store data.
- A partition is a logical division of a physical drive. Each partition within the drive is acting as a separate storage unit.
- A volume is a storage area with its own file system. Volumes can span across multiple partitions.
Image
- A disk image is a single file or a collection of files that replicates the entire contents and structure of a storage device.
- An ISO image (
.iso
) is a sector-by-sector copy of an optical disc, such as a CD, DVD or Blu-ray. - A RAW image is an exact byte-for-byte copy of a storage device without compression or additional data.
Linker
Execution mode
The modes in which x86 code can be executed in are:
-
Real mode (16-bit)
- Computers that use BIOS start up in this mode.
- 20-bit segmented memory address space (meaning that only 1 MB of memory can be addressed— actually since 80286 a little more through HMA)
- Direct software access to peripheral hardware
- No concept of memory protection or multitasking at the hardware level.
-
Protected mode (16-bit and 32-bit)
- Expands addressable physical memory to 16 MB and addressable virtual memory to 1 GB.
- Provides privilege levels and protected memory, which prevents programs from corrupting one another.
- 16-bit protected mode (used during the end of the DOS era) used a complex, multi-segmented memory model.
- 32-bit protected mode uses a simple, flat memory model.
-
Long mode (64-bit)
- Mostly an extension of the 32-bit (protected mode) instruction set, but unlike the 16–to–32-bit transition, many instructions were dropped in the 64-bit mode. Pioneered by AMD.
-
Virtual 8086 mode (16-bit)
- A special hybrid operating mode that allows real mode programs and operating systems to run while under the control of a protected mode supervisor operating system
-
System Management Mode (16-bit)
- Handles system-wide functions like power management, system hardware control, and proprietary OEM designed code.
- It is intended for use only by system firmware.
- All normal execution, including the operating system, is suspended.
- An alternate software system (which usually resides in the computer's firmware, or a hardware-assisted debugger) is then executed with high privileges.
Bootloader
The bootloader is the program that brings your operating system to life. So it needs to bring the kernel into memory somehow. Before transfering control to the kernel, it would be great to setup an environment that the kernel likes, e.g., protected mode.
The bootloader runs in real mode. Because of this, it has easy access to the BIOS.
Checklist:
- Setup 16-bit segment registers and stack
- Print startup message
- Check presence of PCI, CPUID, MSRs
- Enable and confirm enabled A20 line
- Load GDTR
- Inform BIOS of target processor mode
- Get memory map from BIOS
- Locate kernel in filesystem
- Allocate memory to load kernel image
- Load kernel image into buffer
- Enable graphics mode
- Check kernel image ELF headers
- Enable long mode, if 64-bit
- Allocate and map memory for kernel segments
- Setup stack
- Setup COM serial output port
- Setup IDT
- Disable PIC
- Check presence of CPU features (NX, SMEP, x87, PCID, global pages, TCE, WP, MMX, SSE, SYSCALL), and enable them
- Assign a PAT to write combining
- Setup FS/GS base
- Load IDTR
- Enable APIC and setup using information in ACPI tables
- Setup GDT and TSS
BIOS vs UEFI
With a computer running legacy BIOS, the BIOS and the boot loader run in Real mode. The 64-bit operating system kernel checks and switches the CPU into Long mode and then starts new kernel-mode threads running 64-bit code.
With a computer running UEFI, the UEFI firmware (except CSM and legacy Option ROM), the UEFI boot loader and the UEFI operating system kernel all run in Long mode.
Boot sequence
When an x86-based computer starts, the computer tries to run a BIOS initialization program. The BIOS reads and transfers the first 512 bytes (sector 0) of a bootable device, e.g., hard disk. This sector is called the Master Boot Record (MBR). The MBR usually contains two components:
- A tiny bootstrapping program, i.e., a programming that starts the operating system.
- A partition table for the disk device.
However, the BIOS have no idea about any of this, but simply loads the first 512 bytes into memory at 0x7C00
. If the last two bytes are 0x55
and 0xAA
out of the loaded 512 bytes, the BIOS considers it to be valid and jumps to that location to start executing whatever code that is now located there.
Because the MBR may hold a parition table there are two variants to consider:
- If the MBR is partitionless, i.e., does not have a partition table, the code located should load the next step of the boot process which can be the kernel itself or the next stage in the bootloader sequence.
- If the MBR holds a partition table, the MBR takes a look at the partition table and finds the only partition marked as active and loads the boot sector for that partition, and starts to execute that code. It then follows the same flow as (1). The boot sector is the first sector for a partition as opposed to the first sector for the whole disk.
The bootloader is often divided into different stages as it is often impossible to fit everything you want to do in 512 bytes. It is possible depending on use case, but does not leave much room for special-case handling or useful error messages. So the MBR itself contains the first stage (stage 1) of the bootloader. Due to the tiny size of the MBR, there is not much it can do, but it manages just enough to load another sector from disk that contains additional bootstrap code, e.g., the boot sector of a partition or a hard coded sector you have provided. With the code loaded so far, the bootloader is able to enter stage 2. It proceeds loads all code required to boot the kernel. There are many variants here, but for example if the kernel is placed in a file system at the boot partition, stage 2 must know enough about this file system to proceed.
Remember that you can only acccess BIOS and only have access to 640 KiB of RAM in real mode. This implies that if your kernel is bigger than that, you can't load the whole kernel in only real mode! This is unfortunate. The solution is unreal mode! This is not a official processor mode, but rather a technique involving switching between real mode and protected mode in order to access more than 1 MiB while still using the BIOS.
- TODO: memory map
The bootloader described here assumes a couple of things:
- The kernel is a flat binary, i.e., a binary that dose not retain any structure nor segments. There are no special headers or descriptors that describe where the code and data goes. An alternative would be to use, e.g., the ELF format and have code that can decode that format, jump to, and execture the correct code.
- The size of the binary in terms of sectors is calculated during the build process so that stage 1 of the bootloader knows how many sectors it should load.
- There are two stages of the bootloader. Stage 1 is located within the MBR. It loads stage 2. Stage 2 will make the necessary preparations, load all sectors of the kernel below the 1 MiB mark and jump to the main function of the kernel.
- The kernel is loaded at address
0x100000
which means that unreal mode must be used.
- TODO: A20 line
- TODO: Memory map
- TODO: GDT (temporary?)
- TODO: Paging
Partition
A partition table is a table that contains data of the partitions on a disk.
MBR partition table
The Master Boot Record (MBR) is always located on the first sector of a hard disk. It contains the partition table for the disk. The partition table comprises 64 bytes in total of the 512-byte sector.
Addresses (within MBR sector) | Length (bytes) | Description | |
---|---|---|---|
Decimal | Hex | ||
0 - 445 | 0x000 - 0x1BD | 446 | Code area |
446 - 509 | 0x1BE - 0x1FD | 64 | Master partition table |
510 - 511 | 0x1FE - 0x1FF | 2 | Boot record signature |
Addresses (within MBR sector) | Length (bytes) | Table entry | |
---|---|---|---|
Decimal | Hex | ||
446 - 461 | 0x1BE - 0x1CD | 16 | Primary partition 1 |
462 - 477 | 0x1CE - 0x1DD | 16 | Primary partition 2 |
478 - 493 | 0x1DE - 0x1ED | 16 | Primary partition 3 |
494 - 509 | 0x1EE - 0x1FD | 16 | Primary partition 4 |
Addresses (within partition table) | Length (bytes) | Description |
---|---|---|
0 | 1 | Boot indicator (80h = active) |
1-3 | 3 | Starting CHS values |
4 | 1 | Partition-type descriptor |
5-7 | 3 | Ending CHS values |
8-11 | 4 | Starting sector |
12-15 | 4 | Partition size (in sectors) |
GPT partition table
BIOS Parameter Block (BPB)
The boot record is always placed in the logical sector number zero. The first sector of a hard disk is called the Master Boot Record (MBR). In case the storage media is partition, the partition's first sector holds a Volume Boot Record (VBR). The boot record contains code and data mixed together. The BPB is data that contains information about how the partition is formatted.
Offset (within MBR/VBR sector) | Length (in bytes) | Description | |
---|---|---|---|
Decimal | Hex | ||
0 | 0x00 | 3 | |
3 | 0x03 | 8 | |
11 | 0x0B | 2 | |
13 | 0x0D | 1 | Number of sectors per cluster. |
14 | 0x0E | 2 | Number of reserved sectors. The boot record sectors are included in this value. |
16 | 0x10 | 1 | Number of File Allocation Tables (FAT's) on the storage media. |
17 | 0x11 | 2 | Number of root directory entries. |
19 | 0x13 | 2 | The total sectors in the logical volume. If this value is 0, it means there are more than 65535 sectors in the volume, and the actual count is stored in the Large Sector Count entry at 0x20. |
21 | 0x15 | 1 | Media descriptor type. See this. |
22 | 0x16 | 2 | Number of sectors per FAT. FAT12/FAT16 only. |
24 | 0x18 | 2 | Number of sectors per track. |
26 | 0x1A | 2 | Number of heads or sides on the storage media. |
28 | 0x1C | 4 | Number of hidden sectors. (i.e. the LBA of the beginning of the partition.) |
32 | 0x20 | 4 | Large sector count. This field is set if there are more than 65535 sectors in the volume, resulting in a value which does not fit in the Number of Sectors entry at 0x13. |
-
https://manybutfinite.com/post/how-computers-boot-up/
-
https://en.wikipedia.org/wiki/Unreal_mode
-
https://wiki.osdev.org/Memory_Map_(x86)
-
https://wiki.osdev.org/A20_Line
-
https://en.wikipedia.org/wiki/A20_line
-
https://www.pixelbeat.org/docs/disk/
Disk
BIOS disk read
AH = 01
The following status codes represent controller status after last disk operation:
Status (AL) | Description |
---|---|
00 | no error |
01 | bad command passed to driver |
02 | address mark not found or bad sector |
03 | diskette write protect error |
04 | sector not found |
05 | fixed disk reset failed |
06 | diskette changed or removed |
07 | bad fixed disk parameter table |
08 | DMA overrun |
09 | DMA access across 64k boundary |
0A | bad fixed disk sector flag |
0B | bad fixed disk cylinder |
0C | unsupported track/invalid media |
0D | invalid number of sectors on fixed disk format |
0E | fixed disk controlled data address mark detected |
0F | fixed disk DMA arbitration level out of range |
10 | ECC/CRC error on disk read |
11 | recoverable fixed disk data error, data fixed by ECC |
20 | controller error (NEC for floppies) |
40 | seek failure |
80 | time out, drive not ready |
AA | fixed disk drive not ready |
BB | fixed disk undefined error |
CC | fixed disk write fault on selected drive |
E0 | fixed disk status error/Error reg = 0 |
FF | sense operation failed |
GRUB
Interrupts
Memory
Higher Half Kernel
Linux (among other unix-like kernels) reside at virtual addresses 0xC000000 - 0xFFFFFFFF
, leaving 0x00000000 – 0xBFFFFFFF
for user code, data, stacks, libraries and so on. Kernels with this design are said to be in the higher half.
Detecting memory
We can request "low memory" which is the available RAM below 1MB (usually below 640KB) by two BIOS functions (INT 0x12
and INT 0x15
). Using INT 0x12
we get the total number of KBs in the ax
register.
clc ; clear carry flag.
int 0x12 ; Request low memory size from BIOS.
jc .error ; Jump to error routine if carry flag is set.
; ax = amount of continuous memory in KB
Segmentation
There are some special combinations of segment registers and general registers that point to important addresses:
Register Pair | Full Name | Description |
---|---|---|
CS:IP | Code Segment : Instruction Pointer | Points to the address where the processor will fetch the next byte of code. |
SS:SP | Stack Segment : Stack Pointer | Points to the address of the top of the stack, i.e., the most recently pushed byte. |
SS:BP | Stack Segment : Base Pointer | Points to the address of the top of the stack frame, i.e., the base of the data area in the call stack for the currently active subprogram. |
DS:SI | Data Segment : Source Index | Often used to point to string data that is about to be copied to ES:DI. |
ES:DI | Extra Segment : Destination Index | Typically used to point to the destination for a string copy, as mentioned above. |
Paging
System Calls
Filesystem
ATA PIO
Word | Description |
---|---|
10..19 | Serial number |
27..46 | Model number |
60..61 | Total number of user addressable logical sectors for 28-bit commands (DWord) |
Block device
In simple terms, a block device can be though of as a device that reads and writes one block at a time as opposed to a character device that reads and writes one byte at a time. A block here being a fixed set of bytes.
A block device is a special file that provides buffered access to a hardware device.
typedef struct block_t {
block_read_t read;
block_write_t write;
void *device;
} block_t;
FAT
Creating a virtual disk
dd if=/dev/zero of=ramdisk.img count=30000
fdisk ramdisk.img
Press [n, ENTER, ENTER, ENTER, t, 6, w]
mkfs.fat --offset 2048 ramdisk.img
File Allocation Table (FAT)
- If the file size is larger than the sector size, file data is spanning over multiple sectors in the cluster.
- If the file size is larger than the cluster size, file data is spanning over multiple clusters in the cluster chain.
On entries in the FAT
- Directory contents (data) are a series of 32 byte directory entries.
- Value of the entry is the cluster number of the next cluster following this corresponding cluster.
Inode
The inode is a data structure in a Unix-style file system that describes a file-system object such as a file or a directory. Each inode stores the attributes and disk block locations of the object's data.
References
Threads
Networking
- https://wiki.qemu.org/Documentation/Networking
- https://apiraino.github.io/qemu-bridge-networking/
- https://www.qemu.org/docs/master/system/devices/net.html
TCP
- https://cseweb.ucsd.edu/classes/fa09/cse124/presentations/TCPlinux_implementation.pdf
- https://www.saminiir.com/lets-code-tcp-ip-stack-1-ethernet-arp/
ARM
Introduction
FAT
- https://wiki.osdev.org/FAT https://wiki.osdev.org/FAT
- http://elm-chan.org/docs/fat_e.html http://elm-chan.org/docs/fat_e.html
Extras
Bootloader
We have defined
KERNEL_OFFSET equ 0x1000 ;
and we use it to load the kernel at this address
mov bx, KERNEL_OFFSET ; Read from disk and store in 0x1000
when loading from disk.
During linking we are telling the linker to place all executable code at memory address 0x1000
.
The linker will ensure that all code references are calculated based on this address.
i386-elf-ld -o $@ -Ttext 0x1000 $^ --oformat binary