Monday, February 24, 2014

Writing bootloader in C/C++


What is C?

In computing, C is a general-purpose programming language initially developed by Dennis Ritchie between 1969 and 1973 at AT&T Bell Labs.

Why use C? 

A machine dependent language but programs written in C are usually small and fast to execute. The language includes low-level features that are normally available only in assembly or machine language. C is a structured programming language.

Why do I need to write code in C?

Well if you want to write smaller programs and want them to be really fast then go for it.

What do I need to write code in C language?

Well, we will be using GNU C compiler called gcc to write C code.

How to write programs in GCC compiler in C?

Let us write a program to see how it looks like.

Example: test.c

__asm__(".code16\n");
__asm__("jmpl $0x0000, $main\n");

void main() {
}


File: test.ld

ENTRY(main);
SECTIONS
{
     . = 0x7C00;
     .text : AT(0x7C00)
     {
          *(.text);
     }
     .sig : AT(0x7DFE)
     {
          SHORT(0xaa55);
     }
}


How to compile a C program?

On the command prompt type the below:
  • gcc -c -g -Os -march=i686 -ffreestanding -Wall -Werror test.c -o test.o
  • ld -static -Ttest.ld -nostdlib --nmagic -o test.elf test.o
  • objcopy -O binary test.elf test.bin

What does the above commands means to us anyway?

  • gcc -c -g -Os -march=i686 -ffreestanding -Wall -Werror test.c -o test.o
This command converts the given C code into respective object code which is an intermediate code generated by the compiler before converting into machine code.

What does each flag mean?

  • -c: It is used to compile the given source code without linking.
  • -g: Generates debug information to be used by GDB debugger.
  • -Os: optimization for code size
  • -march: Generates code for the specific CPU architecture (in our case i686)
  • -ffreestanding: A freestanding environment is one in which the standard library may not exist, and program startup may not necessarily be at ‘main’.
  • -Wall: Enable all compiler's warning messages. This option should always be used, in order to generate better code.
  • -Werror: Enable warnings being treated as errors
  • test.c: input source file name
  • -o: generate object code
  • test.o: output object code file name.
With all the above combinations of flags to the compiler, we try to generate object code which helps us in identifying errors, warnings and also produce much efficient code for the type of CPU. If you do not specify march=i686 it generates code for the machine type you have or else it on order to port it always better to specify which type of CPU are you targeting for.
  • ld -static -Ttest.ld -nostdlib --nmagic test.elf -o test.o
This is the command to invoke linker from the command prompt and I have explained below what are we trying to do with the linker.

What does each flag mean?

  • -static: Do not link against shared libraries.
  • -Ttest.ld: This feature permits the linker to follow commands from a linker script.
  • -nostdlib: This feature permits the linker to generate code by linking no standard C library startup functions.
  • --nmagic:This feature permits the linker to generate code without _start_SECTION and _stop_SECTION codes.
  • test.elf: input file name(platform dependent file format to store executables Windows: PE, Linux: ELF)
  • -o: generate object code
  • test.o: output object code file name.


What is a linker?

It is the final stage of compilation. The ld(linker) takes one or more object files or libraries as input and combines them to produce a single (usually executable) file. In doing so, it resolves references to external symbols, assigns final addresses to procedures/functions and variables, and revises code and data to reflect new addresses (a process called relocation).

Also remember that we have no standard libraries and all fancy functions to use in our code.
  • objcopy -O binary test.elf test.bin
This command is used to generate platform independent code. Note that Linux stores executables in a different way than windows. Each have their own way storing files but we are just developing a small code to boot which does not depend on any operating system at the moment. So we are dependent on neither of those as we don't require an Operating system to run our code during boot time.

Why use assembly statements inside a C program?

In Real Mode, the BIOS functions can be easily accessed through software interrupts, using Assembly language instructions. This has lead to the usage of inline assembly in our C code.

How to copy the executable code to a bootable device and then test it?


To create a floppy disk image of 1.4mb size, type the following on the command prompt.
  • dd if=/dev/zero of=floppy.img bs=512 count=2880
To copy the code to the boot sector of the floppy disk image file, type the following on the command prompt.
  • dd if=test.bin of=floppy.img
To test the program type the following on the command prompt
  • bochs
You should see a typical emulating window of bochs as below.





Observation:

Nothing has just happened as we did not write anything to display on the screen in our code. So you only see a message “Booting from Floppy”. Congrats!!!
We use __asm__ keyword to embed assembly language statements into a C program. This keyword hints the compiler to recognize that it is an assembly instruction given by the user.
We also use __volatile__ to hint the assembler not to modify our code and let it as it is.

This way of embedding assembly code inside C code is called as inline assembly.

Let us see a few more examples on writing code on a Compiler.
Let us write an assembly program to print the letter ‘X’ onto the screen.

Example: test2.c

__asm__(".code16\n");
__asm__("jmpl $0x0000, $main\n");

void main() {
__asm__ __volatile__ ("movb $'X' , %al\n");
__asm__ __volatile__ ("movb $0x0e, %ah\n");
__asm__ __volatile__ ("int $0x10\n");
}


After typing the above, save to test2.c and then compile as instructed before by changing the source file name. When you compile and successfully copy this code to the boot sector and run bochs you should see the below screen. On the command prompt type bochs to see the result and you should see the letter ‘X’ on the screen as shown in the below screen shot.





Now, let us write a c program to print the letters “Hello, World” onto the screen.

We will also try to define functions and macros through which we will try to print the string.

Example: test3.c

/*generate 16-bit code*/
__asm__(".code16\n");
/*jump boot code entry*/
__asm__("jmpl $0x0000, $main\n");

void main() {
/*print letter 'H' onto the screen*/
__asm__ __volatile__("movb $'H' , %al\n");
__asm__ __volatile__("movb $0x0e, %ah\n");
__asm__ __volatile__("int $0x10\n");

/*print letter 'e' onto the screen*/
__asm__ __volatile__("movb $'e' , %al\n");
__asm__ __volatile__("movb $0x0e, %ah\n");
__asm__ __volatile__("int $0x10\n");

/*print letter 'l' onto the screen*/
__asm__ __volatile__("movb $'l' , %al\n");
__asm__ __volatile__("movb $0x0e, %ah\n");
__asm__ __volatile__("int $0x10\n");

/*print letter 'l' onto the screen*/
__asm__ __volatile__("movb $'l' , %al\n");
__asm__ __volatile__("movb $0x0e, %ah\n");
__asm__ __volatile__("int $0x10\n");

/*print letter 'o' onto the screen*/
__asm__ __volatile__("movb 
$'o' , %al\n");
__asm__ __volatile__("movb $0x0e, %ah\n");
__asm__ __volatile__("int $0x10\n");

/*print letter ',' onto the screen*/
__asm__ __volatile__("movb 
$',' , %al\n");
__asm__ __volatile__("movb $0x0e, %ah\n");
__asm__ __volatile__("int $0x10\n");

/*print letter ' ' onto the screen*/
__asm__ __volatile__("movb 
$' ' , %al\n");
__asm__ __volatile__("movb $0x0e, %ah\n");
__asm__ __volatile__("int $0x10\n");

/*print letter 'W' onto the screen*/
__asm__ __volatile__("movb 
$'W' , %al\n");
__asm__ __volatile__("movb $0x0e, %ah\n");
__asm__ __volatile__("int $0x10\n");

/*print letter 'o' onto the screen*/
__asm__ __volatile__("movb 
$'o' , %al\n");
__asm__ __volatile__("movb $0x0e, %ah\n");
__asm__ __volatile__("int $0x10\n");

/*print letter 'r' onto the screen*/
__asm__ __volatile__("movb 
$'r' , %al\n");
__asm__ __volatile__("movb $0x0e, %ah\n");
__asm__ __volatile__("int $0x10\n");

/*print letter 'l' onto the screen*/
__asm__ __volatile__("movb 
$'l' , %al\n");
__asm__ __volatile__("movb $0x0e, %ah\n");
__asm__ __volatile__("int $0x10\n");

/*print letter 'd' onto the screen*/
__asm__ __volatile__("movb 
$'d' , %al\n");
__asm__ __volatile__("movb $0x0e, %ah\n");
__asm__ __volatile__("int $0x10\n");
}


Now save the above code as test3.c and then follow the compilation instructions given by changing the input source file name and follow the instructions given to copy the compiled code to the boot sector of the floppy. Now observe the result. You should see the below screen output if everything was fine.





Let us write a C program to print the letters “Hello, World” onto the screen.

We will also try to define function through which we will try to print the string.

Example: test4.c

/*generate 16-bit code*/
__asm__(".code16\n");
/*jump boot code entry*/
__asm__("jmpl $0x0000, $main\n");

/* user defined function to print series of characters terminated by null character */
void printString(const char* pStr) {
     while(*pStr) {
          __asm__ __volatile__ (
          "int $0x10" : : "a"(0x0e00 | *pStr), "b"(0x0007)
          );
          ++pStr;
     }
}

void main() {
/* calling the printString function passing string as an argument */
     printString("Hello, World");
}


Now save the above code as test3.c and then follow the compilation instructions given by changing the input source file name and follow the instructions given to copy the compiled code to the boot sector of the floppy. Now observe the result. You should see the below screen output if everything was fine.





I wanted to bring to your note one point. All we are trying to do is just converting the assembly programs written earlier into C programs by way of learning. By now you should be comfortable in writing programs in Assembly and C and also well aware of how to compile and then test them.

Now we will move onto writing loops and making them work inside a function and also see more bios services.