Navigate

Introduction

Running

Development

Making an OS

FAQ

Contact

Credits

License

Resources

The MikeOS Handbook

Version 1.0.0, 16 Sep 2007 - (C) MikeOS Developers

This is the all-in-one documentation file for the MikeOS operating system. It introduces the project, explains how to run it on your PC or in an emulator, and then shows you how to compile and develop for it. If you have just downloaded MikeOS and want to try it straight away, go to the Running MikeOS section.

If you want to write your own operating system, go to the Making an OS section which explains how to get started. The version number for this document, as specified above, matches that of the official MikeOS releases. This Handbook is included in the MikeOS release files.

MikeOS is written by Mike Saunders with contributions from other developers. It is released under a BSD-like open source license.



Introduction

Overview

MikeOS is a 16-bit real mode operating system for x86-compatible PCs, written entirely in assembly language, which boots from a floppy disk or CD-ROM. It features a text-based dialog-driven user interface, a command-line, support for FAT12 (MS-DOS-like) floppy disks and sound via the PC speaker. It can load external programs and has over 30 system calls. Additionally, basic DOS .COM program support is included.

We do not plan to turn MikeOS into a general-purpose operating system like Linux; it is designed as a learning tool, to demonstrate how simple operating systems work. You can use it as the basis of your own OS project, or to learn about x86 assembly language. If you would like to delve into OS development, MikeOS is a good place to start - the code is simple and well-documented, and we're happy to include new features and fixes.

Why 16-bit? Modern operating systems use 32-bit protected mode code, which provides vital features such as memory protection and robust multi-tasking. However, the downside with protected mode is that you lose access to the BIOS, the chunk of code that the PC uses to initialise itself. The BIOS provides rudimentary drivers for the keyboard and display; therefore, 32-bit protected mode OSes, which can't access the BIOS, need to have their own drivers.

By sticking with 16-bit real mode, MikeOS can use the BIOS's keyboard and video drivers, thereby keeping the code free of clutter and sticking to the interesting bits: making an OS work. As a simple, educational operating system, we're not interested in virtual memory and other 32-bit protected mode features. The MikeOS kernel fits into 64K of RAM - plus, it runs from a floppy disk or CD-ROM, so no hard-drive installation is required. You simply write MikeOS to a floppy or CD-R and boot your PC from it to try it out.


History

MikeOS is written almost entirely from scratch, and is not based on any other operating system. A small chunk of code to read FAT12 floppy disks (the MS-DOS type) was taken from a bootloader by E Dehling. The project began in summer 2006, with the 1.0.0 version released in September 2007. Initially, the OS was an enhanced bootloader that read subsequent sectors from a floppy disk and executed them. Following that, FAT12 read support was integrated, so that external programs could be loaded, and then rudimentary DOS support was added via handlers for int 20h and int 21h. Peter Nemeth subsequently improved the DOS routines.


Get involved

As mentioned, we don't plan to turn MikeOS into a complex general-purpose operating system. However, we welcome bugfixes, patches and new features, providing that they retain the overall simplicity of MikeOS and the code is well-documented. Patches which radically alter the structure of MikeOS will be considered providing they don't add vast layers of complexity to the OS! What we'd love to have: more system calls, more hardware support (providing the drivers are small), better DOS compatibility and general bugfixes. Note that in some places, the source code is designed for readability rather than all-out optimisation.

You can help out by downloading the latest release and testing it out on your PC. Please report any bugs to Mike Saunders, and also send any patches you create to the same address too. If you have a basic grounding in x86 assembly, you should find the source easy to navigate after reading the Development section in this document. If you've never done any OS programming before, please read the Making an OS section, followed by the Development section, which will get you started.



Running MikeOS

Emulation

The easiest way to run MikeOS is via a PC emulator. We recommend QEMU, as it's small, free and open source. QEMU emulates a PC with various hardware devices such as a floppy disk and keyboard - perfect for testing and developing an OS. The MikeOS packages include disk images of the operating system; these are virtual floppy disks and CD-ROMs. After all, a disk is just a series of bytes, so it can be represented as a file!

In Linux, open up a terminal and switch into the disk_images directory in the extracted MikeOS package. You'll see that there are files called mikeos.flp and mikeos.iso - the first is a floppy disk image, the second is a CD-ROM image. You can run QEMU with the MikeOS CD-ROM image using:

qemu -cdrom mikeos.iso -boot d

This tells QEMU to use mikeos.iso as if it was a real CD-ROM disc, and to boot from the D drive (historically the CD-ROM device from DOS days). The QEMU virtual PC will start up and begin running MikeOS. Note that QEMU captures the input of your keyboard and mouse, so if you want to get them back for your normal desktop, press Ctrl+Alt on the left-hand side of your keyboard. The source code also includes a script called test-linux.sh which runs QEMU with PC speaker support enabled.

On Windows and Mac OS X, you can use PC emulators such as VMware or the free VirtualBox alternative, booting them from the virtual mikeos.iso CD-ROM image. Please see the Resources section for links.


Real PCs

MikeOS should run on any PC with 1MB of RAM and a 486 CPU (ie anything built within the last 15 years will be more than adequate).

To run MikeOS on a real PC, simply burn the mikeos.iso CD image to a CD-R. You'll find that file in the disk_images directory of the expanded MikeOS packages. Note that you can't just copy the file onto a disc; you need to burn it as a direct ISO image. On Windows, Nero supports a 'Burn ISO image' option, and on Linux, you can use K3B or the 'cdrecord' command-line utility.

So, burn mikeos.iso to a CD-R and then boot your PC from it. (For most PCs, this simply means restarting your machine with the MikeOS CD in your drive - but if it doesn't boot from that, you may need to press Del, Esc, F2 or another key at the initial BIOS boot screen to change the boot device order. You can then tell the BIOS to boot from CD-ROM rather than the hard drive.)

Note that for very old machines, you can write the mikeos.flp image file to a 1.44MB floppy disk and boot from that. Please search for RAWRITE on the internet for a Windows tool to help you, and documentation. If you're a seasoned Linux user, you can use the standard 'dd' utility to write the disk image to a floppy.


Usage

When MikeOS starts, you'll see a dialog box offering you the choice of a command-line (CLI) interface, or a menu-driven program selector. Use the cursor keys to select whichever choice you want and hit enter. If you select the CLI, you'll be able to type in commands - enter 'help' to see a list of in-built commands. Enter 'dir' to list the files on the disk, and type a filename to run a program.

Alternatively, if you choose the menu-driven program selector, you can use the cursor keys to select a program from the list and hit enter to run it. Note that the DOS compatibility in MikeOS is very basic, so if you run a DOS program and it doesn't exit correctly, you will need to restart MikeOS.

Some of the programs included on the disk images:

  • MIKEKERN.BIN - the kernel; don't try to run this as it'll crash the system!
  • HELLO.BIN - the classic 'Hello world' program
  • HARDLIST.BIN - shows PC hardware as reported by the BIOS
  • KEYBOARD.BIN - a music keyboard; press the ZXCVBNM (bottom row) keys to play notes, and Q to quit
  • GFXDEMO.BIN - switches to a different video mode and shows a pattern (press Q to quit)
  • DOSTEST.BIN - a tiny DOS COM program that also runs natively on Windows XP!
  • VLAK.BIN - a small freeware DOS game to demonstrate the DOS compatibility

The source for some of these programs can be found in the programs/ directory. If you want to write MikeOS software or test DOS apps (must be .COM programs of 32K or smaller), see the Development section below for information on adding files to the disk images.



Development

This section explains how to build MikeOS, add programs to the disk images and add your own features to the operating system.


Building

As of the current release, only Linux is supported for building MikeOS. (A script for building on DOS/Windows is included; however, it has not been tested by this author.) However, any Unix-like platform should be capable of building it - so if you modify the build script for, say, FreeBSD or Solaris, please let the developers know and we'll include your script.

Build requirements: the NASM assembler, 'mkisofs' utility and root access. We need root access because we loopback-mount the floppy disk image to insert our files.

To build MikeOS, open a terminal and switch into the expanded MikeOS package. Enter sudo bash in Ubuntu-flavoured distros, or just su in others, to switch to the root user. Then enter:

./build-linux.sh

This will use NASM to assemble the bootloader, kernel and supplied programs, then write the bootloader to the mikeos.flp floppy disk image in the disk_images/ directory. (It writes the 512-byte bootloader to the first sector of the floppy disk image to create a Master Boot Record (MBR) and set up a DOS-like filesystem.) Next, the build script loopback-mounts the mikeos.flp image onto the filesystem - in other words, mounting the image as if it was a real floppy. The script copies over the kernel (mikekern.bin) and binaries from the programs/ directory, before unmounting the floppy image.

With that done, the script runs the 'mkisofs' utility to generate a CD-ROM ISO image of MikeOS, injecting the floppy image as a boot section. So we end up with two files in the disk_images/ directory: one for floppy disks and one for CD-Rs. You can now use them in an emulator or on a real PC as described in the Running section above.


Adding files

To add programs to the MikeOS disk images, you first have to add them to the floppy image. You can use this method to test out DOS programs (must be in .COM format and 32K or smaller). In the MikeOS main directory, enter the following commands as root:

mkdir looptmp
mount -o loop -t vfat disk_images/mikeos.flp looptmp

Now the contents of the MikeOS virtual floppy disk image are accessible in the newly-created looptmp/ directory. (We have loopback-mounted the disk image onto our filesystem.) Copy your programs into that directory, for example:

cp ~/DOSPROG.BIN looptmp/

When you're done, unmount the virtual floppy image and remove the temporary directory:

umount looptmp
rm -rf looptmp

Now you can run the build-linux.sh script again; the final stage of this generates the CD-ROM ISO image. So not only have you updated the floppy disk image, but the CD version has been updated too, ready for testing. Note: you can't just loopback mount the CD ISO image and add files to it; MikeOS has no concept of a CD filesystem. Everything is contained in the boot block of the CD - it's effectively the floppy image written as a special boot section. So when the CD boots up, it actually thinks it's booting from a floppy disk!


Structure

This is the contents of the MikeOS tarball/Zip file:

LICENSE.TXTThe license under which MikeOS code is released
README.TXTA quick snippet of documentation
build-linux.shA script for building MikeOS on Linux
disk_images/mikeos.flpThe floppy disk image
disk_images/mikeos.isoThe CD-ROM ISO image
doc/handbook.htmlThis documentation file!
doc/CHANGES.TXTA list of MikeOS changes
doc/CREDITS.TXTThe people behind MikeOS
dosbuild.batA script for building on DOS/Windows
programs/Directory containing example MikeOS software
source/bootload.asmSource code for the bootloader
source/os_cli.asmSource for the command-line interface
source/os_dos.asmSource for the DOS compatibility routines
source/os_main.asmThe core kernel source code file
source/syscalls.asmSource for MikeOS system calls

Note that you may see .bin files in various places; these are simply assembled programs for writing to the disk images. For instance, in the source/ directory you may see mikekern.bin which is the assembled binary kernel file.

By far the most important file is os_main.asm in the source/ directory. This is the core kernel file - you may notice that it's quite small. This is because it's not the entire MikeOS kernel source code; to keep things manageable, we use the %INCLUDE directive to NASM to add other source files during assembly.

So, os_main.asm actually includes os_cli.asm, os_dos.asm and syscalls.asm. When the kernel is assembled, NASM lumps all the files together and converts them to our 64K mikekern.bin file.

As described in the Building section, the build script also assembles the bootloader into a separate file (bootload.bin), and then assembles any .asm files it finds in the programs/ directory.

So, in summary: the main source file is os_main.asm which pulls in other source code files to make the kernel. The bootloader is a separate assembly file, and assembles to a 512-byte Master Boot Record (MBR) which is added to the floppy and CD images. This boot record is executed by the BIOS when a PC starts, and it goes on to load the kernel (mikekern.bin).


Code path

This section describes what happens when MikeOS is running.

The PC BIOS loads our 512-byte bootloader at memory location 0x07C0 and jumps to that point to begin executing it. In mikeboot.asm, we execute a jmp instruction to jump over the following data section. This data section is a disk header to describe the layout of the disk in DOS FAT format - it's not important for us, but means that we can access the disk just like any other DOS disk.

After that, at bootloader_start, we set up the stack and data registers to make sure that the CPU is pulling data from the right place and not overwriting our code with the stack. The remainder of the bootloader is a DOS file-reading routine which reads the kernel from our disk (mikekern.bin) and loads it into memory at 0x2000:0x0000.

(The 0x2000 defines the segment, and 0x0000 is the starting point of our segment. Segments are incredibly boring and messy; they're designed for working with memory in 64K chunks. Because the MikeOS kernel is 64K in size, we just load it into a single segment, and then never venture outside of that segment for simplicity reasons. From here onwards, we don't have to bother with segments at all.)

So, the kernel is loaded at 0x2000:0x0000 in memory. Our bootloader is ready to jump to the kernel code and begin executing it, but it actually jumps to 0x2000:0x8000 instead. Why is this? Well, as mentioned, the MikeOS kernel is 64K. However, the first 32K is blank, zeroed-out empty space - we use this later for loading and running programs. So we need to skip past the first 32K and start executing the second half of the kernel - hence the jmp instruction towards the end which starts executing at point 0x8000 in the kernel segment.

When the kernel has loaded, this is the 64K memory map of MikeOS:

0xE000 - 0xFFFF: 8K scratchpad
Kernel code
(system calls, menu,
DOS compatibility,
etc.)
- - - - - - - - - - -
System call vectors
0x0000 - 0x7FFF
32K space
for loading
programs

There we can see that the bottom 32K is blank space for loading and executing programs. Hence why our bootloader jumps to 0x8000 - the start of the top 32K! (We also have a blank 8K of space at the top of our kernel segment - it's an empty area that programs can use as a buffer.)

At the proper start of our kernel, at 0x8000, we have a series of vectors to system calls. These are right at the beginning of the kernel code for one reason: they don't move. For instance, a program may want to use the MikeOS os_print_string routine, but it doesn't know exactly where in the kernel it is. But if we have a call to it at the beginning of the kernel, a program can always find it!

A program can call 0x8003 and expect that to be the os_print_string routine. (It's not the routine itself, but another call to the real routine at some random position in the kernel.) Because this is the start of the kernel, and we're not going to add code before this point, these vectors always remain in the same place. Look at the file mikedev.inc in the programs/ directory: it shows where all the system call vectors are. In summary, it means people can write MikeOS programs and use its system calls without worrying about changes to the kernel layout. There are lots of system calls available to programs - manipulating strings, getting input, even turning on the PC speaker! See syscalls.asm for their implementations.

As mentioned, the bootloader jumps into the kernel at 0x8000. If you look in os_main.asm in the source/ directory, you'll see that 0x8000 follows the 32K of padded-out space, and we have a jmp instruction to skip past the vectors. We jump to the os_main label, which is where our kernel starts execution:

; =================================================================
; START OF KERNEL CODE

os_main:
	cli
	mov ax, 0
	mov ss, ax                      ; Set stack segment and pointer
        mov sp, 0xF000
	sti

	...

This is where it all begins. Those first five lines give us plenty of room for the stack, so that we don't end up overwriting important code or data when we're pushing and popping. We're ready to go.

	mov cx, 00h                     ; Divide by 0 error handler
	mov si, os_compat_int00
	call os_modify_int_handler

	mov cx, 20h                     ; Set up DOS compatibility...
	mov si, os_compat_int20         ; ...for interrupts 20h and 21h
	call os_modify_int_handler

	mov cx, 21h
	mov si, os_compat_int21
	call os_modify_int_handler

These next nine lines set up our interrupt handlers. MikeOS does not use interrupts for system calls like MS-DOS; it uses direct call instructions as per the PcW 16 Rosanne OS. For DOS compatibility, however, we want to catch 20h and 21h interrupts. MikeOS includes a system call to change interrupt handlers - so as you can see here, we put the number of the interrupt we want to change into CX, and the position in code it should point to in SI, and then call our special routine. Now DOS calls will be handled correctly.

After this, we set up the screen (generic text video mode with block cursor) and then display a dialog box asking whether the user wants a command-line (CLI) or menu-driven interface. We use our own os_dialog_box routine for this - indeed, we can use anything in syscalls.asm. External programs can use the routines in syscalls.asm providing those routines also have vectors in the start of the kernel (the top 32K as described before), and a label for those vectors in mikedev.inc.

So, the kernel pops up a dialog box, and if the user chooses a command-line, the kernel jumps to the code from os_cli.asm. If the user chooses a menu, we continue a bit further down the main code, where we draw a menu and execute the specified program. This program is loaded into the first 32K of RAM, and the kernel's os_program_load routine jumps to 0x0100 - this is the code starting point for DOS and MikeOS apps.

Let's summarise the MikeOS kernel execution path:

  1. The 512-byte Master Boot Record (bootload.bin) loads mikekern.bin into a 64K segment and starts executing it at the upper 32K (0x8000).
  2. The kernel skips over the system call vectors, which are used by external programs, and sets up the DOS interrupt handlers and the screen.
  3. Users can then choose a command-line interface or a menu-driven program selector.
  4. Either way, the selected program is loaded into the first 32K of RAM, and the kernel jumps to 0x0100 to begin executing it.
  5. Programs can use the MikeOS system call vectors at 0x8000 and upwards to access routines in the kernel.
  6. When a program has finished, it uses ret to go back to the os_program_load routine.

Note on DOS compatibility: the DOS routines are extremely basic at present, providing compatibility for a handful of int 21h system calls. They are enough to run some small games and tools - even VisiCalc starts up, although it has problems when running. Peter Nemeth has improved these calls immensely, but there's still a lot missing. If you're interested in helping out, cool! If you'd rather stick with the main MikeOS code, though, never fear: just ignore os_dos.asm completely.


System calls

The vast majority of the code in MikeOS deals with system calls. These are standalone routines that the MikeOS kernel and external programs can call, to minimise code duplication and create a basic API. Look at syscalls.asm in the source/ directory - it has many code chunks for writing text to the screen, manipulating strings, converting numbers, running programs, accessing the PC speaker and so forth.

Adding new system calls is easy and fun - it extends MikeOS! So if you want to help out, this is the best way to start. Open up syscalls.asm in a text editor and paste in the following after the header text:

; -----------------------------------------------------------------
; os_say_hello -- Prints 'Hello' to the screen
; IN/OUT: Nothing

os_say_hello:
	pusha

	mov si, .message
	call os_print_string

	popa
	ret

	.message db 'Hello', 0

There we have it: a new system call that prints 'Hello' to the screen. Hardly a much-needed feature, but it's a starting point. The first three lines are comments explaining what the call does, and what registers it accepts or returns (like variable passing in high-level languages). Then we have the os_say_hello: label which indicates where the code starts, before a pusha.

All system calls should start with pusha and end with popa before ret: this stores registers on the stack at the start, and then pops them off at the end, so that we don't end up changing a bunch of registers and confusing the calling program. (If you're passing back a value, say in AX, you should store AX in a temporary word and drop it back in between the popa and ret, as seen in os_wait_for_key.)

The body of our code simply places the location of our message string into the SI register, then calls another MikeOS routine, os_print_string. You can freely call other routines from your own system call.

Once we've done this, we can access this routine throughout the kernel. But what about external programs? They have no idea where this call is in the kernel! The trick we use is vectors - a series of call and ret snippets at the start of our kernel code, which jump to these routines. Because these vectors are at the start, they never change their position, so we always know where they are.

For instance, right now, your new system call may be at 0x9B9D in the kernel. But if you add another call above, or someone else does, it may move to 0x9FA6 in the kernel binary. We simply don't know where it's going to be. But if we put at vector at the start of our kernel, before anything else happens, we can use that as the starting point as the vector will never move!

Open up os_main.asm and scroll down to the list of system call vectors. You can see they start from 0x8000. Scroll to the bottom of the list and you'll see something like this:

	call os_file_selector		; 0x807B
	ret

The comment here indicates where this bit of code lies in the kernel binary. Once again, it's static, and basically says: if your program wants to call os_file_selector, it should call 0x807B, as this points to that routine and will never change position.

Let's add a vector to our new call. Add this beneath those two lines:

	call os_say_hello		; 0x807F
	ret

How do we know it's at 0x807F in the kernel binary? Well, just follow the pattern above - it's pretty easy to guess. If you're unsure, or some vector above is more complicated than a simple call and ret, you can always use ndisasm to disassemble the kernel and look for the location of the final call in the list.

That's all good and well, but there's one last thing: people writing external programs don't want to call an ugly number like 0x807F when they run our routine. They want to access it by name, so we need to add a line to mikedev.inc in the programs/ directory:

os_say_hello	equ	0x807F	; Prints 'Hello' to screen

Now, any program that includes mikedev.inc will be able to call our routine by name. Et voila: a brand new system call for MikeOS!


Programs

If you want to write a program for MikeOS, here's where to start. The OS has a number of system calls to handle the screen, manage strings, convert numbers, make sounds from the PC speaker and more. You can see a quick list of these in mikedev.inc in the programs/ directory - for their full implementations, see syscalls.asm in source/.

To have your new program assembled and included on the disk images as part of the build process, create it in the programs/ directory. For instance, create a text file there called coolapp.asm with the following contents:

	BITS 16
	%INCLUDE 'mikedev.inc'
	ORG 100h

start:
	mov si, mystring
	call os_print_string

	ret

	mystring db 'My first MikeOS program!', 0

This is a tiny program that prints a message on the screen. The first three lines aren't x86 CPU instructions but directives to the NASM assembler, telling it that we're in 16-bit mode, we want to use the system calls specified in mikedev.inc, and the code section should begin at byte 100 (hexadecimal).

The start: label isn't mandatory but useful for clarity. In the following two instructions we put the location of our text string into the SI register, then call the MikeOS string printing routine. Finally, the ret returns execution back to the OS.

Build MikeOS as described at the start of the Development section, and when you run it, you'll now see COOLAPP.BIN in the list of files on the disk. Run it to see! You can now extend your program with more features - see the other example programs as a guide.



Making an OS

This is a short guide to operating system development. It is by no means exhaustive, and in writing an OS you'll need to consult many different sources on the internet, but this will give you an overview of the processes and skills involved.


Introduction

Writing an operating system is hard work: you need programming experience, in-depth hardware knowledge and patience. Lots and lots of patience. Unlike, say, programming in Visual Studio, where you're supported by debuggers and documentation galore, OS programming drops you onto the bare metal. You have no libraries to work with - nor any helpful error-catching facilities. When something goes wrong, you'll most likely end up with a scrambled screen and frozen computer.

Yet it's these aspects of OS programming that also make it great fun and tremendously rewarding. You're writing something from scratch, not having to rely on anyone else's libraries to do the grunt work. You have complete control of the machine without some annoying OS routine popping up and blocking your way. The computer does exactly what you tell it to do.

Bear in mind that OSes such as Windows and Linux have taken many years (and thousands of man-hours) to get where they are today. Consequently, it's best to head for something realistic: a small hobby OS like MikeOS. Once you have the basics up-and-running, you can then try to get others involved and start to build up a project.

To start writing your own OS, you need:

  • Programming experience. If you're proficient in PHP, JavaScript, C# or some other high-level language, that's good, but you should learn something more low-level. After all, in writing an OS you'll be coding to the bare hardware, so you won't have any libraries around to help. Learn C - especially pointers and arrays - and you'll have a solid grip on memory management. This Handbook is focused on assembly language (machine code) but C is, in some respects, a portable wrapper around assembly, so it really helps.

  • Linux. OS development is certainly possible on Windows, but it's so much easier on Linux as you can get a complete development toolchain in a few mouse-clicks/commands. Linux is also really good for making floppy disk and CD-ROM images - you don't need to install loads of fiddly programs. Installing Linux is a doddle thesedays; grab Ubuntu and install it in VMware or VirtualBox if you don't want to dual-boot. When you're in Ubuntu, get all the tools you need to follow this guide by entering this in a terminal window:
    sudo apt-get install build-essential qemu nasm
    This gets you the development toolchain (compiler etc.), QEMU PC emulator and the NASM assembler, which converts assembly language into raw machine code executable files.

  • Patience. As mentioned, OS programming is tough work, but very rewarding if you get things right. Even after a few weeks, you may only have a simple command-line to show for your efforts. But you're doing something novel - something more challenging and stimulating than writing yet another IRC client!

If you've got these things, you're ready to dive into the world of OS coding!


PC primer

If you're writing an OS for x86 PCs (the best choice, due to the huge amount of documentation available), you'll need to understand the basics of how a PC starts up. Fortunately, you don't need to dwell on complicated subjects such as graphics drivers and network protocols, as you'll be focusing on the essential parts first.

When a PC is powered-up, it starts executing the BIOS (Basic Input/Output System), which is essentially a mini-OS built into the system. It performs a few hardware tests (eg memory checks) and typically spurts out a graphic (eg Dell logo) or diagnostic text to the screen. Then, when it's done, it starts to load your operating system from any media it can find. Many PCs jump to the hard drive and start executing code they find in the Master Boot Record (MBR), a 512-byte section at the start of the hard drive; some try to find executable code on a floppy disk or CD-ROM.

This all depends on the boot order - you can normally specify it in the BIOS options screen. The BIOS loads 512 bytes from the chosen media into its memory, and begins executing it. This is the bootloader, the small program that then loads the main OS kernel or a larger boot program (eg GRUB/LILO for Linux systems). This 512 byte bootloader has two special numbers at the end to tell the OS that it's a Master Boot Record - we'll cover that later.

Note that PCs have an interesting feature for booting. Historically, most PCs had a floppy drive, so the BIOS was configured to boot from that device. Today, however, many PCs don't have a floppy drive - only a CD-ROM - so a hack was developed to cater for this. When you're booting from a CD-ROM, it can emulate a floppy disk; the BIOS reads the CD-ROM drive, loads in a chunk of data, and executes it as if it was a floppy disk. This is incredibly useful for us OS developers, as we can make floppy disk versions of our OS, but still boot it on CD-only machines. (Floppy disks are really easy to work with, whereas CD-ROM filesystems are much more complicated.)

So, to recap, the boot process is:

  1. Power on: the PC starts up and begins executing the BIOS code.
  2. The BIOS looks for various media such as a floppy disk, CD-ROM or hard drive.
  3. The BIOS loads 512 bytes (the MBR) from the specified media and begins executing it.
  4. Those 512 bytes then go on to load the OS itself, or a more complex bootloader.

For MikeOS, we have the 512-byte bootloader, which we write to a floppy disk image file (a virtual floppy). We can then inject that floppy image into a CD, for PCs that only have CD-ROM drives. Either way, the BIOS loads it as if it was on a floppy, and starts executing it. We have control of the system!


Assembly language primer

Most modern operating systems are written in C/C++. That's very useful when portability and code-maintainability are crucial, but it adds an extra layer of complexity to the proceedings. For your very first OS, you're better off sticking with assembly language, as used in MikeOS. It's more verbose and non-portable, but you don't have to worry about compilers and linkers. Besides, you need a bit of assembly to kick-start any OS.

Assembly language (or colloquially "asm") is a textual way of representing the instructions that a CPU executes. For instance, an instruction to move some memory in the CPU may be 11001001 01101110 - but that's hardly memorable! So assembly provides mnemonics to substitute for these instructions, such as mov ax, 30. They correlate directly with machine-code CPU instructions, but without the meaningless binary numbers.

Like most programming languages, assembly is a list of instructions followed in order. You can jump around between various places and set up subroutines/functions, but it's much more minimal than C# and friends. You can't just print "Hello world" to the screen - the CPU has no concept of what a screen is! Instead, you work with memory, manipulating chunks of RAM, performing arithmetic on them and putting the results in the right place. Sounds scary? It's a bit alien at first, but it's not hard to grasp.

At the assembly language level, there is no such thing as variables in the high-level language sense. What you do have, however, is a set of registers, which are on-CPU memory stores. You can put numbers into these registers and perform calculations on them. In 16-bit mode, these registers can hold numbers between 0 and 65535. Here's a list of the fundamental registers on a typical x86 CPU:

AX, BX, CX, DX General-purpose registers for storing numbers that you're using. For instance, you may use AX to store the character that has been pressed on the keyboard, while using CX to act as a counter in a loop. (Note: these 16-bit registers can be split into 8-bit registers such as AH/AL, BH/BL etc.)
SI, DI Source and data index registers. These point to places in memory for retrieving and storing data.
SP The Stack Pointer (explained in a moment).
IP The Instruction Pointer. This contains the location in memory of the instruction being executed. When an instruction has finished, it is incremented and moves on to the next instruction. You can change the contents of this register to move around in your code.

So you can use these registers to store numbers as you work - a bit like variables, but they're much more fixed in size and purpose. There are a few others, notably segment registers. Due to limitations in old PCs, memory was handled in 64K chunks called segments. This is a really messy subject, but thankfully you don't have to worry about it - for the time being, your OS will be less than a kilobyte anyway! In MikeOS, we limit ourselves to a single 64K segment so that we don't have to mess around with segment registers.

The stack is an area of your main RAM used for storing temporary information. It's called a stack because numbers are stacked one-on-top of another. Imagine a Pringles tube: if you put in a playing card, an iPod Shuffle and a beermat, you'll pull them out in the reverse order (beermat, then iPod, and finally playing card). It's the same with numbers: if you push the numbers 5, 7 and 15 onto the stack, you will pop them out as 15 first, then 7, and lastly 5. In assembly, you can push registers onto the stack and pop them out later - it's useful when you want to store temporarily the value of a register while you use that register for something else.

PC memory can be viewed as a linear line of pigeon-holes ranging from byte 0 to whatever you have installed (millions of bytes on modern machines). At byte number 53,634,246 in your RAM, for instance, you may have your web browser code to view this document. But whereas we humans count in powers of 10 (10, 100, 1000 etc. - decimal), computers are better off with powers of two (because they're based on binary). So we use hexadecimal, which is base 16, as a way of representing numbers. See this chart to understand:

Decimal 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Hexadecimal 0 1 2 3 4 5 6 7 8 9 A B C D E F 10 11 12 13 14

As you can see, whereas our normal decimal system uses 0 - 9, hexadecimal uses 0 - F in counting. It's a bit weird at first, but you'll get the hang of it. In assembly programming, we identify hexadecimal (hex) numbers by tagging a 'h' onto the end - so 0Ah is hex for the number 10. (You can also denote hexadecimal in assembly by prefixing the number with 0x - for instance, 0x0A.)

Let's finish off with a few basic assembly instructions. These move memory around, compare them and perform calculations. They're the building blocks of your OS - there are hundreds of instructions, but you don't have to memorise them all, because the most important handful are used 90% of the time.

mov Copies memory from one location or register to another. For instance, mov ax, 30 places the number 30 into the AX register. Using square brackets, you can get the number at the memory location pointed to by the register. For instance, if BX contains 80, then mov ax, [bx] means "get the number in memory location 80, and put it into AX". You can move numbers between registers too: mov bx, cx.
add / sub Adds a number to a register. add ax, FFh adds FF in hexadecimal (255 in our normal decimal) to the AX register. You can use sub in the same way: sub dx, 50.
cmp Compares a register with a number. cmp cx, 12 compares the CX register with the number 12. It then updates a special register on the CPU called FLAGS - a special register that contains information about the last operation. In this case, if the number 12 is bigger than the value in CX, it generates a negative result, and notes that negative in the FLAGS register. We can use this in the following instructions...
jmp / jg / jl... Jump to a different part of the code. jmp label jumps (GOTOs) to the part of our source code where we have label: written. But there's more - you can jump conditionally, based on the CPU flags set in the previous command. For instance, if a cmp instruction determined that a register held a smaller value than the one with which it was compared, you can act on that with jl label (jump if less-than to label). Similarly, jge label jumps to 'label' in the code if the value in the cmp was greater-than or equal to its compared number.
int Interrupt the program and jump to a specified place in memory. Operating systems set up interrupts which are analogous to subroutines in high-level languages. For instance, in MS-DOS, the 21h interrupt provides DOS services (eg as opening a file). Typically, you put a value in the AX register, then call an interrupt and wait for a result (passed back in a register too). When you're writing an OS from scratch, you can call the BIOS with int 10h to perform tasks like printing strings, reading sectors from a floppy disk etc.

Let's look at some of these instructions in a little more detail. Consider the following code snippet:

	mov bx, 1000h
	mov ax, [bx]
	cmp ax, 50
	jge label
	...

label:
	mov ax, 10

In the first instruction, we move the number 1000h into the BX register. Then, in the second instruction, we store in AX whatever's in the memory location pointed to by BX. This is what the [bx] means: if we just did mov ax, bx it'd simply copy the number 1000h into the AX register. But by using square brackets, we're saying: don't just copy the contents of BX into AX, but copy the contents of the memory address to which BX points. Given that BX contains 1000h, this instruction says: find whatever is at memory location 1000h, and put it into AX.

So, if the byte of memory at location 1000h contains 37, then that number 37 will be put into the AX register via our second instruction. Next up, we use cmp to compare the number in AX with the number 50 (the decimal number 50 - we didn't suffix it with 'h'). The following jge instruction acts on the cmp comparison, which has set the FLAGS register as described earlier. The jge label says: if the result from the previous comparison is greater than or equal, jump to the part of the code denoted by label:. So if the number in AX is greater than or equal to 50, execution jumps to label:. If not, execution continues at the '...' stage.

One last thing: you can insert data into a program with the db (define byte) directive. For instance, this defines a series of bytes with the number zero at the end, representing a string:

	mylabel: db 'Message here', 0

In our assembly code, we know that a string of characters, terminated by a zero, can be found at the mylabel: position. We could also set up single byte to use somewhat like a variable:

	foo: db 0

Now foo: points at a single byte in the code, which in the case of MikeOS will be writable as the OS is copied completely to RAM. So you could have this instruction:

	mov byte al, [foo]

This moves the byte pointed to by foo into the AL register.

That's the essentials of x86 PC assembly language, and enough to get you started. When writing an OS, though, you'll need to learn much more as you progress, so see the Resources section for links to more in-depth assembly tutorials.


Your first OS

Now you're ready to write your first operating system kernel! Of course, this is going to be extremely bare-bones, just a 512-byte MBR as described earlier, but it's a starting point for you to expand further. Paste the following code into a file called myfirst.asm and save it into your home directory - this is the source code to your first OS.

	BITS 16

start:
	mov ax, 07C0h		; Set up 4K stack space after this bootloader
	add ax, 512
	mov ss, ax
	mov sp, 4096

	mov ax, 07C0h		; Set data segment to where we're loaded
	mov ds, ax


	mov si, text_string	; Put string position into SI
	call print_string	; Call our string-printing routine

	jmp $			; Jump here - infinite loop!


	text_string db 'This is my cool new OS!', 0


print_string:			; Routine: output string in SI to screen
	mov ah, 0Eh		; int 10h 'print char' function

.repeat:
	lodsb			; Get character from string
	cmp al, 0
	je .done		; If char is zero, end of string
	int 10h			; Otherwise, print it
        jmp .repeat

.done:
	ret


	times 510-($-$$) db 0	; Pad remainder of MBR sector with 0s
	dw 0xAA55		; The standard PC boot signature

Let's step through this. The BITS 16 line isn't an x86 CPU instruction; it just tells the NASM assembler that we're working in 16-bit mode. NASM can then translate the following instructions into raw x86 binary. Then we have the start: label, which isn't strictly needed as execution begins right at the start of the file anyway, but it's a good marker. From here onwards, note that the semicolon (;) character is used to denote non-executable text comments - we can put anything there.

The following six lines of code aren't really of interest to us - they simply set up the segment registers so that the stack pointer (SP) knows where our handy stack of temporary data is, and where the data segment (DS) is located. As mentioned, segments are a hideously messy way of handling memory from the old 16-bit days, but we just set up the segment registers and forget about them. (The references to 07C0h are the memory location at which the BIOS loads our code, so we start from there.)

The next part is where the fun happens. The mov si, text_string line says: copy the location of the text string below into the SI register. Simple enough! Then we use call, which is like a GOSUB in BASIC or a function call in C. It means: jump to the specified section of code, but prepare to come back here when we're done.

How does the code know how to do that? Well, when we use a call instruction, the CPU increments the position of the IP (Instruction Pointer) register and pushes it onto the stack. You may recall from the previous explanation of the stack that it's a last-in first-out memory storage mechanism. All that business with the stack pointer (SP) and stack segment (SS) at the start cleared a space for the stack, so that we can drop temporary numbers there without overwriting our code.

So, the call print_string says: jump to the print_string routine, but push the location of the next instruction onto the stack, so we can pop it off later and resume execution here. Execution has jumped over to print_string: - this routine uses the BIOS to output text to the screen. First we put 0Eh into the AH register (the upper byte of AX). Then we have a lodsb (load string byte) instruction, which retrieves a byte of data from the location pointed to by SI, and stores it in AL (the lower byte of AX). Next we use cmp to check if that byte is zero - if so, it's the end of the string and we quit printing (jump to the .done label).

If it's not zero, we call int 10h (interrupt our code and go to the BIOS), which reads the value in the AH register (0Eh) we set up before. Ah, says the BIOS - 0Eh in the AH register means "print the character in the AL register to the screen!". So the BIOS prints the first character in our string, and returns from the int call. We then jump to the .repeat label, which starts the process again - lodsb to load the next byte from SI (it increments SI each time), see if it's zero and decide what to do. The ret at the end of our string-printing routine means: "we've finished here - return back to the place where we were called by popping the code location from the stack back into the IP register".

So there we have a demonstration of a loop, in a standalone routine. You can see that the text_string label is alongside a stream of characters, which we insert into our OS using db. The text is in apostrophes so that NASM knows it's not code, and at the end we have a zero to tell our print_string routine that we're at the end.

Let's recap: we start off by setting up the segment registers so that our OS knows where the stack pointer and executable code resides. Then we point the SI register at a string in our OS binary, and call our string-printing routine. This routine scans through the characters pointed to by SI and displays them until it finds a zero, at which point it returns back into the code that called it. Then the jmp $ line says: keep jumping to the same line. (The '$' in NASM denotes the current point of code.) This sets up an infinite loop, so that the message is displayed and our OS doesn't try to execute the following string!

The final two lines are interesting. For a PC to recognise a master boot record (MBR), it has to be exactly 512 bytes in size and end with the numbers AAh and 55h (the boot signature). So the first of these lines says: pad out our resulting binary file to be 510 bytes in size. Then the second line uses dw (define a word - two bytes) containing the aforementioned boot signature. Voila: a 512 byte boot file with the correct numbers at the end for the BIOS to recognise.

Let's build our new OS. In a terminal window, in your home directory, enter:

nasm -f bin -o myfirst.bin myfirst.asm

Here we assemble the code from our text file into a raw binary file of machine-code instructions. With the -f bin flag, we tell NASM that we want a plain binary file (not a complicated Linux executable - we want it as plain as possible!). The -o myfirst.bin part tells NASM to generate the resulting binary in a file called myfirst.bin.

Now we need a virtual floppy disk image to which we can write our bootloader-sized kernel. Copy mikeos.flp from the disk_images/ directory of the MikeOS bundle into your home directory, and rename it myfirst.flp. Then enter:

dd status=noxfer conv=notrunc if=myfirst.bin of=myfirst.flp

This uses the 'dd' utility to directly copy our kernel to the first sector of the floppy disk image. When it's done, we can boot our new OS using the QEMU PC emulator as follows:

qemu -fda myfirst.flp

And there you are! Your OS will boot up in a virtual PC. If you want to use it on a real PC, you can write the floppy disk image to a real floppy and boot from it, or generate a CD-ROM ISO image. For the latter, make a new directory called cdiso and move the myfirst.flp file into it. Then, in your home directory, enter:

mkisofs -o myfirst.iso -b myfirst.flp cdiso/

This generates a CD-ROM ISO image called myfirst.iso with bootable floppy disk emulation, using the virtual floppy disk image from before. Now you can burn that ISO to a CD-R and boot your PC from it! (Note that you need to burn it as a direct ISO image and not just copy it onto a disc.)

Next you'll want to improve your OS - explore the MikeOS source code to get some inspiration. Remember that bootloaders are limited to 512 bytes, so if you want to do a lot more, you'll need to make your bootloader load a separate file from the disk and begin executing it, in the same fashion as MikeOS.



Frequently Asked Questions

1. Will MikeOS ever use 32-bit / protected mode / C / C++ etc.?

No! MikeOS will not become a general-purpose OS like Linux. We want to keep the code very clean, compact and simple - ideal as a learning tool. If you're interested in developing a more versatile desktop OS, consider joining the Syllable or Haiku projects.


2. Can I install MikeOS to my hard drive?

Not yet, no. MikeOS does not include a driver, and adding one may make the code too complicated. Also, given the small size of the OS, it's not likely to grow beyond floppy disk size any time soon!


3. Will MikeOS ever have a GUI?

It depends: if the code isn't too big or complex, we would add a simple GUI. We started work on a GUI early in 2007, writing a PS/2 mouse driver, font rendering system and various drawing primitives, but the code grew too complex. If you're an assembler guru and understand the MikeOS code well, we can supply the earlier MikeOS code tree with the GUI - contact us for details.


4. I want to make my own OS; can I use MikeOS as a base?

Of course - that's great. Providing you include the original MikeOS license file (LICENSE.TXT) with your software, and retain the copyright notices, you can do as you please. The license is a BSD-like license, which lets you do anything with the code apart from claim that you wrote all of it! If you start a project based on MikeOS, please let us know and we'll post a link in this Handbook.


5. I've optimised some chunks of your code...

Excellent - if it's still readable, please do send us the patch. However, there are many code snippets in MikeOS which aren't optimised but written for clarity. If you speed up a system call without making the code look really weird, that's good, but as a learning tool we're aiming for the code to be very easy to understand.



Contact

Website

The MikeOS project website is located on SourceForge at http://mikeos.sf.net.

This site has the latest release available for download, plus forums, activity statistics and a bug tracker on the project page. The most up-to-date version of this document is also available at the MikeOS website.


Questions

If you have a question about MikeOS, please first make sure that it hasn't been answered in this documentation (especially the FAQ section). You can email Mike Saunders, the lead developer, with your question and he will try to reply quickly.


Bug reports

If you have found a reproducible bug in MikeOS, please let us know about it so that we can fix it. The SourceForge project page, linked above, includes a bug tracker where you can submit bug reports. However, it may be easier to simply email the main developer directly - no need to sign up to SourceForge.

When submitting a bug report, please detail the exact point at which MikeOS didn't work as expected, what happened (eg any error messages), and the hardware or emulator you're using. NOTE: the DOS compatibility routines are incomplete, so you're likely to encounter various bugs if you try running DOS programs. If you find a minor glitch that looks like an easy fix, please do let us know about it, but if a random DOS .COM program fails to work, chances are we're missing a few routines! Until someone steps in to enhance the DOS compatibility, nothing can be done about that yet.


Patches

If you've made some improvements or additions to MikeOS and wish to submit them, great! If they're small changes - such as a bugfix or minor tweak - you can paste the altered code into an email. Explain what it does and where it goes in the source code, and if it's OK, we'll include it.

If your change is larger (eg a system call) and affects various parts of the code, you're better off with a patch. On UNIX-like systems such as Linux, you can use the diff command-line utility to generate a list of your changes. For this, you will need the original (release) source code tree of MikeOS, along with the tree of your modified code. For instance, you may have the original code in a directory called mikeos-1.0.0/ and your enhanced version in fixed-mikeos-1.0.0/.

Switch to the directory beneath these, and enter:

diff -ru mikeos-1.0.0 fixed-mikeos-1.0.0 > patch.txt

This collates the differences between the two directories, and directs the output to the file patch.txt. Have a look at the file to make sure it's OK (you can see how it shows which lines have changed), and then attach the file to an email.

Please email small fixes and complete patches to Mike Saunders.



Credits

People who have contributed to MikeOS:

  • Mike Saunders - main developer and Handbook author
  • Peter Nemeth - DOS compatibility improvements and bugfixes
  • E Dehling - wrote the original FAT12 code that MikeOS uses

Thanks to Michael Crees for testing, and Helen Ewart for creating the MikeOS cat mascot.



License

This is the license for redistributing and modifying MikeOS. It is based on the BSD license, and essentially states: do what you like with the code, providing you keep this license file intact and don't claim that you wrote the whole thing!

===================================================================
MikeOS -- License
===================================================================

Copyright (C) 2006, 2007 MikeOS Developers -- http://mikeos.sf.net

All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

    * Redistributions of source code must retain the above copyright
      notice, this list of conditions and the following disclaimer.

    * Redistributions in binary form must reproduce the above copyright
      notice, this list of conditions and the following disclaimer in the
      documentation and/or other materials provided with the distribution.

    * Neither the name MikeOS nor the names of any MikeOS contributors
      may be used to endorse or promote products derived from this software
      without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY MIKEOS DEVELOPERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
ARE DISCLAIMED. IN NO EVENT SHALL MIKEOS DEVELOPERS BE LIABLE FOR ANY
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE
USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.


===================================================================


Resources

Tools

These are the Linux programs you'll need to build MikeOS and test it in a PC emulator. Note that depending on your distro, you may be able to get these via your package manager.

  • NASM - the assembler used to build MikeOS
  • QEMU - an excellent PC emulator

Windows users: although Linux is the currently supported build and test platform, QEMU is also available for Windows. Please consult the documentation supplied with the Windows version of QEMU. You may also want to try VirtualBox, a free PC emulator with a user-friendly interface.

Mac OS X users: The Q PC Emulator, based on QEMU, can be used to boot the MikeOS CD-ROM image. Please consult the Q documentation.


Links

See these sites for more information on x86 assembly language and PC development.

  • OSDev.org - very helpful forums where you can get help, and a useful wiki
  • Roby's Tutorial - great guide to x86 assembly language; geared towards programming on MS-DOS but the concepts are still useful for MikeOS programming
  • UNIX assembly - a short guide to assembly on Linux and BSD

If you can recommend any other links that would be useful to MikeOS programmers, please let us know!