|Operating Systems Development Series|
This series is intended to demonstrate and teach operating system development from the ground up.
IntroductionWelcome! In the previous tutorial we talked about how to load and execute a sector. We also have taken a look at the Rings in assembly language, and a detailed look at the BIOS Parameter Block (BPB).
In this tutorial, we are going to use everything we learned to parse the FAT12 file system, and load our second stage loader by name,
This tutorial is going to have alot of code in it. I will do my best to explain everything in detail. Also, this tutorial will have some math in it as well.
cli and hltYou might be curious at why I end all of my demo programs with the instructions "cli" and "hlt". Its actualy pretty simple. If there is no way of stopping the program in some manner, the CPU will just go beyond your program and execute random instructions. This will, ultimately, end in a Triple Fault.
The reason I clear interrupts (cli) as because the interrupts will execute (hence the system is not haulted) even when I wanted to. This can cause problems. So, just having a hlt instruction (without cli) can Triple Fault the cpu.
Because of this, I always end all of my demos with cli and hlt.
Filesystems - TheoryYippe! Its time to talk about filesystems :)
A File System is nothing more then a specification. It helps create the concept of a "file" on a disk.
A file is a group of data that has represents something. This data can be anything we want it to be. It all depends on how we interprate the data.
As you know, a sector is 512 bytes in size. A file is stored across these sectors on disk. If the file exceeds 512 bytes, we have to give it more sectors. Because not all files are evenly 512 bytes in size, we will need to fill in the rest of the bytes (That the file doesnt use). Kind of like what we did for our bootloader.
If a file spans accross several sectors, we call these sectors a Cluster in the FAT file systems. For example, our kernel will most likley span alot of sectors. To load our kernel, we will need to load the cluster (The sectors) from where it is located.
If a file spans across different sectors (Not contigous) across different clusters, it is said to be Fragmented. We will need to collect the different parts of the file.
There are alot of different kinds of file systems. Some are widley use (Like FAT12, FAT16, FAT32, NTFS, ext (Linux), HFS (Used in older MACs); other filesystems are only used by specific companies for in house use (Like the GFS - Google File System).
Alot of OS developers also create there on versions of the FAT file systems (or even something completely new). These are useually not as good as the most common filesystems though (Like FAT and NTFS).
Okay, so we know a little about file systems now. We are going to be using FAT12 for its simplicity. If we decide, we can always use a different one. :)
FAT12 Filesystem - TheoryFAT12 is the first FAT (File Allocation Table) Filesystem released in 1977, and used in Microsoft Disk BASIC. FAT12, as being an older filesystem generally released for floppr disks, had a number of limtations.
FAT16 has support for directories, and over 64,000 files as it uses a 16 bit cluster (file) address, as apposed to FAT16. However, FAT16 and FAT12 are very simular.
To make things simple, we are going to use FAT12. We might spruce things up with FAT16 (or even use FAT32) later :) (FAT32 is quite different then FAT 12/16, so we might ust use FAT16 later.)
FAT12 FileSystem - Disk StorageTo understand more about FAT12, and how it works, it is better to look at the structure of a typical formatted disk.
This is a typical formatted FAT12 disk, from the bootsector to the very last sector on disk.
Understanding this structure will be important when loading and searching for our file.
Note that there are 2 FATs on a disk. It is located *right after* the reserved sectors (or the bootloader, if there is none).
Also note: The Root Directory is right after all of the FATs. This means...
if we add up the number of sectors per FAT, and the reserved sectors, we can get the first sector to the Root Directory. By searching the Root Directory for a simple string (our filename), we can effectivly find the exact sector of the file on disk :)
Lets look closer...
Boot SectorThis section contains the BIOS Parameter Block and the bootloader. Yep--Ours. The BIOS Parameter Block contains information tat help describe our disk.
Extra Reserved SectorsRemember the bpbReservedSectors member of our BPB? Any extra reserved sectors are stored here, right after the bootsector.
File Allocation Tables (FATs)Remember that a cluster represents a series of contiguous sectors on disk. the size of each cluster is normally 2 KB to 32 KiB. The file peices are linked (from one cluster to another using a common data structure, such as a Linked List.
There are two FATs. However, one is just a copy of the first one for data recovery purposes. It useally isnt used.
The File Allocation Table (FAT) is a list of entrys that map to each of these clusters. They help identify important information to aid in storing data to these clusters.
Each entry is a 12 bit value that represents a cluster. The FAT is a linked list-like structure with these entrys that helps identify what clusters are in use.
To better nderstand this lets look at the possible values:
Lets look at this in a deeper way. A cluster, as you know, represents a series of sectors. We define the amount of sectors it represents from the BIOS Paramete Block:
In our case, each cluster is 1 sector. When we get the first sector of Stage 2 (We get this from the root directory), we use this sector as the starting cluster number in the FAT. Once we find the starting cluster, we just refrence the FAT to determin the cluster (The FAT is just an array of 32 bit numbers. We just compare this number with the list above to determin what to do with it.)
The Root Directory TableNow, THIS will be important to us :)
The root directory is a table of 32 byte values that represent information reguarding file and directories. This 32 byte value uses the format:
Wait a sec! Remember that DOS filenames are limited to 11 bytes? This is why:
Remember in a previous tutorial I explained how their are internal and external filenames? The filename structure I explained is the internal filename. As it is limited to 11 bytes, the filename "Stage2.sys" has to become
Searching and reading FAT12 - TheoryOkay, after reading all of the above, you are probably tired of me saying "FAT12" :)
Anywhoo... How is hie information usefull to us?
We are going to be refrencing the BIOS Parameter Block (BPB) alot. Here is the BPB that we created from the prevuis tutorials for refrence:
Please see the previous tutorial for a detailed explanation of each member.
What we are trying to do is to load a second stage loader. Lets look at what we need to do in detail:
Beginning with a filenameThe first thing to do is to create a good filename. Remember: The Filenames must be exactally 11 bytes to insure we don't corrupt the root directory.
I am using "STAGE2.SYS", for my second stage. You can look at an example of its internal filename in the above section.
Creating Stage 2Okay, Stage2 is a seperate program then the bootloader. Our Stage2 will be very simular to a DOS COM program, sound cool?
All Stage2 does right now is print a message and halt. Everything you have already seen from the bootloader:
To assemble with NASM, just assemble it as a binary program (COM programs are binary), and copy it into the floppy disk image. For example:
No PARTCOPY required :)
Step 1: Loading the Root Directory TableNow its time to load Stage2.sys! We will be refrencing the Root directory table alot here, along with the BIOS parameter block for disk information.
Step 1: Get size of root directoryOkay, first we need to get the size of the root directory.
To get the size, just multiply the number of entrys in the root directory. Seems simple enough :)
In Windows, whenever you add a file or directory to a FAT12 formatted disk, Windows automatically adds the file information to the root directory, so we dont need to worry about it. This makes things much simpler.
Dividing the number of root entrys by bytes per sector will tell us how many sectors the root entry uses.
Here is an example:
Remember that the root directory table is a table of 32 byte values (entrys) that reprsent the file information.
Yippe--Okay, we know how much sectors to load in for the root directory. Now, lets find the starting sector to load from :)
Step 2: Get start of root directoryThis is another easy one. First, lets look at a FAT12 formatted disk again:
Okay, note that the Root Directory is located directly after both FATs and reserved sectors. In other words, just add the FATs + reserved sectors, and you found the root direcory!
Pretty easy, huh? Now, we just read the sector to some location in memory:
Root Directory - Complete exampleThis example code is taken directly from the bootloader a the end of the tutorial. It loads the root directory:
Step 2: Find Stage 2Okay, now the root directory table is loaded. Looking at the above code, we loaded it to 0x200. Now, to find our file.
Lets look back at the 32 byte root directory table again (Section Root Directory Table. Remember the first 11 bytes represent the filename. Also remember that, because each root directory entry is 32 bytes, Every 32 bytes will be the start of the next entry - Pointing us back to the first 11 bytes of the next entry.
Hence, all we need to do is compare filenames, and jump to the next entry (32bytes), and test again until we reach the end of the sector. For example...
On to the next step...
Step 3: Loading FAT
Step 1: Get start clusterOkay, so the root directory is loaded and we found the files entry. How do we get its starting cluster?
The starting cluster will be important to us when loading the file.
Step 2: Get size of FATLets look at the BIOS parameter block again. More specifically...
Okay, so how do we find out how many sectors there are in both FATs? Just multiply sectors per FAT by the number of sectors :) Seems simple, ...but...
No, never mind, it is simple ^^
Step 3: Load the FATNow that we know how many sectors to read. Just, um... read it :)
Yey! Now with the FAT stuff out of the way (Not completely!), load in stage 2!
FAT - Complete exampleHere is the complete code taken directly from the bootloader:
LBA and CHSIn loading the image, all we need to do is load each cluster by refrencing the FAT.
There is one little problem we havnt discussed yet though. Okay, We have a cluster number from the FAT. But, How do we use it?
The problem is that this cluster represents a linear address, while, in order to load sectors, we will need a segment/track/head address. (Interrupt 0x13)
There are two ways to access a disk. Either through Cylinder/Head/Sector (CHS) addressing or Liogical Block Addressing (LBA).
The LBA represents an indexed location on disk. The first block being 0, then 1, and so on. LBA simply represents sectors are sequentially numbered with LBA 0. Cant get more basic then that.
You will need to know how to convert between LBA and CHS.
Converting CHS to LBA
The formula to convert CHS to LBA:
That is simple enough. :) Heres an example:
Converting LBA to CHSThis is a little bit more complex, but still is relaitevly easy:
Heres an example...
Not to hard, I hope :)
Load the clusterOkay, in loading Stage 2, we first need to refrence the cluster from the FAT. Pretty simple. Then, conver the cluster number to LBA so we can read it in:
Get next clusterThis is tricky.
Okay, remember each cluster number in the FAT entry is 12 bits. This is a problem. If we read in 1 byte, we are only copying a part of the cluster number!.
Because of this, we have to read a WORD (2 byte) value.
Yet, then again, we run into a problem. Copying 2 bytes (from a 12 bit value) means that we will copy a part of the next cluster entry. For example, imagine this is your FAT:
Notice all even clusters accopy all of the first byte, but part of the second. Also notice that all odd clusters occopy a part of their first byte, but all of the second!
Okay, so what we need to do is to read a 2byte (word) value from the FAT (This is our cluster).
If the cluster is even, Mask out the top 4 bits, as it belongs to the next cluster.
If it is odd, shift it down 4 bits (to discard the bits used by the first cluster.) For example...
The first shot contains the bootloader loading Stage 2 successfully. Stage 2 prints the loading operating system message.
The second shot displays an error message when it cannot find the file (within the root directory).
This demo contains most of the code in this lesson, 2 source files, 2 directories, and 2 batch programs. The first directory contains the stage 1 program -- our bootloader, the second directory contains our stage 2 program - STAGE2.SYS.
ConclusionWow, this tutorial was hard to write. Simply because it is hard explaining such a complex topic in every detail, while still trying to make it very easy to follow. I hope I did well :)
If you have any suggestions that could imporove this tutorial, please let me know :)
Well... I guess thats it: Good bye bootloader!
In the next tutorial, we will begin building on Stage 2. We will talk about A20, and look at Protected Mode in alot more detail...
See you there!
Until next time,