How the PC behaves in real mode
When you reboot the PC it enters a mode known as real mode. This mode
gives maximum compatibility with the 8086 and some extra features (such
as extended registers, faster instructions ,additional instructions etc.
etc.).
In this mode memory is divided in segments of 64 KB (16 bits) with a
total addressable space of 220=1024KB. Memory locations are accessed
through a segment:offset address (the so called logical address).
Calculation of the physical address (the actual byte number in memory)
is performed in the following way:
\$sf"physical address" = sf"10h"*sf"segment"+sf"offset"\$
For example if we take segment 9000h and offset 8000h (logical address
9000:8000h) we get physical address:
\$sf"9000h"*sf"10h"+sf"8000h" = sf"90000h" + sf"8000h" = sf"98000"\$
|
this address refers to the same physical memory location as for instance 9300:5000h
so segments overlap in real mode.
|
To access different segments, 16-bit
segment registers (such as cs, ds and es) are used so that the maximum
address is $FFFF:000Fh = FFFFFh physical = 2^20$.
The maximal address accessable address would be FFFF:FFFFh = 10FFEFh
physical, but this can’t be expressed in 20 bits. However if we find a
way to access an additional address line (the most famous A20 line) we
can even use this additional FFFF:FFFFh-FFFF:000Fh=FFF0h=65520 bytes.
(the so called High Memory Area (HMA)) But why do we have to enable this
A20 line? Why isn’t it enabled at boot up?
If the A20 line would be enabled then if we got the highest 20 bit
address FFFF:000Fh = FFFFFh and we would go one byte further
(FFFF:0010h) we would access the physical address 100000h (1 0000 0000
0000 0000 0000b), however at the 8086 there is no A20 (this is the 21st
addressline because we start at A0) so that FFFF:000Fh+1= 0000:0000h
dropping the carry. Because some programs use this memory wrap feature
on the 8086, the A20 has to be disabled for complete backward
compatibility.
There is however a way to enable this A20 address line (this is what
himem.sys does on MS-DOS computers giving an additional memoryblock of
almost 64K for device drivers and so on). We can use the keyboard
controller to enable this A20 line, because the A20 line is logical
ANDed with a keyboard controller output, which is disabled at boot up.
This means that the 21st bit of an address is always: 0 AND x = 0. So
all we have to do is enable this keyboard controller output to get: 1
AND x = x. (code to do this will be presented in a later chapter)
Now how can we access A31-A22 to get the 4 GB addressable memory space?
You guessed it, by switching to protected mode. However in PM, memory
management is quite a different ballplay so let’s check it out.
Segment Selectors
A segment selector is a 16-bit value used to select a segment in the
GDT. First let’s take a look at the segment selector’s format:
</p><center> <table border="1">
<tbody><tr> <td align="CENTER" colspan="2">
<b>Segment selector overview</b> </td> </tr>
<tr> <td align="CENTER" colspan="2"> <pre> 16 3 2 1 0
---------------------------------------- | Index | T | RPL |
</p><ul> <li>Index: this is the index of the segment
to be used in the GDT or LDT. In our previous example of a GDT, the null
selector would have an index of 0h, the code segment selector an index
of 1h and so on. I guess this is the actual reason why there can only be
8192 selectora. (the index field is 13 bits wide and
2<sup>13</sup> = 8192 = 2000h) </li><li>TI: this
tells the processor whether the descriptor should be taken out of de GDT
or the LDT (Local Descriptor Table, this table can be defined for every
seperate process in a multitasking environment). In our case TI = 0 so
that we’ll use the GDT. </li><li>RPL: The requested
privilege level must be smaller or equal to the descriptor privilege
level (so higher or same priority) to be able to access the segment. If
this is not the case a general protection exception will be generated
(\#GP). In our case we’ll use RPL = 0. </li></ul>
Assume we would want to access the datasegment from the GDT, with RPL =
0. We would then have to load for example DS with 10h (0000 0000 0000
1000b). If we now want to place a white on black 'a' (character code
041h color attribute 07h) in the first place of the video memory we
could say:</p>
mov word \[0xB8000\],0x0741</p>
We could also load for instance gs with 18h (selecting the videosegment)
and say:</p>
mov word \[gs:0\],0x741 ;remember segment-base = 0xB8000 so offset =
0h</p>
Now the only thing left mentioning is how to set up the GDTR. Well
luckely there is a special instruction which does this for us:
<b>lgdt (Load Global descriptor table)</b>. The limit loaded
in the GDTR is an offset to the last valid byte, so a limit of 0 results
in exactly one valid byte. So if we would want to load the GDTR in our
case the limit would be gdt\_end-gdt-1, because the label gdt\_end is
one byte after the last byte of the GDT, which is exactly what I’ve put
at label gdtr. The base address of our GDT will be 0000:16-bit offset of
gdt, or simpler just gdt. Again I have put that there. So all we have to
do is load the GDTR with the value specified at gdt:</p>
o32 is a NASM keyword which tells the assembler that our operator size
prefix is 32-bit, I don’t know whether this is absolutely necessary.
(any suggestions?) </p>
This is all we need to know about memory access in PM for the moment.
Now the time has come to do the actual switch.
</p><center><h2>8. Switching from real to Protected
Mode</h2></center><b>The operation mode of the
processor is controlled by the least significant bit of the 32-bit
control register 0 (CR0), also called the protection enable (PE)
bit.</b> Because it’s paramount to leave the other bits unchanged
this is done in the following way:
</p><pre>mov eax,cr0 ;load eax with the contents of cr0 or
eax,1 ;set the least significant bit leave the other bits unchanged mov
cr0,eax ;switch to PM </pre>
Before switching to PM, there are a few things you need to do:
</p><ol> <li>cli: Disable interrupts, because the
installed interrupts are all written for real mode and if an interrupt
would occur after the mode switch, your system would probably reboot.
</li><li>Load the GDTR using lgdt, to set up the GDT.
</li><li>Execute a mov CR0 instruction to set the PE bit of
control register 0. </li><li>Immediately after the mov,cr0
instruction perform a far jump to clear the instruction prefetch queue,
because it’s still filled with real mode instructions and addresses.
</li><li>Reload all the segment registers except CS. (which
is reloaded by the far jump) </li><li> Load the Interrupt
descriptor tables to make interrupts possible </li><li>sti:
Re-enable interrupts. </li><li>Enable the A20 line to
prevent memorywrap. </li></ol>
In the following source, I am only going to load the GDT and switch to
PM. So I will not set up a stack or an IDT, which is fine as long as you
don’t POP or PUSH and leave interrupts disabled. When you boot this
example the following actions will be taken:</p>
</p><ol> <li>The screen will be erased.
</li><li>A brown 'a' will be printed in the left corner of
the screen. </li><li>The system will wait for a keypress.
</li><li>The switch to PM will be made.
</li><li>A white 'a' will be printed in the left corner of
the screen. </li><li>The system will go into an infinite
loop (note that CTRL+ALT+DEL will no longer function, because interrupts
are still disabled). </li></ol>
</p><center><h2>9. Enable the A20 address
line</h2></center>In order to use the full amount of RAM
plugged in your computer you have to enable the a20 addressline. As
mentioned earlier this can be done by enabling a line of the floppy
controller. The state of this line can be changed by setting the
appropriate bit. This bit is the second bit of the AT keyboard
controller output port. (port 064h) So in theory we can enable the a20
address line by simply setting this second bit.
There are however some things to be taken into account. The keyboard
buffer (that is the buffer on the keyboard, not the BIOS-buffer) can
still contain some bytes which have to be handled first. </p>
If we have completly cleared the keyboard buffer we try to set the a20
line. This should then enable us to use the additional 64K HMA. So we
can test whether the a20 gate is enabled by writing a byte to
FFFF:000Fh+1 and check whether this byte is different from the one at
0000:0001h. Because if a20 is enabled FFFF:000Fh+1=100000h physical and
if a20 is not enabled a wrap will occur thus writing a byte to 000000h
physical. </p>
To be able to see if the byte positioned at the physical address 00000h
has really changed we try to write the bit inverted (by using NOT) byte
of the original value of 00000h. In that manner it’s always possible to
see if 00000h has changed (which would imply that a20 is not enabled).
</p>
The code I have used below is not written by me. (although I have added
some comments) I think Tran originally wrote this code for use in his
PMode protected mode wrapper. The piece of code conains a function
EnableA20 which should do exactly that. So here we go: </p>
enablea20kbwait: ;wait for safe to write to 8042
xor cx,cx ;loop a maximum of FFFFh times
enablea20kbwaitl0:
jmp short $+2 ;these three jumps are inserted to
wait some clockcycles
jmp short $+2 ;for the port to settle down
jmp short $+2
in al,64h ;read 8042 status
test al,2 ;buffer full? zero-flag is set if
bit 2 of 64h is not set
loopnz enablea20kbwaitl0 ;if yes (bit 2 of 64h is set), loop
until cx=0
ret
;while the above loop is executing keyboard interrupts will occur
which will empty the buffer ;so be sure to have interrupts still
enabled when you execute this code
enablea20test: ;test for enabled A20
: mov al,byte \[fs:0\] ;get byte from 0:0 mov ah,al ;preserve old
byte not al ;modify byte xchg al,byte \[gs:10h\] ;put modified
byte to 0ffffh:10h ;which is either 0h or 100000h
depending on the a20 state
: cmp ah,byte \[fs:0\] ;set zero if byte at 0:0 equals
: ;which means a20 is enabled
mov \[gs:10h\],al ;put back old byte at 0ffffh:10h
: ret ;return, zeroflag is set if A20
EnableA20: ;hardware enable gate A20 (entry point of routine
xor ax,ax ;set A20 test segments 0 and 0ffffh mov fs,ax ;fs=0000h
dec ax mov gs,ax ;gs=0ffffh
call enablea20test ;is A20 already enabled? jz short enablea20done
;if yes (zf is set), done
;if the system is PS/2 then bit 2 of port 92h (Programmable Option
Select) ;controls the state of the a20 gate
in al,92h ;PS/2 A20 enable or al,2 ;set bit 2 without changing the
rest
: jmp short \$+2 ;Allow port to settle down jmp short \$+2 jmp short
\$+2 out 92h,al ;enable bit 2 of the POS call enablea20test ;is
A20 enabled? jz short enablea20done ;if yes, done
call enablea20kbwait ;AT A20 enable using the 8042
: ;wait for buffer empty (giving zf
: jnz short enablea20f0 ;if failed to clear buffer jump
mov al,0d1h ;keyboard controller command 01dh
: out 64h,al ;60h will go to the 8042 output port
call enablea20kbwait ;clear buffer and let line settle
: jnz short enablea20f0 ;if failed to clear buffer jump
mov al,0dfh ;write 11011111b to the 8042 output
: ;(bit 2 is anded with A20 so we
call enablea20kbwait ;clear buffer and let line settle
enablea20f0: ;wait for A20 to enable
: mov cx,800h ;do 800h tries
: call enablea20test ;is A20 enabled? jz enablea20done ;if yes, done
in al,40h ;get current tick counter (high
: jmp short \$+2 jmp short \$+2 jmp short \$+2 in al,40h ;get
current tick counter (low byte) mov ah,al ;save low byte of clock
in ah
enablea20l1: ;wait a single tick
: in al,40h ;get current tick counter (high
: jmp short \$+2 jmp short \$+2 jmp short \$+2 in al,40h ;get
current tick counter (low byte) cmp al,ah ;compare clocktick to
one saved in
: je enablea20l1 ;if equal wait a bit longer
loop enablea20l0 ;wait a bit longer to give a20 a
: stc ;a20 hasn’t been enabled so set
carry to indicate failure
: clc ;a20 has been enabled succesfully so
As you can see it requires quite a few lines of assembly to enable the
a20 gate. This can pose a problem because a bootsector can only be a
maximum 512 bytes. (And we still have to add code to load our kernel en
place it in memory) </p> In order to make some room we will remove
the layout area DOS uses to identify the disk. This forces us to write a
program by which we can write a file to the bootsector of our bootdisk.
</p><center><h2>10. Writing a bootsector to a non-DOS
disk</h2></center>In contrast to all those lucky linux-users
who have dd at their disposal, a DOS or Windows user doesn’t have an
easy way of writing a binary image to a floppy if it is not recognizable
by DOS. Because our bootsector is getting a bit full I really wanted to
remove the block with diskinfo DOS uses to recognize the disk. The
problem is that it’s then impossible to use debug to write the
bootsector to the floppy. So I decided to write my very own WBS (Write
BootSector).
So what has to be done to write an arbitrary file to the bootsector of a
floppy disk? First of all the bootimage has to be read from the hard
disk and stored in memory. Then the buffer containing the bootsector has
to be written to the floppy disk.</p>
</p><pre>;------------------------------------------------------------------------------------------
; wbs.asm Write Boot Sector ; ; writes a binary file from harddisk to
the bootsector of floppy 0 (a:) ; ; compile with NASM to binary file
(nasm is assumed to be in your path) ; nasm wbs.asm -o wbs.com ; ;
written by emJay (c) 1999 last updated 18-06-99 ;
;------------------------------------------------------------------------------------------
Welcome: db "WBS Write Boot Sector v1.0 (c)1999 emJay.",10,13,'\$'
AskInfile: db "What is the location of the bootsector on your
hardisk?",10,13,":\$" ErrorOpen: db "An error has
occurred…..quiting.",10,13,'\$' OpenSuccess: db "File opened
successfully.",10,13,'\$' InitSuccess: db "Floppy initialised
successfully.",10,13,'\$' WriteSuccess: db "Bootsector written
successfully.",10,13,'\$' Counter: db 3
: mov ah,0x09 mov dx,Welcome int 0x21 mov dx,AskInfile int 0x21 xor
si,si
: mov ah,0x01 int 0x21 cmp al,13 je InputDone mov byte
\[Infile+si\],al inc si jmp InputLoop
: mov byte \[Infile+si\],0 mov ax,0x3d00 mov dx,Infile int 21h jc
Error
mov ah,0x09 mov dx,OpenSuccess int 0x21
mov ah,0x3f mov bx,\[Handle\] mov cx,0x200 mov dx,FileBuffer int
0x21 mov bx,\[Handle\] mov ah,0x3e int 0x21
xor ax,ax mov dl,0 int 0x13 jc Error mov ah,0x09 mov dx,InitSuccess
int 0x21
: mov ah,0 mov dl,0 int 0x13 mov al,1 mov ah,3 mov cx,1 mov dx,0 mov
bx,FileBuffer int 0x13 jnc WriteOK dec byte \[Counter\] jz Error jmp
loop1
: mov ah,0x09 mov dx,WriteSuccess int 0x21
: mov ah,1 mov dl,0 int 0x13 mov al,ah mov ah,0x4c int 0x21
: mov ah,0x09 mov dx,ErrorOpen int 0x21 jmp Exit
section .bss Infile: resb 80 Handle: resb 1 FileBuffer: resb 0x200
</pre>
</p><center><h2>11. All
sources</h2></center><ul> <li><a
href="http://web.archive.org/web/20010424064833/http://www.phys.uu.nl/~mjanssen/osdev/dumbboot.asm>">dumbboot.asm</a
</li><li><a
href="http://web.archive.org/web/20010424064833/http://www.phys.uu.nl/~mjanssen/osdev/dosboot.asm>">dosboot.asm</a
</li><li><a
href="http://web.archive.org/web/20010424064833/http://www.phys.uu.nl/~mjanssen/osdev/pmboot.asm>">pmboot.asm</a
</li></ul> <center><h2>12.
Bibliography</h2></center><ol type="1">
<li>Michael Tischer, PC Intern, ISBN 1-55755-145-6 <br> A
great book on all PC related stuff, it really takes you in depth on a
large number of subjects. </li><li>Lance Leventhal, Lance
Leventhal’s 80386 programming guide, ISBN 90-6233-440-7 <br> The
most important parts of the intel 80386 manual, I don’t know whether the
ISBN is for the English book or the Dutch translation.
</li><li>Intel Architecture Software Developer’s Manual,
Volume 1: Basic Architecture, Volume 2: Instruction Set Reference,
Volume 3: System Programming Guide<br> The manual for using Intel
processors, it covers everything from registers to instruction set and
Protected Mode. These manuals are downloadable from <a
href="<http://web.archive.org/web/20010424064833/http://www.intel.com/>">Intel’s
web site</a> (approximatly 10 MB including addenda).
</li><li>Ralph Brown’s Interrupt List<br>A complete
description of all the PC’s interrupts (including BIOS and DOS) and a
description of all hardware ports. A must have for every assembly
programmer. </li></ol> <center><h2>13.
Links</h2></center><ol> <li><a
href="<http://web.archive.org/web/20010424064833/http://www.webring.org/cgi-bin/webring?ring=os&list>"
target="\_top">The OS webring</a>: Links to sites which are
part of the Operating System webring. It contains a lot of good links.
</li><li><a
href="<http://web.archive.org/web/20010424064833/http://www.intel.com/>"
target="\_top">Intel’s web site</a>: for all information about
Intel processors, chipsets including datasheets and manuals. It is also
possible to order a free CD-ROM with the processor manuals and a lot of
other stuff. </li><li><a
href="<http://web.archive.org/web/20010424064833/http://www.pobox.com/~ralf/files.html>"
target="\_top">Ralph Brown’s Home Page</a>: here you can
download the Ralph Brown Interrupt list which contains all known and
(unknown) interrupts and a description of their
usage.</li></ol> <center><h2>14.
Warranty</h2></center>I exclude any and all implied
warranties, including warranties of merchantability and fitness for a
particular purpose. I make no warranty or representation, either express
or implied, with respect to this source code, its quality, performance,
merchantability, or fitness for a particular purpose. I shall have no
liability for special, incidental, or consequential damages arising out
of or resulting from the use or modification of this source code.
Anyway I will by no means accept warranty for any damage caused by using
information and / or sources found on this web page. So if you f\*\*k
up, kick yourself!!! </p><center><h2>15. Who am
I</h2></center>I am a twenty-four year old physics student
from Utrecht in the Netherlands. My name is emJay (AKA Mark Janssen).
Contact me at <a
href="[mailto:mjanssen@phys.uu.nl](mailto:mjanssen@phys.uu.nl)"><mjanssen@phys.uu.nl></a>
<center><h2>16. Update
history</h2></center><center> <table
width="90%"> <tbody><tr><td>28 March 2000:
</td><td> Added link to OS webring in the links section.
</td> </tr> <tr><td>14 March 2000:
</td><td> Used PHP3 to make navigation between pages
possible and create the contents (Yes, it is completly automated).
</td> </tr> </tbody></table> </center>