Develop our own operating system! #part07
Virtual Memory with Paging
Hello again! Welcome to the seventh step of building our own operating system. Before we going to the today’s lesson I would like to encourage you to refresh your knowledge about our 4th week implementation about segmentation. You can refresh your knowledge by revisiting the 4th week article in here. So as we discussed in there, today we are going to study about ‘paging’ technique. Paging is a technique that use to achieve virtual memory. Before going into more details about paging let’s have a good idea about virtual memory.
Introduction to Virtual Memory
Do you know that a computer program can access more memory than the actual physical memory(RAM) capacity? Well it is possible! Because a computer program is not getting direct access to the physical memory. It is using this concept that we called as ‘virtual memory’.
Virtual memory is all about ‘indirection’. As the name suggest their is no actual memory like that. It is virtual and is a layer of indirection. Let’s see how this works.
Without virtual memory we have no any indirection. The program memory is equal to physical memory. If program load for address1024, it’s going to look for RAM address 1024. See the below picture. What happen when program address try to use some address beyond 1GB? The program going to crash! That’s why we use this layer of indirection to MAP those two addresses.
The virtual memory solve the above mentioned problem by adding an layer between program address space and physical address space. So instead of going one-to-one like previously, it do this mapping. This mapping provides the indirection that we discussed before. So instead of going directly to the physical address, program address look-in to map. And then the map will decide the relevant physical address to that program address. So when program address space going to access more than 1GB the mapping will know it is not in the RAM and it should be find in to hard disk. So the program is not going to Crash with this virtual memory concept!
So, to achieve this virtual memory, we use paging and segmentation. We have seen that how we can use segmentation in our operating system before. This week, we are going it implement it with paging.
Virtual Memory Through Segmentation?
So why we need paging when we have already done it with segmentation? Well, actually you can skip this part completely and just go with segmentation for achieving virtual memory. Then each user mode process would get its own segment, with base address and limit properly set up. But there are some drawbacks with segmentation. For example and each and every process is isolated and need contiguous memory to store them. So we should exactly know how much memory is required by the process which is not practical. Otherwise, we can move the memory segments to places where they can grow when the limit is reached. But both of solutions are either not pragmatic or expensive! But paging can Solve both of the problems! (You should note that in x86_64 (the 64-bit version of the x86 architecture), segmentation is almost completely removed.)
Paging — Overview
In Operating Systems, Paging is a storage technique that is used to retrieve processes from the secondary storage into the main memory in the form of pages.
The main idea behind the paging is to divide each process in the form of pages. The main memory will also be divided in the form of frames.
One page of the process is to be stored in one of the frames of the memory. The pages can be stored at the different locations of the memory but the priority is always to find the contiguous frames or holes.
Pages of the process are brought into the main memory only when they are required otherwise they reside in the secondary storage.
Paging — How does it works?
As I mentioned before when we achieved virtual memory through paging that means that each process will get the impression that the available memory range is 0x00000000–0xFFFFFFFF even though the actual size of the memory might be much less. It also means that when a process addresses a byte of memory it will use a virtual (linear) address instead of physical one(see the picture above). The code in the user process won’t notice any difference but will experience an execution delay. The linear address gets translated to a physical address by the Memory Management Unit(MMU) and the page table. But the virtual address isn’t mapped to a physical address, the CPU will raise a page fault interrupt.
Paging in x86
In x86 there are few things that we need to know when we study about paging. On the x86, the MMU maps memory through a series of tables, two to be exact. They are the paging directory (PD), and the paging table (PT).
Both tables contain 1024 4-byte entries, making them 4 KB each. In the page directory, each entry points to a page table. In the page table, each entry points to a physical address(address of the page frame). Then using the offset it is possible to find the exact physical memory location(See the above diagram).
In a virtual address, the highest 10 bits species the offset of a page directory entry (PDE) in the current PDT, the next 10 bits represent the offset of a page table entry (PTE) while PDE pointed to the page table. The lowest 12 bits in the address is the offset within the page frame to be addressed(to find the exact line from the page frame).
All page directories, page tables and page frames need to be aligned on 4096 byte addresses. This makes it possible to address a PDT, PT or PF with just the highest 20 bits of a 32 bit address, since the lowest 12 need to be zero.
The PDE and PTE structure is very similar to each other: 32 bits (4 bytes), where the highest 20 bits points to a PTE or PF, and the lowest 12 bits control access rights and other configurations. (See above diagrams) 4 bytes times 1024 equals 4096 bytes, so a page directory and page table both t in a page frame themselves.
While pages are normally 4096 bytes, it is also possible to use 4 MB pages. A PDE then points directly to a 4 MB page frame, which needs to be aligned on a 4 MB address boundary. The address translation is almost the same as in the figure, with just the page table step removed. It is possible to mix 4 MB and 4 KB pages.
The 20 bits pointing to the current PDT is stored in the register cr3. The lower 12 bits of cr3 are used for configuration. The most interesting bits are U/S, which determine what privilege levels can access this page (PL0 or PL3), and R/W, which makes the memory in the page read-write or read-only.
Why we need both PDT and PT?
A page table is an array of 1024 * 32-bit entries (conveniently fitting into a single 4KB page). Each entry points to the physical address of a page. Because a single page table is not able to represent the entire address space on its own (1024 entries * 4KB = only 22-bits of address space), we require a second level page table: a page directory. A page directory also consists of 1024 * 32-bit entries (again fitting into a single page), each pointing to a page table. We can see that now 1024 * 1024 * 4KB = 32-bits and with this 3-level structure we are able to map the entire 4GB virtual address space.
Identity Paging
Identity Paging, Identity Mapped Paging and 1:1 Paging are terms often used on the forum to describe a design choice where a portion of virtual addresses are mapped to physical addresses that have the same value. This means that if paging is enabled with identity paging, 0xb8000 is 0xb8000, as long as that area is identity mapped.
This can be done at compile time by creating a page directory where each entry points to its corresponding 4 MB frame. In NASM this can be done with macros and commands (%rep, times and dd). It can of course also be done at run-time by using ordinary assembly code instructions.
Enabling Paging
Paging is enabled by first writing the address of a page directory to cr3 and then setting bit 31 (the PG “paging-enable” bit) of cr0 to 1. To use 4 MB pages, set the PSE bit (Page Size Extensions, bit 4) of cr4. The following assembly code shows an example:
;eax has the address of the page directory
mov cr3, eax
mov ebx, cr4 ; read current cr4
or ebx, 0x00000010 ; set PSE
mov cr4, ebx ; update cr4
mov ebx, cr0 ; read current cr0
or ebx, 0x80000000 ; set PG
mov cr0, ebx ; update cr0
;now paging is enabled
It is important to note that all addresses within the page directory, page tables and in cr3 need to be physical addresses to the structures, never virtual. This will be more relevant in later sections where we dynamically update the paging structures (see the chapter “User Mode”).
An instruction that is useful when an updating a PDT or PT is invlpg. It invalidates the Translation Lookaside Buffer (TLB) entry for a virtual address. The TLB is a cache for translated addresses, mapping physical addresses corresponding to virtual addresses. This is only required when changing a PDE or PTE that was previously mapped to something else. If the PDE or PTE had previously been marked as not present (bit 0 was set to 0), executing invlpg is unnecessary. Changing the value of cr3 will cause all entries in the TLB to be invalidated.
An example of invalidating a TLB entry is shown below:
; invalidate any TLB references to virtual address 0 invlpg [0]
Paging and the Kernel
Reasons to Not Identity Map the Kernel
A problem will arise when connecting user mode process code with a kernel located at the beginning of the virtual address space — that is, a virtual address space of 0x00000000, “size of kernel.” Standard practice is to load code into memory location 0x00000000 while linking. 0x00000000 will be the base address for resolving absolute references. This means the user mode process can’t be loaded at this virtual address because the kernel is mapped to the virtual address space (0x00000000, “size of kernel”).
Use a linker script that instructs it to assume a different beginning address. However, this is a difficult solution for the users of the operating system to deal with.
We’re also assuming that we want the kernel’s address space to be part of the user mode process’s address space in this scenario. The fact that we don’t have to alter any paging structures in order to access the kernel’s code and data is a good feature, as we’ll see later. A user process cannot read or write kernel memory unless they have privilege level 0.
The Virtual Address for the Kernel
A high virtual memory address is preferred, such as 0xC0000000 (3 GB). There’s no way the user mode process will reach 3 GB in size, which is the only way it can presently compete with the kernel. It is referred to be a higher-half kernel when the kernel employs virtual addresses of at least 3 GB. It’s not necessary to locate the kernel at 0xC0000000 to achieve the same benefits. For example, how much virtual memory should be accessible for the kernel (it’s simplest if all memory above the kernel virtual address belongs to the kernel) and how much should be available for a process are factors in determining the proper address to choose.
Placing the Kernel at 0xC0000000
The kernel should be placed at 0xC0100000, rather than at 0xC0000000, because this allows (0x00000000, 0x00100000) to be translated to (0xC0000000, 0xC0100000). Memory (0x00000000, “size of kernel”) is mapped to the range (0xC0000000, +0xC0000000 + “size of kernel”) in this way.
Placement of the kernel isn’t difficult, but it does need a little of thinking. Once again, this is a linkage issue. For example, if we utilize relocation in the linker script (see “Linking the Kernel”), the linker will think that our kernel is loaded at physical memory address 0x00100000, not 0x00000000. If the jumps are resolved using 0xC100000 as the base address, a kernel jump will leap directly into the user mode process code (remember that the user mode process is loaded in virtual memory 0x00000000).
But we can’t just tell the linker to presume that 0xC01000000 is where the kernel starts (is loaded), because we want it to be loaded at 0x00100000. In order to prevent the kernel from being loaded at 0x00000000 since there is BIOS and GRUB code loaded below 1MB, it is loaded at 1 MB. Because the computer may have less than 3 GB of physical memory, we can’t trust that we’ll be able to load the kernel at 0xC0100000 either.
As a workaround, the linker script can make use of both the relocation instruction (.=0xC0100000) and the AT instruction. Address calculations for non-relative memory-references should utilize the relocation address as a starting point for calculating addresses for relocation species. The kernel should be loaded into memory at the specified species. At link time, GNU ld performs the relocation, and GRUB handles the load address specified by AT when loading the kernel.
Higher-half Linker Script
Linker can be modified to do this:
ENTRY(loader) /* the name of the entry symbol */
. = 0xC0100000 /* the code should be relocated to 3GB + 1MB */
/* align at 4 KB and load at 1 MB */
.text ALIGN (0x1000) : AT(ADDR(.text)-0xC0000000)
{
*(.text) /* all text sections from all files */
}
/* align at 4 KB and load at 1 MB + . */
.rodata ALIGN (0x1000) : AT(ADDR(.text)-0xC0000000)
{
*(.rodata*) /* all read-only data sections from all files */
}
/* align at 4 KB and load at 1 MB + . */
.data ALIGN (0x1000) : AT(ADDR(.text)-0xC0000000)
{
*(.data) /* all data sections from all files */
}
/* align at 4 KB and load at 1 MB + . */
.bss ALIGN (0x1000) : AT(ADDR(.text)-0xC0000000)
{
*(COMMON) /* all COMMON sections from all files */
*(.bss) /* all bss sections from all files */
}
Entering the Higher Half
There is no paging table when GRUB goes to kernel code. Due to the incorrect mapping, any accesses to 0xC0100000 + X will at best result in a general protection exception (GPE), and in all other cases the machine will simply crash (if it has more than 3GB of memory).
It is thus necessary to do the following tasks using assembly code that does not make use of absolute jumps or relative memory addresses
1.Create a page table.
2.For the first four megabytes of virtual address space, add identity mapping.
3.Please provide a mapping for the value 0xC0100000 to 0x0010000
A page fault would be generated soon after paging was enabled if we skipped the first 4 MB of identity mapping. This may be done after the table has been constructed by making eip point at a virtual address in the upper half by jumping to a label:
; assembly code executing at around 0x00100000
; enable paging for both actual location of kernel
; and its higher-half virtual location
lea ebx, [higher_half] ; load the address of the label in ebx
jmp ebx ; jump to the label
higher_half:
; code here executes in the higher half kernel
; eip is larger than 0xC0000000
; can continue kernel initialisation, calling C code, etc.
If you set eip to a memory location that is either next to or immediately following address 0xC0100000, all code will run as if it were placed at the higher-half of that address. There’s no need to worry about the page table anymore, as invlpg [0] will invalidate its corresponding entry in the TLB.
Running in the Higher Half
With a higher-half kernel, there are a few extra things to consider. Memory-mapped I/O, which uses specified memory regions, must be used with caution. Since there is no longer an entry in the page table for the address 0x000B8000, the value 0xC00B8000 must be used instead since virtual address 0xC0000000 corresponds to physical address 0x00000000 in the frame buffer, as an example
Multiboot’s structure must also be updated with new virtual addresses, as well as any explicit address references.
It’s easy to map 4 MB pages to the kernel, but it wastes RAM (unless you have a really big kernel). If you’d want to save memory, you can map the higher-half kernel into four-kilobyte pages. But one must specify the mappings from virtual to physical addresses at run-time to reserve memory for a page directory and a page table. Kernel size may be calculated by dumping the linker script, which we’ll have to do later when constructing the page frame allocator .
Virtual Memory Through Paging
Paging permits virtual memory to benefit from two things. Access to memory can be controlled at a finer level. A page can be read-only, just for PL0, and so on. A contiguous memory is created by creating the appearance of continuous memory in the mind. A contiguous memory may be expanded without moving data around. Although the user mode applications can access all memory below 3 GB, we don’t have to allocate page frames if they don’t really use the memory. Code can be placed around 0x00000000 and the stack can be positioned at about C0000000, while only requiring two real pages.
So that all about paging! Hope you got a good understanding about paging with this article. Let’s meet with another exciting step in next week. Until then be motivated! keep learning! :)
Written by,
R.A.W. Lalendra
Undergraduate in Bsc(hons) Software Engineering
University of Kelaniya Sri Lanka.