W4118

W4118 Fall 2025 - Homework 5

DUE: Wednesday 11/19/2025 at 11:59pm ET

Modified: Fri Nov 14, 12:00pm (part 1, question 3)

General Instructions

All homework submissions are to be made via Git. You must submit a detailed list of references as part of your homework submission indicating clearly what sources you referenced for each homework problem. You do not need to cite the course textbooks and instructional staff. All other sources must be cited. Please edit and include this file in the top-level directory of your homework submission in the main branch of your team repo. Be aware that commits pushed after the deadline will not be considered. Refer to the homework policy section on the class website for further details.

Group programming problems are to be done in your assigned groups. We will let you know when the Git repository for your group has been set up on GitHub. It can be cloned using the following command. Replace teamN with your team number, e.g. team0. You can find your group number here.

$ git clone git@github.com:W4118/f25-hmwk5-teamN.git

IMPORTANT: You should clone the repository directly in the terminal of your VM, instead of cloning it on your local machine and then copying it into your VM. The filesystem in your VM is case-sensitive, but your local machine might use a case-insensitive filesystem (for instance, this is the default setting on macs). Cloning the repository to a case-insensitive filesystem might end up clobbering some kernel source code files. See this post for some examples.

This repository will be accessible to all members of your team, and all team members are expected to make local commits and push changes or contributions to GitHub equally. You should become familiar with team-based shared repository Git commands such as git-pull, git-merge, git-fetch. For more information, see this guide.

There should be at least five commits per member in the team’s Git repository. The point is to make incremental changes and use an iterative development cycle. Follow the Linux kernel coding style. You must check your commits with the run_checkpatch.sh script provided as part of your team repository. Errors from the script in your submission will cause a deduction of points. (Note that the script only checks the changes up to your latest commit. Changes in the working tree or staging area will not be checked.)

For students on Arm computers (e.g. macs with M1/M2/M3 CPU): if you want your submission to be built/tested for Arm, you must create and submit a file called .armpls in the top-level directory of your repo; feel free to use the following one-liner:

$ cd "$(git rev-parse --show-toplevel)" && touch .armpls && git add .armpls && git commit -m "Arm pls"

You should do this first so that this file is present in any code you submit for grading.

For all programming problems, you should submit your source code as well as a single README file documenting your files and code for each part. Please do NOT submit kernel images. The README should explain any way in which your solution differs from what was assigned, and any assumptions you made. You are welcome to include a test run in your README showing how your system call works. It should also state explicitly how each group member contributed to the submission and how much time each member spent on the homework. The README should be placed in the top level directory of the main branch of your team repo (on the same level as the linux/ and user/ directories).

Assignment Overview

A core feature of memory management in Linux (and most other operating systems) is the concept of virtual address spaces. To each process running in userspace, it appears as if it is the only process using the memory. It can allocate, read, write, and deallocate memory at any allowed ‘virtual’ address without worrying if some other process happens to be using that same address. However, in reality multiple virtual address spaces coexist in physical memory.

The Linux kernel is responsible for providing this abstraction to userspace processes; it keeps track of every virtual address for each running process, and maps those addresses to physical memory. However, storing an exact, one-to-one mapping from every virtual address of every process to a corresponding physical address is clearly impossible given physical memory limitations. Instead, the kernel uses some smart data structures (in combination with hardware support) to provide the virtual memory abstraction transparently to processes, while in reality storing only a tiny fraction of all possible mappings.

In this assignment, you will explore the two main data structures the kernel uses to manage memory mappings: virtual memory areas and page tables.

Virtual memory areas (VMAs), also called memory regions, are how the kernel tracks the memory that a process thinks it has allocated, along with some associated state. They represent regions in the virtual address space, but are independent of physical pages in memory.

Page tables, on the other hand, map those virtual memory addresses to physical memory addresses once real memory is actually required. In addition to mappings, the page tables also include state information (e.g. whether pages are read-only or dirty).

In this assignment, you will see how VMAs and page tables are managed and updated by tracing these updates for a specified process. We will create system calls that allow an ‘inspector’ process to access a continuously updating ‘shadow’ page table describing a small segment of the virtual address space of some other target process.

For simplicity, our shadow page table will only have a single level, and will store information about both the real page tables and the virtual memory areas of the target process. Think of the shadow page table as a single-level map from a page in the virtual address space to a frame of physical memory.

This assignment follows an incremental development and testing approach. As you implement kernel features for tracking memory management events in Parts 2 and 3, you will simultaneously build userspace inspector and target programs. This allows you to test each piece of your kernel implementation immediately by writing target code that triggers the specific events (like mmap or page faults) and verifying that your inspector correctly reports the changes in the shadow page table.

PART 1: Creating shared kernel-user memory

In homework 2, you created a process tree by having the kernel allocate memory to store the process tree, then copied the tree to memory allocated by the user program. This has two shortcomings:

You have to repeatedly copy the entire tree using copy_to_user, which is (relatively) slow and might sleep
You don’t get real-time updates of the tree as processes are created, you only get the updated tree when you call the system call.

For this assignment, we want to share memory between the kernel and a user program, so that updates to the memory by the kernel are immediately reflected in the memory that the user program sees, without needing to call a system call each time. We will use this shared memory to store a version of a process’s page table that is live updated by the kernel as the process triggers page faults.

PART 1.1: Opening shared memory

Implement the following system calls in linux/kernel/shadowpt.c:

/*
 * Syscall No. 467
 * Allocate some memory in kernel space,
 * and remap the memory into the indicated userspace range.
 *
 * The `target` argument will be put to use in the next part.
 */
long shadowpt_open(pid_t target, void *dest, size_t size);

Before calling shadowpt_open, the calling program should have mmap-ed size bytes of memory at dest. The syscall should then allocate some kernel-space memory of the same size, then link it to the userspace memory using remap_pfn_range (more details below). Afterwards, any changes made to that area of memory by kernel code should be immediately reflected in the linked userspace memory.

Take a look at the remap_pfn_range function to implement this remapping. The function takes several arguments, including the vma representing the user-space region to use, the starting virtual address to use, and the pfn representing the starting page frame number of the physical memory region to map. It is well documented for use in kernel drivers, but it can also be used elsewhere. Look at how it is implemented and try to understand in broad terms how it works (this will be useful for the next parts of the assignment). Also look at usages across the kernel, for example here (called here).

Note that after remapping, you will end up with two virtual addresses to the same set of physical pages: one in kernel space, which kernel space code will use to write to it, and one in userspace, which your program will use to read it. This means that any memory updates in the kernel effectively happen instantly for the userspace process. To test this, after remapping the memory, try changing some bytes in the shared memory from the kernel, and check you can see those changes from userspace.

It may seem a bit weird that we allocate the memory twice, once in userspace and once in kernel space. This is correct though, because what we are really doing with remap_pfn_range is updating the userspace page table entries so that they point to the same physical pages that we allocated in kernel space. Remember, the virtual memory allocated in userspace does not have any physical memory associated initially.

Although your syscall is given the userspace address and size to map, you need to get the associated VMA with that address, as you need to pass it as the first argument of remap_pfn_range. You also need to ensure the VMA is not split, and that none of its PTEs have been allocated yet.

Consider the following steps to implement remapping:

Allocate kernel pages used for the remapping (alloc_pages_exact may be useful).
Find the VMA (vma_lookup/find_vma) containing the memory to be used in remap_pfn_range, and tailor it for remapping (vma_modify_flags) so that the VMA contains exactly the range of addresses to be remapped. See notes below on how to use vma_modify_flags. vm_flags_clear could also be useful to clear flags.
Check that no PTEs have been allocated in the VMA so that our remapping does not corrupt any existing user mappings. You may find it helpful to review kernel functions such as __handle_mm_fault, __do_page_fault, show_pte(arm64) for walking page tables. If any PTEs for the VMA have been allocated, you should return an error as remap_pfn_range will not work in that case.
Modify the VMA flags so that it cannot be written from userspace. You may also want to restore the original VMA flags later so the userspace process can use the memory normally after tracking ends.
virt_to_phys could be useful for converting virtual to physical addresses.
Above changes to the memory mappings should be within a mmap_write_lock.

Input Checking

If any of the provided arguments are invalid (for example, if the target process does not exist, your system call should return -EINVAL). In addition to basic sanity checking of the inputs, there are some additional limitations you should impose on callers:

Only one process can have mapped memory at a time. If a process calls the system call when another process has already opened some shared memory, it should receive -EBUSY.
Only a superuser should be able to make the system call, otherwise, it should give -EPERM.
dest should be page aligned. Otherwise, return -EINVAL.

Notes:

Any errors or failures in your handler should be handled appropriately. On failure, anything allocated or referenced should be cleaned up and the appropriate errno should be returned.
Make sure your system call handles race conditions between multiple callers and in general handles concurrency gracefully. Under any condition, when any number of callers make the system call, at most one should return successfully and be unaffected by the others.
remap_pfn_range is done at page granularity, so the actual required memory size needs to be rounded up to the next page boundary.
The vma_modify function takes in attributes (flags, start address, end address, etc.) about what some section of a vma will look like after you modify it, and either merges it with its neighbors (if attributes match) or splits it based on start and end to create an isolated vma with the new attributes. In this case, focus on how to parameterize vma_modify to guarantee that the memory you will remap is isolated in exactly one vma. The vma_modify_flags function is particularly useful for as it calls vma_modify and sets specific attributes for the vma (look at VM_SPECIAL).

PART 1.2: Unlinking your shared memory

Implement the following syscall:

/*
 * Syscall No. 468
 * Unmap and release the shared memory, and
 * reset any global state set in shadowpt_open.
 */
long shadowpt_close(void);

Since you don’t want to leave the inspector process with access to kernel pages, you should undo the remap_pfn_range to disconnect the virtual addresses of the process from these physical pages. There are multiple ways to do this, but a good place to begin is by looking through the munmap system call to see how it breaks down a regular mmap’d region. You may also find zap_vma_ptes helpful.

Once the close syscall is made, the entries range should behave as if shadowpt_open were never called, and the region was just mmap’d.

PART 1.3: Track page faults in a given process

Now that your shared memory is working, you should use it to track page faults in real-time for a given process.

You should modify the kernel code at the appropriate location to call a hook function whenever a page fault happens. An example ‘hook’ function prototype might look something like this, defined in include/linux/shadowpt.h:

/*
 * Log a page fault for the given mm and address.
 */
void page_fault_happened(struct mm_struct *mm, unsigned long vaddr);	

When a page fault occurs in the target process (first argument of shadowpt_open), and the shared memory region is currently mapped, you should record the faulting virtual address in the shared memory ring buffer. Treat the shared memory as an array of unsigned long entries with total byte length size. Compute the slot index as

index = (vaddr % size) / sizeof(unsigned long);

and then store the address with

shared_memory[index] = vaddr;

You should be able to see this modification in your userspace program. Make sure to use an appropriate locking mechanism to prevent multiple page faults from causing simultaneous writes. You should overwrite any entries for a given index if one is already present.

To test this mechanism, spawn two programs: one which periodically mmaps a new memory region and writes to it, and another which tracks the first program’s page faults and periodically prints them. You should also see an output if the target program reads an invalid address and segfaults.

Tasks:

Implement shadowpt_open and shadowpt_close.
Test your system call to see that you can perform remapping, and that when you make some writes in the kernel they appear in userspace.
Modify any relevant files in the kernel to reflect the newly added system call.
In addition to mapping the shared memory, this system call will likely need to store some information globally so that later on we can write to it from various locations in the kernel.

PART 2: Creating the shadow page table

Now that your shared memory implementation works and you have some experience with how linux page faults work, you will begin implementing the shadow page table. We refer to it as a shadow page table because it shadows the real page table of a process to reflect updates to the real page table, though it is not actually used for the actual address translation for the process. What you are essentially going to do is provide some structure to the shared memory implementation you already completed so that it can be used as a shadow page table by defining the data structures that will be written and read from the shared memory.

PART 2.1: Defining the shadow page table

First, set up the data structures for the shadow page table that you will provide to inspecting processes. Since both your internal kernel logic and the userspace processes that use your system call need to know how these are structured, you should place them in a UAPI header. Create the file include/uapi/linux/shadowpt.h, and add the following components:

struct shadow_pte, which represents a single entry in the shadow page table. You should store one for each page in the target virtual address range. Each entry includes a page-aligned virtual address at the start of the virtual page, the physical frame number (pfn) of the mapped physical page (or 0 if it is unmapped), and flags to describe the state of the entry, which we refer to as its mapping state.

/* The data structure representing a single entry in the page table */
struct shadow_pte {
	unsigned long vaddr; /* Virtual address of the base of this page */
	unsigned long pfn; /* Physical frame number of the associated physical page */
	unsigned long state_flags; /* Flags describing the mapping state */
};

Some constant flag values that will be packed in the state_flags field.

#define SPTE_VADDR_ALLOCATED 1    /* the page is allocated */
#define SPTE_VADDR_ANONYMOUS 2    /* the page is for anonymous memory */

#define SPTE_PTE_MAPPED 4         /* a physical frame of memory is mapped */

A small number of helper macros to read whether a particular state flag in a shadow_pte is set.

/* Macros for testing whether each state flag is set */
#define is_vaddr_allocated(state) (state & SPTE_VADDR_ALLOCATED)
#define is_vaddr_anonymous(state) (state & SPTE_VADDR_ANONYMOUS)

#define is_pte_mapped(state) (state & SPTE_PTE_MAPPED)

struct user_shadow_pt to represent the full shadow page table for the target range, which includes the start and end virtual addresses, and an array of all the shadow page table entries in that range:

/* Max range 2048 4K pages */
#define MAX_SPT_RANGE (0x800000)

/* The data structure representing the full page table */
struct user_shadow_pt {
	unsigned long start_vaddr; /* first virtual address in the target range */
	unsigned long end_vaddr; /* last virtual address in the target range + 1 */
	struct shadow_pte *entries; /* array of all page table entries */
};

Note that end_vaddr should be the address after the last virtual address in the target range. i.e., if we wish to examine addresses 0 - 4095, we would set start_vaddr to 0 and end_vaddr to 4096.
You may limit the maximum range (i.e. end_vaddr - start_vaddr) a shadow page table can contain to MAX_SPT_RANGE, since this implementation isn’t particularly space efficient.

As you write your code, you may find it helpful to add some helper functions or macros to this header file to more easily access information related to these structures. However, do not modify the provided data structures or add additional data structures to pass information between userspace and the kernel. The provided data structures make up the minimal and complete API for your system call; any additions you make should be for convenience only.

In include/linux/shadowpt.h, you’ll want to both include the UAPI header (with #include <uapi/linux/shadowpt.h>) and define any additional data structures, short helper functions, or prototypes that might need to be accessible from within the kernel, but shouldn’t be exposed to userspace. When you do this, make sure you have appropriate include guards in both header files.

Tasks:

Add a UAPI header including the provided data structures (with the exact provided definitions) and any other necessary information to be used in the shadow page table in a header file.
Add an internal header including any data structures/prototypes you will use to internally manage the shadow page table.

Notes:

To make your UAPI header accessible in userspace, don’t forget to run make headers_install INSTALL_HDR_PATH=/usr after compiling and booting into your new kernel. This command will install the headers found under include/uapi/ in your Linux source tree into /usr/include/. Now you should be able to #include <linux/shadowpt.h> from userspace. Make sure that you are able to run make headers_install with no compilation errors. We will consider an error when running make headers_install as your kernel failing to compile.
If you are curious about the motivation behind separating UAPI and internal kernel headers, check out this article explaining why the separation was introduced.

PART 2.2: Creating the shadow page table

Next you will modify the shadowpt_open system call to create our shadow page table and make it available to the inspector process. It should now have the following interface:

/*
 * Syscall No. 467
 * Set up the shadow page table in kernel space,
 * and remap the memory for that page table into the indicated userspace range.
 * Target should be a process, not a thread.
 */
long shadowpt_open(pid_t target, struct user_shadow_pt *dest);

System Call Usage/Behavior

An inspector process will call this system call with the pid of the process it would like to inspect and a struct user_shadow_pt. This struct should be populated to contain the target start and end addresses, and a linear, preallocated entries array. entries should point to a virtual memory region that has been allocated by the inspector process. Remember, the vma corresponding to the memory you want to share between userspace and the kernel, which now corresponds to the entries array, cannot have any physical memory already allocated when you call shadowpt_open.

In the system call function, the kernel should take these inputs, validate them, and build a corresponding shadow page table in kernel space. Once this has been done, your function should remap the pages storing the shadow page table into the inspector’s virtual memory using remap_pfn_range, now mapping to the dest->entries userspace pointer. For now you do not need to populate the shadow page table memory, you can leave it empty. You should remove your page fault tracking behaviour if it is still active.

Input checking

In addition to the checks in Part 1.1, implement the following:

If the target range spans more than MAX_SPT_RANGE, truncate end_vaddr while keeping start_vaddr unchanged. Update the user_shadow_pt at dest so the user will be notified.
If the address range at entries is already mapped to physical pages, return -EINVAL.
If the VMA at entries is not large enough to hold the shadow page table, return -EINVAL. Note remap_pfn_range is done at page granularity, so the actual required memory size needs to be rounded up to the next page boundary.

PART 2.3: Destroying the shadow page table

You should also update shadowpt_close to close the shadow page tables, both intentionally and automatically when something goes wrong. It should now reset any global state set up by shadowpt_open. Only the exact process which successfully called shadowpt_open should be able to directly call shadowpt_close.

Inspector Process Exit

An important corner case you need to handle is if the inspector process exits before making the close system call, particularly since we only allow the inspector process itself to successfully make that call.

You should make sure that no matter how the inspector process exits, your kernel behaves as if the close system call was made and cleans up properly.

Notes:

As with before, consider concurrency bugs. Make sure that your program behaves correctly for any set of open/close calls made at any time.
Think carefully about how you detect the inspector process exiting. You should make sure that you catch any exit for any reason.

Tasks:

Implement and test the shadowpt_close call.
Modify any relevant files in the kernel to reflect the newly added system call.
Implement handling for cleaning up when the inspector process exits.

PART 2.4: Building user programs for testing

With the core data structures and system calls for creating and destroying the shadow page table in place, you will now create the initial versions of the userspace programs that will be used for testing throughout the rest of the assignment.

Implement these programs in the user/part4 directory. Your Makefile should be configured to build two executables in this directory, inspector and target, when make is called.

1. The inspector Program:

Write the initial version of your inspector program. It should be called as follows:

$ ./inspector <target_pid> <start_address> <end_address>

When called, it should:

Initialize a struct user_shadow_pt.
Allocate a memory region for the entries array using mmap. This is the memory that the kernel will remap.
- Note: it’s a bad idea to use malloc to prepare a memory section for mapping page table entries, because malloc cannot allocate memory with more than M_MMAP_THRESHOLD (128KB by default). Instead you should consider using the mmap system call; take a look at do_mmap in mm/mmap.c to set up the proper flags to pass to mmap.
- Once you’ve mmap’ed a memory area for remapping page table entries, the kernel will create a Virtual Memory Area for it. The flags for the memory area should include MAP_ANONYMOUS and MAP_PRIVATE.
Call shadowpt_open with the target PID and the configured user_shadow_pt struct.
Upon success, enter a loop that prints the contents of the shadow page table once per second.
Implement a signal handler for SIGINT (ctrl+c) that ensures shadowpt_close is called before the program exits, so kernel resources are cleaned up properly.

2. The target Program:

Write a simple target program that prints its own PID and then sleeps forever. This will give you a stable process to aim your inspector at for initial testing. Later on, you will modify your target process to do more advanced memory operations to be inspected.

At this stage, your inspector should be able to successfully track the target, open the shared memory, and display an empty shadow page table.

PART 3: Initialization and tracking mmap updates

Now that you can create/destroy the shadow page table (Part 2), you’ll initialize it to mirror the existing VMA status and keep it roughly up to date, limited to mmap-based allocations. In Part 4, you will add closer tracking of various other events.

PART 3.1: Begin/stop tracking

Add the following syscalls:

/*
 * Syscall No. 469
 * Begin tracking and initialize shadow page tables to
 * the state of the tracked process's VMAs and PTEs.
 */
long shadowpt_start(void);

/*
 * Syscall No. 470
 * Stop tracking additional changes.
 */
long shadowpt_stop(void);

shadowpt_start should walk the page tables of the target process, copying over any entries in the address range passed to shadowpt_open. Specifically, it should iterate over the VMAs and page table entries of your target process, copying them over to your shadow page table.

There should only be updates to the shadow page table between the start and stop system calls. To prevent tracking from falling into a corrupted state, shadowpt_start should always reset the shadow page table to a state that matches the current state.

Like shadowpt_close, shadowpt_start and shadowpt_stop should only be called by the process that set up the tracking via shadowpt_open.

You do not need to return an error if shadowpt_close is called when tracking is still enabled (i.e. called shadowpt_start but not shadowpt_stop).

PART 3.2: mmap-based VMA updates

When the virtual address space representation (i.e. the VMA structs) is updated for an address (or some range of addresses) within the given range, you should update the shadow page table in kernel space for that address.

An example ‘hook’ function prototype might look something like this:

/*
 * Update shadow page table entries in the given range.
 * 
 * This function should log properties (allocated, anonymous, etc.) for virtual addresses.
 */
void update_shadow_vaddr(struct mm_struct *mm, unsigned long vstart, unsigned long vend);

You should track the following pieces of information about each page in the target range, and update its struct shadow_pte accordingly using the constant flag values we defined above:

Set SPTE_VADDR_ALLOCATED in the state_flags field when the address is allocated (i.e. after an mmap call). The bit should not be set when the address is not allocated.
Set SPTE_VADDR_ANONYMOUS in the state_flags field when the address is anonymous. The bit should not be set if the address is not for anonymous memory (i.e. for file-backed mapping).

For now, you should place hooks to track allocation changes that happen for one of the following reasons:

Full initialization to existing VMA status when shadowpt_start is called.
Allocation by mmap.

Notes:

When reading from components of the memory management system, remember that you are not the only one using them. Make sure that you grab the appropriate locks and increment reference counts as necessary while accessing shared kernel data structures.
Spend some time looking through the kernel code to understand when and how virtual memory areas (VMAs) are allocated and deallocated, and when and how their state changes.

Tasks

Implement update_shadow_vaddr, or some similar update function.
Add update_shadow_vaddr calls in various kernel locations to track relevant changes.

PART 3.3: mmap-based PTE updates

Just like VMA structs, page table entries (PTEs) have some state flags associated with them, and can either be mapped or not. Your shadow page table should track the current state of PTEs in addition to VMA flags for particular pages.

Again, it makes sense to create some kind of update function that you can call to track changes. An example could look something like this:

/*
 * Update the shadow page table when PTEs are updated
 */
void update_shadow_pte(struct mm_struct *mm, unsigned long vstart, unsigned long vend);

You should track the following pieces of information for each virtual page in the target range:

Set SPTE_PTE_MAPPED when the virtual page is associated with a physical page (via a new PTE). The bit should not be set when there is no mapping.
When SPTE_PTE_MAPPED is set, save the physical frame number of the associated page in the pfn field of the shadow_pte struct.

The key to this part is identifying where in the kernel a PTE is modified. Again, for now we should just track PTE updates triggered by mmap, which would be any time a new PTE is inserted into the page table (for example, this could happen when a user process tries to write to a particular address for the first time). For simplicity, you do not need to track updates that involve huge pages, but your shadow page table should work correctly to track standard 4 KB pages even in the presence of huge pages.

Notes:

You will need to walk the page tables at some point. While there are many functions that already exist in the kernel to do this, they are pretty much all far too complicated for our purposes. You should implement your own function for walking the page table rather than trying to use an existing function. You may find it helpful to review kernel functions such as __handle_mm_fault, __do_page_fault, show_pte(arm64).
You might want more than just the provided example update_shadow_pte function to behave in different contexts (with different locks held, for example).
A nonexistent PTE can be detected either if the pte_none function returns true, or if some containing table higher in the page tables does not exist. Note that this is different from the PTE_PRESENT bit, which indicates whether the page has been swapped to disk (which we are not asking you to track).
Once again, make sure you keep concurrency in mind. Many processes can access the kernel’s memory-related data structures at once, so be sure to grab the appropriate locks and references when you read and write kernel shared data.

Tasks

Implement update_shadow_pte to update your shadow page table.
Implement your own function for walking the page table.
Place calls to update_shadow_pte in as many locations as required to cover the mentioned scenarios.

PART 3.4: Testing

To test your VMA and PTE tracking, modify your target program to allocate and access a region of memory using mmap.

Have the target call mmap to allocate a region within the range your inspector is watching and print the mapped address.
Sleep briefly to allow the inspector to observe the allocation.
Write to the mapped region to trigger a page fault on first access.

Run your inspector while the target executes. You should observe:

After mmap: SPTE_VADDR_ALLOCATED is set (and SPTE_VADDR_ANONYMOUS if anonymous); SPTE_PTE_MAPPED is unset and pfn is 0.
After first-touch write: SPTE_PTE_MAPPED becomes set and pfn is populated.

This confirms that both your VMA hooks (update_shadow_vaddr) and PTE hooks (update_shadow_pte) are functioning correctly.

Tasks:

Update your target binary to mmap some memory for testing and generate a page fault

PART 4: Full life cycle tracking

Now extend your tracker to handle a broader range of memory allocations and keep the shadow page table synchronized, including clearing entries when mappings are torn down or invalidated, for example as a result of memory deallocations.

PART 4.1: Extending VMA updates

Place VMA hooks that you already implemented in Part 3 so that you can now track changes that happen for one of the following reasons:

(DONE in PART3) Full initialization to existing VMA status when shadowpt_start is called.
(DONE in PART3) Allocation by mmap.
Allocation by malloc.
Stack allocation via expansion of a VMA marked VM_GROWSDOWN/VM_GROWSUP.
Deallocation by free and munmap.
Full deallocation due to process exit.
Replacement by the execve system call.

Notes:

Make sure to unset SPTE_VADDR_ALLOCATED in the state_flags field when the address is deallocated (i.e. after an free call).
VMAs that are marked VM_GROWSDOWN/VM_GROWSUP can be expanded by the kernel without an explicit mmap. For example, the VMA for the stack is marked either VM_GROWSDOWN or VM_GROWSUP so that its start address can continue to change as the stack size increases without repeatedly making system calls. See expand_stack_locked for more details.
When the target process exits, you should not unmap the shadow page table, nor interfere with that process’ exit. Instead, you should mark all the virtual pages in your target range as deallocated (and stay that way until shadowpt_close is called or the inspector process exits).

PART 4.2: Extending PTE updates

Similarly, you should now make sure that you track PTE updates for the following scenarios:

(DONE in PART3) Full initialization to existing PTE status when shadowpt_start is called.
(DONE in PART3) Any time a new PTE is inserted into the page table (for example, this could happen when a user process tries to write to a particular address for the first time).
Any time a PTE is invalidated, or marked to be removed from the page table (see the MMU notifier hint below for suggestions on how to implement this).

Notes:

Make sure to unset SPTE_PTE_MAPPED when the mapping for a virtual page to a physical page is removed.
There is a huge number of places in the kernel where PTEs are invalidated, and placing an update after each one would be extremely time consuming. Luckily, we’re not the only ones interested in page tables, and as a result the kernel provides the MMU notifier, which allows us to register our own callback functions. These callback functions will be called whenever a particular range of PTEs is going to be invalidated.
Read this article for information about when the MMU notifier was first implemented, and this one for an update about recent changes that could provide some context for how to use it. A good idea would be to add an mmu_notifier to your kernel shadow page table and include something like the following in your code:
```
static const struct mmu_notifier_ops shadowpt_mmu_notifier_ops = {
    .invalidate_range_end = shadowpt_invalidate_range_end
};

/*
 * Register mmu notifier with mm struct.
 * Needs pinned mm struct. Can't hold mm lock or spinlock when
 * called. Returns 0 on success or -errno on failure.
 */
static int shadowpt_notifier_register(struct mm_struct *target_mm,
    struct mmu_notifier *dest_notifier)
{
    dest_notifier->ops = &shadowpt_mmu_notifier_ops;
    return mmu_notifier_register(dest_notifier, target_mm);
}
```
You can then register/unregister your notifier on calls to shadowpt_start/stop.
- For an example of a kernel component that uses the MMU notifier, check out KVM, the hypervisor built into Linux.
- You will need to register and unregister for callbacks with the notifier when you start and stop tracking. Think carefully about concurrency implications of each of these calls, and about where you place them in relation to the rest of your code.

PART 4.3: Testing

Test that your shadow page table correctly tracks all lifecycle events described in this part.

Run your updated target and inspector to verify updates for:

mmap/munmap allocations
malloc/free (heap growth via brk)
Stack expansion (VM_GROWSDOWN/VM_GROWSUP)
execve replacement
Process exit
MMU notifier invalidations

Your inspector should show flags and PFNs updating correctly as each event occurs, and clearing when mappings are removed. This confirms your shadow page table remains synchronized across allocation, deallocation, and invalidation events.

PART 5: Written questions

Write answers to the following questions in user/part5.txt, following the provided template exactly. Make sure to include any references you use in your references.txt file, and that you are referencing the correct kernel version in bootlin (v6.14). For reference, the URLs you answer with should be in the following format: https://elixir.bootlin.com/linux/v6.14/source/kernel/sched/core.c#L1234

Suppose we have a 32-bit system whose hardware only supports two levels of paging and the page size is 4KB. All levels of the page table for process 1 are allocated in physical memory, and they are stored contiguously starting at physical frame number (pfn) 10. In other words, the pgd is stored starting in pfn 10, and the ptes are stored starting in the physical frame of memory immediately after those used to store the pgd. For process 1, if virtual address 0x3c0ffff maps to physical address 0x7d0fff, what pfn will be stored in the pgd and pte entries of process 1’s page table that are used to translate this virtual address to its physical address? Write your answer using decimal numbers (not binary or hexadecimal).
Suppose you have a system that uses a TLB and a 4-level page table, and a virtually addressed cache. The TLB access time is 10 ns, the virtually addressed cache access time is 40 ns, and RAM access time is 100 ns. The TLB has a 95% hit rate, and the virtually addressed cache has a 90% hit rate. What is the average access time?
For a typical modern 64-bit Linux system which uses a 48-bit virtual address space for processes and 4KB sized pages, if a single frame of physical memory is mapped to a process’s page table, what is the minimum amount of memory that would be required to store the process’s page table when using a one-level page table, two-level page table, three-level page table, and four-level page table?
Specify the URL in bootlin indicating the file and line number at which the physical address is computed from the pgd, and the file and line number at which that address is loaded to the page table base register for the user space address space on a context switch, for both arm64 and x86.
Consider the following C program, run on a single CPU on x86 architecture:
```
int main() {
    int *ptr = NULL;
    *ptr = 5;
}
```
Identify the kernel functions from those listed below that will get executed, and put them in the order in which they will be called. Start your trace at the time when the process begins executing the first instruction of main(), and end your trace when the process will no longer run anymore. Functions may be called multiple times. Not all functions need to be used. Also, not all functions that are executed are listed below – limit your answer to include only these functions. In your answer, you should write each function exactly how it appears below (no extra tabs, bullets, spaces, uppercase letters, etc.), with each function on a separate line. We will be grading your answers based on a diff – if you do not follow the formatting specifications, you will not receive credit.
- arch_do_signal_or_restart
- bad_area_nosemaphore
- context_switch
- deactivate_task
- dequeue_task
- do_exit
- do_fault
- do_group_exit
- do_kern_addr_fault
- do_user_addr_fault
- exc_page_fault
- exit_to_user_mode_loop
- exit_to_user_mode_prepare
- find_vma
- force_sig_fault
- force_sig_fault_to_task
- get_signal
- handle_mm_fault
- handle_page_fault
- handle_pte_fault
- handle_signal
- irqentry_exit
- irqentry_exit_to_user_mode
- kernelmode_fixup_or_oops
- lock_mm_and_find_vma
- lock_vma_under_rcu
- page_fault_oops
- pgtable_bad
- __schedule
- send_signal_locked
- sigaddset
- sys_call_table
- vma->vm_ops->fault

Submission Checklist

Include the following in your main branch. Only include source code (ie *.c,*.h) and text files, do not include compiled objects.

.armpls file for teams on Arm
README file
references.txt file
Implementation of system calls and additional functions added to linux/kernel/shadowpt.c
Implementation of your inspector and target programs in user/part4
Answers to written questions in user/part5.txt