最后讨论一个有趣的问题:如果多个VMA的虚拟页面同时映射了同一个匿名页面,那么page->index应该等于多少?
虽然匿名页面和KSM页面可以通过PageAnon()和PageKsm()宏来分区。但是这两种页面究竟有什么区别呢?是不是多个VMA的虚拟页面共享一个匿名页面的情况就一定是KSM页面呢?这是一个非常好的问题,可以从中窥探出匿名页面和KSM页面的区别。这个问题要分两种情况,一是父子进程的VMA共享同一个匿名页面,二是不相干的进程VMA共享同一个匿名页面。
(1) 第一种情况在RMAP反向映射机制时已经介绍过。父进程在VMA映射匿名页面时会创建属于这个VMA的RMAP反向映射的设施,在__page_set_anon_rmap()里会设置page->index值为虚拟地址在VMA中的offet。子进程fork时,复制了父进程的VMA内容到子进程的VMA中,并且复制父进程的页表到子进程中,因此对于父子进程来说,page->index 值是一致的。
当需要从page找到所有映射page的虚拟地址时,在rmap_walk_anon()函数中,父子进程都使用page->index值来计算在VMA中的虚拟地址,详见rmap_walk_anon()->vma_address()函数。
static int rmap_walk_anon(struct page *page, struct rmap_walk_control *rwc)
{
struct anon_vma *anon_vma;
pgoff_t pgoff;
struct anon_vma_chain *avc;
int ret = SWAP_AGAIN;
anon_vma = rmap_walk_anon_lock(page, rwc);
if (!anon_vma)
return ret;
pgoff = page_to_pgoff(page);
anon_vma_interval_tree_foreach(avc, &anon_vma->rb_root, pgoff, pgoff) {
struct vm_area_struct *vma = avc->vma;
unsigned long address = vma_address(page, vma);
if (rwc->invalid_vma && rwc->invalid_vma(vma, rwc->arg))
continue;
ret = rwc->rmap_one(page, vma, address, rwc->arg);
if (ret != SWAP_AGAIN)
break;
if (rwc->done && rwc->done(page))
break;
}
anon_vma_unlock_read(anon_vma);
return ret;
}
(2) 第二种情况是KSM页面,KSM页面由内容相同的两个匿名页面合并而成,它们可以是不相干的进程的VMA,也可以是父子进程的VMA,那么它的page->index值应该等于多少呢?
void do_page_add_anon_rmap(struct page *page,
struct vm_area_struct *vma, unsigned long address, int exclusive)
{
int first = atomic_inc_and_test(&page->_mapcount);
if (first) {
/*
* We use the irq-unsafe __{inc|mod}_zone_page_stat because
* these counters are not modified in interrupt context, and
* pte lock(a spinlock) is held, which implies preemption
* disabled.
*/
if (PageTransHuge(page))
__inc_zone_page_state(page,
NR_ANON_TRANSPARENT_HUGEPAGES);
__mod_zone_page_state(page_zone(page), NR_ANON_PAGES,
hpage_nr_pages(page));
}
if (unlikely(PageKsm(page)))
return;
VM_BUG_ON_PAGE(!PageLocked(page), page);
/* address might be in next vma when migration races vma_adjust */
if (first)
__page_set_anon_rmap(page, vma, address, exclusive);
else
__page_check_anon_rmap(page, vma, address);
}
在do_page_add_anon_rmap()函数中有这样一个判断,只有当_mapcount等于-1时才会调用__page_set_anon_rmap()去设置page->index值,那就是第一次映射该页面的用于pte才会去设置page->index值。
当需要从page中找到所有映射page的虚拟地址时,因为page是KSM页面,所以使用rmap_walk_ksm()函数,如下:
int rmap_walk_ksm(struct page *page, struct rmap_walk_control *rwc)
{
struct stable_node *stable_node;
struct rmap_item *rmap_item;
int ret = SWAP_AGAIN;
int search_new_forks = 0;
VM_BUG_ON_PAGE(!PageKsm(page), page);
/*
* Rely on the page lock to protect against concurrent modifications
* to that page's node of the stable tree.
*/
VM_BUG_ON_PAGE(!PageLocked(page), page);
stable_node = page_stable_node(page);
if (!stable_node)
return ret;
again:
hlist_for_each_entry(rmap_item, &stable_node->hlist, hlist) {
struct anon_vma *anon_vma = rmap_item->anon_vma;
struct anon_vma_chain *vmac;
struct vm_area_struct *vma;
anon_vma_lock_read(anon_vma);
anon_vma_interval_tree_foreach(vmac, &anon_vma->rb_root,
0, ULONG_MAX) {
vma = vmac->vma;
if (rmap_item->address < vma->vm_start ||
rmap_item->address >= vma->vm_end)
continue;
/*
* Initially we examine only the vma which covers this
* rmap_item; but later, if there is still work to do,
* we examine covering vmas in other mms: in case they
* were forked from the original since ksmd passed.
*/
if ((rmap_item->mm == vma->vm_mm) == search_new_forks)
continue;
if (rwc->invalid_vma && rwc->invalid_vma(vma, rwc->arg))
continue;
/*这里使用rmap_item->address来获取虚拟地址,其实rmap_item->address就是在
整个虚拟地址空间中的偏移地址,也就是某个页面的虚拟地址*/
ret = rwc->rmap_one(page, vma,
rmap_item->address, rwc->arg);
if (ret != SWAP_AGAIN) {
anon_vma_unlock_read(anon_vma);
goto out;
}
if (rwc->done && rwc->done(page)) {
anon_vma_unlock_read(anon_vma);
goto out;
}
}
anon_vma_unlock_read(anon_vma);
}
if (!search_new_forks++)
goto again;
out:
return ret;
}
这里使用rmap_item->address来获取每个VMA对应的虚拟地址,而不是像父子进程共享的匿名页面那样使用page->index来计算虚拟地址。因此对于KSM页面来说,page->index等于第一次映射该页的VMA中的offset