什么是Linux内存管理中的RSS和VSZ

长孙波鸿

2023-12-01

本文翻译自：What is RSS and VSZ in Linux memory management

What are RSS and VSZ in Linux memory management? Linux内存管理中的RSS和VSZ是什么？ In a multithreaded environment how can both of these can be managed and tracked? 在多线程环境中，如何管理和跟踪这两者？

#1楼

参考：https://stackoom.com/question/x49q/什么是Linux内存管理中的RSS和VSZ

#2楼

RSS is the Resident Set Size and is used to show how much memory is allocated to that process and is in RAM. RSS是驻留集大小，用于显示分配给该进程的内存量，并且位于RAM中。 It does not include memory that is swapped out. 它不包括换出的内存。 It does include memory from shared libraries as long as the pages from those libraries are actually in memory. 它确实包括来自共享库的内存，只要这些库中的页面实际上在内存中。 It does include all stack and heap memory. 它确实包括所有堆栈和堆内存。

VSZ is the Virtual Memory Size. VSZ是虚拟内存大小。 It includes all memory that the process can access, including memory that is swapped out, memory that is allocated, but not used, and memory that is from shared libraries. 它包括进程可以访问的所有内存，包括被换出的内存，已分配但未使用的内存以及来自共享库的内存。

So if process A has a 500K binary and is linked to 2500K of shared libraries, has 200K of stack/heap allocations of which 100K is actually in memory (rest is swapped or unused), and it has only actually loaded 1000K of the shared libraries and 400K of its own binary then: 因此，如果进程A具有500K二进制文件并且链接到2500K共享库，则具有200K的堆栈/堆分配，其中100K实际上在内存中（其余是交换或未使用），并且它实际上只加载了1000K的共享库然后是400K自己的二进制文件：

RSS: 400K + 1000K + 100K = 1500K
VSZ: 500K + 2500K + 200K = 3200K

Since part of the memory is shared, many processes may use it, so if you add up all of the RSS values you can easily end up with more space than your system has. 由于部分内存是共享的，因此许多进程可能会使用它，因此如果将所有RSS值相加，您可以轻松获得比系统更多的空间。

The memory that is allocated also may not be in RSS until it is actually used by the program. 在程序实际使用之前，分配的内存也可能不在RSS中。 So if your program allocated a bunch of memory up front, then uses it over time, you could see RSS going up and VSZ staying the same. 因此，如果你的程序预先分配了一堆内存，然后随着时间的推移使用它，你可以看到RSS上升，VSZ保持不变。

There is also PSS (proportional set size). 还有PSS（比例设定大小）。 This is a newer measure which tracks the shared memory as a proportion used by the current process. 这是一种较新的度量，它将共享内存跟踪为当前进程使用的比例。 So if there were two processes using the same shared library from before: 因此，如果有两个进程使用之前的相同共享库：

PSS: 400K + (1000K/2) + 100K = 400K + 500K + 100K = 1000K

Threads all share the same address space, so the RSS, VSZ and PSS for each thread is identical to all of the other threads in the process. 线程都共享相同的地址空间，因此每个线程的RSS，VSZ和PSS与进程中的所有其他线程相同。 Use ps or top to view this information in linux/unix. 使用ps或top在linux / unix中查看此信息。

There is way more to it than this, to learn more check the following references: 除此之外还有更多方法，了解更多信息请查看以下参考资料：

Also see: 另见：

A way to determine a process's "real" memory usage, ie private dirty RSS? 一种确定进程“真实”内存使用情况的方法，即私有脏RSS？

#3楼

I think much has already been said, about RSS vs VSZ. 关于RSS vs VSZ，我想已经说了很多。 From an administrator/programmer/user perspective, when I design/code applications I am more concerned about the RSZ, (Resident memory), as and when you keep pulling more and more variables (heaped) you will see this value shooting up. 从管理员/程序员/用户的角度来看，当我设计/编写应用程序时，我更关心RSZ（驻留内存），当你不断拉动越来越多的变量（堆积）时，你会看到这个值上升。 Try a simple program to build malloc based space allocation in loop, and make sure you fill data in that malloc'd space. 尝试一个简单的程序在循环中构建基于malloc的空间分配，并确保在malloc空间中填充数据。 RSS keeps moving up. RSS不断向上发展。 As far as VSZ is concerned, it's more of virtual memory mapping that linux does, and one of its core features derived out of conventional operating system concepts. 就VSZ而言，它更多的是Linux所做的虚拟内存映射，其核心功能之一源于传统的操作系统概念。 The VSZ management is done by Virtual memory management of the kernel, for more info on VSZ, see Robert Love's description on mm_struct and vm_struct, which are part of basic task_struct data structure in kernel. VSZ管理由内核的虚拟内存管理完成，有关VSZ的更多信息，请参阅Robert Love对mm_struct和vm_struct的描述，它们是内核中基本task_struct数据结构的一部分。

#4楼

Minimal runnable example 最小的可运行示例

For this to make sense, you have to understand the basics of paging: How does x86 paging work? 为此，您必须了解分页的基础知识： x86分页如何工作？ and in particular that the OS can allocate virtual memory via page tables / its internal memory book keeping (VSZ virtual memory) before it actually has a backing storage on RAM or disk (RSS resident memory). 特别是OS可以在它实际上在RAM或磁盘（RSS驻留存储器）上具有后备存储之前，通过页表/其内部存储器保存（VSZ虚拟存储器）来分配虚拟存储器。

Now to observe this in action, let's create a program that: 现在要观察这一点，让我们创建一个程序：

allocates more RAM than our physical memory with mmap 使用mmap分配比物理内存更多的RAM
writes one byte on each page to ensure that each of those pages goes from virtual only memory RSS and VSZ 在每个页面上写入一个字节，以确保每个页面都来自虚拟内存RSS和VSZ
checks the memory usage of the process with one of the methods mentioned at: Memory usage of current process in C 使用下面提到的方法之一检查进程的内存使用情况： C中当前进程的内存使用情况

main.c main.c中

#define _GNU_SOURCE
#include <assert.h>
#include <inttypes.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/mman.h>
#include <unistd.h>

typedef struct {
    unsigned long size,resident,share,text,lib,data,dt;
} ProcStatm;

/* https://stackoverflow.com/questions/1558402/memory-usage-of-current-process-in-c/7212248#7212248 */
void ProcStat_init(ProcStatm *result) {
    const char* statm_path = "/proc/self/statm";
    FILE *f = fopen(statm_path, "r");
    if(!f) {
        perror(statm_path);
        abort();
    }
    if(7 != fscanf(
        f,
        "%lu %lu %lu %lu %lu %lu %lu",
        &(result->size),
        &(result->resident),
        &(result->share),
        &(result->text),
        &(result->lib),
        &(result->data),
        &(result->dt)
    )) {
        perror(statm_path);
        abort();
    }
    fclose(f);
}

int main(int argc, char **argv) {
    ProcStatm proc_statm;
    char *base, *p;
    char system_cmd[1024];
    long page_size;
    size_t i, nbytes, print_interval, bytes_since_last_print;
    int snprintf_return;

    /* Decide how many ints to allocate. */
    if (argc < 2) {
        nbytes = 0x10000;
    } else {
        nbytes = strtoull(argv[1], NULL, 0);
    }
    if (argc < 3) {
        print_interval = 0x1000;
    } else {
        print_interval = strtoull(argv[2], NULL, 0);
    }
    page_size = sysconf(_SC_PAGESIZE);

    /* Allocate the memory. */
    base = mmap(
        NULL,
        nbytes,
        PROT_READ | PROT_WRITE,
        MAP_SHARED | MAP_ANONYMOUS,
        -1,
        0
    );
    if (base == MAP_FAILED) {
        perror("mmap");
        exit(EXIT_FAILURE);
    }

    /* Write to all the allocated pages. */
    i = 0;
    p = base;
    bytes_since_last_print = 0;
    /* Produce the ps command that lists only our VSZ and RSS. */
    snprintf_return = snprintf(
        system_cmd,
        sizeof(system_cmd),
        "ps -o pid,vsz,rss | awk '{if (NR == 1 || $1 == \"%ju\") print}'",
        (uintmax_t)getpid()
    );
    assert(snprintf_return >= 0);
    assert((size_t)snprintf_return < sizeof(system_cmd));
    bytes_since_last_print = print_interval;
    do {
        /* Modify a byte in the page. */
        *p = i;
        p += page_size;
        bytes_since_last_print += page_size;
        /* Print process memory usage every print_interval bytes.
         * We count memory using a few techniques from:
         * https://stackoverflow.com/questions/1558402/memory-usage-of-current-process-in-c */
        if (bytes_since_last_print > print_interval) {
            bytes_since_last_print -= print_interval;
            printf("extra_memory_committed %lu KiB\n", (i * page_size) / 1024);
            ProcStat_init(&proc_statm);
            /* Check /proc/self/statm */
            printf(
                "/proc/self/statm size resident %lu %lu KiB\n",
                (proc_statm.size * page_size) / 1024,
                (proc_statm.resident * page_size) / 1024
            );
            /* Check ps. */
            puts(system_cmd);
            system(system_cmd);
            puts("");
        }
        i++;
    } while (p < base + nbytes);

    /* Cleanup. */
    munmap(base, nbytes);
    return EXIT_SUCCESS;
}

GitHub upstream . GitHub上游。

Compile and run: 编译并运行：

gcc -ggdb3 -O0 -std=c99 -Wall -Wextra -pedantic -o main.out main.c
echo 1 | sudo tee /proc/sys/vm/overcommit_memory
sudo dmesg -c
./main.out 0x1000000000 0x200000000
echo $?
sudo dmesg

where: 哪里：

0x1000000000 == 64GiB: 2x my computer's physical RAM of 32GiB 0x1000000000 == 64GiB：2x我的计算机32GiB的物理RAM
0x200000000 == 8GiB: print the memory every 8GiB, so we should get 4 prints before the crash at around 32GiB 0x200000000 == 8GiB：每8GiB打印一次内存，所以我们应该在崩溃前获得4张打印，大约32GiB
echo 1 | sudo tee /proc/sys/vm/overcommit_memory echo 1 | sudo tee /proc/sys/vm/overcommit_memory : required for Linux to allow us to make a mmap call larger than physical RAM: maximum memory which malloc can allocate echo 1 | sudo tee /proc/sys/vm/overcommit_memory ：Linux允许我们使mmap调用大于物理RAM： malloc可以分配的最大内存

Program output: 节目输出：

extra_memory_committed 0 KiB
/proc/self/statm size resident 67111332 768 KiB
ps -o pid,vsz,rss | awk '{if (NR == 1 || $1 == "29827") print}'
  PID    VSZ   RSS
29827 67111332 1648

extra_memory_committed 8388608 KiB
/proc/self/statm size resident 67111332 8390244 KiB
ps -o pid,vsz,rss | awk '{if (NR == 1 || $1 == "29827") print}'
  PID    VSZ   RSS
29827 67111332 8390256

extra_memory_committed 16777216 KiB
/proc/self/statm size resident 67111332 16778852 KiB
ps -o pid,vsz,rss | awk '{if (NR == 1 || $1 == "29827") print}'
  PID    VSZ   RSS
29827 67111332 16778864

extra_memory_committed 25165824 KiB
/proc/self/statm size resident 67111332 25167460 KiB
ps -o pid,vsz,rss | awk '{if (NR == 1 || $1 == "29827") print}'
  PID    VSZ   RSS
29827 67111332 25167472

Killed

Exit status: 退出状态：

which by the 128 + signal number rule means we got signal number 9 , which man 7 signal says is SIGKILL , which is sent by the Linux out-of-memory killer . 通过128 +信号编号规则意味着我们得到了信号编号9 ，其中man 7 signal是SIGKILL ，它是由Linux 内存杀手发送的。

Output interpretation: 产出解释：

VSZ virtual memory remains constant at printf '0x%X\\n' 0x40009A4 KiB ~= 64GiB ( ps values are in KiB) after the mmap. 在mmap之后，VSZ虚拟内存在printf '0x%X\\n' 0x40009A4 KiB ~= 64GiB （ ps值以KiB为单位）保持不变。
RSS "real memory usage" increases lazily only as we touch the pages. RSS“真实内存使用”仅在我们触摸页面时才会延迟增加。 For example: 例如：
- on the first print, we have extra_memory_committed 0 , which means we haven't yet touched any pages. 在第一次打印时，我们有extra_memory_committed 0 ，这意味着我们还没有触及任何页面。 RSS is a small 1648 KiB which has been allocated for normal program startup like text area, globals, etc. RSS是一个小的1648 KiB ，已被分配用于正常的程序启动，如文本区域，全局等。
- on the second print, we have written to 8388608 KiB == 8GiB worth of pages. 在第二个印刷品上，我们写了8388608 KiB == 8GiB页面。 As a result, RSS increased by exactly 8GIB to 8390256 KiB == 8388608 KiB + 1648 KiB 结果，RSS正好增加了8GIB到8390256 KiB == 8388608 KiB + 1648 KiB
- RSS continues to increase in 8GiB increments. RSS继续以8GiB为增量增加。 The last print shows about 24 GiB of memory, and before 32 GiB could be printed, the OOM killer killed the process 最后一个打印显示大约24 GiB的内存，在打印32 GiB之前，OOM杀手杀死了这个过程

OOM killer logs OOM杀手日志

Our dmesg commands have shown the OOM killer logs. 我们的dmesg命令显示了OOM杀手日志。

An exact interpretation of those has been asked at: 有人对此进行了精确解释：

Understanding the Linux oom-killer's logs but let's have a quick look here. 了解Linux oom-killer的日志，但让我们快速浏览一下。
https://serverfault.com/questions/548736/how-to-read-oom-killer-syslog-messages https://serverfault.com/questions/548736/how-to-read-oom-killer-syslog-messages

The very first line of the log was: 日志的第一行是：

[ 7283.479087] mongod invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0

So we see that interestingly it was the MongoDB daemon that always runs in my laptop on the background that first triggered the OOM killer, presumably when the poor thing was trying to allocate some memory. 所以我们看到有趣的是，MongoDB守护进程总是在我的笔记本电脑上运行，它首先触发了OOM杀手，大概是当可怜的东西试图分配一些内存时。

However, the OOM killer does not necessarily kill the one who awoke it. 然而，OOM杀手并不一定会杀死那个唤醒它的人。

After the invocation, the kernel prints a table or processes including the oom_score : 调用之后，内核会打印一个或多个表，包括oom_score ：

[ 7283.479292] [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
[ 7283.479303] [    496]     0   496    16126        6   172032      484             0 systemd-journal
[ 7283.479306] [    505]     0   505     1309        0    45056       52             0 blkmapd
[ 7283.479309] [    513]     0   513    19757        0    57344       55             0 lvmetad
[ 7283.479312] [    516]     0   516     4681        1    61440      444         -1000 systemd-udevd

and further ahead we see that our own little main.out actually got killed on the previous invocation: 进一步提前我们看到我们自己的小main.out实际上在之前的调用中被杀死了：

[ 7283.479871] Out of memory: Kill process 15665 (main.out) score 865 or sacrifice child
[ 7283.479879] Killed process 15665 (main.out) total-vm:67111332kB, anon-rss:92kB, file-rss:4kB, shmem-rss:30080832kB
[ 7283.479951] oom_reaper: reaped process 15665 (main.out), now anon-rss:0kB, file-rss:0kB, shmem-rss:30080832kB

This log mentions the score 865 which that process had, presumably the highest (worst) OOM killer score as mentioned at: https://unix.stackexchange.com/questions/153585/how-does-the-oom-killer-decide-which-process-to-kill-first 该日志提到了该过程所具有的score 865 ，可能是最高（最差）的OOM杀手分数，如下所述： https ： //unix.stackexchange.com/questions/153585/how-does-the-oom-killer-decide-该过程对杀-第一

Also interestingly, everything apparently happened so fast that before the freed memory was accounted, the oom was awoken again by the DeadlineMonitor process: 同样有趣的是，一切都显然发生得如此之快，以至于在释放内存之前， oom被DeadlineMonitor进程再次唤醒：

[ 7283.481043] DeadlineMonitor invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0

and this time that killed some Chromium process, which is usually my computers normal memory hog: 而这次杀死了一些Chromium进程，这通常是我的计算机正常记忆猪：

[ 7283.481773] Out of memory: Kill process 11786 (chromium-browse) score 306 or sacrifice child
[ 7283.481833] Killed process 11786 (chromium-browse) total-vm:1813576kB, anon-rss:208804kB, file-rss:0kB, shmem-rss:8380kB
[ 7283.497847] oom_reaper: reaped process 11786 (chromium-browse), now anon-rss:0kB, file-rss:0kB, shmem-rss:8044kB

Tested in Ubuntu 19.04, Linux kernel 5.0.0. 在Ubuntu 19.04，Linux内核5.0.0中测试过。

#5楼

They are not managed, but measured and possibly limited (see getrlimit system call, also on getrlimit(2) ). 它们不受管理，但是经过测量并且可能有限（参见getrlimit系统调用，也可以在getrlimit（2）上）。

RSS means resident set size (the part of your virtual address space sitting in RAM). RSS表示驻留集大小（位于RAM中的虚拟地址空间的一部分）。

You can query the virtual address space of process 1234 using proc(5) with cat /proc/1234/maps and its status (including memory consumption) thru cat /proc/1234/status 您可以使用带有cat /proc/1234/maps proc（5）及其状态（包括内存消耗）通过cat /proc/1234/status查询进程1234的虚拟地址空间

#6楼

RSS is Resident Set Size (physically resident memory - this is currently occupying space in the machine's physical memory), and VSZ is Virtual Memory Size (address space allocated - this has addresses allocated in the process's memory map, but there isn't necessarily any actual memory behind it all right now). RSS是驻留集大小（物理驻留内存 - 当前占用机器物理内存中的空间），VSZ是虚拟内存大小（分配地址空间 - 这个地址在进程的内存映射中分配，但不一定有现在它背后的实际记忆）。

Note that in these days of commonplace virtual machines, physical memory from the machine's view point may not really be actual physical memory. 请注意，在普通虚拟机的这些日子里，来自机器视点的物理内存可能并不真正是实际的物理内存。

什么是Linux内存管理中的RSS和VSZ

#1楼

#2楼

#3楼

#4楼

#5楼

#6楼

相关阅读

相关文章

相关问答

相关文档