本文翻译自:What is RSS and VSZ in Linux memory management
What are RSS and VSZ in Linux memory management? Linux内存管理中的RSS和VSZ是什么? In a multithreaded environment how can both of these can be managed and tracked? 在多线程环境中,如何管理和跟踪这两者?
参考:https://stackoom.com/question/x49q/什么是Linux内存管理中的RSS和VSZ
RSS is the Resident Set Size and is used to show how much memory is allocated to that process and is in RAM. RSS是驻留集大小,用于显示分配给该进程的内存量,并且位于RAM中。 It does not include memory that is swapped out. 它不包括换出的内存。 It does include memory from shared libraries as long as the pages from those libraries are actually in memory. 它确实包括来自共享库的内存,只要这些库中的页面实际上在内存中。 It does include all stack and heap memory. 它确实包括所有堆栈和堆内存。
VSZ is the Virtual Memory Size. VSZ是虚拟内存大小。 It includes all memory that the process can access, including memory that is swapped out, memory that is allocated, but not used, and memory that is from shared libraries. 它包括进程可以访问的所有内存,包括被换出的内存,已分配但未使用的内存以及来自共享库的内存。
So if process A has a 500K binary and is linked to 2500K of shared libraries, has 200K of stack/heap allocations of which 100K is actually in memory (rest is swapped or unused), and it has only actually loaded 1000K of the shared libraries and 400K of its own binary then: 因此,如果进程A具有500K二进制文件并且链接到2500K共享库,则具有200K的堆栈/堆分配,其中100K实际上在内存中(其余是交换或未使用),并且它实际上只加载了1000K的共享库然后是400K自己的二进制文件:
RSS: 400K + 1000K + 100K = 1500K
VSZ: 500K + 2500K + 200K = 3200K
Since part of the memory is shared, many processes may use it, so if you add up all of the RSS values you can easily end up with more space than your system has. 由于部分内存是共享的,因此许多进程可能会使用它,因此如果将所有RSS值相加,您可以轻松获得比系统更多的空间。
The memory that is allocated also may not be in RSS until it is actually used by the program. 在程序实际使用之前,分配的内存也可能不在RSS中。 So if your program allocated a bunch of memory up front, then uses it over time, you could see RSS going up and VSZ staying the same. 因此,如果你的程序预先分配了一堆内存,然后随着时间的推移使用它,你可以看到RSS上升,VSZ保持不变。
There is also PSS (proportional set size). 还有PSS(比例设定大小)。 This is a newer measure which tracks the shared memory as a proportion used by the current process. 这是一种较新的度量,它将共享内存跟踪为当前进程使用的比例。 So if there were two processes using the same shared library from before: 因此,如果有两个进程使用之前的相同共享库:
PSS: 400K + (1000K/2) + 100K = 400K + 500K + 100K = 1000K
Threads all share the same address space, so the RSS, VSZ and PSS for each thread is identical to all of the other threads in the process. 线程都共享相同的地址空间,因此每个线程的RSS,VSZ和PSS与进程中的所有其他线程相同。 Use ps or top to view this information in linux/unix. 使用ps或top在linux / unix中查看此信息。
There is way more to it than this, to learn more check the following references: 除此之外还有更多方法,了解更多信息请查看以下参考资料:
Also see: 另见:
I think much has already been said, about RSS vs VSZ. 关于RSS vs VSZ,我想已经说了很多。 From an administrator/programmer/user perspective, when I design/code applications I am more concerned about the RSZ, (Resident memory), as and when you keep pulling more and more variables (heaped) you will see this value shooting up. 从管理员/程序员/用户的角度来看,当我设计/编写应用程序时,我更关心RSZ(驻留内存),当你不断拉动越来越多的变量(堆积)时,你会看到这个值上升。 Try a simple program to build malloc based space allocation in loop, and make sure you fill data in that malloc'd space. 尝试一个简单的程序在循环中构建基于malloc的空间分配,并确保在malloc空间中填充数据。 RSS keeps moving up. RSS不断向上发展。 As far as VSZ is concerned, it's more of virtual memory mapping that linux does, and one of its core features derived out of conventional operating system concepts. 就VSZ而言,它更多的是Linux所做的虚拟内存映射,其核心功能之一源于传统的操作系统概念。 The VSZ management is done by Virtual memory management of the kernel, for more info on VSZ, see Robert Love's description on mm_struct and vm_struct, which are part of basic task_struct data structure in kernel. VSZ管理由内核的虚拟内存管理完成,有关VSZ的更多信息,请参阅Robert Love对mm_struct和vm_struct的描述,它们是内核中基本task_struct数据结构的一部分。
Minimal runnable example 最小的可运行示例
For this to make sense, you have to understand the basics of paging: How does x86 paging work? 为此,您必须了解分页的基础知识: x86分页如何工作? and in particular that the OS can allocate virtual memory via page tables / its internal memory book keeping (VSZ virtual memory) before it actually has a backing storage on RAM or disk (RSS resident memory). 特别是OS可以在它实际上在RAM或磁盘(RSS驻留存储器)上具有后备存储之前,通过页表/其内部存储器保存(VSZ虚拟存储器)来分配虚拟存储器。
Now to observe this in action, let's create a program that: 现在要观察这一点,让我们创建一个程序:
mmap
使用mmap
分配比物理内存更多的RAM main.c main.c中
#define _GNU_SOURCE
#include <assert.h>
#include <inttypes.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/mman.h>
#include <unistd.h>
typedef struct {
unsigned long size,resident,share,text,lib,data,dt;
} ProcStatm;
/* https://stackoverflow.com/questions/1558402/memory-usage-of-current-process-in-c/7212248#7212248 */
void ProcStat_init(ProcStatm *result) {
const char* statm_path = "/proc/self/statm";
FILE *f = fopen(statm_path, "r");
if(!f) {
perror(statm_path);
abort();
}
if(7 != fscanf(
f,
"%lu %lu %lu %lu %lu %lu %lu",
&(result->size),
&(result->resident),
&(result->share),
&(result->text),
&(result->lib),
&(result->data),
&(result->dt)
)) {
perror(statm_path);
abort();
}
fclose(f);
}
int main(int argc, char **argv) {
ProcStatm proc_statm;
char *base, *p;
char system_cmd[1024];
long page_size;
size_t i, nbytes, print_interval, bytes_since_last_print;
int snprintf_return;
/* Decide how many ints to allocate. */
if (argc < 2) {
nbytes = 0x10000;
} else {
nbytes = strtoull(argv[1], NULL, 0);
}
if (argc < 3) {
print_interval = 0x1000;
} else {
print_interval = strtoull(argv[2], NULL, 0);
}
page_size = sysconf(_SC_PAGESIZE);
/* Allocate the memory. */
base = mmap(
NULL,
nbytes,
PROT_READ | PROT_WRITE,
MAP_SHARED | MAP_ANONYMOUS,
-1,
0
);
if (base == MAP_FAILED) {
perror("mmap");
exit(EXIT_FAILURE);
}
/* Write to all the allocated pages. */
i = 0;
p = base;
bytes_since_last_print = 0;
/* Produce the ps command that lists only our VSZ and RSS. */
snprintf_return = snprintf(
system_cmd,
sizeof(system_cmd),
"ps -o pid,vsz,rss | awk '{if (NR == 1 || $1 == \"%ju\") print}'",
(uintmax_t)getpid()
);
assert(snprintf_return >= 0);
assert((size_t)snprintf_return < sizeof(system_cmd));
bytes_since_last_print = print_interval;
do {
/* Modify a byte in the page. */
*p = i;
p += page_size;
bytes_since_last_print += page_size;
/* Print process memory usage every print_interval bytes.
* We count memory using a few techniques from:
* https://stackoverflow.com/questions/1558402/memory-usage-of-current-process-in-c */
if (bytes_since_last_print > print_interval) {
bytes_since_last_print -= print_interval;
printf("extra_memory_committed %lu KiB\n", (i * page_size) / 1024);
ProcStat_init(&proc_statm);
/* Check /proc/self/statm */
printf(
"/proc/self/statm size resident %lu %lu KiB\n",
(proc_statm.size * page_size) / 1024,
(proc_statm.resident * page_size) / 1024
);
/* Check ps. */
puts(system_cmd);
system(system_cmd);
puts("");
}
i++;
} while (p < base + nbytes);
/* Cleanup. */
munmap(base, nbytes);
return EXIT_SUCCESS;
}
Compile and run: 编译并运行:
gcc -ggdb3 -O0 -std=c99 -Wall -Wextra -pedantic -o main.out main.c
echo 1 | sudo tee /proc/sys/vm/overcommit_memory
sudo dmesg -c
./main.out 0x1000000000 0x200000000
echo $?
sudo dmesg
where: 哪里:
echo 1 | sudo tee /proc/sys/vm/overcommit_memory
echo 1 | sudo tee /proc/sys/vm/overcommit_memory
: required for Linux to allow us to make a mmap call larger than physical RAM: maximum memory which malloc can allocate echo 1 | sudo tee /proc/sys/vm/overcommit_memory
:Linux允许我们使mmap调用大于物理RAM: malloc可以分配的最大内存 Program output: 节目输出:
extra_memory_committed 0 KiB
/proc/self/statm size resident 67111332 768 KiB
ps -o pid,vsz,rss | awk '{if (NR == 1 || $1 == "29827") print}'
PID VSZ RSS
29827 67111332 1648
extra_memory_committed 8388608 KiB
/proc/self/statm size resident 67111332 8390244 KiB
ps -o pid,vsz,rss | awk '{if (NR == 1 || $1 == "29827") print}'
PID VSZ RSS
29827 67111332 8390256
extra_memory_committed 16777216 KiB
/proc/self/statm size resident 67111332 16778852 KiB
ps -o pid,vsz,rss | awk '{if (NR == 1 || $1 == "29827") print}'
PID VSZ RSS
29827 67111332 16778864
extra_memory_committed 25165824 KiB
/proc/self/statm size resident 67111332 25167460 KiB
ps -o pid,vsz,rss | awk '{if (NR == 1 || $1 == "29827") print}'
PID VSZ RSS
29827 67111332 25167472
Killed
Exit status: 退出状态:
137
which by the 128 + signal number rule means we got signal number 9
, which man 7 signal
says is SIGKILL , which is sent by the Linux out-of-memory killer . 通过128 +信号编号规则意味着我们得到了信号编号9
,其中man 7 signal
是SIGKILL ,它是由Linux 内存杀手发送的 。
Output interpretation: 产出解释:
printf '0x%X\\n' 0x40009A4 KiB ~= 64GiB
( ps
values are in KiB) after the mmap. 在mmap之后,VSZ虚拟内存在printf '0x%X\\n' 0x40009A4 KiB ~= 64GiB
( ps
值以KiB为单位)保持不变。 extra_memory_committed 0
, which means we haven't yet touched any pages. 在第一次打印时,我们有extra_memory_committed 0
,这意味着我们还没有触及任何页面。 RSS is a small 1648 KiB
which has been allocated for normal program startup like text area, globals, etc. RSS是一个小的1648 KiB
,已被分配用于正常的程序启动,如文本区域,全局等。 8388608 KiB == 8GiB
worth of pages. 在第二个印刷品上,我们写了8388608 KiB == 8GiB
页面。 As a result, RSS increased by exactly 8GIB to 8390256 KiB == 8388608 KiB + 1648 KiB
结果,RSS正好增加了8GIB到8390256 KiB == 8388608 KiB + 1648 KiB
See also: https://unix.stackexchange.com/questions/35129/need-explanation-on-resident-set-size-virtual-size 另见: https : //unix.stackexchange.com/questions/35129/need-explanation-on-resident-set-size-virtual-size
OOM killer logs OOM杀手日志
Our dmesg
commands have shown the OOM killer logs. 我们的dmesg
命令显示了OOM杀手日志。
An exact interpretation of those has been asked at: 有人对此进行了精确解释:
The very first line of the log was: 日志的第一行是:
[ 7283.479087] mongod invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
So we see that interestingly it was the MongoDB daemon that always runs in my laptop on the background that first triggered the OOM killer, presumably when the poor thing was trying to allocate some memory. 所以我们看到有趣的是,MongoDB守护进程总是在我的笔记本电脑上运行,它首先触发了OOM杀手,大概是当可怜的东西试图分配一些内存时。
However, the OOM killer does not necessarily kill the one who awoke it. 然而,OOM杀手并不一定会杀死那个唤醒它的人。
After the invocation, the kernel prints a table or processes including the oom_score
: 调用之后,内核会打印一个或多个表,包括oom_score
:
[ 7283.479292] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
[ 7283.479303] [ 496] 0 496 16126 6 172032 484 0 systemd-journal
[ 7283.479306] [ 505] 0 505 1309 0 45056 52 0 blkmapd
[ 7283.479309] [ 513] 0 513 19757 0 57344 55 0 lvmetad
[ 7283.479312] [ 516] 0 516 4681 1 61440 444 -1000 systemd-udevd
and further ahead we see that our own little main.out
actually got killed on the previous invocation: 进一步提前我们看到我们自己的小main.out
实际上在之前的调用中被杀死了:
[ 7283.479871] Out of memory: Kill process 15665 (main.out) score 865 or sacrifice child
[ 7283.479879] Killed process 15665 (main.out) total-vm:67111332kB, anon-rss:92kB, file-rss:4kB, shmem-rss:30080832kB
[ 7283.479951] oom_reaper: reaped process 15665 (main.out), now anon-rss:0kB, file-rss:0kB, shmem-rss:30080832kB
This log mentions the score 865
which that process had, presumably the highest (worst) OOM killer score as mentioned at: https://unix.stackexchange.com/questions/153585/how-does-the-oom-killer-decide-which-process-to-kill-first 该日志提到了该过程所具有的score 865
,可能是最高(最差)的OOM杀手分数,如下所述: https : //unix.stackexchange.com/questions/153585/how-does-the-oom-killer-decide-该过程对杀-第一
Also interestingly, everything apparently happened so fast that before the freed memory was accounted, the oom
was awoken again by the DeadlineMonitor
process: 同样有趣的是,一切都显然发生得如此之快,以至于在释放内存之前, oom
被DeadlineMonitor
进程再次唤醒:
[ 7283.481043] DeadlineMonitor invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
and this time that killed some Chromium process, which is usually my computers normal memory hog: 而这次杀死了一些Chromium进程,这通常是我的计算机正常记忆猪:
[ 7283.481773] Out of memory: Kill process 11786 (chromium-browse) score 306 or sacrifice child
[ 7283.481833] Killed process 11786 (chromium-browse) total-vm:1813576kB, anon-rss:208804kB, file-rss:0kB, shmem-rss:8380kB
[ 7283.497847] oom_reaper: reaped process 11786 (chromium-browse), now anon-rss:0kB, file-rss:0kB, shmem-rss:8044kB
Tested in Ubuntu 19.04, Linux kernel 5.0.0. 在Ubuntu 19.04,Linux内核5.0.0中测试过。
They are not managed, but measured and possibly limited (see getrlimit
system call, also on getrlimit(2) ). 它们不受管理,但是经过测量并且可能有限(参见getrlimit
系统调用,也可以在getrlimit(2)上 )。
RSS means resident set size (the part of your virtual address space sitting in RAM). RSS表示驻留集大小 (位于RAM中的虚拟地址空间的一部分)。
You can query the virtual address space of process 1234 using proc(5) with cat /proc/1234/maps
and its status (including memory consumption) thru cat /proc/1234/status
您可以使用带有cat /proc/1234/maps
proc(5)及其状态(包括内存消耗)通过cat /proc/1234/status
查询进程1234的虚拟地址空间
RSS is Resident Set Size (physically resident memory - this is currently occupying space in the machine's physical memory), and VSZ is Virtual Memory Size (address space allocated - this has addresses allocated in the process's memory map, but there isn't necessarily any actual memory behind it all right now). RSS是驻留集大小(物理驻留内存 - 当前占用机器物理内存中的空间),VSZ是虚拟内存大小(分配地址空间 - 这个地址在进程的内存映射中分配,但不一定有现在它背后的实际记忆)。
Note that in these days of commonplace virtual machines, physical memory from the machine's view point may not really be actual physical memory. 请注意,在普通虚拟机的这些日子里,来自机器视点的物理内存可能并不真正是实际的物理内存。