|
一)swap的概述
1)swap的作用可簡(jiǎn)單描述為:
當內存不夠用時(shí),將存儲器中的數據塊從DRAM移到swap的磁盤(pán)空間中,以釋放更多的空間給當前進(jìn)程使用.
當再次需要那些數據時(shí),就可以將swap磁盤(pán)中的數據重新移到內存,而將那些不用的數據塊從內存移到swap中.
2)數據從內存移動(dòng)交換區的行為被稱(chēng)為頁(yè)面調用,發(fā)生在后臺的頁(yè)面調用沒(méi)有來(lái)自應用程序的干涉.
3)swap空間是分頁(yè)的,每一頁(yè)的大小和內存頁(yè)的大小一樣.
4)并不是一定要給每個(gè)系統劃分SWAP,比如大多數的嵌入式就沒(méi)有swap.
二)swap能讓系統最多使用多少內存?
一個(gè)32位的LINUX系統在沒(méi)有啟用PAE的情況下最多可以用4GB的物理內存.在用戶(hù)空間中最多可用3GB.
而增加swap會(huì )讓系統增加可使用的內存空間嗎?
用下面的測試回答這個(gè)問(wèn)題:
查看內核版本
[root@test1 tmp]# uname -r
2.6.18-8.el5
建一個(gè)10GB的文件
[root@test1 tmp]# dd if=/dev/zero f=data.dat bs=10M count=1000
[root@test1 tmp]# mkswap data.dat
Setting up swapspace version 1, size = 10485755 kB
[root@test1 tmp]# free -m
total used free shared buffers cached
Mem: 503 43 460 0 13 9
-/+ buffers/cache: 20 482
Swap: 1027 22 1005
[root@test1 tmp]# swapon data.dat
[root@test1 tmp]# free -m
total used free shared buffers cached
Mem: 503 46 457 0 15 9
-/+ buffers/cache: 20 482
Swap: 11027 22 11005
[root@test1 tmp]# swapon -s
Filename Type Size Used Priority
/dev/sda2 partition 1052248 23040 -1
/tmp/data.dat file 10239992 0 -3
現在的物理內存和swap加起來(lái)超過(guò)了4GB.而我們最多能用多少呢?
下面的程序會(huì )不斷申請/分配內存空間.
源程序如下:test1
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main (int argc, char *argv[])
{
void *ptr;
int n = 0;
while (1){
ptr = malloc(0x100000);
if (ptr == NULL)
break;
memset(ptr, 1, 0x100000);
printf("malloced %d MB\n", ++n);
}
pause();
}
[root@test1 tmp]# gcc test1.c -o callmem
[root@test1 tmp]# ./callmem
malloced 3039 MB
malloced 3040 MB
malloced 3041 MB
malloced 3042 MB
malloced 3043 MB
malloced 3044 MB
malloced 3045 MB
malloced 3046 MB
malloced 3047 MB
malloced 3048 MB
malloced 3049 MB
malloced 3050 MB
malloced 3051 MB
malloced 3052 MB
malloced 3053 MB
malloced 3054 MB
malloced 3055 MB
malloced 3056 MB
在分配了3056MB的內存之后,再不能分配了,說(shuō)明即使通過(guò)swap給系統增加了10GB的內存,系統最終也只能使用3056MB的內存空間(用戶(hù)空間),不能突破4GB的限制.
三)超過(guò)了系統最大可使用內存空間會(huì )怎樣?
依舊是上面的例子,我們關(guān)閉掉swap文件data.dat
[root@test1 tmp]# swapoff data.dat
查看當前的swap空間
[root@test1 tmp]# swapon -s
Filename Type Size Used Priority
/dev/sda2 partition 1052248 22760 -1
執行callmem
[root@test1 tmp]# ./callmem
malloced 1465 MB
malloced 1466 MB
malloced 1467 MB
malloced 1468 MB
malloced 1469 MB
malloced 1470 MB
malloced 1471 MB
malloced 1472 MB
malloced 1473 MB
malloced 1474 MB
malloced 1475 MB
malloced 1476 MB
malloced 1477 MB
malloced 1478 MB
malloced 1479 MB
malloced 1480 MB
malloced 1481 MB
malloced 1482 MB
malloced 1483 MB
malloced 1484 MB
malloced 1485 MB
Killed
在分配了1485MB內存后,程序退出.
在/var/log/message中查看報錯的信息,如下:
May 30 10:40:52 test1 kernel: automount invoked oom-killer: gfp_mask=0x201d2, rder=0, mkilladj=0
May 30 10:40:52 test1 kernel: [<c0452c4e>] out_of_memory+0x3b/0x179
May 30 10:40:52 test1 kernel: [<c0454081>] __alloc_pages+0x1fe/0x27e
May 30 10:40:52 test1 kernel: [<c04552cb>] __do_page_cache_readahead+0xc4/0x1c6
May 30 10:40:52 test1 kernel: [<c044f9ad>] sync_page+0x0/0x3b
May 30 10:40:52 test1 kernel: [<c044c74d>] __delayacct_blkio_end+0x32/0x35
May 30 10:40:52 test1 kernel: [<c05fb48c>] __wait_on_bit_lock+0x4b/0x52
May 30 10:40:52 test1 kernel: [<c044f930>] __lock_page+0x51/0x57
May 30 10:40:52 test1 kernel: [<c0452284>] filemap_nopage+0x151/0x315
May 30 10:40:52 test1 kernel: [<c045a8de>] __handle_mm_fault+0x172/0x87b
May 30 10:40:52 test1 kernel: [<c041d3f9>] __activate_task+0x1c/0x29
May 30 10:40:52 test1 kernel: [<c042109f>] wake_up_new_task+0x1be/0x1c6
May 30 10:40:52 test1 kernel: [<c05fd48f>] do_page_fault+0x20a/0x4b8
May 30 10:40:52 test1 kernel: [<c05fd285>] do_page_fault+0x0/0x4b8
May 30 10:40:52 test1 kernel: [<c0404a71>] error_code+0x39/0x40
May 30 10:40:52 test1 kernel: =======================
May 30 10:40:52 test1 kernel: Mem-info:
May 30 10:40:52 test1 kernel: DMA per-cpu:
May 30 10:40:52 test1 kernel: cpu 0 hot: high 0, batch 1 used:0
May 30 10:40:52 test1 kernel: cpu 0 cold: high 0, batch 1 used:0
May 30 10:40:52 test1 kernel: DMA32 per-cpu: empty
May 30 10:40:52 test1 kernel: Normal per-cpu:
May 30 10:40:52 test1 kernel: cpu 0 hot: high 186, batch 31 used:30
May 30 10:40:52 test1 kernel: cpu 0 cold: high 62, batch 15 used:14
May 30 10:40:52 test1 kernel: HighMem per-cpu: empty
May 30 10:40:52 test1 kernel: Free pages: 4844kB (0kB HighMem)
May 30 10:40:52 test1 kernel: Active:62495 inactive:62136 dirty:0 writeback:4 unstable:0 free:1211 slab:1350 mapped:0 pagetables:693
May 30 10:40:52 test1 kernel: DMA free:2072kB min:88kB low:108kB high:132kB active:5108kB inactive:5084kB present:16384kB pages_scan
ned:15907 all_unreclaimable? yes
May 30 10:40:52 test1 kernel: lowmem_reserve[]: 0 0 496 496
May 30 10:40:52 test1 kernel: DMA32 free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unrecl
aimable? no
May 30 10:40:52 test1 kernel: lowmem_reserve[]: 0 0 496 496
May 30 10:40:52 test1 kernel: Normal free:2772kB min:2804kB low:3504kB high:4204kB active:244872kB inactive:243460kB present:507904k
B pages_scanned:990188 all_unreclaimable? yes
May 30 10:40:52 test1 kernel: lowmem_reserve[]: 0 0 0 0
May 30 10:40:52 test1 kernel: HighMem free:0kB min:128kB low:128kB high:128kB active:0kB inactive:0kB present:0kB pages_scanned:0 al
l_unreclaimable? no
May 30 10:40:52 test1 kernel: lowmem_reserve[]: 0 0 0 0
May 30 10:40:52 test1 kernel: DMA: 0*4kB 1*8kB 1*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 1*2048kB 0*4096kB = 2072kB
May 30 10:40:52 test1 kernel: DMA32: empty
May 30 10:40:52 test1 kernel: Normal: 1*4kB 0*8kB 3*16kB 1*32kB 0*64kB 1*128kB 0*256kB 1*512kB 0*1024kB 1*2048kB 0*4096kB = 2772kB
May 30 10:40:52 test1 kernel: HighMem: empty
May 30 10:40:52 test1 kernel: Swap cache: add 2790571, delete 2790571, find 1739/3390, race 2+1
May 30 10:40:53 test1 kernel: Free swap = 0kB
May 30 10:40:53 test1 kernel: Total swap = 1052248kB
May 30 10:40:53 test1 kernel: Free swap: 0kB
May 30 10:40:53 test1 kernel: 131072 pages of RAM
May 30 10:40:53 test1 kernel: 0 pages of HIGHMEM
May 30 10:40:53 test1 kernel: 2188 reserved pages
May 30 10:40:53 test1 kernel: 59 pages shared
May 30 10:40:53 test1 kernel: 0 pages swap cached
May 30 10:40:53 test1 kernel: 0 pages dirty
May 30 10:40:53 test1 kernel: 4 pages writeback
May 30 10:40:53 test1 kernel: 0 pages mapped
May 30 10:40:53 test1 kernel: 1350 pages slab
May 30 10:40:53 test1 kernel: 693 pages pagetables
May 30 10:40:53 test1 kernel: Out of memory: Killed process 2720 (callmem)
通過(guò)測試證明在程序申請了多于系統的內存可使用空間時(shí),程序將被中止.同時(shí)系統將回收這個(gè)程序已申請分配的內存空間.
四)關(guān)于內存耗盡的總結:
1)在進(jìn)程收到OOM之前,內核將刷新文件系統的cache來(lái)釋放空間.
2)將交換區的頁(yè)面移到磁盤(pán)上.
3)當內存變少時(shí),虛擬性使每個(gè)進(jìn)程通過(guò)交換區來(lái)做簡(jiǎn)單的上下文環(huán)境切換.
4)當進(jìn)程消耗盡交換內存后,才會(huì )引發(fā)out-of-memory(OOM)來(lái)kill那些進(jìn)程.
五)內存耗盡的解決辦法
有三種方法解決進(jìn)程對內存的瘋狂占用.
1)overcommit
當系統用完RAM和交換區后,內核將終止響應進(jìn)程,malloc持續地返回遠大于系統所能提供的指向虛擬地址的指針,Linux稱(chēng)這種情況為overcommit.
overcommit默認是0,處于開(kāi)啟狀態(tài),允許一個(gè)進(jìn)程分配多于當前可用的內存,而進(jìn)程耗盡可用內存時(shí),將被OOM(out-of-memroy)終止.
可以人為將overcommit改為2,即關(guān)閉covercommit,它迫使內核在分配內存時(shí),是基于當前可用的物理內存.
通過(guò)下面的程序測試malloc和overcommit,這個(gè)程序與之前的程序最大不同是,程序只分配內存(malloc),但不通過(guò)memset來(lái)修改它申請的數據,這樣就不會(huì )產(chǎn)生缺頁(yè).
源程序如下:test2
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int
main (int argc, char *argv[])
{
void *ptr;
int n = 0;
while (1){
ptr = malloc(0x100000);
if (ptr == NULL)
break;
n++;
}
printf("malloced %d MB\n", n);
pause();
}
現在overcommit處于開(kāi)啟狀態(tài):
[root@test1 ~]# cat /proc/sys/vm/overcommit_memory
0
為便于測試將swap分區內存關(guān)閉:
[root@test1 tmp]# swapoff -a
編譯程序test2.c
[root@test1 tmp]# gcc test2.c -o callmem1
執行callmem1,發(fā)現它申請了3056MB,而物理內存只有512MB.
[root@test1 tmp]# ./callmem1
malloced 3056 MB
現在關(guān)閉overcommit
[root@test1 tmp]# echo 2 > /proc/sys/vm/overcommit_memory
[root@test1 tmp]# free
total used free shared buffers cached
Mem: 515600 49624 465976 0 1464 14908
-/+ buffers/cache: 33252 482348
Swap: 0 0 0
再次執行callmem1,發(fā)現它這回只申請到了182MB的內存.
[root@test1 tmp]# ./callmem1
malloced 182 MB
malloc只會(huì )分配內存,它沒(méi)有修改數據,所以并不用占用內存空間.通過(guò)free命令查看后沒(méi)有明顯變化.
[root@test1 tmp]# free
total used free shared buffers cached
Mem: 515600 50244 465356 0 1472 14908
-/+ buffers/cache: 33864 481736
Swap: 0 0 0
2)ulimit
通過(guò)ulimit系統命令限制當前shell只能用10MB的內存資源.
[root@test1 tmp]# ulimit -d 10240 -m 10240 -v 10240
-d The maximum size of a process data segment
-m The maximum resident set size
-v The maximum amount of virtual memory available to the shell
其中-v是決定性的,它決定了當前shell可用的最多虛擬內存.
檢查設置的結果
[root@test1 tmp]# ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) 10240
max nice (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 8192
max locked memory (kbytes, -l) 32
max memory size (kbytes, -m) 10240
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
max rt priority (-r) 0
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 8192
virtual memory (kbytes, -v) 10240
file locks (-x) unlimited
最后執行callmem1
[root@test1 tmp]# ./callmem1&
[1] 2525
malloced 8 MB
我們限制到了10MB,為什么只能分配8MB的內存.
用pmap查看映射地址.
[root@test1 tmp]# jobs -x pmap %1
2525: ./callmem1
00110000 1244K r-x-- /lib/libc-2.5.so
00247000 8K r-x-- /lib/libc-2.5.so
00249000 4K rwx-- /lib/libc-2.5.so
0024a000 12K rwx-- [ anon ]
00b51000 100K r-x-- /lib/ld-2.5.so
00b6a000 4K r-x-- /lib/ld-2.5.so
00b6b000 4K rwx-- /lib/ld-2.5.so
00c68000 4K r-x-- [ anon ]
08048000 4K r-x-- /tmp/callmem1
08049000 4K rw--- /tmp/callmem1
b7777000 8228K rw--- [ anon ]
b7f93000 8K rw--- [ anon ]
bfe56000 88K rw--- [ stack ]
total 9712K
我們看到除了程序自身malloc分配掉的內存之外,還分別加載了動(dòng)態(tài)鏈接庫libc-2.5.so,ld-2.5.so.
在通過(guò)malloc分配了8228KB時(shí),程序已經(jīng)不能再申請1MB的內存了.
3)setrlimit
我們在實(shí)際應用的過(guò)程式中,可以通過(guò)sysconf(_SC_AVPHYS_PAGES);獲取系統中可用物理內存頁(yè)的數量.
而不包括刷新的cache和交換到磁盤(pán)上的空間.它和MemFree的值很相近.可以通過(guò)/proc/meminfo查看.
這個(gè)值應該說(shuō)是一個(gè)很保存的值,因為它不包括通過(guò)刷新為人所知cache而釋放的空間.
我們通過(guò)setrlimit系統調用,來(lái)保證當前進(jìn)程只能使用sysconf(_SC_AVPHYS_PAGES)獲得保守值.
第一步去掉上一個(gè)試驗留下的內存限制.
[root@test1 tmp]# ulimit -d unlimited -m unlimited -v unlimited
編譯test3.c
[root@test1 tmp]# gcc test3.c -o callmem2
源程序如下:test3.c
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <limits.h>
#include <signal.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/resource.h>
int
main (int argc, char *argv[])
{
void *ptr;
int n = 0;
int r = 0;
struct rlimit rl;
u_long pages, max_pages, max_bytes;
pages = sysconf(_SC_AVPHYS_PAGES);
max_pages = ULONG_MAX / sysconf(_SC_PAGE_SIZE);
if (pages > max_pages)
pages = max_pages;
max_bytes = pages * sysconf(_SC_PAGE_SIZE);
r = getrlimit (RLIMIT_AS,&rl);
printf("crrent hard limit is %ld MB\n",
(u_long) rl.rlim_max/0x100000);
rl.rlim_cur = max_bytes;
r = setrlimit (RLIMIT_AS, &rl);
if (r){
perror("setrlimit");
}
printf("limit set to %ld MB\n", max_bytes / 0x100000);
while(1){
ptr = malloc(0x100000);
if (ptr == NULL){
perror("malloc");
break;
}
memset(ptr, 1, 0x100000);
printf("malloced %d MB\n", ++n);
}
printf("paused\n");
raise(SIGSTOP);
return 0;
}
先查看一下當前內存利用情況,當前空閑內存為296MB.
[root@test1 ~]# free -m
total used free shared buffers cached
Mem: 503 206 296 0 28 141
-/+ buffers/cache: 36 466
Swap: 1027 0 1027
執行callmem2程序
[root@test1 tmp]# ./callmem2
crrent hard limit is 4095 MB
limit set to 296 MB
malloced 1 MB
malloced 2 MB
malloced 3 MB
malloced 4 MB
malloced 5 MB
......
malloced 294 MB
malloc: Cannot allocate memory
paused
[1]+ Stopped ./callmem2
crrent hard limit is 4095 MB
當前系統的內存限額為4095MB,也就是4GB的虛擬內存限制.
limit set to 296 MB
而進(jìn)程最多也只能用到296MB,最后通過(guò)malloc分配到了294MB,還有2MB給了系統庫.
再看一下當前的內存:
[root@test1 ~]# free -m
total used free shared buffers cached
Mem: 503 497 5 0 28 137
-/+ buffers/cache: 331 171
Swap: 1027 0 1027
這里為什么還剩下5MB的內存呢,其實(shí)它把內存都給了callmem2(剩下的零頭沒(méi)法再給了).它是將cached中的4MB內存釋放給了系統.
那么在這種情況下,別的進(jìn)程還能分配到內存嗎?
答案是可以的,但最多只能是分配5MB左右.比如上面free的內存有5MB,進(jìn)程啟動(dòng)的時(shí)候會(huì )分配1MB-2MB的內存給系統庫.剩下的它會(huì )盡可能的占用.
再次執行callmem2,進(jìn)程只分配到了4MB.
[root@test1 tmp]# ./callmem2
crrent hard limit is 4095 MB
limit set to 6 MB
malloced 1 MB
malloced 2 MB
malloced 3 MB
malloced 4 MB
malloc: Cannot allocate memory
paused
[2]+ Stopped ./callmem2
用pmap查看映射的地址,最多用了5728KB.
[root@test1 tmp]# jobs -x pmap %2
2739: ./callmem2
00110000 1244K r-x-- /lib/libc-2.5.so
00247000 8K r-x-- /lib/libc-2.5.so
00249000 4K rwx-- /lib/libc-2.5.so
0024a000 12K rwx-- [ anon ]
00b51000 100K r-x-- /lib/ld-2.5.so
00b6a000 4K r-x-- /lib/ld-2.5.so
00b6b000 4K rwx-- /lib/ld-2.5.so
00ba3000 4K r-x-- [ anon ]
08048000 4K r-x-- /tmp/callmem2
08049000 4K rw--- /tmp/callmem2
08259000 132K rw--- [ anon ]
b7bd8000 4116K rw--- [ anon ]
b7ff0000 8K rw--- [ anon ]
bf938000 84K rw--- [ stack ]
total 5728K
再次查看內存的利用情況,free還是5MB,這回cached釋放了3MB,buffers釋放了1MB
[root@test1 ~]# free -m
total used free shared buffers cached
Mem: 503 497 5 0 27 134
-/+ buffers/cache: 335 167
Swap: 1027 0 1027
4)內存限制的總結:
4.1)三種方法都可以防止OOM的發(fā)生.
4.2)overcommit的方法,只有超級用戶(hù)都能使用,設成2后,之后所有用戶(hù)的進(jìn)程都不能消耗超過(guò)實(shí)際內存的地址空間,也就收不到OOM信息了.但這種方式并不可控.
4.3)ulimit的方法,所有用戶(hù)都能使用,設定后,當前shell生效,有硬限制和軟限制兩種方式,這種方式可以控制進(jìn)程最多占用的內存.
4.4)setrlimit的方法,所有用戶(hù)都能使用,設定后,當前進(jìn)程有效.有硬限制和軟限制兩種方式,這種方式也可以控制進(jìn)程最多占用的內存.
4.5)setrlimit一般和sysconf一起聯(lián)用,但sysconf得到的頁(yè)面數量只是一個(gè)模糊的數,在一個(gè)繁忙的系統里,這個(gè)值是不定的.
4.6)最后說(shuō)一點(diǎn),sysconf(_SC_PAGE_SIZE)得到的值是一個(gè)可用物理內存的值,而不包括swap.
4.6)如果用ulimit設定當前shell下啟用的進(jìn)程最多可用10MB,即使sysconf(_SC_PAGE_SIZE);得到的值是大于10MB,我們也只能用到10MB.例如下面的試驗:
[root@test1 tmp]# ulimit -d 10240 -m 10240 -v 10240
[root@test1 tmp]# ./callmem2
crrent hard limit is 10 MB
setrlimit: Invalid argument
limit set to 328 MB
malloced 1 MB
malloced 2 MB
malloced 3 MB
malloced 4 MB
malloced 5 MB
malloced 6 MB
malloced 7 MB
malloc: Cannot allocate memory
paused
[1]+ Stopped ./callmem2
[root@test1 tmp]# jobs -x pmap %1
2514: ./callmem2
0067b000 4K r-x-- [ anon ]
00b51000 100K r-x-- /lib/ld-2.5.so
00b6a000 4K r-x-- /lib/ld-2.5.so
00b6b000 4K rwx-- /lib/ld-2.5.so
00b6e000 1244K r-x-- /lib/libc-2.5.so
00ca5000 8K r-x-- /lib/libc-2.5.so
00ca7000 4K rwx-- /lib/libc-2.5.so
00ca8000 12K rwx-- [ anon ]
08048000 4K r-x-- /tmp/callmem2
08049000 4K rw--- /tmp/callmem2
b7729000 8224K rw--- [ anon ]
b7f44000 8K rw--- [ anon ]
bfd46000 84K rw--- [ stack ]
total 9704K
聯(lián)系客服