一)core文件
一般是當程序崩潰時(shí),內核把該程序當前的內存映射到core文件里,例如當程序出現段錯誤,內核會(huì )發(fā)送SIGSEGV信號給程序,使程序中斷,并把該程序的內存寫(xiě)入到core文件.
所以core文件中只是程序的內存映像,如果在編譯時(shí)加入調試信息的話(huà),那么還會(huì )有調試信息.
下面我們討論一下core相關(guān)的話(huà)題:
1)ulimit
ulimit -c 可以設置core的文件,并能控制是否產(chǎn)生core文件,如果設置為0的話(huà),將不產(chǎn)生core文件,如果設置的大小小于core文件,則對core文件截取.
所以如果要調試core文件,一定要打開(kāi)這個(gè)限制,如下:
查看core文件的限制,此時(shí)為0,即不成生core文件
ulimit -c
0
打開(kāi)core文件的限制,不限制core文件的大小,使程序可以產(chǎn)生core文件
ulimit -c unlimited
ulimit -c
unlimited
2)core文件的名稱(chēng)和生成路徑
/proc/sys/kernel/core_uses_pid可以控制core文件的文件名中是否添加pid作為擴展.文件內容為1,表示添加pid作為擴展名,生成的core文件格式為core.PID,為0則表示生成的core文件統一命名為core.
如下:
查看core_uses_pid 默認為1,即在core文件名后加入PID
cat /proc/sys/kernel/core_uses_pid
1
設定只產(chǎn)生core的文件名
echo "0" > /proc/sys/kernel/core_uses_pid
查看當前目錄,此時(shí)沒(méi)有core文件
ls
test test.c
執行./test程序,產(chǎn)生core文件.
./test
*** glibc detected *** ./test: double free or corruption (fasttop): 0x09f29008 ***
======= Backtrace: =========
/lib/libc.so.6[0xbd3f7d]
/lib/libc.so.6(cfree+0x90)[0xbd75d0]
./test[0x80483dc]
/lib/libc.so.6(__libc_start_main+0xdc)[0xb83dec]
./test[0x8048301]
======= Memory map: ========
00430000-00431000 r-xp 00430000 00:00 0 [vdso]
00b51000-00b6a000 r-xp 00000000 08:01 3704501 /lib/ld-2.5.so
00b6a000-00b6b000 r-xp 00018000 08:01 3704501 /lib/ld-2.5.so
00b6b000-00b6c000 rwxp 00019000 08:01 3704501 /lib/ld-2.5.so
00b6e000-00ca5000 r-xp 00000000 08:01 3704502 /lib/libc-2.5.so
00ca5000-00ca7000 r-xp 00137000 08:01 3704502 /lib/libc-2.5.so
00ca7000-00ca8000 rwxp 00139000 08:01 3704502 /lib/libc-2.5.so
00ca8000-00cab000 rwxp 00ca8000 00:00 0
00dab000-00db6000 r-xp 00000000 08:01 3704511 /lib/libgcc_s-4.1.1-20070105.so.1
00db6000-00db7000 rwxp 0000a000 08:01 3704511 /lib/libgcc_s-4.1.1-20070105.so.1
08048000-08049000 r-xp 00000000 08:01 327681 /root/test
08049000-0804a000 rw-p 00000000 08:01 327681 /root/test
09f29000-09f4a000 rw-p 09f29000 00:00 0
b7e00000-b7e21000 rw-p b7e00000 00:00 0
b7e21000-b7f00000 ---p b7e21000 00:00 0
b7f0c000-b7f0d000 rw-p b7f0c000 00:00 0
b7f21000-b7f22000 rw-p b7f21000 00:00 0
bfd58000-bfd6d000 rw-p bfd58000 00:00 0 [stack]
Aborted (core dumped)
再次查看,系統中已經(jīng)產(chǎn)生core文件,且文件名為core
ls
core test test.c
/proc/sys/kernel/core_pattern可以控制core文件保存位置和文件名格式,如下:
echo "/tmp/core-%e-%u-%s" > /proc/sys/kernel/core_pattern
%e表示添加命令名
%u表示添加當前uid
%s表示添加導致產(chǎn)生core的信號
我們運行上面的小程序,如下:
./test
*** glibc detected *** ./test: double free or corruption (fasttop): 0x09dbe008 ***
======= Backtrace: =========
/lib/libc.so.6[0xbd3f7d]
/lib/libc.so.6(cfree+0x90)[0xbd75d0]
./test[0x80483dc]
/lib/libc.so.6(__libc_start_main+0xdc)[0xb83dec]
./test[0x8048301]
======= Memory map: ========
004e3000-004e4000 r-xp 004e3000 00:00 0 [vdso]
00b51000-00b6a000 r-xp 00000000 08:01 3704501 /lib/ld-2.5.so
00b6a000-00b6b000 r-xp 00018000 08:01 3704501 /lib/ld-2.5.so
00b6b000-00b6c000 rwxp 00019000 08:01 3704501 /lib/ld-2.5.so
00b6e000-00ca5000 r-xp 00000000 08:01 3704502 /lib/libc-2.5.so
00ca5000-00ca7000 r-xp 00137000 08:01 3704502 /lib/libc-2.5.so
00ca7000-00ca8000 rwxp 00139000 08:01 3704502 /lib/libc-2.5.so
00ca8000-00cab000 rwxp 00ca8000 00:00 0
00dab000-00db6000 r-xp 00000000 08:01 3704511 /lib/libgcc_s-4.1.1-20070105.so.1
00db6000-00db7000 rwxp 0000a000 08:01 3704511 /lib/libgcc_s-4.1.1-20070105.so.1
08048000-08049000 r-xp 00000000 08:01 327681 /root/test
08049000-0804a000 rw-p 00000000 08:01 327681 /root/test
09dbe000-09ddf000 rw-p 09dbe000 00:00 0
b7e00000-b7e21000 rw-p b7e00000 00:00 0
b7e21000-b7f00000 ---p b7e21000 00:00 0
b7f77000-b7f78000 rw-p b7f77000 00:00 0
b7f8c000-b7f8d000 rw-p b7f8c000 00:00 0
bf8cc000-bf8e2000 rw-p bf8cc000 00:00 0 [stack]
Aborted (core dumped)
ls -l /tmp
total 212
-rw------- 1 root root 413696 Apr 10 10:51 core-test-0-6.2663
注:test為程序名,0代表我們是用root執行的程序,6代表我們收到的是SIGABRT(6)信號
下面我們將core文件名格式調整為加入時(shí)間,PID,主機名,如下:
echo "/tmp/core-%t-%p-%h" > /proc/sys/kernel/core_pattern
%t添加core文件生成時(shí)的unix時(shí)間
%p添加pid
%h添加主機名
再次運行test程序,輸出略.
再次查看core文件
ls -l /tmp
total 372
-rw------- 1 root root 413696 Apr 10 10:52 core-1302447166-2677-test1
注:其中的1302447166為UNIX時(shí)間.2677為PID,test1為主機名,我們注意到此時(shí)沒(méi)有core_uses_pid的PID,說(shuō)明我們加入了PID的設定后,系統屏蔽了core_user_pid的設定.
3)setuid與core文件
如果一個(gè)程序設定了setuid,那么普通用戶(hù)在默認情況下是無(wú)法生成core文件的,只有更改/proc/sys/fs/suid_dumpable才可以,如下:
查看suid_dumpable,默認為0,即不產(chǎn)生core文件.
cat /proc/sys/fs/suid_dumpable
0
為下面的試驗順利,我們將core_pattern的格式下做調整,加入用戶(hù)名,如下:
echo "/tmp/core-%e-%u-%s" > /proc/sys/kernel/core_pattern
查看/tmp/test(溢出程序)的權限,并加入setuid權限,如下:
ls -l /tmp/test
-rwxr-xr-x 1 root root 4811 Apr 10 10:41 /tmp/test
chmod +s /tmp/test
ls -l /tmp/test
-rwsr-sr-x 1 root root 4811 Apr 10 10:41 /tmp/test
用普通用戶(hù)運行test程序,如下:
su - test
./test
*** glibc detected *** ./test: double free or corruption (fasttop): 0x08279008 ***
======= Backtrace: =========
/lib/libc.so.6[0x175f7d]
/lib/libc.so.6(cfree+0x90)[0x1795d0]
./test[0x80483dc]
/lib/libc.so.6(__libc_start_main+0xdc)[0x125dec]
./test[0x8048301]
======= Memory map: ========
00110000-00247000 r-xp 00000000 08:01 3704502 /lib/libc-2.5.so
00247000-00249000 r-xp 00137000 08:01 3704502 /lib/libc-2.5.so
00249000-0024a000 rwxp 00139000 08:01 3704502 /lib/libc-2.5.so
0024a000-0024d000 rwxp 0024a000 00:00 0
00b51000-00b6a000 r-xp 00000000 08:01 3704501 /lib/ld-2.5.so
00b6a000-00b6b000 r-xp 00018000 08:01 3704501 /lib/ld-2.5.so
00b6b000-00b6c000 rwxp 00019000 08:01 3704501 /lib/ld-2.5.so
00bc3000-00bc4000 r-xp 00bc3000 00:00 0 [vdso]
00dab000-00db6000 r-xp 00000000 08:01 3704511 /lib/libgcc_s-4.1.1-20070105.so.1
00db6000-00db7000 rwxp 0000a000 08:01 3704511 /lib/libgcc_s-4.1.1-20070105.so.1
08048000-08049000 r-xp 00000000 08:01 327681 /tmp/test
08049000-0804a000 rw-p 00000000 08:01 327681 /tmp/test
08279000-0829a000 rw-p 08279000 00:00 0
b7e00000-b7e21000 rw-p b7e00000 00:00 0
b7e21000-b7f00000 ---p b7e21000 00:00 0
b7fd0000-b7fd1000 rw-p b7fd0000 00:00 0
b7fe5000-b7fe6000 rw-p b7fe5000 00:00 0
bfcb0000-bfcc5000 rw-p bfcb0000 00:00 0 [stack]
Aborted
確認沒(méi)有產(chǎn)生core文件
ls
ssh-jioUnx2695 ssh-psRqMA2559 ssh-TbaOaU2593 test
我們切換到root用戶(hù)將suid_dumpable設定為1,如下:
echo 1 > /proc/sys/fs/suid_dumpable
再次運行程序,發(fā)現已經(jīng)產(chǎn)生了core文件,如下:
./test
*** glibc detected *** ./test: double free or corruption (fasttop): 0x0816a008 ***
======= Backtrace: =========
/lib/libc.so.6[0xbd3f7d]
/lib/libc.so.6(cfree+0x90)[0xbd75d0]
./test[0x80483dc]
/lib/libc.so.6(__libc_start_main+0xdc)[0xb83dec]
./test[0x8048301]
======= Memory map: ========
0022f000-00230000 r-xp 0022f000 00:00 0 [vdso]
00b51000-00b6a000 r-xp 00000000 08:01 3704501 /lib/ld-2.5.so
00b6a000-00b6b000 r-xp 00018000 08:01 3704501 /lib/ld-2.5.so
00b6b000-00b6c000 rwxp 00019000 08:01 3704501 /lib/ld-2.5.so
00b6e000-00ca5000 r-xp 00000000 08:01 3704502 /lib/libc-2.5.so
00ca5000-00ca7000 r-xp 00137000 08:01 3704502 /lib/libc-2.5.so
00ca7000-00ca8000 rwxp 00139000 08:01 3704502 /lib/libc-2.5.so
00ca8000-00cab000 rwxp 00ca8000 00:00 0
00dab000-00db6000 r-xp 00000000 08:01 3704511 /lib/libgcc_s-4.1.1-20070105.so.1
00db6000-00db7000 rwxp 0000a000 08:01 3704511 /lib/libgcc_s-4.1.1-20070105.so.1
08048000-08049000 r-xp 00000000 08:01 327681 /tmp/test
08049000-0804a000 rw-p 00000000 08:01 327681 /tmp/test
0816a000-0818b000 rw-p 0816a000 00:00 0
b7e00000-b7e21000 rw-p b7e00000 00:00 0
b7e21000-b7f00000 ---p b7e21000 00:00 0
b7fcb000-b7fcc000 rw-p b7fcb000 00:00 0
b7fe0000-b7fe1000 rw-p b7fe0000 00:00 0
bf983000-bf999000 rw-p bf983000 00:00 0 [stack]
Aborted (core dumped)
ls
core-test-500-6.2945 ssh-jioUnx2695 ssh-psRqMA2559 ssh-TbaOaU2593 test
4)強制生成core文件
下面有兩種生成core文件的技巧:
第一種是發(fā)送信號給進(jìn)程,如下:
kill -s SIGSEGV $$
注:此時(shí)會(huì )生成一個(gè)core文件,但這將會(huì )殺掉自己當前的進(jìn)程,使自己要重啟bash.
第二種方法是用gcore,如下:
gcore $$
(no debugging symbols found)
Using host libthread_db library "/lib/libthread_db.so.1".
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
0x003aa402 in __kernel_vsyscall ()
Saved corefile core.2561
二)創(chuàng )建自己的黑匣子
在軟件開(kāi)發(fā)的過(guò)程中,可以在應用程序中建立一個(gè)黑匣子,使用它代替調試器有如下的好處:
1)它將為用戶(hù)提供一份詳細的控制日志.這樣的話(huà),用戶(hù)即可以保存一些有用的調試信息,又可以不損失性能.
2)當試圖調試堆棧溢出的時(shí)候,尤其有用,回想典型的堆棧溢出,它會(huì )導致跟蹤信息無(wú)效,使調試幾乎無(wú)用.
我們用下面的程序來(lái)演示創(chuàng )建黑匣子的過(guò)程,如下:
#include <string.h>
#include <stdlib.h>
#include <stdarg.h>
char tracebuf[4096] = "";
char *mstart = tracebuf;
int dbgprintf(const char *fmt, ...)
__attribute__((__format__(__printf__, 1, 2)));
int dbgprintf(const char *fmt, ...)
{
int n = 0;
va_list ap;
va_start(ap, fmt);
int nchars = sizeof(tracebuf) - (mstart - tracebuf);
if (nchars <= 2){
mstart = tracebuf;
nchars = sizeof(tracebuf);
}
n = vsnprintf(mstart, nchars, fmt, ap);
mstart += n +1;
va_end(ap);
return n;
}
int defective(int x)
{
int y = 1;
dbgprintf("defective(%u)", x);
if (x == 10){
dbgprintf("time to corrupt the stack!");
memset(&y, 0xa5, sizeof(y) + 128);
dbgprintf("I'm still here; returning now.");
return 0;
}
return defective(x+1);
}
int main (int argc, char *argv[])
{
defective(1);
dbgprintf("exiting...");
return 0;
}
編譯:
gcc -g trace-buffer.c -o trace-buffer
注:本程序通過(guò)遞歸處理,當達到第10次時(shí)將導致內存溢出,而這些被dbgprintf寫(xiě)入到全局的內存塊中,再此之前我們也通過(guò)dbgprintf函數將調試信息寫(xiě)入到全局的內存塊中.
運行程序:
./trace-buffer
Segmentation fault (core dumped)
下面用gdb對本次溢出進(jìn)行調試,如下:
gdb ./trace-buffer core.2627
GNU gdb Red Hat Linux (6.5-16.el5rh)
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu"...Using host libthread_db library "/lib/libthread_db.so.1".
warning: Can't read pathname for load map: Input/output error.
Reading symbols from /lib/libc.so.6...done.
Loaded symbols for /lib/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
Core was generated by `./trace-buffer'.
Program terminated with signal 11, Segmentation fault. /*我們打開(kāi)core,發(fā)現當時(shí)是出現了段錯誤*/
#0 0x0804847d in defective (x=Cannot access memory at address 0xa5a5a5ad
) at trace-buffer.c:45
45 }
(gdb) bt /*用bt命令打印輸出堆棧信息*/
#0 0x0804847d in defective (x=Cannot access memory at address 0xa5a5a5ad
) at trace-buffer.c:45
Cannot access memory at address 0xa5a5a5a9
(gdb) x/15s &tracebuf /*用x命令輸出內存地址tracebuf的信息*/
0x8049720 <tracebuf>: "defective(1)"
0x804972d <tracebuf+13>: "defective(2)"
0x804973a <tracebuf+26>: "defective(3)"
0x8049747 <tracebuf+39>: "defective(4)"
0x8049754 <tracebuf+52>: "defective(5)"
0x8049761 <tracebuf+65>: "defective(6)"
0x804976e <tracebuf+78>: "defective(7)"
0x804977b <tracebuf+91>: "defective(8)"
0x8049788 <tracebuf+104>: "defective(9)"
0x8049795 <tracebuf+117>: "defective(10)"
0x80497a3 <tracebuf+131>: "time to corrupt the stack!"
0x80497be <tracebuf+158>: "I'm still here; returning now."
0x80497dd <tracebuf+189>: ""
0x80497de <tracebuf+190>: ""
0x80497df <tracebuf+191>: ""
(gdb) quit
注:我們看到用bt命令只看到了段錯誤的信息,而通過(guò)黑匣子看到了我們整個(gè)程序的運行過(guò)程,包括程序出錯溢出時(shí)所發(fā)生的一切.
三)獲取運行時(shí)的堆棧軌跡(gstack)
獲得程序堆棧軌跡(bactrace)最簡(jiǎn)單的方法是使用gdb.這里的缺點(diǎn)是它依附于正在執行的進(jìn)程,需要調試器將進(jìn)程停下來(lái),輸入一些命令,然后再讓進(jìn)程繼續運行.如果常常需要這么做,反復中斷執行的程序會(huì )造成大量的時(shí)間浪費.
gstack的優(yōu)點(diǎn)是其所消耗的執行時(shí)間少,而且是非交互式的.
下面是gstack應用的一個(gè)例子:
終端1)
top
終端2)
gstack `pgrep top`
#0 0x004eb402 in __kernel_vsyscall ()
#1 0x00c33f7d in ___newselect_nocancel () from /lib/libc.so.6
#2 0x08051187 in ?? ()
#3 0x00b83dec in __libc_start_main () from /lib/libc.so.6
#4 0x08049741 in ?? ()