if和switch效率的再研究
昨天發(fā)現了一本叫做CSAPP的書(shū),終于找到了關(guān)于switch問(wèn)題的解答。
這是一段C代碼:
/* $begin switch-c */
int switch_eg(int x)
{
int result = x;
switch (x) {
case 100:
result *= 13;
break;
case 102:
result += 10;
/* Fall through */
case 103:
result += 11;
break;
case 104:
case 106:
result *= result;
break;
default:
result = 0;
}
return result;
}
/* $end switch-c */
用GCC匯編出來(lái)的代碼如下:
.file "switch.c"
.version "01.01"
gcc2_compiled.:
.text
.align 4
.globl switch_eg
.type switch_eg,@function
switch_eg:
pushl %ebp
movl %esp,%ebp
movl 8(%ebp),%edx
leal -100(%edx),%eax
cmpl ,%eax
ja .L9
jmp *.L10(,%eax,4)
.p2align 4,,7
.section .rodata
.align 4
.align 4
.L10:
.long .L4
.long .L9
.long .L5
.long .L6
.long .L8
.long .L9
.long .L8
.text
.p2align 4,,7
.L4:
leal (%edx,%edx,2),%eax
leal (%edx,%eax,4),%edx
jmp .L3
.p2align 4,,7
.L5:
addl ,%edx
.L6:
addl ,%edx
jmp .L3
.p2align 4,,7
.L8:
imull %edx,%edx
jmp .L3
.p2align 4,,7
.L9:
xorl %edx,%edx
.L3:
movl %edx,%eax
movl %ebp,%esp
popl %ebp
ret
.Lfe1:
.size switch_eg,.Lfe1-switch_eg
.ident "GCC: (GNU) 2.95.3 20010315 (release)"
在上面的匯編代碼中我們可以很清楚的看到switch部分被分配了一個(gè)連續的查找表,switch case中不連續的部分也被添加上了相應的條目,switch表的大小不是根據case語(yǔ)句的多少,而是case的最大值的最小值之間的間距。在選擇相應的分支時(shí),會(huì )先有一個(gè)cmp子句,如果大于查找表的最大值,則跳轉到default子句。而其他所有的case語(yǔ)句的耗時(shí)都回事O(1)。
相比于if-else結構,switch的效率絕對是要高很多的,但是switch使用查找表的方式?jīng)Q定了case的條件必須是一個(gè)連續的常量。而if-else則可以靈活的多。
發(fā)表于: 2008-10-27,修改于: 2008-10-27 21:28 已瀏覽141次,有評論2條
推薦投訴網(wǎng)友評論
vfdff 時(shí)間:2008-10-31 22:58:52 IP地址:219.245.118.★
你把 case 106 修改為case 119,你就會(huì )發(fā)現查找表沒(méi)有你想象中的大,好像這種優(yōu)化是不確定的》
我把它改成 case 119 (其余不變)后,IDA的反匯編結果:
.text:00401020 switch_eg proc near ; CODE XREF: j_switch_egj
.text:00401020
.text:00401020 var_48 = dword ptr -48h
.text:00401020 var_8 = dword ptr -8
.text:00401020 var_4 = dword ptr -4
.text:00401020 arg_0 = dword ptr 8
.text:00401020
.text:00401020 push ebp
.text:00401021 mov ebp, esp
.text:00401023 sub esp, 48h
.text:00401026 push ebx
.text:00401027 push esi
.text:00401028 push edi
.text:00401029 lea edi, [ebp+var_48]
.text:0040102C mov ecx, 12h
.text:00401031 mov eax, 0CCCCCCCCh
.text:00401036 rep stosd
.text:00401038 mov eax, [ebp+arg_0]
.text:0040103B mov [ebp+var_4], eax
.text:0040103E mov ecx, [ebp+arg_0]
.text:00401041 mov [ebp+var_8], ecx
.text:00401044 mov edx, [ebp+var_8]
.text:00401047 sub edx, 64h
.text:0040104A mov [ebp+var_8], edx
.text:0040104D cmp [ebp+var_8], 13h
.text:00401051 ja short loc_401090
.text:00401053 mov ecx, [ebp+var_8]
.text:00401056 xor eax, eax
.text:00401058 mov al, ds:byte_4010B5[ecx]
.text:0040105E jmp ds:off_4010A1[eax*4]
.text:00401065
.text:00401065 loc_401065: ; DATA XREF: .text:off_4010A1o
.text:00401065 mov edx, [ebp+var_4]
.text:00401068 imul edx, 0Dh
.text:0040106B mov [ebp+var_4], edx
.text:0040106E jmp short loc_401097
.text:00401070 ; 哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪?
.text:00401070
.text:00401070 loc_401070: ; CODE XREF: switch_eg+3Ej
.text:00401070 ; DATA XREF: .text:004010A5o
.text:00401070 mov eax, [ebp+var_4]
.text:00401073 add eax, 0Ah
.text:00401076 mov [ebp+var_4], eax
.text:00401079
.text:00401079 loc_401079: ; CODE XREF: switch_eg+3Ej
.text:00401079 ; DATA XREF: .text:004010A9o
.text:00401079 mov ecx, [ebp+var_4]
.text:0040107C add ecx, 0Bh
.text:0040107F mov [ebp+var_4], ecx
.text:00401082 jmp short loc_401097
.text:00401084 ; 哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪?
.text:00401084
.text:00401084 loc_401084: ; CODE XREF: switch_eg+3Ej
.text:00401084 ; DATA XREF: .text:004010ADo
.text:00401084 mov edx, [ebp+var_4]
.text:00401087 imul edx, [ebp+var_4]
.text:0040108B mov [ebp+var_4], edx
.text:0040108E jmp short loc_401097
.text:00401090 ; 哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪?
.text:00401090
.text:00401090 loc_401090: ; CODE XREF: switch_eg+31j
.text:00401090 ; switch_eg+3Ej
.text:00401090 ; DATA XREF: ...
.text:00401090 mov [ebp+var_4], 0
.text:00401097
.text:00401097 loc_401097: ; CODE XREF: switch_eg+4Ej
.text:00401097 ; switch_eg+62j ...
.text:00401097 mov eax, [ebp+var_4]
.text:0040109A pop edi
.text:0040109B pop esi
.text:0040109C pop ebx
.text:0040109D mov esp, ebp
.text:0040109F pop ebp
.text:004010A0 retn
.text:004010A0 switch_eg endp
.text:004010A0
.text:004010A0 ; 哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪哪?
.text:004010A1 off_4010A1 dd offset loc_401065 ; DATA XREF: switch_eg+3Er
.text:004010A5 dd offset loc_401070
.text:004010A9 dd offset loc_401079
.text:004010AD dd offset loc_401084
.text:004010B1 dd offset loc_401090
.text:004010B5 byte_4010B5 db 0 ; DATA XREF: switch_eg+38r
.text:004010B6 dw 104h
.text:004010B8 dd 4040302h, 3 dup(4040404h), 0CCCCCC03h, 0Dh dup(0CCCCCCCCh)
.text:00401100
ubuntuer 時(shí)間:2008-11-01 10:31:42 IP地址:58.19.58.★
確實(shí),不過(guò)現在編譯器優(yōu)化了吧,那本書(shū)有點(diǎn)老了的.gcc version 4.2.3
switch確實(shí)是查表是可以確定的^_^