admin 管理员组

文章数量: 1086019

CSAPP lab2 经典的bomblab二次学习

csapp上的实验一直收到广泛好评,其中第二个实验bomblab则是一个学习C语言及其转化汇编指令的非常好的小项目,前段时间刚刚借鉴了很多教程完成了整个lab的6个小实验(还有一个secret phase实在太忙一直没顾得上),于是现在打算自己重新做一遍整个实验,好好回忆一下当初学到的知识。

电脑系统是Ubuntu20.04,采用objdump对二进制文件进行反汇编

首先把bomb.c文件贴上来

/**************************************************************************** Dr. Evil's Insidious Bomb, Version 1.1* Copyright 2011, Dr. Evil Incorporated. All rights reserved.** LICENSE:** Dr. Evil Incorporated (the PERPETRATOR) hereby grants you (the* VICTIM) explicit permission to use this bomb (the BOMB).  This is a* time limited license, which expires on the death of the VICTIM.* The PERPETRATOR takes no responsibility for damage, frustration,* insanity, bug-eyes, carpal-tunnel syndrome, loss of sleep, or other* harm to the VICTIM.  Unless the PERPETRATOR wants to take credit,* that is.  The VICTIM may not distribute this bomb source code to* any enemies of the PERPETRATOR.  No VICTIM may debug,* reverse-engineer, run "strings" on, decompile, decrypt, or use any* other technique to gain knowledge of and defuse the BOMB.  BOMB* proof clothing may not be worn when handling this program.  The* PERPETRATOR will not apologize for the PERPETRATOR's poor sense of* humor.  This license is null and void where the BOMB is prohibited* by law.***************************************************************************/#include <stdio.h>
#include <stdlib.h>
#include "support.h"
#include "phases.h"/* * Note to self: Remember to erase this file so my victims will have no* idea what is going on, and so they will all blow up in a* spectaculary fiendish explosion. -- Dr. Evil */FILE *infile;int main(int argc, char *argv[])
{char *input;/* Note to self: remember to port this bomb to Windows and put a * fantastic GUI on it. *//* When run with no arguments, the bomb reads its input lines * from standard input. */if (argc == 1) {  infile = stdin;} /* When run with one argument <file>, the bomb reads from <file> * until EOF, and then switches to standard input. Thus, as you * defuse each phase, you can add its defusing string to <file> and* avoid having to retype it. */else if (argc == 2) {if (!(infile = fopen(argv[1], "r"))) {printf("%s: Error: Couldn't open %s\n", argv[0], argv[1]);exit(8);}}/* You can't call the bomb with more than 1 command line argument. */else {printf("Usage: %s [<input_file>]\n", argv[0]);exit(8);}/* Do all sorts of secret stuff that makes the bomb harder to defuse. */initialize_bomb();printf("Welcome to my fiendish little bomb. You have 6 phases with\n");printf("which to blow yourself up. Have a nice day!\n");/* Hmm...  Six phases must be more secure than one phase! */input = read_line();             /* Get input                   */phase_1(input);                  /* Run the phase               */phase_defused();                 /* Drat!  They figured it out!* Let me know how they did it. */// Border relations with Canada have never been better.printf("Phase 1 defused. How about the next one?\n");/* The second phase is harder.  No one will ever figure out* how to defuse this... */input = read_line();phase_2(input);phase_defused();//1 2 4 8 16 32printf("That's number 2.  Keep going!\n");/* I guess this is too easy so far.  Some more complex code will* confuse people. */input = read_line();phase_3(input);phase_defused();printf("Halfway there!\n");// [0, 207] [1, 311] [2, 707] [3, 256] [4, 389] [5, 206] [6, 682]/* Oh yeah?  Well, how good is your math?  Try on this saucy problem! */input = read_line();phase_4(input);phase_defused();printf("So you got that one.  Try this one.\n");//[7, 0]/* Round and 'round in memory we go, where we stop, the bomb blows! */input = read_line();phase_5(input);phase_defused();printf("Good work!  On to the next...\n");//ionefg/* This phase will never be used, since no one will get past the* earlier ones.  But just in case, make this one extra hard. */input = read_line();phase_6(input);phase_defused();//4 3 2 1 6 5/* Wow, they got it!  But isn't something... missing?  Perhaps* something they overlooked?  Mua ha ha ha ha! */return 0;
}

简要阅读一下代码可以看到有6个phase,每一个会检索你从终端的输入,如果符合要求就进入下一个phase,如果不符合就会错误。我们需要通过阅读反汇编出的汇编代码以及利用gdb调试功能来对输入进行推导。

直接来看phase_1的汇编代码

0000000000400ee0 <phase_1>:400ee0:    48 83 ec 08              sub    $0x8,%rsp400ee4:    be 00 24 40 00           mov    $0x402400,%esi400ee9:    e8 4a 04 00 00           call   401338 <strings_not_equal>400eee:    85 c0                    test   %eax,%eax400ef0:    74 05                    je     400ef7 <phase_1+0x17>400ef2:    e8 43 05 00 00           call   40143a <explode_bomb>400ef7:    48 83 c4 08              add    $0x8,%rsp400efb:    c3                       ret    

从其它教程中学到的非常有利于阅读汇编代码的方法,就是先把带有汇编指令的代码转化为无需指令的方式,变量名称可以用寄存器名称代替。于是phase_1可以转化为:

rsp -= 8;
esi = 0x402400;
call strings_not_equal;
test eax, eax
je 400ef7;
explode_bomb();400ef7:rsp += 8;

首先这里test je指令的用法是:

x86平台上使用汇编如何判断一个值是否为0?
一般会使用该指令:
test %rax %rax
je xxx
test指令会判断后面两个操作数执行AND操作,结果为0就设置zero flag,然后搭配je跳转指令从而实现对一个值是否为0的判断。
如果%rax值为0,那么他们相与才会等于0,否则该值不会为0.
————————————————
版权声明:本文为CSDN博主「程序猿Ricky的日常干货」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。
原文链接:

还有一些需要加入的知识是,对于函数来说,一些寄存器有其独特的作用,比如edi寄存器用来保存函数的第一个参数,esi寄存器用来保存第二个参数,并且函数的(如果有)前六个参数都有其独特的寄存器保存,超过6个的参数则会保存在程序栈中,而函数的返回值会保存在eax寄存器中,我们将上述转化成c风格的代码:

phase_1(edi) {rsp -= 8;esi = 0x402400;eax = strings_not_equal(edi,esi);if(eax != 0) explode_bomb();rsp += 8;    
}    

比较的字符串首地址就是0x402400这个首地址,我们可以在gdb中对这个地址进行查看

gdb中 x/s addr指令能以字符流的形式查看addr内存地址,由此我们可以得出第一题的答案。

马不停蹄我们来看phase_2的汇编:

0000000000400efc <phase_2>:400efc:    55                       push   %rbp400efd:    53                       push   %rbx400efe:    48 83 ec 28              sub    $0x28,%rsp400f02:    48 89 e6                 mov    %rsp,%rsi400f05:    e8 52 05 00 00           call   40145c <read_six_numbers>400f0a:    83 3c 24 01              cmpl   $0x1,(%rsp)400f0e:    74 20                    je     400f30 <phase_2+0x34>400f10:    e8 25 05 00 00           call   40143a <explode_bomb>400f15:    eb 19                    jmp    400f30 <phase_2+0x34>400f17:    8b 43 fc                 mov    -0x4(%rbx),%eax400f1a:    01 c0                    add    %eax,%eax400f1c:    39 03                    cmp    %eax,(%rbx)400f1e:    74 05                    je     400f25 <phase_2+0x29>400f20:    e8 15 05 00 00           call   40143a <explode_bomb>400f25:    48 83 c3 04              add    $0x4,%rbx400f29:    48 39 eb                 cmp    %rbp,%rbx400f2c:    75 e9                    jne    400f17 <phase_2+0x1b>400f2e:    eb 0c                    jmp    400f3c <phase_2+0x40>400f30:    48 8d 5c 24 04           lea    0x4(%rsp),%rbx400f35:    48 8d 6c 24 18           lea    0x18(%rsp),%rbp400f3a:    eb db                    jmp    400f17 <phase_2+0x1b>400f3c:    48 83 c4 28              add    $0x28,%rsp400f40:    5b                       pop    %rbx400f41:    5d                       pop    %rbp400f42:    c3                       ret    

代码长度一下子起来了,但是没关系,先按照上面的情况来,先把它变成能看得懂的方式(省略开头和结尾处的运行状态保存与复位代码):

rsp -= 40;
rsi = rsp;
call read_six_numbers;
if(*(rsp) == 1) goto 400f30
else explode_bomb();400f30:rbx = rsp + 4;rbp = rsp + 24goto 400f17;400f17:eax = *(rbx - 4);eax += eax;if(*(rbx) == eax) goto 400f25;else explode_bomb();400f25:rbx += 4;if(rbx != rbp) goto 400f17;else goto 400f3c;  //即运行正常退出进入下一个阶段

将它转化为c风格代码后为:

phase_2() {//栈顶指针移动rsi = rsp;read_six_number();if(*(rsp) != 1) explode_bomb();rbp = rsp + 24;for(rbx = rsp + 4;; rbx != rbp; rbx += 4) {eax = *(rbx - 4);eax += eax;if(*(rbx) == eax) continue;else explode_bomb();}return ;
}

这样一看就很容易明白了,首先一看函数名称read_six_number可见是读入了6个数字,循环明显是一种遍历方式,通过rbx += 4可以看出是一个数占4个字节,大概率就是int了,第五行的判断说明如果第一个读的数不是1就会直接bomb,所以第一个数要是1

细看for循环,rbx初始的rsp+4的值实际上是所存入的第二个数的地址,而循环第一次中eax则是第一个数的值,然后加一个本身,再进行判断,如果eax中的值和rbx所存地址处的值不一样就会bomb,这样一看就很明显,实际上就是要求6个数,第一个数是1,其余每个数都是前一个数加上自己,所以最后答案就是1 2 4 8 16 32

光速飞到phase_3上,老规矩先贴汇编代码:

0000000000400f43 <phase_3>:400f43:    48 83 ec 18              sub    $0x18,%rsp400f47:    48 8d 4c 24 0c           lea    0xc(%rsp),%rcx400f4c:    48 8d 54 24 08           lea    0x8(%rsp),%rdx400f51:    be cf 25 40 00           mov    $0x4025cf,%esi400f56:    b8 00 00 00 00           mov    $0x0,%eax400f5b:    e8 90 fc ff ff           call   400bf0 <__isoc99_sscanf@plt>400f60:    83 f8 01                 cmp    $0x1,%eax400f63:    7f 05                    jg     400f6a <phase_3+0x27>400f65:    e8 d0 04 00 00           call   40143a <explode_bomb>400f6a:    83 7c 24 08 07           cmpl   $0x7,0x8(%rsp)400f6f:    77 3c                    ja     400fad <phase_3+0x6a>400f71:    8b 44 24 08              mov    0x8(%rsp),%eax400f75:    ff 24 c5 70 24 40 00     jmp    *0x402470(,%rax,8)400f7c:    b8 cf 00 00 00           mov    $0xcf,%eax400f81:    eb 3b                    jmp    400fbe <phase_3+0x7b>400f83:    b8 c3 02 00 00           mov    $0x2c3,%eax400f88:    eb 34                    jmp    400fbe <phase_3+0x7b>400f8a:    b8 00 01 00 00           mov    $0x100,%eax400f8f:    eb 2d                    jmp    400fbe <phase_3+0x7b>400f91:    b8 85 01 00 00           mov    $0x185,%eax400f96:    eb 26                    jmp    400fbe <phase_3+0x7b>400f98:    b8 ce 00 00 00           mov    $0xce,%eax400f9d:    eb 1f                    jmp    400fbe <phase_3+0x7b>400f9f:    b8 aa 02 00 00           mov    $0x2aa,%eax400fa4:    eb 18                    jmp    400fbe <phase_3+0x7b>400fa6:    b8 47 01 00 00           mov    $0x147,%eax400fab:    eb 11                    jmp    400fbe <phase_3+0x7b>400fad:    e8 88 04 00 00           call   40143a <explode_bomb>400fb2:    b8 00 00 00 00           mov    $0x0,%eax400fb7:    eb 05                    jmp    400fbe <phase_3+0x7b>400fb9:    b8 37 01 00 00           mov    $0x137,%eax400fbe:    3b 44 24 0c              cmp    0xc(%rsp),%eax400fc2:    74 05                    je     400fc9 <phase_3+0x86>400fc4:    e8 71 04 00 00           call   40143a <explode_bomb>400fc9:    48 83 c4 18              add    $0x18,%rsp400fcd:    c3                       ret    

真行啊,原来还能更长,老规矩先翻译一下

rsp -= 0x18;
rcx = rsp + 12;
rdx = rsp + 8;
esi = 0x4025cf;
eax = 0;
call __isoc99_sscanf@plt;
if(eax > 1) goto 400f6a;
call explode_bomb;400f6a:if(*(rsp + 8) > 7) call explode_bomb;eax = *(rsp + 8);goto 8*eax + 0x402470;
400f7c:eax = 0xcf;    //207goto 400fbe;
400f83:eax = 0x2c3;   //707goto 400fbe;
400f8a:eax = 0x100;   //256goto 400fbe;
400f91:eax = 0x185;   //389goto 400fbe;
400f98:eax = 0xce;    //206goto 400fbe;
400f9f:eax = 0x2aa;   //682goto 400fbe;
400fa6:eax = 0x147;   //327goto 400fbe;
400fb2:eax = 0;goto 400fbe;
400fb9:eax = 0x137;if(eax == *(rsp + 12)) goto 400fc9;  //成功返回explode_bomb;

这个乍一看可能有点奇怪,感觉和我们熟知的一些程序逻辑有点不同,但其实敏感一些的可能会猜到这是一个跳转表,switch case逻辑的一个汇编代码实现。

还可以看到在跳转表逻辑之前,调用了一个C99版本的scanf函数,它的第一个参数存放在esi中,在地址0x4025cf中,在scanf函数中第一个参数一般是输入数据的类型模式format,我们通过gdb调试看一下这里放了什么

所以在这里输入了两个整型数据,scanf函数的格式应该就是

scanf("%d %d",rdx,rcx); //rdx和rcx分别是存储两个整型数据的寄存器

因此从上面的rdx = rsp + 8, rcx = rsp + 12可知,在栈的rsp + 8处存放了存储的第一个数,在rsp + 12处存放了第二个。那么转到4006fa处,if判断所得出的应该是,如果输入第一个数的值大于7,就直接bomb,否则eax = 第一个数,然后转到跳转表。

因此,我们可以发现,0x402470应该就是跳转表的首地址,我们通过gdb调试看一下。

打印指定内存地址的值
使用x命令来打印内存的值,格式为x/nfu addr,以f格式打印从addr开始的n个长度单元为u的内存值。
n:输出单元的个数
f:输出格式,如x表示以16进制输出,o表示以8进制输出,默认为x
u:一个单元的长度,b表示1个byte,h表示2个byte(half word),w表示4个byte,g表示8个byte(giant word)
————————————————
版权声明:本文为CSDN博主「高性能架构探索」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。
原文链接:

对照汇编指令,可以看出这正是后面汇编指令运行的地址,也就是说当运行到跳转表时,通过跳到指定的地址处,该处存放着下一条指令的地址。因此从这里可以得出跳转表实际上为:

0

1

2

3

4

5

6

207

311

707

256

389

206

682

只要输入一组跳转表索引和对应的值即可。

终于过半了,我们来看phase_4:

000000000040100c <phase_4>:40100c:    48 83 ec 18              sub    $0x18,%rsp401010:    48 8d 4c 24 0c           lea    0xc(%rsp),%rcx401015:    48 8d 54 24 08           lea    0x8(%rsp),%rdx40101a:    be cf 25 40 00           mov    $0x4025cf,%esi40101f:    b8 00 00 00 00           mov    $0x0,%eax401024:    e8 c7 fb ff ff           call   400bf0 <__isoc99_sscanf@plt>401029:    83 f8 02                 cmp    $0x2,%eax40102c:    75 07                    jne    401035 <phase_4+0x29>40102e:    83 7c 24 08 0e           cmpl   $0xe,0x8(%rsp)401033:    76 05                    jbe    40103a <phase_4+0x2e>401035:    e8 00 04 00 00           call   40143a <explode_bomb>40103a:    ba 0e 00 00 00           mov    $0xe,%edx40103f:    be 00 00 00 00           mov    $0x0,%esi401044:    8b 7c 24 08              mov    0x8(%rsp),%edi401048:    e8 81 ff ff ff           call   400fce <func4>40104d:    85 c0                    test   %eax,%eax40104f:    75 07                    jne    401058 <phase_4+0x4c>401051:    83 7c 24 0c 00           cmpl   $0x0,0xc(%rsp)401056:    74 05                    je     40105d <phase_4+0x51>401058:    e8 dd 03 00 00           call   40143a <explode_bomb>40105d:    48 83 c4 18              add    $0x18,%rsp401061:    c3                       ret    

看着很短哈,但其实是里面另插了一个函数func4

先不管func4,先来翻译一下:

rsp -= 24;
rcx = rsp + 12;
rdx = rsp + 8;
esi = 0x4025cf;
eax = 0;
call __isoc99_sscanf@plt;
if(eax != 2) call explode_bomb;40102e:if(*(rsp + 8) <= 15) goto 40103a;call explode_bomb;
40103a:edx = 15;esi = 0;edi = *(rsp + 8);call func4;if(eax != 0) explode_bomb();if(*(rsp + 12) == 0) goto 40105d;  //成功返回

很经典的scanf函数,再看一下它的输入是啥:

依然是两个整型数据,第一个保存在rsp + 8处,第二个保存在rsp + 12处。根据第10行可以看出第一个值要小于等于15,要往后看就必须了解func4的内容了,但目前知道的有func4的第一个参数就是输入的第一个数,第二个参数是0,第三个参数是15(如果有这么多参数的话),上代码:

0000000000400fce <func4>:400fce:    48 83 ec 08              sub    $0x8,%rsp400fd2:    89 d0                    mov    %edx,%eax400fd4:    29 f0                    sub    %esi,%eax400fd6:    89 c1                    mov    %eax,%ecx400fd8:    c1 e9 1f                 shr    $0x1f,%ecx400fdb:    01 c8                    add    %ecx,%eax400fdd:    d1 f8                    sar    %eax400fdf:    8d 0c 30                 lea    (%rax,%rsi,1),%ecx400fe2:    39 f9                    cmp    %edi,%ecx400fe4:    7e 0c                    jle    400ff2 <func4+0x24>400fe6:    8d 51 ff                 lea    -0x1(%rcx),%edx400fe9:    e8 e0 ff ff ff           call   400fce <func4>400fee:    01 c0                    add    %eax,%eax400ff0:    eb 15                    jmp    401007 <func4+0x39>400ff2:    b8 00 00 00 00           mov    $0x0,%eax400ff7:    39 f9                    cmp    %edi,%ecx400ff9:    7d 0c                    jge    401007 <func4+0x39>400ffb:    8d 71 01                 lea    0x1(%rcx),%esi400ffe:    e8 cb ff ff ff           call   400fce <func4>401003:    8d 44 00 01              lea    0x1(%rax,%rax,1),%eax401007:    48 83 c4 08              add    $0x8,%rsp40100b:    c3                       ret    

依旧翻译:

400fce:rsp -= 8;eax = edx;eax -= esi;ecx = eax;ecx >> 31;  //逻辑右移eax += ecx;eax >> 1;   //算数右移ecx = rax + rsi;if(edi <= ecx) goto 400ff2;edx = rcx - 1;goto 400fce;
400fee:eax += eax;goto 401007;   //成功
400ff2:eax = 0;if(edi >= ecx) goto 401007;  //成功esi = rcx + 1;goto 400fce;

很明显这是一个递归的过程,我们把它变成c风格:

func4(edi, esi, edx) {eax = edx;eax -= esi;ecx = eax;ecx >> 31;eax += ecx;eax /= 2;ecx = rax + rsi;if(edi > ecx) {edx = rcx - 1;func4(edi, esi, edx);} else {eax = 0;if(edi >= ecx) return ;else {esi = rcx + 1;func4(edi, esi, edx);}}
}    

这道题有个非常简单的做法,第一次调用func4(第一个数,0,15),从第8行的if判断可以看出,只要第一个数比ecx大,那就不可能返回,必须使得第一个数小于等于ecx,进入else分支中,并且同时满足edi>=ecx才行,也就是说。当第一个输入的数等于ecx时,就可以使函数正确返回,那么在第一次调用func4的第8行,此时ecx = 7,也就是说,第一个数需要为7.

再返回去看phase_4整个函数,此时func4()返回回到phase_4的17、18行如下:

rsp -= 24;
rcx = rsp + 12;
rdx = rsp + 8;
esi = 0x4025cf;
eax = 0;
call __isoc99_sscanf@plt;
if(eax != 2) call explode_bomb;40102e:if(*(rsp + 8) <= 15) goto 40103a;call explode_bomb;
40103a:edx = 15;esi = 0;edi = *(rsp + 8);call func4;if(eax != 0) explode_bomb();if(*(rsp + 12) == 0) goto 40105d;  //成功返回

17行处eax是func4的返回值,成功返回0,因此第17行不会bomb,18行的rsp + 12的位置正是存放的第二个数的地址,也就是说第二个数为0即可使phase_4成功返回,因此最后的答案是7 0

还剩最后两题,最后的都是非常有难度的,我们先来看phase_5:

0000000000401062 <phase_5>:401062:    53                       push   %rbx401063:    48 83 ec 20              sub    $0x20,%rsp401067:    48 89 fb                 mov    %rdi,%rbx40106a:    64 48 8b 04 25 28 00     mov    %fs:0x28,%rax401071:    00 00 401073:    48 89 44 24 18           mov    %rax,0x18(%rsp)401078:    31 c0                    xor    %eax,%eax40107a:    e8 9c 02 00 00           call   40131b <string_length>40107f:    83 f8 06                 cmp    $0x6,%eax401082:    74 4e                    je     4010d2 <phase_5+0x70>401084:    e8 b1 03 00 00           call   40143a <explode_bomb>401089:    eb 47                    jmp    4010d2 <phase_5+0x70>40108b:    0f b6 0c 03              movzbl (%rbx,%rax,1),%ecx40108f:    88 0c 24                 mov    %cl,(%rsp)401092:    48 8b 14 24              mov    (%rsp),%rdx401096:    83 e2 0f                 and    $0xf,%edx   401099:    0f b6 92 b0 24 40 00     movzbl 0x4024b0(%rdx),%edx4010a0:    88 54 04 10              mov    %dl,0x10(%rsp,%rax,1)4010a4:    48 83 c0 01              add    $0x1,%rax4010a8:    48 83 f8 06              cmp    $0x6,%rax4010ac:    75 dd                    jne    40108b <phase_5+0x29>4010ae:    c6 44 24 16 00           movb   $0x0,0x16(%rsp)4010b3:    be 5e 24 40 00           mov    $0x40245e,%esi4010b8:    48 8d 7c 24 10           lea    0x10(%rsp),%rdi4010bd:    e8 76 02 00 00           call   401338 <strings_not_equal>4010c2:    85 c0                    test   %eax,%eax4010c4:    74 13                    je     4010d9 <phase_5+0x77>4010c6:    e8 6f 03 00 00           call   40143a <explode_bomb>4010cb:    0f 1f 44 00 00           nopl   0x0(%rax,%rax,1)4010d0:    eb 07                    jmp    4010d9 <phase_5+0x77>4010d2:    b8 00 00 00 00           mov    $0x0,%eax4010d7:    eb b2                    jmp    40108b <phase_5+0x29>4010d9:    48 8b 44 24 18           mov    0x18(%rsp),%rax4010de:    64 48 33 04 25 28 00     xor    %fs:0x28,%rax4010e5:    00 00 4010e7:    74 05                    je     4010ee <phase_5+0x8c>4010e9:    e8 42 fa ff ff           call   400b30 <__stack_chk_fail@plt>4010ee:    48 83 c4 20              add    $0x20,%rsp4010f2:    5b                       pop    %rbx4010f3:    c3                       ret    

长度上可以说根本不让咱失望,我们来转化一下:

push rbx;
rsp -= 32;
rbx = rdi;
rax = %fs:0x28;
*(rsp + 24) = rax;
eax = 0;     // xor %eax, %eax 是快速置0的指令
call string_length;
if(eax == 6) goto 4010d2;
call explode_bomb;
goto 4010d2;40108b:ecx = *(rbx + rax);*(rsp) = cl;rdx = *(rsp);edx &= 0xf;  //去edx寄存器的后四位edx = *(rdx + 0x4024b0);*(rsp + rax + 16) = dl;rax += 1;if(rax != 6) goto 40108b;*(rsp + 22) = 0;esi = 0x40245e;rdi = rsp + 10;call strings_not_equal;if(eax == 0) goto 4010d9;call explode_bomb;
4010d2:eax = 0;goto 40108b;
4010d9:rax = *(rsp + 24);if(rax == %fs:0x28) goto 4010ee;  //成功返回call __stack_chk_fail;

一下子看到可能最难理解的就是第4、5行,实际上这部分和31、32行是对应的,是放置栈缓冲区不应该被修改的地方被修改所放置的金丝雀机制,具体可以去看csapp第三章栈缓冲部分,这里和整个题目关系不大,看不明白实际也不影响。

通过第7、8行的汇编指令,我们可以猜测输入的应该是string类型,并且长度应该为6,否则就会bomb,然后经由4010d2处再进入40108b处的循环最后执行了一个字符串比较并返回,我们将指令继续转成c风格代码(栈相关代码直接忽略):

phase_5(rdi) {rbx = rdi;eax = 0;eax = string_length();if(eax != 6) explode_bomb();eax = 0;for(; eax != 6; eax += 1) {ecx = *(rbx + rax);*(rsp) = cl;     //cl是ecx寄存器的后8位形式,刚好可以保存一个char字符rdx = *(rsp);edx &= 0xf;  //去edx寄存器的后四位edx = *(rdx + 0x4024b0);*(rsp + rax + 16) = dl;   //dl是edx寄存器的后8位形式}*(rsp + 22) = 0;esi = 0x40245e;rdi = rsp + 10;eax = strings_not_equal(rdi, esi);if(eax == 0) return ;else explode_bomb();

不如从后往前先看,看见有固定内存地址的先看看里面有啥,strings_not_equal()里有一个0x40245e,让我们打印一下看看

也刚好是6位字符哈,我当时直接迫不及待输入了,当然是直接bomb,这是因为在前面的循环中对每个字符进行了处理,所以肯定不是这6个了,所以研究的重点还是前面的那个循环

可以看到在每次循环中,8-11行的操作实际上就是在取这个当前字符的位数,然后还和0x4024b0这个地址求和并执行取地址操作,我们来看看0x4024b0里存了啥:

从代码中可以看出,原先长度为6的string经过处理之后,成为了现在的flyers,也就是我们需要通过逆推上去找到原先的string,

从第12行可以看出,flyers需要在0x4024b0的char数组处找到对应的索引:

f

l

y

e

r

s

9

15

14

5

6

7

1001

1111

1110

0011

0110

0111

而这些值又是与0x00001111 &过的结果,也就是说,我们只要找到和符合条件的字符都可以成功,经查询ASCII码表后,可以得出一组小写字母的解(不唯一),为ionefg.

终于!!!来到了最后一关(不算隐藏关卡)老规矩先上汇编!!!

00000000004010f4 <phase_6>:4010f4:    41 56                    push   %r144010f6:    41 55                    push   %r134010f8:    41 54                    push   %r124010fa:    55                       push   %rbp4010fb:    53                       push   %rbx4010fc:    48 83 ec 50              sub    $0x50,%rsp401100:    49 89 e5                 mov    %rsp,%r13401103:    48 89 e6                 mov    %rsp,%rsi401106:    e8 51 03 00 00           call   40145c <read_six_numbers>40110b:    49 89 e6                 mov    %rsp,%r1440110e:    41 bc 00 00 00 00        mov    $0x0,%r12d401114:    4c 89 ed                 mov    %r13,%rbp401117:    41 8b 45 00              mov    0x0(%r13),%eax40111b:    83 e8 01                 sub    $0x1,%eax40111e:    83 f8 05                 cmp    $0x5,%eax401121:    76 05                    jbe    401128 <phase_6+0x34>401123:    e8 12 03 00 00           call   40143a <explode_bomb>401128:    41 83 c4 01              add    $0x1,%r12d40112c:    41 83 fc 06              cmp    $0x6,%r12d401130:    74 21                    je     401153 <phase_6+0x5f>401132:    44 89 e3                 mov    %r12d,%ebx401135:    48 63 c3                 movslq %ebx,%rax401138:    8b 04 84                 mov    (%rsp,%rax,4),%eax40113b:    39 45 00                 cmp    %eax,0x0(%rbp)40113e:    75 05                    jne    401145 <phase_6+0x51>401140:    e8 f5 02 00 00           call   40143a <explode_bomb>401145:    83 c3 01                 add    $0x1,%ebx401148:    83 fb 05                 cmp    $0x5,%ebx40114b:    7e e8                    jle    401135 <phase_6+0x41>40114d:    49 83 c5 04              add    $0x4,%r13401151:    eb c1                    jmp    401114 <phase_6+0x20>401153:    48 8d 74 24 18           lea    0x18(%rsp),%rsi401158:    4c 89 f0                 mov    %r14,%rax40115b:    b9 07 00 00 00           mov    $0x7,%ecx401160:    89 ca                    mov    %ecx,%edx401162:    2b 10                    sub    (%rax),%edx401164:    89 10                    mov    %edx,(%rax)401166:    48 83 c0 04              add    $0x4,%rax40116a:    48 39 f0                 cmp    %rsi,%rax40116d:    75 f1                    jne    401160 <phase_6+0x6c>40116f:    be 00 00 00 00           mov    $0x0,%esi401174:    eb 21                    jmp    401197 <phase_6+0xa3>401176:    48 8b 52 08              mov    0x8(%rdx),%rdx40117a:    83 c0 01                 add    $0x1,%eax40117d:    39 c8                    cmp    %ecx,%eax40117f:    75 f5                    jne    401176 <phase_6+0x82>401181:    eb 05                    jmp    401188 <phase_6+0x94>401183:    ba d0 32 60 00           mov    $0x6032d0,%edx401188:    48 89 54 74 20           mov    %rdx,0x20(%rsp,%rsi,2)40118d:    48 83 c6 04              add    $0x4,%rsi401191:    48 83 fe 18              cmp    $0x18,%rsi401195:    74 14                    je     4011ab <phase_6+0xb7>401197:    8b 0c 34                 mov    (%rsp,%rsi,1),%ecx40119a:    83 f9 01                 cmp    $0x1,%ecx40119d:    7e e4                    jle    401183 <phase_6+0x8f>40119f:    b8 01 00 00 00           mov    $0x1,%eax4011a4:    ba d0 32 60 00           mov    $0x6032d0,%edx4011a9:    eb cb                    jmp    401176 <phase_6+0x82>4011ab:    48 8b 5c 24 20           mov    0x20(%rsp),%rbx4011b0:    48 8d 44 24 28           lea    0x28(%rsp),%rax4011b5:    48 8d 74 24 50           lea    0x50(%rsp),%rsi4011ba:    48 89 d9                 mov    %rbx,%rcx4011bd:    48 8b 10                 mov    (%rax),%rdx4011c0:    48 89 51 08              mov    %rdx,0x8(%rcx)4011c4:    48 83 c0 08              add    $0x8,%rax4011c8:    48 39 f0                 cmp    %rsi,%rax4011cb:    74 05                    je     4011d2 <phase_6+0xde>4011cd:    48 89 d1                 mov    %rdx,%rcx4011d0:    eb eb                    jmp    4011bd <phase_6+0xc9>4011d2:    48 c7 42 08 00 00 00     movq   $0x0,0x8(%rdx)4011d9:    00 4011da:    bd 05 00 00 00           mov    $0x5,%ebp4011df:    48 8b 43 08              mov    0x8(%rbx),%rax4011e3:    8b 00                    mov    (%rax),%eax4011e5:    39 03                    cmp    %eax,(%rbx)4011e7:    7d 05                    jge    4011ee <phase_6+0xfa>4011e9:    e8 4c 02 00 00           call   40143a <explode_bomb>4011ee:    48 8b 5b 08              mov    0x8(%rbx),%rbx4011f2:    83 ed 01                 sub    $0x1,%ebp4011f5:    75 e8                    jne    4011df <phase_6+0xeb>4011f7:    48 83 c4 50              add    $0x50,%rsp4011fb:    5b                       pop    %rbx4011fc:    5d                       pop    %rbp4011fd:    41 5c                    pop    %r124011ff:    41 5d                    pop    %r13401201:    41 5e                    pop    %r14401203:    c3                       ret    

不愧是最后一题,长度上确实刷新记录了,难度也相当高。

再难也一样,开始翻译:

push r14, r13, r12, rbp, rbx;
rsp -= 80;
r13 = rsp;
rsi = rsp;
call read_six_number;
r14 = rsp;
r12d = 0;401114:rbp = r13;rax = *r13;eax -= 1;;if(eax <= 5) goto 401128;call explode_bomb;401128:r12d += 1;if(r12d == 6) goto 401153;ebx = r12d;rax = ebx;eax = *(rsp + 4*rax);if(*rbp != eax) goto 401145;call explode_bomb;
401145:ebx += 1;if(ebx <= 5) goto 401135;r13 += 4;goto 401114;
401153:rsi = rsp + 24;rax = r14;ecx = 7;
401160:edx = ecx;edx -= *(rax);*(rax) = edx;rax += 4;if(rax != rsi) goto 401160;esi = 0;goto 401197;
401176:rdx = *(rdx + 8);eax += 1;if(eax != ecx) goto 401176;goto 401188;
401183:edx = 0x6032d0;
401188:*(rsp + 2*rsi + 32) = rdx;rsi += 4;if(rsi == 24) goto 4010ab;
400197:    ecx = *(rsp + rsi);if(ecx <= 1) goto 401183;eax = 1;edx = 0x6032d0;goto 401176;
4010ab:  //***rbx = *(rsp + 32);rax = rsp + 40;rsi = rsp + 80;rcx = rbx;
4011bd:rdx = *rax;*(rcx + 8) = rdx;rax += 8;if(rax == rsi) goto 4011d2;rcx = rdx;goto 4011bd;
4011d2:*(rdx + 8) = 0;ebp = 5;
4011df:rax = *(rbx + 8);eax = *rax;if(*rbx >= eax) goto 4011ee;call explode_bomb;
4011ee:rbx = *(rbx + 8);ebp -= 1;if(ebp != 0) goto 4011df;rsp += 80;pop rbx, rbp, r12, r13, r14;

翻译过来也很不好办,代码很长,而且跳转很多,但是都是部分代码的跳转,我们可以划分成一部分一部分来,起码我们看见调用了read_six_number(),知道输入是6个数字

先硬着头皮开始转c风格:

phase_6() {rsp -= 80;r13 = rsi = rsp;read_six_number();r14 = rsp;for(r12d = 0; r12d != 6; r12d += 1) {rbp = r13;eax = *r13;eax -= 1;if(eax > 5) explode_bomb();for(ebx = r12d; ebx <= 5; ebx += 1) {rax = ebx;eax = *(rsp + 4*rax);if(eax == *rbp) explode_bomb();}r13 += 4;}...

这一部分代码可以看出,输入的6个整型数,每一个应该都小于等于6,且各不相等。

继续往下(从401153地址开始):

    ...rsi = rsp + 24;rax = r14;ecx = 7;for(; rax != rsi; rax += 4) {edx = ecx;edx -= *(rax);*(rax) = edx;}for(esi = 0; esi != 24; esi += 4) {ecx = *(rsp + rsi);if(ecx <= 1) {edx = 0x6032d0;} else {edx = 0x6032d0;for(eax = 1; eax != ecx; eax += 1) {rdx = *(rdx + 8);}}*(rsp + 2*rsi + 32) = rdx;}...

第一个for循环很好看懂,把输入的6个数都改成7和本身的差,进行了一次操作

第二个for循环就比较困难了,ecx寄存器中存放了之前读入的每个数,还根据它的值有一个处理逻辑

1

2

3

4

5

6

rsp + 32

rsp + 40

rsp + 48

rsp + 56

rsp + 64

rsp + 72

0x6032d0

*(6032d8)

...

...

...

...

第二行是存放这些值的地址,可以看出一个值占了8个字节,第三行是说此地址存了一些我们不太看懂的东西。

老规矩,我们这里应该打印一下看看0x6032d0这里存了些啥:

可能敏感的朋友能够发现,实际上这是一个链表,前四个字节存储了一个值,中间四个字节存储了当前节点在链表中的序号,然后四个字节存储了下一个节点的位置,后面的0x00000000应该是为了对齐。

如果能把这里看出来,这个phase基本上可以说是解决了,后面的部分就比较简单了。

    ...rbx = *(rsp + 32);rsi = rsp + 80;rcx = rbx;for(rax = rsp + 40; rax != rsi; rax += 8) {rdx = *rax;*(rcx + 8) = rdx;rcx = rdx;}*(rdx + 8) = 0;for(ebp = 5; ebp != 0; ebp -= 1) {rax = *(rbx + 8);eax = *rax;if(*rbx >= eax) rbx = *(rbx + 8);else explode_bomb();}rsp += 80;pop rbx, rbp, r12, r13, r14;
}

rsp + 32是链表节点的开始地址,而rax = rsp + 40是链表开始节点所存放的下一个节点地址的地址,因此*rax也就是当前节点的下一个节点,而rcx和rbx在第一个循环开始前存放着第一个节点的开始地址,*(rcx + 8) = rdx(*rax)也就是在当前节点的指向下一个节点的地址处存放下一个节点的地址,也就是构造链表,连接节点的过程。

第二个循环则是遍历链表,并在每个循环中进行判断,如果当前节点的值大于等于下一个节点的值,就继续,如果小于则bomb,因此要求此时链表是降序排列。

根据我们查看地址得知,链表节点情况如下:

1

2

3

4

5

6

0x14c

0x0a8

0x39c

0x2b3

0x1dd

0x1bb

若要求为降序,则节点索引应该为345612

但之前每个索引被7减过,因此最后答案应该为最开始输入的6个数,分别是432165

phase_6完整代码如下:

phase_6() {rsp -= 80;r13 = rsi = rsp;read_six_number();r14 = rsp;for(r12d = 0; r12d != 6; r12d += 1) {rbp = r13;eax = *r13;eax -= 1;if(eax > 5) explode_bomb();for(ebx = r12d; ebx <= 5; ebx += 1) {rax = ebx;eax = *(rsp + 4*rax);if(eax == *rbp) explode_bomb();}r13 += 4;}rsi = rsp + 24;rax = r14;ecx = 7;for(; rax != rsi; rax += 4) {edx = ecx;edx -= *(rax);*(rax) = edx;}for(esi = 0; esi != 24; esi += 4) {ecx = *(rsp + rsi);if(ecx <= 1) {edx = 0x6032d0;} else {edx = 0x6032d0;for(eax = 1; eax != ecx; eax += 1) {rdx = *(rdx + 8);}}*(rsp + 2*rsi + 32) = rdx;}rbx = *(rsp + 32);rsi = rsp + 80;rcx = rbx;for(rax = rsp + 40; rax != rsi; rax += 8) {rdx = *rax;*(rcx + 8) = rdx;rcx = rdx;}*(rdx + 8) = 0;for(ebp = 5; ebp != 0; ebp -= 1) {rax = *(rbx + 8);eax = *rax;if(*rbx >= eax) rbx = *(rbx + 8);else explode_bomb();}rsp += 80;
}

翻译成c风格都这么长,怪不得汇编这么难懂了。

总结:

能够做完这个实验真的很不容易,但是通过对汇编代码的阅读也学到了很多东西,可以看到从汇编层面上来说如何构造一个复杂数据结构,如何构造链表,可以看到递归是如何实现的,可以看到switch case的跳转表如何实现,还可以学到一些放置栈溢出的手段(比如金丝雀等等),最最基本的就是如何通过gdb对程序进行简单调试(虽然基本不涉及分支等方面),深深体验到了从看侯捷老师的视频中学到的:“源码面前,没有困难”这句话的真正含义,也希望自己越来越好,大家都越来越强!

本文标签: CSAPP lab2 经典的bomblab二次学习