SSD6 Exercise1 深入分析

題目

Take Assessment: Exercise 1: Decoding Lab

Decoding Lab: Understanding a Secret Message

You have just intercepted an encoded message. The message is a sequence of bits which reads as follows in hexadecimal:

6363636363636363724646636F6D6F72

466D203A65693A7243646E206F54540A

5920453A54756F0A6F6F470A21643A6F

594E2020206F776F797275744563200A

6F786F686E6963736C206765796C656B

2C3365737420346E20216F74726F5966

7565636F202061206C61676374206C6F

20206F74747865656561727632727463

6E617920680A64746F69766120646E69

21687467630020656C6C786178742078

6578206F727478787863617800783174

You have no idea how to decode it, but you know that your grade depends on it, so you are willing to do anything to extract the message. Fortunately, one of your many agents on the field has stolen the source code for the decoder. This agent (007) has put the code and the message in the file secret.cpp, which you can download from the laboratory of your technical staff (Q).

Q has noticed that the decoder takes four integers as arguments. Executing the decoder with various arguments seems to either crash the program or produce unintelligible output. It seems that the correct four integers have to be chosen in order for the program to produce the decoded message. These four integers are the “secret keys.”

007 has been unable to find the keys, but from the desk of the encrypting personnel he was able to cunningly retrieve the first five characters of the unencoded message. These characters are:
From:

Assignment

Your assignment is to decode the message, and find the keys.

Reminders

This exercise is not extremely difficult. However, the strategy of trying things until something works will be ineffective. Try to understand the material in the course, particularly the following:

•Memory contains nothing but bits. Bits are interpreted as integers, characters, or instructions by the compiler, but they have no intrinsic type in memory.

•The compiler can be strong-armed into interpreting integers as characters, or even as instructions, and vice versa.
•Every group of 8 bits (a byte) has an address.
•A pointer in C is merely a stored memory address.
•The activation records for each function call are all together in memory, and they are organized in a stack that grows downwards and shrinks upwards on function calls and returns respectively.
•The return address of one function as well as the addresses of all of its local variables are allocated within one activation record.

Strategy

The designers of this decoder weren’t very good. They made it possible for us to attack the keys in two independent parts. Try to break the first two keys first, and do not try to break the third and fourth keys until you have succeeded with the first two.

You can do the first part by specifying only two integer arguments when you execute the decoder. If you get the first and second keys right, a message that starts with From: will appear. This message is not the true message, but a decoy. It is useful, however, to let you know that you have indeed broken the first two keys.

In breaking the first two keys, realize that the function process_keys12 must be somehow changing the value of the dummy variable. This must be so, because the variables start and stride control the extraction of the message, and they are calculated from the value of dummy.

In breaking the third and fourth keys, try to get the code to invoke extract_message2 instead of extract_message1. This modification must somehow be controlled from within the function process_keys34.

Files

When you are done, write a brief report that includes at least the following:

1.The secret message.

2.The secret keys.

3.One paragraph describing, in your own prose, what process_keys12 does. For example, you might say that it modifies a specific program variable.
4.The meaning of the first two keys in terms of variables and addresses in the decoder program. For example, you might describe key2 by saying that its X-Y bits contain the value to which variable start is set. Or you might describe key1 by saying, for example, that it must be set equal to the number of memory addresses separating the address of two specific variables. These are only examples.

5.One paragraph describing, in your own prose, what process_keys34 does.

6.One paragraph describing the line of source code that is executed when the first call to process_keys34 returns.

7.The meaning of the third and fourth keys in terms of variables and addresses in the decoder program.

Be precise, clear, and brief in each of the points above. Your report should not, in any case, be

longer than one page. Do not get frustrated if this takes a little longer than you expected: brief and clear text often requires more time to write than rambling prose.

Your teacher can tell you what word processors you may use to write your report. Chances are that you can write your report in a number of formats, and for simplicity’s sake, you might even want to write it using Notepad.

Enjoy!

解答思路

閱讀完題目後的到如下提示及信息:

  • 先解出key1和key2,再去解key3和key4
  • key1和key2正解以後會有From:開頭的輸出
  • 函數process_keys12會改變變量dummy的值( function process_keys12 must be somehow changing the value of the dummy variable. )
  • 解key3和key4時,嘗試讓代碼調用extract_message2而不調用extract_message1(In breaking the third and fourth keys, try to get the code to invoke extract_message2 instead of extract_message1.)
  • 函數 process_keys34(控制代碼的調用轉變 This modification must somehow be controlled from within the function process_keys34.)

以上算是題目中給出的提示,解題時儘量朝着這些提示去想。

下面我們來分析一下代碼,從主程序開始

int main(int argc, char *argv[])
{
    int dummy = 1;
    int start, stride;
    int key1, key2, key3, key4;
    char * msg1, *msg2;

    key3 = key4 = 0;
    if (argc < 3) {
        usage_and_exit(argv[0]);
    }
    key1 = strtol(argv[1], NULL, 0);
    key2 = strtol(argv[2], NULL, 0);
    if (argc > 3) key3 = strtol(argv[3], NULL, 0);
    if (argc > 4) key4 = strtol(argv[4], NULL, 0);

    process_keys12(&key1, &key2);

    start = (int)(*(((char *)&dummy)));
    stride = (int)(*(((char *)&dummy) + 1));

    if (key3 != 0 && key4 != 0) {
        process_keys34(&key3, &key4);
    }

    msg1 = extract_message1(start, stride);

    if (*msg1 == '\0') {
        process_keys34(&key3, &key4);
        msg2 = extract_message2(start, stride);
        printf("%s\n", msg2);
    }
    else {
        printf("%s\n", msg1);
    }

    return 0;
}

由於要先解出key1和key2,我截取只有兩個命令行參數時執行的代碼來分析

int main(int argc, char *argv[])
{
    int dummy = 1;
    int start, stride;
    int key1, key2, key3, key4;
    char * msg1, *msg2;
    key1 = strtol(argv[1], NULL, 0);
    key2 = strtol(argv[2], NULL, 0);
    process_keys12(&key1, &key2);
    start = (int)(*(((char *)&dummy)));
    stride = (int)(*(((char *)&dummy) + 1));
    msg1 = extract_message1(start, stride);
    printf("%s\n", msg1);
    return 0;
}

我們理一下思路,判斷key1及key2的正確性,是根據最終控制檯輸出的結果。而控制檯輸出的msg1是extract_message1(start, stride)的返回值。讓後我們分析一下extract_message1(start, stride)方法幹了什麼?

char * extract_message1(int start, int stride) {
    int i, j, k;
    int done = 0;
    for (i = 0, j = start + 1; !done; j++) {
        for (k = 1; k < stride; k++, j++, i++) {
            if (*(((char *)data) + j) == '\0') {
                done = 1;
                break;
            }
            message[i] = *(((char *)data) + j);
        }
    }
    message[i] = '\0';
    return message;
}

代碼不多,主要是對指針和地址以及int和char轉換的理解。
這個方法的作用就是把一個int型的數組中的每一個int數值轉換爲4個字符,最終從得到的char數組中讀取部分字符放入數組message中,當讀到‘\0’字符時結束。
參數的作用:

  • start:從轉換的到的char數組的start+1角標位置開始讀取
  • stride:當stride>1時,每讀取stride-1個字符,隔一個不讀,這樣循環下去;當stride=1時候就什麼也不讀取了

好了分析清楚之後,我們知道extract_message1這個函數的結果由start和stride決定,而代碼中start和stride的值又由dummy來決定。

start = (int)(*(((char *)&dummy)));
stride = (int)(*(((char *)&dummy) + 1));

其中
start等於dummy最低地址那一字節中的數值。
stride等於dummy第二低地址那一字節中的數值。

那麼可以肯定process_keys12(&key1, &key2)函數一定時改變了dummy的值。
我們來看process_keys12的代碼

void process_keys12(int * key1, int * key2) {

    *((int *)(key1 + *key1)) = *key2;
}

那麼很明顯(int *)(key1 + *key1)的值即等於dummy變量的地址值。
注意:這裏key1是main中key1的地址,*key1等於main中的key1
這樣的在指針層面key1(main中)=&dummy-&key1,在本題中這個值很容易看出來,因爲它們之間就有3個int型的變量,所以key1=3;

當然也可以調試在代碼中看內存中這兩個變量的地址差,再除以4同樣得到key1的值。

注:用vs調試時候發現這幾個連續聲明的int型變臉的內存地址竟然不是連續,間隔不是4,而是12,多用的8個字節不知道做什麼了,其中的原因大致是vs默認的編譯選項是這樣的,可以修改編譯選項,參考如下博文http://blog.csdn.net/pngfiwang/article/details/49624845。當然也可以不修改,只不過這時候要讓key1=9了;

接下來我們分析key2,我們知道key2的值也就等於dummy的值。只有解出start和stride才能知道dummy應該滿足什麼條件。所以我們要分析start和stride各等於多少時才使輸出以From:開頭。
我寫一段代碼直接把data數組轉換字符數組顯示出來,觀察。


int main()
{
    int data[] = {
    0x63636363, 0x63636363, 0x72464663, 0x6F6D6F72,
    0x466D203A, 0x65693A72, 0x43646E20, 0x6F54540A,
    0x5920453A, 0x54756F0A, 0x6F6F470A, 0x21643A6F,
    0x594E2020, 0x206F776F, 0x79727574, 0x4563200A,
    0x6F786F68, 0x6E696373, 0x6C206765, 0x796C656B,
    0x2C336573, 0x7420346E, 0x20216F74, 0x726F5966,
    0x7565636F, 0x20206120, 0x6C616763, 0x74206C6F,
    0x20206F74, 0x74786565, 0x65617276, 0x32727463,
    0x6E617920, 0x680A6474, 0x6F697661, 0x20646E69,
    0x21687467, 0x63002065, 0x6C6C7861, 0x78742078,
    0x6578206F, 0x72747878, 0x78636178, 0x00783174
    };

    for(int i=0;i<44*4;i++){
        char a=*((((char *)(data))+i));
        printf("%c",a);
    }


}

輸出結果:

cccccccccFFrromo: mFr:ie ndC
TTo:E Y
ouT
Gooo:d!  NYowo tury
 cEhoxoscineg lkelyse3,n4 tto! fYoroceu a  cgalol tto  eextvraectr2 yantd
havioind gth!e

根據上面分析的extract_message1的作用及參數的含義我們不難看出,當start=9,stride=3時候符合情況。讀兩個Fr隔一個再讀om,再隔一個,讀:

這時候我們就知道dummy應該滿足什麼情況了也就是
最高地址 —— ——- 最低地址
———— ———— 03 09

也就是說dummy的值只要後兩個字節滿足03,09的情況就可以了。內存中是按2進制來保存的,因此就符合16進制的計算規則。所以滿足題目的key2值就有多個了,列舉幾個777,66313,131849等等無窮盡

ps:在做key1和key2的破解調試時候發現message默認容量100不夠用

//在調試過程中發現message大小100不夠報錯,就改爲了200
char message[200];

輸出結果如下:

From: Friend
To: You
Good! Now try choosing keys3,4 to force a call to extract2 and
avoid the call to extract1

接下來我們就要破解key3和key4的值了。
分析相關代碼

if (key3 != 0 && key4 != 0) {
        process_keys34(&key3, &key4);
    }

    msg1 = extract_message1(start, stride);

    if (*msg1 == '\0') {
        process_keys34(&key3, &key4);
        msg2 = extract_message2(start, stride);
        printf("%s\n", msg2);
    }
    else {
        printf("%s\n", msg1);
    }

首先很明確的是下面兩句代碼一定執行了,因爲如extract_message1方法在唯一是From:開頭的情況下,輸出結果並不對。

msg2 = extract_message2(start, stride);
printf("%s\n", msg2);

那麼按照國際慣例(一本正經放屁)先來看看extract_message2方法的都幹了什麼鬼

char * extract_message2(int start, int stride) {
    int i, j;

    for (i = 0, j = start; 
         *(((char *) data) + j) != '\0';
         i++, j += stride) 
         {
             message[i] = *(((char *) data) + j);
         }
    message[i] = '\0';
    return message;
}

比extract_message1方法明瞭一些
start:從char數組角標爲start的元素開始
stride:每隔stride-1個元素讀取一個

既然執行了extract_message2(start, stride);是不是一定意味着if語句條件爲真呢?
莫慌,我們來分析一下

先來猜測一下process_keys34(&key3, &key4);函數幹了什麼

void process_keys34 (int * key3, int * key4) {

    *(((int *)&key3) + *key3) += *key4;
}

獲得形參key3的地址轉爲一個int指針加上一個常數,也就是便宜到內存中另一個地址處,然後修改了這處地址上的值,好奇怪,被自己說暈了。。。

那有沒有可能是修改了start或者stride的值呢?
答案是:no
原因聽我繼續囉嗦。
如果是修改了start的值,那麼讓*msg=’\0’的情況就是直接從char數組中’\0’的位置讀取,你肯定知道’\0’的ASCII碼是0,那麼瞅一眼data數組就很明顯只有最後一個數組0x00783174的前兩位等於0,也就是轉成的char數組的最後一個字符時’\0’,如果是這樣extract_message2方法得到的字符串肯是沒有值的

再來看stride,很明顯只有讓stride<=1,那麼extract_message2肯定不會得到我們要的結果

從以上分析我們可以發現,如果程序是順序執行的,就會產生矛盾。

只有認爲程序不是順序執行的了,那麼程序是在什麼時候跳轉,又跳轉到哪裏去了呢?(原諒我廢話多)

動動腳趾頭都知道process_keys34這個方法使程序發生了跳轉,跳轉到哪裏去也是很容易看出來的,肯定跳過了if條件語句內的process_keys34(&key3, &key4);語句,否則程序將陷入死循環。

那麼process_keys34是怎麼讓程序發生跳轉的呢?
猜測一下,就是修改了函數的返回地址嘛

根據函數調用的堆棧原理(不清楚的同學請戳http://blog.chinaunix.net/uid-23069658-id-3981406.html),先看下圖吧

這裏寫圖片描述

對應到本題是這樣的
這裏寫圖片描述

從圖中很容看出key3=-1

來來來,就剩最後一個key4 ,key4其實很簡單的,就是第一句process_keys34(&key3, &key4);返回地址和第二句process_keys34(&key3, &key4);返回地址的差值。

調試時,打開vs的反彙編窗口,我拿出如下一段代碼



if (key3 != 0 && key4 != 0) {
003F1709  cmp         dword ptr [key3],0  
003F170D  je          main+105h (03F1725h)  
003F170F  cmp         dword ptr [key4],0  
003F1713  je          main+105h (03F1725h)  
        process_keys34(&key3, &key4);
003F1715  lea         eax,[key4]  
003F1718  push        eax  
003F1719  lea         ecx,[key3]  
003F171C  push        ecx  
003F171D  call        process_keys34 (03F11B3h)  
003F1722  add         esp,8  
    }

    msg1 = extract_message1(start, stride);
003F1725  mov         eax,dword ptr [stride]  
003F1728  push        eax  
003F1729  mov         ecx,dword ptr [start]  
003F172C  push        ecx  
003F172D  call        extract_message1 (03F1118h)  
003F1732  add         esp,8  
003F1735  mov         dword ptr [msg1],eax  

    if (*msg1 == '\0') {
003F1738  mov         eax,dword ptr [msg1]  
003F173B  movsx       ecx,byte ptr [eax]  
003F173E  test        ecx,ecx  
003F1740  jne         main+159h (03F1779h)  
        process_keys34(&key3, &key4);
003F1742  lea         eax,[key4]  
003F1745  push        eax  
003F1746  lea         ecx,[key3]  
003F1749  push        ecx  
003F174A  call        process_keys34 (03F11B3h)  
003F174F  add         esp,8  

可看出key4=003F174F-003F1722 =0x2d=45

到此就完了,去擼了
如有錯誤,請指正(0.0)

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章