androguard解析指令及字节码

androguard解析指令及字节码

解析字节码是常用到的一个需求,被解析出来的字节码可以用于多种用途,例如数值分析、机器学习等。

所谓的字节码:在 Java 语言中中引入了虚拟机的概念,即在机器和编译程序之间加入了一层抽象的虚拟的机器。这台虚拟的机器在任何平台上都提供给编译程序一个的共同的接口。编译程序只需要面向虚拟机,生成虚拟机能够理解的代码,然后由解释器来将虚拟机代码转换为特定系统的机器码执行。在 Java 中,这种供虚拟机理解的代码叫做字节码(即扩展名为 .class 的文件),它不面向任何特定的处理器,只面向虚拟机。每一种平台的解释器是不同的,但是实现的虚拟机是相同的。Java 源程序经过编译器编译后变成字节码,字节码由虚拟机解释执行,虚拟机将每一条要执行的字节码送给解释器,解释器将其翻译成特定机器上的机器码,然后在特定的机器上运行。这也就是解释了 Java 的编译与解释并存的特点。

采用字节码的好处:Java 语言通过字节码的方式,在一定程度上解决了传统解释型语言执行效率低的问题,同时又保留了解释型语言可移植的特点。所以 Java 程序运行时比较高效,而且,由于字节码并不专对一种特定的机器,因此,Java程序无须重新编译便可在多种不同的计算机上运行。

每种方法的字节码都存储在 Dalvik 文件中。Androguard 可以提供三种不同形式获取字节码的方法。

字节码是以16位为单位构造的,但是Androguard将使用8位单位来显示字节码。 如果在字节码中给出了偏移量,则也以字节表示。同样的所有索引均以字节长度提供。

由于Dalvik与Java密切相关,因此所有整数值都表示为带符号的“ int”(32位值)或“ long”(64位)。
值以十进制或十六进制表示。如果值为十六进制,则该值后缀为“ h”,即“ f7a0h”或“ 63392”。

Getting the raw bytecode

要想获取方法的字节码的原始字节表示,第一步仍然是先要加载要测试的 APK 文件

ubuntu@ubuntu:~$ androguard analyze /home/ubuntu/Desktop/meeting.apk 
Please be patient, this might take a while.
Found the provided file is of type 'APK'
[INFO    ] androguard.apk: Starting analysis on AndroidManifest.xml
[INFO    ] androguard.apk: APK file was successfully validated!
[INFO    ] androguard.analysis: Adding DEX file version 35
[INFO    ] androguard.analysis: Reading bytecode took : 0min 00s
[INFO    ] androguard.analysis: Adding DEX file version 35
[INFO    ] androguard.analysis: Reading bytecode took : 0min 00s
[INFO    ] androguard.analysis: End of creating cross references (XREF) run time: 0min 00s
Added file to session: SHA256::689673bed0f4d6121a63f3c9fd88efb538ec316561d426120c440d8be89f6256
Loaded APK file...
>>> a
<androguard.core.bytecodes.apk.APK object at 0x7f3b27bb8390>
>>> d
[<androguard.core.bytecodes.dvm.DalvikVMFormat object at 0x7f3b1ae9d978>, <androguard.core.bytecodes.dvm.DalvikVMFormat object at 0x7f3b1ae36b00>]
>>> dx
<analysis.Analysis VMs: 2, Classes: 85, Methods: 340, Strings: 122>

Androguard version 3.4.0a1 started

然后通过 python 编程获取所有方法的字节表示:

In [1]: for method in dx.get_methods():
   ...:     if method.is_external():
   ...:         continue
   ...:     m = method.get_method()
   ...:     if m.get_code():
   ...:         print(m.get_code().get_bc().get_raw())

其输出结果中将包含很多的二进制数据,如下:

bytearray(b'p\x10\t\x00\x00\x00\x0e\x00')
bytearray(b'p\x10\t\x00\x00\x00\x0e\x00')
bytearray(b'p\x10\xf3\x00\x00\x00\x0e\x00')
bytearray(b'\x12\x01i\x01E\x00\x1a\x00C\x01i\x00C\x00\x1a\x00\xa1\x02i\x00F\x00i\x01D\x00\x0e\x00')
bytearray(b'p\x10\x00\x00\x00\x00\x0e\x00')
bytearray(b'\x1d\x02b\x00C\x00\x1a\x01\x08\x00n \x03\x01\x10\x00\n\x008\x00\x1e\x00"\x00G\x00o\x10\x03\x00\x02\x00\x0c\x01q\x10\x06\x01\x01\x00\x0c\x01p \t\x01\x10\x00b\x01C\x00n \x0f\x01\x10\x00\x0c\x00n\x10\x11\x01\x00\x00\x0c\x00i\x00C\x00\x12\x10\x1e\x02\x0f\x00b\x00C\x00\x1a\x01\x08\x00n \xfe\x00\x10\x00\n\x00;\x00\xf5\xff"\x00G\x00o\x10\x03\x00\x02\x00\x0c\x01q\x10\x06\x01\x01\x00\x0c\x01p \t\x01\x10\x00\x1a\x01\x08\x00n \x0f\x01\x10\x00\x0c\x00b\x01C\x00n \x0f\x01\x10\x00\x0c\x00n\x10\x11\x01\x00\x00\x0c\x00i\x00C\x00(\xd4\r\x00\x1e\x02\'\x00')

Getting disassembled instructions

编写代码获取反汇编形式

In [2]: for method in dx.get_methods():
   ...:      if method.is_external():
   ...:          continue
   ...:      m = method.get_method()
   ...:      for idx, ins in m.get_instructions_idx():
   ...:          print(idx, ins.get_op_value(), ins.get_name(), ins.get_output())
   ...: 

输出为

0 112 invoke-direct v0, Ljava/lang/Object;-><init>()V
6 14 return-void 
0 112 invoke-direct v0, Ljava/lang/Object;-><init>()V
6 14 return-void 
0 18 const/4 v1, 0
2 105 sput-object v1, Lcom/wrapper/proxyapplication/WrapperProxyApplication;->shellApp Landroid/app/Application;

如果想根据具体的类名和方法名获取其反汇编形式,可以采取以下做法

In [3]: for m in dx.find_methods("Lcom/tencent/wemeet/app/MyWrapperProxyApplication;"):
   ...:      print(m.full_name)
   ...:      for idx, ins in m.get_method().get_instructions_idx():
   ...:         print(idx, ins.get_op_value(), ins.get_name(), ins.get_output())
   ...:         

输出中将会看到

Lcom/tencent/wemeet/app/MyWrapperProxyApplication; <init> ()V
0 112 invoke-direct v0, Lcom/wrapper/proxyapplication/WrapperProxyApplication;-><init>()V
6 14 return-void 
Lcom/tencent/wemeet/app/MyWrapperProxyApplication; initProxyApplication (Landroid/content/Context;)V
0 110 invoke-virtual v7, Landroid/content/Context;->getApplicationInfo()Landroid/content/pm/ApplicationInfo;
6 12 move-result-object v4
8 84 iget-object v0, v4, Landroid/content/pm/ApplicationInfo;->sourceDir Ljava/lang/String;
12 18 const/4 v1, 0
14 34 new-instance v2, Ljava/util/zip/ZipFile;
18 112 invoke-direct v2, v0, Ljava/util/zip/ZipFile;-><init>(Ljava/lang/String;)V
24 7 move-object v1, v2
26 57 if-nez v1, +00dh
30 113 invoke-static Landroid/os/Process;->myPid()I
36 10 move-result v4
38 113 invoke-static v4, Landroid/os/Process;->killProcess(I)V
44 18 const/4 v4, 0
46 113 invoke-static v4, Ljava/lang/System;->exit(I)V
52 113 invoke-static v7, v1, Lcom/wrapper/proxyapplication/Util;->PrepareSecurefiles(Landroid/content/Context; Ljava/util/zip/ZipFile;)I
58 110 invoke-virtual v1, Ljava/util/zip/ZipFile;->close()V
64 98 sget-object v4, Lcom/wrapper/proxyapplication/Util;->CPUABI Ljava/lang/String;
68 26 const-string v5, "x86"
72 51 if-ne v4, v5, +031h
76 34 new-instance v4, Ljava/lang/StringBuilder;
80 110 invoke-virtual v7, Landroid/content/Context;->getFilesDir()Ljava/io/File;
86 12 move-result-object v5
88 110 invoke-virtual v5, Ljava/io/File;->getAbsolutePath()Ljava/lang/String;
94 12 move-result-object v5
96 113 invoke-static v5, Ljava/lang/String;->valueOf(Ljava/lang/Object;)Ljava/lang/String;
102 12 move-result-object v5
104 112 invoke-direct v4, v5, Ljava/lang/StringBuilder;-><init>(Ljava/lang/String;)V
110 26 const-string v5, "/prodexdir/"
114 110 invoke-virtual v4, v5, Ljava/lang/StringBuilder;->append(Ljava/lang/String;)Ljava/lang/StringBuilder;
120 12 move-result-object v4
122 98 sget-object v5, Lcom/wrapper/proxyapplication/Util;->libname Ljava/lang/String;
126 110 invoke-virtual v4, v5, Ljava/lang/StringBuilder;->append(Ljava/lang/String;)Ljava/lang/StringBuilder;
132 12 move-result-object v4
134 110 invoke-virtual v4, Ljava/lang/StringBuilder;->toString()Ljava/lang/String;
140 12 move-result-object v4
142 113 invoke-static v4, Ljava/lang/System;->load(Ljava/lang/String;)V
148 14 return-void 
150 13 move-exception v3
152 110 invoke-virtual v3, Ljava/io/IOException;->printStackTrace()V
158 40 goto -42h
160 13 move-exception v3
162 110 invoke-virtual v3, Ljava/io/IOException;->printStackTrace()V
168 40 goto -34h
170 98 sget-object v4, Lcom/wrapper/proxyapplication/Util;->libname Ljava/lang/String;
174 113 invoke-static v4, Ljava/lang/System;->loadLibrary(Ljava/lang/String;)V
180 40 goto -10h
Lcom/tencent/wemeet/app/MyWrapperProxyApplication; onCreate ()V
0 111 invoke-super v0, Lcom/wrapper/proxyapplication/WrapperProxyApplication;->onCreate()V
6 14 return-void 

Get processed bytecode from decompiler

In [9]: for method in dx.get_methods():
   ...:     if method.is_external():
   ...:         continue
   ...:     m = method.get_method()
   ...:     print(m.source())
   ...:     

通过上述代码,可以输出所有方法的源码,输出举例如下:

	private declared_synchronized boolean Fixappname()
    {
        try {
            if (!com.wrapper.proxyapplication.WrapperProxyApplication.className.startsWith(.)) {
                if (com.wrapper.proxyapplication.WrapperProxyApplication.className.indexOf(.) < 0) {
                    com.wrapper.proxyapplication.WrapperProxyApplication.className = new StringBuilder(String.valueOf(super.getPackageName())).append(.).append(com.wrapper.proxyapplication.WrapperProxyApplication.className).toString();
                }
            } else {
                com.wrapper.proxyapplication.WrapperProxyApplication.className = new StringBuilder(String.valueOf(super.getPackageName())).append(com.wrapper.proxyapplication.WrapperProxyApplication.className).toString();
            }
        } catch (String v0_11) {
            throw v0_11;
        }
        return 1;
    }

也可以使用 DAD 编译抽象语法树(AST),AST 可以轻松地用于对代码本身进行分析。其方法如下:

from pprint import pprint
    from androguard.decompiler.dad.decompile import DvMethod
    for method in dx.get_methods():
        if method.is_external():
            continue
        dv = DvMethod(method)
        dv.process(doAST=True)
        pprint(dv.get_ast())

其输出形式为:

{'body': ['BlockStatement',
          None,
          [['LocalDeclarationStatement',
            None,
            [['TypeName', ('.int', 0)], ['Local', 'v1_0']]],
           ['LocalDeclarationStatement',
            ['ClassInstanceCreation',
             (java/io/File, <init>, (Ljava/lang/String;)V),
             [['Local', 'p5']],
             ['TypeName', (java/io/File, 0)]],
            [['TypeName', (java/io/File, 0)], ['Local', 'v0_1']]],
           ['IfStatement',
            None,
            ['BinaryInfix',
             [['Parenthesis',
               [['Unary',
                 [['MethodInvocation',
                   [['Local', 'v0_1']],
                   (java/io/File, exists, ()Z),
                   exists,
                   True]],
                 '!',
                 False]]],
              ['Parenthesis',
               [['BinaryInfix',
                 [['MethodInvocation',
                   [['Local', 'v0_1']],
                   (java/io/File, length, ()J),
                   length,
                   True],
                  ['Local', 'p6']],
                 '!=']]]],
             '||'],
            [['BlockStatement',
              None,
              [['ExpressionStatement',
                ['Assignment',
                 [['Local', 'v1_0'], ['Literal', '0', ('.int', 0)]],
                 '']]]],
             ['BlockStatement',
              None,
              [['ExpressionStatement',
                ['Assignment',
                 [['Local', 'v1_0'], ['Literal', '1', ('.int', 0)]],
                 '']]]]]],
           ['ReturnStatement', ['Local', 'v1_0']]]],
 'comments': [],
 'flags': ['private', 'static'],
 'params': [[['TypeName', (java/lang/String, 0)], ['Local', 'p5']],
            [['TypeName', ('.long', 0)], ['Local', 'p6']]],
 'ret': ['TypeName', ('.boolean', 0)],
 'triple': (com/wrapper/proxyapplication/Util,
            isFileValid,
            (Ljava/lang/String;J)Z)}

以上 AST 等价于下面的源代码

private static boolean isFileValid(String p5, long p6)
    {
        int v1_0;
        java.io.File v0_1 = new java.io.File(p5);
        if ((!v0_1.exists()) || (v0_1.length() != p6)) {
            v1_0 = 0;
        } else {
            v1_0 = 1;
        }
        return v1_0;
    }

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章