每天学一点java字节码


java 字节码

类方法的方法参数从0开始,而实例方法的第0个参数是this指针。

For class methods (i.e. static methods) the method parameters start from zero, however, for instance methods the zero slot is reserved for this.

局部变量,基本类型有8种,还有引用类型和返回地址

A local variable can be:

  • boolean
  • byte
  • char
  • long
  • short
  • int
  • float
  • double
  • reference
  • returnAddress

All types take a single slot in the local variable array except long and double which both take two consecutive slots because these types are double width (64-bit instead of 32-bit).

基本类型除了long和double占两个slot之外,其余的都只占一个slot


The value of the new variable is then stored into the local variables array in the correct slot. If the variable is not a primitive value then the local variable slot only stores a reference. The reference points to an the object stored in the heap.

变量存储在局部变量数组中,如果不是基本类型,则在slot中存储一个对象的引用。


举个栗子:

int i = 5;

Is compile to:

0: bipush      5

2: istore_0


bipush将整数添加到操作栈中

Is used to add a byte as an integer to the operand stack, in this case 5 as added to the operand stack.


istore_0存储一个整数到局部变量中,只能是istore_0,istore_1,istore_2或者i_store3

Is one of a group of opcodes with the format istore_<n> they all store an integer into local variables. The <n> refers to the location in the local variable array that is being stored and can only be 0, 1, 2 or 3. Another opcode is used for values higher then 3 called istore, which takes an operand for the location in the local variable array.




2 A field (or class variable) is stored on the heap as part of a class instance (or object). Information about the field is added into the field_info array in the class file as shown below.



ClassFile {

    u4 magic;魔数

    u2 minor_version;

    u2 major_version;

    u2 constant_pool_count;

    cp_info contant_pool[constant_pool_count – 1];

    u2 access_flags;标记类的public/protected/private

    u2 this_class;当前类

    u2 super_class;父类

    u2 interfaces_count;接口数量

    u2 interfaces[interfaces_count];

    u2 fields_count;成员变量个数

    field_info fields[fields_count];

    u2 methods_count;方法个数

    method_info methods[methods_count];

    u2 attributes_count;

    attribute_info attributes[attributes_count];

}

初始化类成员变量的操作,会自动添加到初始化方法体内

public class SimpleClass {


    public int simpleField = 100;


}

An extra section appears when you run javap demonstrating the field added to the field_info array:

public int simpleField;

    Signature: I

    flags: ACC_PUBLIC

The byte code for the initialization is added into the constructor (shown in bold), as follows:

public SimpleClass();

  Signature: ()V

  flags: ACC_PUBLIC

  Code:

    stack=2, locals=1, args_size=1

       0: aload_0

       1: invokespecial #1                  // Method java/lang/Object."<init>":()V

       4: aload_0

       5: bipush        100

       7: putfield      #2                  // Field simpleField:I

      10: return


字节码中的data放在常量池中

Constant pool:

   #1 = Methodref          #4.#16         //  java/lang/Object."<init>":()V

   #2 = Fieldref           #3.#17         //  SimpleClass.simpleField:I

   #3 = Class              #13            //  SimpleClass

   #4 = Class              #19            //  java/lang/Object

   #5 = Utf8               simpleField

   #6 = Utf8               I

   #7 = Utf8               <init>

   #8 = Utf8               ()V

   #9 = Utf8               Code

  #10 = Utf8               LineNumberTable

  #11 = Utf8               LocalVariableTable

  #12 = Utf8               this

  #13 = Utf8               SimpleClass

  #14 = Utf8               SourceFile

  #15 = Utf8               SimpleClass.java

  #16 = NameAndType        #7:#8          //  "<init>":()V

  #17 = NameAndType        #5:#6          //  simpleField:I

  #18 = Utf8               LSimpleClass;

  #19 = Utf8               java/lang/Object

类常量定义

For example:

public class SimpleClass {


    public final int simpleField = 100;


}

The field description is augmented with ACC_FINAL:

public static final int simpleField = 100;

    Signature: I

    flags: ACC_PUBLIC, ACC_FINAL

    ConstantValue: int 100

The initialization in the constructor is however unaffected:

4: aload_0

5: bipush        100

7: putfield      #2       


静态变量初始化,是放在类初始化器里面的,使用的是cinit

Static Variables

A static class variable with the static modifier is flagged as ACC_STATIC in the class file as follows:

public static int simpleField;

    Signature: I

    flags: ACC_PUBLIC, ACC_STATIC

The byte code for initialization of static variables is not found in the instance constructor<init>. Instead static fields are initialized as part of the class constructor <cinit>using theputstatic operand instead of putfield operand.

static {};

  Signature: ()V

  flags: ACC_STATIC

  Code:

    stack=1, locals=0, args_size=0

       0: bipush         100

       2: putstatic      #2                  // Field simpleField:I

       5: return


条件语句

if-else

The following code example shows a simple if-else comparing two integer parameters.

public int greaterThen(int intOne, int intTwo) {

    if (intOne > intTwo) {

        return 0;

    } else {

        return 1;

    }

}

This method results in the following byte code:

0: iload_1

1: iload_2

2: if_icmple     7

5: iconst_0

6: ireturn

7: iconst_1

8: ireturn

First the two parameters are loaded onto the operand stack using iload_1 and iload_2.if_icmple then compares the top two values on the operand stack. This operand branches to byte code 7 if intOne is less then or equal to intTwo. 


更复杂一点的例子

public int greaterThen(float floatOne, float floatTwo) {

    int result;

    if (floatOne > floatTwo) {

        result = 1;

    } else {

        result = 2;

    }

    return result;

}

This method results in the following byte code:

 0: fload_1

 1: fload_2

 2: fcmpl

 3: ifle          11

 6: iconst_1

 7: istore_3

 8: goto          13

11: iconst_2

12: istore_3

13: iload_3

14: ireturn


首先使用fcmpl得出两个数的比较结果,然后将该结果放到操作栈中

因为fcmpl is first used to compare floatOne and floatTwo and push the result onto the operand stack as follows:

  • floatOne > floatTwo –> 1
  • floatOne = floatTwo –> 0
  • floatOne < floatTwo –> -1
  • floatOne or floatTwo = NaN –> 1

Next ifle is used to branch to byte code 11 if the result from fcmpl is <= 0.

iload_3 is then used to push the result stored in the third local variable slot to the top of the operand stack so that it can be returned by the return instruction.


比较操作

if_icmp<cond>

        eq

        ne

        lt

        le

        gt

        ge

This group of opcodes are used to compare the top two integers on the operand stack and branch to a new byte code. The <cond> can be:

  • eq - equals
  • ne - not equals
  • lt - less then
  • le - less then or equal
  • gt - greater then
  • ge - greater then or equal

if_acmp<cond>

        eq

        ne

These two opcodes are used to test if two references are eq equal or ne non equal and branch to a new byte code location as specified by the operand.

ifnonnull 是否空

ifnull

These two opcodes are used to test if two references are null or not null and branch to a new byte code location as specified by the operand.

lcmp

This opcode is used to compare the top two integers on the operand stack and push a value onto the operand stack as follows:

  • if value1 > value2 –> push 1
  • if value1 = value2 –> push 0
  • if value1 < value2 –> push -1

fcmp<cond>

     l

     g

dcmp<cond>

     l

     g

This group of opcodes is used to compare two floator double values and push a value onto the operand stack as follows:

  • if value1 > value2 –> push 1
  • if value1 = value2 –> push 0
  • if value1 < value2 –> push -1


instanceof

This opcode pushes an int result of 1 onto the operand stack if the object at the top of the operand stack is an instance of the class specified. The operand for this opcode is used to specify the class by providing an index into the constant pool. If the object is null or not an instance of the specified class then the int result 0 is added to the operand stack.


switch操作,jdk7之前的,只支持能够自动提升为int的类型,如byte、char、short,int。jdk7之后开始支持String

public int simpleSwitch(int intOne) {

    switch (intOne) {

        case 0:

            return 3;

        case 1:

            return 2;

        case 4:

            return 1;

        default:

            return -1;

    }

}

This produces the following byte code:

 0: iload_1

 1: tableswitch   {

         default: 42

             min: 0     这里会计算出条件语句的最值,这样对于不在min~max范围的情况,直接跳到default.

             max: 4

               0: 36

               1: 38

               2: 42      //  The tableswitch instruction also has values for 2 and 3, as these are not provided as casestatements in the Java code they both point to       // the default code block.

               3: 42

               4: 40

    }

36: iconst_3

37: ireturn

38: iconst_2

39: ireturn

40: iconst_1

41: ireturn

42: iconst_m1

43: ireturn


对于条件语句比较稀疏的,使用tableswithch太费空间,于是使用lookupswitch来查找分支,速度比tableswitch慢


public int simpleSwitch(int intOne) {
    switch (intOne) {
        case 10:
            return 1;
        case 20:
            return 2;
        case 30:
            return 3;
        default:
            return -1;
    }
}

This produces the following byte code:

 0: iload_1
 1: lookupswitch  {
         default: 42
           count: 3
              10: 36
              20: 38
              30: 40
    }
36: iconst_1
37: ireturn
38: iconst_2
39: ireturn
40: iconst_3
41: ireturn
42: iconst_m1
43: ireturn

JDK7开始支持String作为switch的条件。主要是通过对字符串的hash code进行比较,分为两个阶段

首先是将操作栈顶的字符串的hash code和switch分支的hash code进行比较,然后使用tableswitch跳到正确的分支

public int simpleSwitch(String stringOne) {
    switch (stringOne) {
        case "a":
            return 0;
        case "b":
            return 2;
        case "c":
            return 3;
        default:
            return 4;
    }
}

This String switch statement will produce the following byte code:

 0: aload_1
 1: astore_2
 2: iconst_m1
 3: istore_3
 4: aload_2
 5: invokevirtual #2                  // Method java/lang/String.hashCode:()I
 8: tableswitch   {
         default: 75
             min: 97
             max: 99
              97: 36
              98: 50
              99: 64
       }
36: aload_2
37: ldc           #3                  // String a
39: invokevirtual #4                  // Method java/lang/String.equals:(Ljava/lang/Object;)Z
42: ifeq          75
45: iconst_0
46: istore_3
47: goto          75
50: aload_2
51: ldc           #5                  // String b
53: invokevirtual #4                  // Method java/lang/String.equals:(Ljava/lang/Object;)Z
56: ifeq          75
59: iconst_1
60: istore_3
61: goto          75
64: aload_2
65: ldc           #6                  // String c
67: invokevirtual #4                  // Method java/lang/String.equals:(Ljava/lang/Object;)Z
70: ifeq          75
73: iconst_2
74: istore_3
75: iload_3
76: tableswitch   {
         default: 110
             min: 0
             max: 2
               0: 104
               1: 106
               2: 108
       }
104: iconst_0
105: ireturn
106: iconst_2
107: ireturn
108: iconst_3
109: ireturn
110: iconst_4
111: ireturn
该类的常量池如下:

Constant pool:
  #2 = Methodref          #25.#26        //  java/lang/String.hashCode:()I
  #3 = String             #27            //  a
  #4 = Methodref          #25.#28        //  java/lang/String.equals:(Ljava/lang/Object;)Z
  #5 = String             #29            //  b
  #6 = String             #30            //  c

 #25 = Class              #33            //  java/lang/String
 #26 = NameAndType        #34:#35        //  hashCode:()I
 #27 = Utf8               a
 #28 = NameAndType        #36:#37        //  equals:(Ljava/lang/Object;)Z
 #29 = Utf8               b
 #30 = Utf8               c

 #33 = Utf8               java/lang/String
 #34 = Utf8               hashCode
 #35 = Utf8               ()I
 #36 = Utf8               equals
 #37 = Utf8               (Ljava/lang/Object;)Z

难怪之前都不支持String的分支,果然是比较麻烦啊。。。


如果条件语句的字符串的hash code相等,那么字节码则改一下:


public int simpleSwitch(String stringOne) {
    switch (stringOne) {
        case "FB":
            return 0;
        case "Ea":
            return 2;
        default:
            return 4;
    }
}

This generates the following byte code:

 0: aload_1
 1: astore_2
 2: iconst_m1
 3: istore_3
 4: aload_2
 5: invokevirtual #2                  // Method java/lang/String.hashCode:()I
 8: lookupswitch  {
         default: 53
           count: 1
            2236: 28
    }
28: aload_2
29: ldc           #3                  // String Ea
31: invokevirtual #4                  // Method java/lang/String.equals:(Ljava/lang/Object;)Z
34: ifeq          42
37: iconst_1
38: istore_3
39: goto          53
42: aload_2
43: ldc           #5                  // String FB
45: invokevirtual #4                  // Method java/lang/String.equals:(Ljava/lang/Object;)Z
48: ifeq          53
51: iconst_0
52: istore_3
53: iload_3
54: lookupswitch  {
         default: 84
           count: 2
               0: 80
               1: 82
    }
80: iconst_0
81: ireturn
82: iconst_2
83: ireturn
84: iconst_4
85: ireturn


循环语句:

loop

while、do-while、for

循环语句中,一般有分支语句:such as if_icmpge or if_icmplt,and a goto statement. 


举个栗子

public void whileLoop() {
    int i = 0;
    while (i < 2) {
        i++;
    }
}

Is compiled to:

 0: iconst_0
 1: istore_1
 2: iload_1
 3: iconst_2
 4: if_icmpge     13
 7: iinc          1, 1 //自增1
10: goto          2
13: return

The iinc instruction is one of the few instruction that updates a local variable directly without having to load or store values in the operand stack. In this example the iincinstruction increases the first local variable (i.e. i) by 1.


for循环和while循环类似

do-while循环,

public void doWhileLoop() {
    int i = 0;
    do {
        i++;
    } while (i < 2);
}

Results in the following byte code:

 0: iconst_0    		//常量压栈到操作栈
 1: istore_1    		//抛栈到local variables表中
 2: iinc          1, 1
 5: iload_1     		//将local variables中元素压栈,
 6: iconst_2    		//常量压栈 
 7: if_icmplt     2		//从栈中取出两个元素,比较大小,如果less than
10: return




To Be Continued...

参考点击打开链接

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章