爲什麼short、byte會被提升爲int?及基本類型的真實大小

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":1,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" Java中,short 、byte、char 類型的數據在做運算的時候,都會默認提升爲 int,如下面的代碼,需要將等於號右邊的強制轉爲 short 纔可以通過編譯。","attrs":{}}]},{"type":"codeblock","attrs":{"lang":"java"},"content":[{"type":"text","text":"public static void main(String[] args) {\n short a = 1;\n short b = 2;\n a = a + b; // 編譯不過\n short c = a + b; // 編譯不過\n short d = (short) (a+b); // 編譯通過\n }","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 爲什麼兩個 short 相加會變成 int,有的解釋說,兩個 short 相加可能溢出,所以用 int 來接就不會溢出,那這樣的話,兩個 int 相加豈不應該是 long 類型嗎?其實本質的原因要從字節碼開始講起。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"本文出現了一些字節碼指令,如果想詳細查看,請參考Java虛擬機規範 ","attrs":{}},{"type":"link","attrs":{"href":"https://docs.oracle.com/javase/specs/jvms/se8/html/jvms-6.html","title":""},"content":[{"type":"text","text":"Chapter 6. The Java Virtual Machine Instruction Set","attrs":{}}]}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"爲什麼short、byte會被提升爲int?","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" Java虛擬機的指令由一個字節長度的、代表着某種特定操作含義的數字(稱爲操作碼,Opcode)以及跟隨其後的零至多個代表次操作所需參數(稱爲操作數,Operands)而構成。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" Java虛擬機的指令集中的大多數都對它們執行的操作的數據類型進行編碼,例如 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"iload","attrs":{}}],"attrs":{}},{"type":"text","text":" 指令,是將一個局部變量加載到操作棧,且這個局部變量必須是 int 類型,每一個指令都是隻能接受對應類型的數據的。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 但由於操作碼的長度爲一個字節,這意味着","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"指令集的操作碼總數不可能超過256條","attrs":{}},{"type":"text","text":",這也爲設計包含數據類型的操作碼帶來了很大壓力:如果每一種與數據類型相關的指令都支持Java虛擬機所有運行時數據類型的話,那指令的數量就會超出一個字節所能表示的數量範圍了。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 根據下表(出自 ","attrs":{}},{"type":"link","attrs":{"href":"https://docs.oracle.com/javase/specs/jvms/se8/html/jvms-2.html#jvms-2.11.1","title":""},"content":[{"type":"text","text":"Table 2.11.1-A. Type support in the Java Virtual Machine instruction set","attrs":{}}]},{"type":"text","text":"),可以發現大部分指令都沒有支持 byte、char 和 short 類型,甚至沒有任何指令支持 boolean 類型。","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"因此如果要對兩個 short 類型的數字相加,那隻能轉成 int,再使用 iadd 命令相加,然後再轉成 short 了。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/a3/a36ef38430b2a40cb9f6ccbd953f9cbd.png","alt":"Java虛擬機指令集中的類型支持","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" ","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"基本類型有多大? ","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"byte、short、char 的大小","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"常言道,byte 佔 1 字節,short、char 兩字節,事實真的如此嗎?","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"舉個例子,有如下代碼","attrs":{}}]},{"type":"codeblock","attrs":{"lang":"java"},"content":[{"type":"text","text":"public class Test {\n public static void main(String[] args) {\n short a = 100;\n short b = 200;\n short c = (short) (a + b);\n }\n}","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"codeinline","content":[{"type":"text","text":"javac Test.java","attrs":{}}],"attrs":{}},{"type":"text","text":"編譯文件,","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"javap -verbose Test","attrs":{}}],"attrs":{}},{"type":"text","text":"查看 Class 文件內容,摘取main方法的信息:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":""},"content":[{"type":"text","text":"public static void main(java.lang.String[]);\n descriptor: ([Ljava/lang/String;)V\n flags: ACC_PUBLIC, ACC_STATIC\n Code:\n stack=2, locals=4, args_size=1\n 0: bipush 100 // 由於~128-127,編譯時就轉爲了byte類型,所以使用 bipush 將 byte 類型的數據100推入操作數棧\n 2: istore_1 // 棧頂int數值存入第2局部變量\n 3: sipush 200 //將 short 類型的數據200推入操作數棧\n 6: istore_2 // 棧頂int數值存入第3局部變量\n 7: iload_1 // 將第2局部變量入棧\n 8: iload_2 // 將第3局部變量入棧\n 9: iadd // 將棧頂的兩個數int出棧並將相加結果入棧\n 10: i2s // 將棧頂的int出棧,轉爲short再入棧\n 11: istore_3 // 棧頂int數值存入第4局部變量\n 12: return","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"指令中使用了 bipush 來操作 byte類型的數據,查看該指令的描述(sipush也類似):","attrs":{}}]},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"The immediate byte is sign-extended to an int value. That value is pushed onto the operand stack.","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"byte立刻被帶符號地擴展爲int值。該值被推送到操作數堆棧上。","attrs":{}}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/09/092e6c6fe5a3e29caa64722ae3777d80.png","alt":"bipush的描述","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們再查看一個用到的i2s的命令 ","attrs":{}},{"type":"link","attrs":{"href":"https://docs.oracle.com/javase/specs/jvms/se8/html/jvms-6.html#jvms-6.5.i2s","title":""},"content":[{"type":"text","text":"jvms-6.5.i2s","attrs":{}}]},{"type":"text","text":",它是這樣描述的","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"The value on the top of the operand stack must be of type int. It is popped from the operand stack, truncated to a short, then sign-extended to an int result. That result is pushed onto the operand stack.","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"翻譯過來大致是:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"操作數堆棧頂部的值必須是int類型。它從操作數棧中彈出,截斷爲short,然後符號擴展爲int結果。結果被推送到操作數堆棧上。","attrs":{}}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 所以從上面可以看出:在編譯期或運行期 , short 類型將被帶符號擴展爲 int 類型,其它幾個基本類型也一樣。","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"因此,大多數對於 boolean、byte、char 和 short 類型數據的操作,實際都提升爲 int ,並使用 int 作爲運算類型,所以 它們都佔 4字節。實際上,虛擬機規範也只有 4字節 和 8字節類型(long、float), boolean、char、short 都是佔了 4字節。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這有沒有讓你想起Java棧上分配的最小單位 slot 就是 4 字節?就算是byte在棧上也是佔用一個slot。","attrs":{}}]},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"The Java stack is a last-in, first-out stack of 32-bit slots. Because each slot in the stack occupies 32 bits, all local variables occupy at least 32 bits. Local variables of type long and double, which are 64-bit quantities, occupy two slots on the stack. Local variables of type byte or short are stored as local variables of type int, but with a value that is valid for the smaller type. For example, an int local variable which represents a byte type will always contain a value valid for a byte (-128 <= value <= 127). ","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"請參考: ","attrs":{}},{"type":"link","attrs":{"href":"https://www.infoworld.com/article/2077233/bytecode-basics.html#:~:text=The%20Java%20stack%20is%20a,two%20slots%20on%20the%20stack.","title":""},"content":[{"type":"text","text":"Bytecode basics","attrs":{}}]}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 當然了,slot爲4字節也有很多原因,如在設計虛擬機時,主要考慮的是 32位體系,32位系統使用 4 字節是最節省,因爲 CPU 只能 32位32位的尋址。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 那我們平時所說 short 是 2 字節的豈不是錯誤的?並不完全是,","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"對於在棧上(局部變量)的 byte、char、short 類型的數據,在內存中的確會佔 4 字節,但這對於(數組)對象來說並不適用。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 回首表2.11.1-A,byte類型只支持四種操作,大部分操作都需要轉爲int,那byte類型存在的意義是什麼呢?反正都要佔用一個slot,不如全部用int,其實就是因爲數組對象的存在,例如其中兩個指令 baload和bastore 就是操作數組的。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 雖然一個byte類型的變量只需要一個字節就能夠表示了,但到了棧幀上(局部變量)卻佔4字節。如果byte數組對象中的每個元素也都佔4字節,那就會浪費大量的空間。由於在對象數組中,元素都是分配在堆中的,棧上和堆上分配的機制不一樣,在堆中每個元素可以被壓縮成 1 字節(short數組每個元素壓縮成2字節)。不過當你取出byte數組中的一個元素時,放入棧幀上時,又會被轉成int,佔用4字節。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" ","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"總結一下:byte、short、char等類型的數據當做局部變量使用時,實際也佔用一個slot的大小,即4字節,但在數組中可以優化,byte 數組每個元素佔 1 字節, char、short 數組各個元素佔 2 字節。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"參考stackoverflow","attrs":{}},{"type":"link","attrs":{"href":"https://stackoverflow.com/questions/229886/size-of-a-byte-in-memory-java#:~:text=Yes,%20a%20byte%20variable%20in%20Java%20is%20in%20fact%204%20bytes%20in%20memory.%20However%20this%20doesn't%20hold%20true%20for%20arrays.%20The%20storage%20of%20a%20byte%20array%20of%2020%20bytes%20is%20in%20fact%20on","title":""},"content":[{"type":"text","text":" Size of a byte in memory - Java","attrs":{}}]},{"type":"text","text":",注意標註高亮的部分。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"更多對基本類型的描述,可以查看","attrs":{}},{"type":"link","attrs":{"href":"https://docs.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html","title":""},"content":[{"type":"text","text":"Primitive Data Types","attrs":{}}]}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"horizontalrule","attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"支持更少的 boolean ","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"說完byte、char、short,我們再來看看對於 boolean 的描述,摘取部分信息 ","attrs":{}},{"type":"link","attrs":{"href":"https://docs.oracle.com/javase/specs/jvms/se8/html/jvms-2.html#jvms-2.3.4","title":""},"content":[{"type":"text","text":"2.3.4. The boolean Type","attrs":{}}]},{"type":"text","text":":","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Although the Java Virtual Machine defines a boolean type, it only provides very limited support for it. There are no Java Virtual Machine instructions solely dedicated to operations on boolean values. Instead, expressions in the Java programming language that operate on boolean values are compiled to use values of the Java Virtual Machine int data type.","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"The Java Virtual Machine does directly support boolean arrays. Its newarray instruction (§newarray) enables creation of boolean arrays. Arrays of type boolean are accessed and modified using the byte array instructions baload and bastore (§baload, §bastore).","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"In Oracle’s Java Virtual Machine implementation, boolean arrays in the Java programming language are encoded as Java Virtual Machine byte arrays, using 8 bits per boolean element.","attrs":{}}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"翻譯大概如下:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"儘管Java虛擬機定義了一種 boolean 類型,但對它的提供支持非常有限,沒有專門的虛擬機指令用來操作 boolean 類型。但是,對於有 boolean 值參與運行的表達式,都會被編譯成 int 類型的數據。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"虛擬機直接支持了 boolean 數組,它使用","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"newarray","attrs":{}}],"attrs":{}},{"type":"text","text":"指令來創建數組,並可以使用 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"baload","attrs":{}}],"attrs":{}},{"type":"text","text":" 和 ","attrs":{}},{"type":"codeinline","content":[{"type":"text","text":"bastore","attrs":{}}],"attrs":{}},{"type":"text","text":" 來訪問和修改 boolean 類型的數組","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在 Oracle 的Java虛擬機實現中, boolean 類型的數組被編碼成和 byte類型的數組, 每個 boolean 元素使用 8 bit。","attrs":{}}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" ","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" ","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"因此,boolean 作爲局部變量使用時,也佔 4 字節,在數組中使用時,按虛擬機規範要佔 1 字節。","attrs":{}},{"type":"text","text":"但boolean數組最終如何實現,還是要看各個虛擬機廠商是否遵守規範了。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章