文章目錄
class文件
主要是classFileParser.cpp裏面的parse_stream方法
MagicNumber
classFileParser.cpp
u4:衆所周知的CAFEBABE,如果不是則虛擬機拒絕加載.這個常量定義在classFileParser.cpp中
#define JAVA_CLASSFILE_MAGIC 0xCAFEBABE
至於使用則是在parse_stream方法(該方法在ClassParser的構造方法中調用)中
void ClassFileParser::parse_stream(const ClassFileStream* const stream,
TRAPS) {
assert(stream != NULL, "invariant");
assert(_class_name != NULL, "invariant");
// BEGIN STREAM PARSING
stream->guarantee_more(8, CHECK); // magic, major, minor
// Magic value
const u4 magic = stream->get_u4_fast();
guarantee_property(magic == JAVA_CLASSFILE_MAGIC,
"Incompatible magic value %u in class file %s",
magic, CHECK);
大端和小端
Little-Endian:低位字節排在內存的低地址端,高位字節排在內存的高地址端(書寫中左邊是高位例如12中1是高位)
Big-Endian:
網絡字節序:TCP/IP各層使用Big-Endian
JVM:Big-Endian,所以class文件是遵循Big-Endian的,但是C/C++一般是追隨環境的,所以在讀取class文件時需要轉換.
在整個流的讀取過程中都會遇到平臺相關問題,這裏以cafebabe的讀取爲例,則就需要深入get_u4_fase(),下面代碼來自classFileStream.hpp,至於爲什麼帶上fast則是因爲這裏沒有做校驗(u1和u2有做校驗的非fast版本,但是u4只有fast版本)
// Read u4 from stream
u4 get_u4_fast() const {
u4 res = Bytes::get_Java_u4((address)_current);
_current += 4;
return res;
}
至於get_Java_u4則在不同的CPU下有所不同,以下是幾種示例(這些代碼來自不同的hpp文件)
// arm
static inline u4 get_Java_u4(address p) {
return u4(p[0]) << 24 |
u4(p[1]) << 16 |
u4(p[2]) << 8 |
u4(p[3]);
}
//aarch64
static inline u4 get_Java_u4(address p) { return swap_u4(get_native_u4(p)); }
static inline u4 get_native_u4(address p) { return *(u4*)p; }
static inline u4 swap_u4(u4 x); // compiler-dependent implementation
//linux
inline u4 Bytes::swap_u4(u4 x) {
return bswap_32(x);
}
inline static u4 bswap_32(u4 x) {
return ((x & 0xFF) << 24) |
((x & 0xFF00) << 8) |
((x >> 8) & 0xFF00) |
((x >> 24) & 0xFF);
}
//bytes_linux_x86.inline.hpp
inline u4 Bytes::swap_u4(u4 x) {
#ifdef AMD64
return bswap_32(x);
#else
u4 ret;
__asm__ __volatile__ (
"bswap %0"
:"=r" (ret) // output : register 0 => ret
:"0" (x) // input : x => register 0
:"0" // clobbered register
);
return ret;
#endif // AMD64
}
//x86
static inline u4 get_Java_u4(address p) { return get_Java<u4>(p); }
注意這裏的每一步調用都和平臺(包括CPU&OS)相關,不同的平臺調用的是不同代碼,不過我個人認爲最後bswap_32還可以再優化一下.
Version
classFileParser.cpp
依次是兩個u2,分別是minor_version和major_version
// Version numbers
_minor_version = stream->get_u2_fast();
_major_version = stream->get_u2_fast();
if (DumpSharedSpaces && _major_version < JAVA_6_VERSION) {
ResourceMark rm;
warning("Pre JDK 6 class not supported by CDS: %u.%u %s",
_major_version, _minor_version, _class_name->as_C_string());
Exceptions::fthrow(
THREAD_AND_LOCATION,
vmSymbols::java_lang_UnsupportedClassVersionError(),
"Unsupported major.minor version for dump time %u.%u",
_major_version,
_minor_version);
}
// Check version numbers - we check this even with verifier off
verify_class_version(_major_version, _minor_version, _class_name, CHECK);
其中真正執行的代碼檢查的verify_class_version代碼如下
// A legal major_version.minor_version must be one of the following:
// Major_version >= 45 and major_version < 56, any minor_version.
// Major_version >= 56 and major_version <= JVM_CLASSFILE_MAJOR_VERSION and minor_version = 0.
// Major_version = JVM_CLASSFILE_MAJOR_VERSION and minor_version = 65535 and --enable-preview is present.
static void verify_class_version(u2 major, u2 minor, Symbol* class_name, TRAPS){
ResourceMark rm(THREAD);
const u2 max_version = JVM_CLASSFILE_MAJOR_VERSION;
if (major < JAVA_MIN_SUPPORTED_VERSION) {
Exceptions::fthrow(
THREAD_AND_LOCATION,
vmSymbols::java_lang_UnsupportedClassVersionError(),
"%s (class file version %u.%u) was compiled with an invalid major version",
class_name->as_C_string(), major, minor);
return;
}
if (major > max_version) {
Exceptions::fthrow(
THREAD_AND_LOCATION,
vmSymbols::java_lang_UnsupportedClassVersionError(),
"%s has been compiled by a more recent version of the Java Runtime (class file version %u.%u), "
"this version of the Java Runtime only recognizes class file versions up to %u.0",
class_name->as_C_string(), major, minor, JVM_CLASSFILE_MAJOR_VERSION);
return;
}
if (major < JAVA_12_VERSION || minor == 0) {
return;
}
if (minor == JAVA_PREVIEW_MINOR_VERSION) {
if (major != max_version) {
Exceptions::fthrow(
THREAD_AND_LOCATION,
vmSymbols::java_lang_UnsupportedClassVersionError(),
"%s (class file version %u.%u) was compiled with preview features that are unsupported. "
"This version of the Java Runtime only recognizes preview features for class file version %u.%u",
class_name->as_C_string(), major, minor, JVM_CLASSFILE_MAJOR_VERSION, JAVA_PREVIEW_MINOR_VERSION);
return;
}
if (!Arguments::enable_preview()) {
Exceptions::fthrow(
THREAD_AND_LOCATION,
vmSymbols::java_lang_UnsupportedClassVersionError(),
"Preview features are not enabled for %s (class file version %u.%u). Try running with '--enable-preview'",
class_name->as_C_string(), major, minor);
return;
}
} else { // minor != JAVA_PREVIEW_MINOR_VERSION
Exceptions::fthrow(
THREAD_AND_LOCATION,
vmSymbols::java_lang_UnsupportedClassVersionError(),
"%s (class file version %u.%u) was compiled with an invalid non-zero minor version",
class_name->as_C_string(), major, minor);
}
}
常見version
- 49:1.5
- 50:1.6
- 51:1.7
Constant_pool
classFileParser.cpp
先讀取一個u2類型的變量,作爲constant_pool_size(代碼中的cp_size)
stream->guarantee_more(3, CHECK); // length, first cp tag
u2 cp_size = stream->get_u2_fast();
guarantee_property(
cp_size >= 1, "Illegal constant pool size %u in class file %s",
cp_size, CHECK);
_orig_cp_size = cp_size;
if (is_hidden()) { // Add a slot for hidden class name.
assert(_max_num_patched_klasses == 0, "Sanity check");
cp_size++;
} else {
if (int(cp_size) + _max_num_patched_klasses > 0xffff) {
THROW_MSG(vmSymbols::java_lang_InternalError(), "not enough space for patched classes");
}
cp_size += _max_num_patched_klasses;
}
其中*is_hidden()*是讀取了classFileParser.hpp中的_is_hidden變量,至於這個變量貌似在ClassFileParser的構造函數中來自ClassLoader
接下來纔是讀取constant_pool[cp_size]
_cp = ConstantPool::allocate(_loader_data,cp_size,CHECK);
ConstantPool* const cp = _cp;
parse_constant_pool(stream, cp, _orig_cp_size, CHECK);
assert(cp_size == (const u2)cp->length(), "invariant");
可以看出來這裏是調用了ConstantPool裏面的allocate來獲得cp,然後在使用parse_constant_pool來逐個讀取,相關代碼如下
ConstantPool* ConstantPool::allocate(ClassLoaderData* loader_data, int length, TRAPS) {
Array<u1>* tags = MetadataFactory::new_array<u1>(loader_data, length, 0, CHECK_NULL);
int size = ConstantPool::size(length);
return new (loader_data, size, MetaspaceObj::ConstantPoolType, THREAD) ConstantPool(tags);
}
上面的size方法主要是做了對齊的操作
注意:這裏最大的一個坑是class文件裏cp_entries的實際長度是cp_size-1,因爲第0個cp_entries不存在
Constant Pool Entries
這些常量名稱都是JVM_CONSTANT_*格式,下面統一把前後綴去掉,其中tag長度爲u1,length爲u2,index爲u2,bytes的長度根據下面表確定(這些定義來自於classfile_constants.h.template),另外注意到injector.h裏也有類似定義,但是根據說明injector.h是用於classFile轉換
類型 | tag | length | bytes | index(指向…的索引項) | index(指向…的索引項) |
---|---|---|---|---|---|
Utf8 | 1 | u2 | 長度length的字符串 | ||
Unicode | 2 | ||||
Integer | 3 | u4 高位在前的int值 | |||
Float | 4 | u4 高位在前的float | |||
Long | 5 | u8 高位在前的long | |||
Double | 6 | u8 高位在前的double | |||
Class | 7 | 全限定名常量項 | |||
String | 8 | 字符串字面量 | |||
Fieldref | 9 | 聲明字段的Class | 字段描述符NameAndType | ||
Methodref | 10 | 聲明方法的類Class | 名稱及類型NameAndType | ||
InterfaceMethodref | 11 | 聲明方法的接口Class | NameAndType | ||
NameAndType | 12 | 字段/方法名稱常量項 | 字段/方法描述符常量項 | ||
MethodHandle | 15 | ||||
MethodType | 16 | ||||
Dynamic | 17 | ||||
InvokeDynamic | 18 | ||||
Module | 19 | ||||
Package | 20 | ||||
ExternalMax | 20 |
對於一個java中聲明爲int的變量,在class文件中它的類型也將對應一個String,並且值是I,如果用16位表示則是0x49,沒想到jvm沒有對這些基本類型做出區別於對象的處理.另外對於類的字符串,很多是以L開頭.
Access_flag
classFileParser.cpp
// ACCESS FLAGS
stream->guarantee_more(8, CHECK); // flags, this_class, super_class, infs_len
// Access flags
jint flags;
// JVM_ACC_MODULE is defined in JDK-9 and later.
if (_major_version >= JAVA_9_VERSION) {
flags = stream->get_u2_fast() & (JVM_RECOGNIZED_CLASS_MODIFIERS | JVM_ACC_MODULE);
} else {
flags = stream->get_u2_fast() & JVM_RECOGNIZED_CLASS_MODIFIERS;
}
if ((flags & JVM_ACC_INTERFACE) && _major_version < JAVA_6_VERSION) {
// Set abstract bit for old class files for backward compatibility
flags |= JVM_ACC_ABSTRACT;
}
verify_legal_class_modifiers(flags, CHECK);
short bad_constant = class_bad_constant_seen();
if (bad_constant != 0) {
// Do not throw CFE until after the access_flags are checked because if
// ACC_MODULE is set in the access flags, then NCDFE must be thrown, not CFE.
classfile_parse_error("Unknown constant tag %u in class file %s", bad_constant, CHECK);
}
_access_flags.set_flags(flags);
其中用到的JVM_RECOGNIZED_CLASS_MODIFIERS來自jvm.h
#define JVM_RECOGNIZED_CLASS_MODIFIERS (JVM_ACC_PUBLIC | \
JVM_ACC_FINAL | \
JVM_ACC_SUPER | \
JVM_ACC_INTERFACE | \
JVM_ACC_ABSTRACT | \
JVM_ACC_ANNOTATION | \
JVM_ACC_ENUM | \
JVM_ACC_SYNTHETIC)
其中JVM_ACC_SYNTHETIC用來標識這個類並非用戶代碼產生.另外從java1.2開始都會有JVM_ACC_SUPER
This_class
首先讀取一個u2的值用於指向constant_pool中的位置,然後就是各種校驗,詳細代碼如下
// This class and superclass
_this_class_index = stream->get_u2_fast();
check_property(
valid_cp_range(_this_class_index, cp_size) &&
cp->tag_at(_this_class_index).is_unresolved_klass(),
"Invalid this class index %u in constant pool in class file %s",
_this_class_index, CHECK);
Symbol* const class_name_in_cp = cp->klass_name_at(_this_class_index);
assert(class_name_in_cp != NULL, "class_name can't be null");
// Don't need to check whether this class name is legal or not.
// It has been checked when constant pool is parsed.
// However, make sure it is not an array type.
if (_need_verify) {
guarantee_property(class_name_in_cp->char_at(0) != JVM_SIGNATURE_ARRAY,
"Bad class name in class file %s",
CHECK);
}
#ifdef ASSERT
// Basic sanity checks
assert(!(_is_hidden && (_unsafe_anonymous_host != NULL)), "mutually exclusive variants");
if (_unsafe_anonymous_host != NULL) {
assert(_class_name == vmSymbols::unknown_class_name(), "A named anonymous class???");
}
if (_is_hidden) {
assert(_class_name != vmSymbols::unknown_class_name(), "hidden classes should have a special name");
}
#endif
// Update the _class_name as needed depending on whether this is a named,
// un-named, hidden or unsafe-anonymous class.
if (_is_hidden) {
assert(_class_name != NULL, "Unexpected null _class_name");
#ifdef ASSERT
if (_need_verify) {
verify_legal_class_name(_class_name, CHECK);
}
#endif
// NOTE: !_is_hidden does not imply "findable" as it could be an old-style
// "hidden" unsafe-anonymous class
// If this is an anonymous class fix up its name if it is in the unnamed
// package. Otherwise, throw IAE if it is in a different package than
// its host class.
} else if (_unsafe_anonymous_host != NULL) {
update_class_name(class_name_in_cp);
fix_unsafe_anonymous_class_name(CHECK);
} else {
// Check if name in class file matches given name
if (_class_name != class_name_in_cp) {
if (_class_name != vmSymbols::unknown_class_name()) {
ResourceMark rm(THREAD);
Exceptions::fthrow(THREAD_AND_LOCATION,
vmSymbols::java_lang_NoClassDefFoundError(),
"%s (wrong name: %s)",
class_name_in_cp->as_C_string(),
_class_name->as_C_string()
);
return;
} else {
// The class name was not known by the caller so we set it from
// the value in the CP.
update_class_name(class_name_in_cp);
}
// else nothing to do: the expected class name matches what is in the CP
}
}
// Verification prevents us from creating names with dots in them, this
// asserts that that's the case.
assert(is_internal_format(_class_name), "external class name format used internally");
if (!is_internal()) {
LogTarget(Debug, class, preorder) lt;
if (lt.is_enabled()){
ResourceMark rm(THREAD);
LogStream ls(lt);
ls.print("%s", _class_name->as_klass_external_name());
if (stream->source() != NULL) {
ls.print(" source: %s", stream->source());
}
ls.cr();
}
#if INCLUDE_CDS
if (DumpLoadedClassList != NULL && stream->source() != NULL && classlist_file->is_open()) {
if (!ClassLoader::has_jrt_entry()) {
warning("DumpLoadedClassList and CDS are not supported in exploded build");
DumpLoadedClassList = NULL;
} else if (SystemDictionaryShared::is_sharing_possible(_loader_data) &&
!_is_hidden &&
_unsafe_anonymous_host == NULL) {
// Only dump the classes that can be stored into CDS archive.
// Hidden and unsafe anonymous classes such as generated LambdaForm classes are also not included.
oop class_loader = _loader_data->class_loader();
ResourceMark rm(THREAD);
bool skip = false;
if (class_loader == NULL || SystemDictionary::is_platform_class_loader(class_loader)) {
// For the boot and platform class loaders, skip classes that are not found in the
// java runtime image, such as those found in the --patch-module entries.
// These classes can't be loaded from the archive during runtime.
if (!stream->from_boot_loader_modules_image() && strncmp(stream->source(), "jrt:", 4) != 0) {
skip = true;
}
if (class_loader == NULL && ClassLoader::contains_append_entry(stream->source())) {
// .. but don't skip the boot classes that are loaded from -Xbootclasspath/a
// as they can be loaded from the archive during runtime.
skip = false;
}
}
if (skip) {
tty->print_cr("skip writing class %s from source %s to classlist file",
_class_name->as_C_string(), stream->source());
} else {
classlist_file->print_cr("%s", _class_name->as_C_string());
classlist_file->flush();
}
}
}
#endif
}
Super_class
和上面代碼類似,同樣是通過一個u2來獲取在constant_pool中的位置,代碼如下
// SUPERKLASS
_super_class_index = stream->get_u2_fast();
_super_klass = parse_super_class(cp,
_super_class_index,
_need_verify,
CHECK);
可以看出來實際通過parse_super_class來完成,代碼如下
const InstanceKlass* ClassFileParser::parse_super_class(ConstantPool* const cp,
const int super_class_index,const bool need_verify,TRAPS) {
assert(cp != NULL, "invariant");
const InstanceKlass* super_klass = NULL;
if (super_class_index == 0) {
check_property(_class_name == vmSymbols::java_lang_Object(),
"Invalid superclass index %u in class file %s",
super_class_index,
CHECK_NULL);
} else {
check_property(valid_klass_reference_at(super_class_index),
"Invalid superclass index %u in class file %s",
super_class_index,
CHECK_NULL);
// The class name should be legal because it is checked when parsing constant pool.
// However, make sure it is not an array type.
bool is_array = false;
if (cp->tag_at(super_class_index).is_klass()) {
super_klass = InstanceKlass::cast(cp->resolved_klass_at(super_class_index));
if (need_verify)
is_array = super_klass->is_array_klass();
} else if (need_verify) {
is_array = (cp->klass_name_at(super_class_index)->char_at(0) == JVM_SIGNATURE_ARRAY);
}
if (need_verify) {
guarantee_property(!is_array,
"Bad superclass name in class file %s", CHECK_NULL);
}
}
return super_klass;
}
上面的第一個if分支是爲了java.lang.Object做了專門處理,它可以沒有父類(super_class_index==0),其他都需要檢查父類的合法性.另外就是對是否是數組做了一些檢查.
Interfaces
// Interfaces
_itfs_len = stream->get_u2_fast();
parse_interfaces(stream,_itfs_len,cp,&_has_nonstatic_concrete_methods,CHECK);
assert(_local_interfaces != NULL, "invariant");
具體幹活的parse_interfaces函數代碼如下,其中我對於assert(itfs_len > 0, “only called for len>0”);這一行代碼也是驚呆了.
// Side-effects: populates the _local_interfaces field
void ClassFileParser::parse_interfaces(const ClassFileStream* const stream,
const int itfs_len,
ConstantPool* const cp,
bool* const has_nonstatic_concrete_methods,
TRAPS) {
assert(stream != NULL, "invariant");
assert(cp != NULL, "invariant");
assert(has_nonstatic_concrete_methods != NULL, "invariant");
if (itfs_len == 0) {
_local_interfaces = Universe::the_empty_instance_klass_array();
} else {
assert(itfs_len > 0, "only called for len>0");
_local_interfaces = MetadataFactory::new_array<InstanceKlass*>(_loader_data, itfs_len, NULL, CHECK);
int index;
for (index = 0; index < itfs_len; index++) {
const u2 interface_index = stream->get_u2(CHECK);
Klass* interf;
check_property(
valid_klass_reference_at(interface_index),
"Interface name has bad constant pool index %u in class file %s",
interface_index, CHECK);
if (cp->tag_at(interface_index).is_klass()) {
interf = cp->resolved_klass_at(interface_index);
} else {
Symbol* const unresolved_klass = cp->klass_name_at(interface_index);
// Don't need to check legal name because it's checked when parsing constant pool.
// But need to make sure it's not an array type.
guarantee_property(unresolved_klass->char_at(0) != JVM_SIGNATURE_ARRAY,
"Bad interface name in class file %s", CHECK);
// Call resolve_super so classcircularity is checked
interf = SystemDictionary::resolve_super_or_fail(
_class_name,
unresolved_klass,
Handle(THREAD, _loader_data->class_loader()),
_protection_domain,
false,
CHECK);
}
if (!interf->is_interface()) {
THROW_MSG(vmSymbols::java_lang_IncompatibleClassChangeError(),
err_msg("class %s can not implement %s, because it is not an interface (%s)",
_class_name->as_klass_external_name(),
interf->external_name(),
interf->class_in_module_of_loader()));
}
if (InstanceKlass::cast(interf)->has_nonstatic_concrete_methods()) {
*has_nonstatic_concrete_methods = true;
}
_local_interfaces->at_put(index, InstanceKlass::cast(interf));
}
if (!_need_verify || itfs_len <= 1) {
return;
}
// Check if there's any duplicates in interfaces
ResourceMark rm(THREAD);
NameSigHash** interface_names = NEW_RESOURCE_ARRAY_IN_THREAD(THREAD,
NameSigHash*,
HASH_ROW_SIZE);
initialize_hashtable(interface_names);
bool dup = false;
const Symbol* name = NULL;
{
debug_only(NoSafepointVerifier nsv;)
for (index = 0; index < itfs_len; index++) {
const InstanceKlass* const k = _local_interfaces->at(index);
name = k->name();
// If no duplicates, add (name, NULL) in hashtable interface_names.
if (!put_after_lookup(name, NULL, interface_names)) {
dup = true;
break;
}
}
}
if (dup) {
classfile_parse_error("Duplicate interface name \"%s\" in class file %s",
name->as_C_string(), CHECK);
}
}
}
Fields
classFileParser.cpp
// Fields (offsets are filled in later)
_fac = new FieldAllocationCount();
parse_fields(stream,
_access_flags.is_interface(),
_fac,
cp,
cp_size,
&_java_fields_count,
CHECK);
assert(_fields != NULL, "invariant");
其中FieldAllocationCount代碼如下,看上去最多隻能有65535個field
class ClassFileParser::FieldAllocationCount : public ResourceObj {
public:
u2 count[MAX_FIELD_ALLOCATION_TYPE];
FieldAllocationCount() {
for (int i = 0; i < MAX_FIELD_ALLOCATION_TYPE; i++) {
count[i] = 0;
}
}
FieldAllocationType update(bool is_static, BasicType type) {
FieldAllocationType atype = basic_type_to_atype(is_static, type);
if (atype != BAD_ALLOCATION_TYPE) {
// Make sure there is no overflow with injected fields.
assert(count[atype] < 0xFFFF, "More than 65535 fields");
count[atype]++;
}
return atype;
}
};
下面是parse_fields的代碼,可以看出你裏面有很多防禦性代碼
// Side-effects: populates the _fields, _fields_annotations,
// _fields_type_annotations fields
void ClassFileParser::parse_fields(const ClassFileStream* const cfs,
bool is_interface,
FieldAllocationCount* const fac,
ConstantPool* cp,
const int cp_size,
u2* const java_fields_count_ptr,
TRAPS) {
assert(cfs != NULL, "invariant");
assert(fac != NULL, "invariant");
assert(cp != NULL, "invariant");
assert(java_fields_count_ptr != NULL, "invariant");
assert(NULL == _fields, "invariant");
assert(NULL == _fields_annotations, "invariant");
assert(NULL == _fields_type_annotations, "invariant");
cfs->guarantee_more(2, CHECK); // length
const u2 length = cfs->get_u2_fast();
*java_fields_count_ptr = length;
int num_injected = 0;
const InjectedField* const injected = JavaClasses::get_injected(_class_name,
&num_injected);
const int total_fields = length + num_injected;
// The field array starts with tuples of shorts
// [access, name index, sig index, initial value index, byte offset].
// A generic signature slot only exists for field with generic
// signature attribute. And the access flag is set with
// JVM_ACC_FIELD_HAS_GENERIC_SIGNATURE for that field. The generic
// signature slots are at the end of the field array and after all
// other fields data.
//
// f1: [access, name index, sig index, initial value index, low_offset, high_offset]
// f2: [access, name index, sig index, initial value index, low_offset, high_offset]
// ...
// fn: [access, name index, sig index, initial value index, low_offset, high_offset]
// [generic signature index]
// [generic signature index]
// ...
//
// Allocate a temporary resource array for field data. For each field,
// a slot is reserved in the temporary array for the generic signature
// index. After parsing all fields, the data are copied to a permanent
// array and any unused slots will be discarded.
ResourceMark rm(THREAD);
u2* const fa = NEW_RESOURCE_ARRAY_IN_THREAD(THREAD,
u2,
total_fields * (FieldInfo::field_slots + 1));
// The generic signature slots start after all other fields' data.
int generic_signature_slot = total_fields * FieldInfo::field_slots;
int num_generic_signature = 0;
for (int n = 0; n < length; n++) {
// access_flags, name_index, descriptor_index, attributes_count
cfs->guarantee_more(8, CHECK);
AccessFlags access_flags;
const jint flags = cfs->get_u2_fast() & JVM_RECOGNIZED_FIELD_MODIFIERS;
verify_legal_field_modifiers(flags, is_interface, CHECK);
access_flags.set_flags(flags);
const u2 name_index = cfs->get_u2_fast();
check_property(valid_symbol_at(name_index),
"Invalid constant pool index %u for field name in class file %s",
name_index, CHECK);
const Symbol* const name = cp->symbol_at(name_index);
verify_legal_field_name(name, CHECK);
const u2 signature_index = cfs->get_u2_fast();
check_property(valid_symbol_at(signature_index),
"Invalid constant pool index %u for field signature in class file %s",
signature_index, CHECK);
const Symbol* const sig = cp->symbol_at(signature_index);
verify_legal_field_signature(name, sig, CHECK);
u2 constantvalue_index = 0;
bool is_synthetic = false;
u2 generic_signature_index = 0;
const bool is_static = access_flags.is_static();
FieldAnnotationCollector parsed_annotations(_loader_data);
const u2 attributes_count = cfs->get_u2_fast();
if (attributes_count > 0) {
parse_field_attributes(cfs,
attributes_count,
is_static,
signature_index,
&constantvalue_index,
&is_synthetic,
&generic_signature_index,
&parsed_annotations,
CHECK);
if (parsed_annotations.field_annotations() != NULL) {
if (_fields_annotations == NULL) {
_fields_annotations = MetadataFactory::new_array<AnnotationArray*>(
_loader_data, length, NULL,
CHECK);
}
_fields_annotations->at_put(n, parsed_annotations.field_annotations());
parsed_annotations.set_field_annotations(NULL);
}
if (parsed_annotations.field_type_annotations() != NULL) {
if (_fields_type_annotations == NULL) {
_fields_type_annotations =
MetadataFactory::new_array<AnnotationArray*>(_loader_data,
length,
NULL,
CHECK);
}
_fields_type_annotations->at_put(n, parsed_annotations.field_type_annotations());
parsed_annotations.set_field_type_annotations(NULL);
}
if (is_synthetic) {
access_flags.set_is_synthetic();
}
if (generic_signature_index != 0) {
access_flags.set_field_has_generic_signature();
fa[generic_signature_slot] = generic_signature_index;
generic_signature_slot ++;
num_generic_signature ++;
}
}
FieldInfo* const field = FieldInfo::from_field_array(fa, n);
field->initialize(access_flags.as_short(),
name_index,
signature_index,
constantvalue_index);
const BasicType type = cp->basic_type_for_signature_at(signature_index);
// Remember how many oops we encountered and compute allocation type
const FieldAllocationType atype = fac->update(is_static, type);
field->set_allocation_type(atype);
// After field is initialized with type, we can augment it with aux info
if (parsed_annotations.has_any_annotations()) {
parsed_annotations.apply_to(field);
if (field->is_contended()) {
_has_contended_fields = true;
}
}
}
int index = length;
if (num_injected != 0) {
for (int n = 0; n < num_injected; n++) {
// Check for duplicates
if (injected[n].may_be_java) {
const Symbol* const name = injected[n].name();
const Symbol* const signature = injected[n].signature();
bool duplicate = false;
for (int i = 0; i < length; i++) {
const FieldInfo* const f = FieldInfo::from_field_array(fa, i);
if (name == cp->symbol_at(f->name_index()) &&
signature == cp->symbol_at(f->signature_index())) {
// Symbol is desclared in Java so skip this one
duplicate = true;
break;
}
}
if (duplicate) {
// These will be removed from the field array at the end
continue;
}
}
// Injected field
FieldInfo* const field = FieldInfo::from_field_array(fa, index);
field->initialize(JVM_ACC_FIELD_INTERNAL,
injected[n].name_index,
injected[n].signature_index,
0);
const BasicType type = Signature::basic_type(injected[n].signature());
// Remember how many oops we encountered and compute allocation type
const FieldAllocationType atype = fac->update(false, type);
field->set_allocation_type(atype);
index++;
}
}
assert(NULL == _fields, "invariant");
_fields =
MetadataFactory::new_array<u2>(_loader_data,
index * FieldInfo::field_slots + num_generic_signature,
CHECK);
// Sometimes injected fields already exist in the Java source so
// the fields array could be too long. In that case the
// fields array is trimed. Also unused slots that were reserved
// for generic signature indexes are discarded.
{
int i = 0;
for (; i < index * FieldInfo::field_slots; i++) {
_fields->at_put(i, fa[i]);
}
for (int j = total_fields * FieldInfo::field_slots;
j < generic_signature_slot; j++) {
_fields->at_put(i++, fa[j]);
}
assert(_fields->length() == i, "");
}
if (_need_verify && length > 1) {
// Check duplicated fields
ResourceMark rm(THREAD);
NameSigHash** names_and_sigs = NEW_RESOURCE_ARRAY_IN_THREAD(
THREAD, NameSigHash*, HASH_ROW_SIZE);
initialize_hashtable(names_and_sigs);
bool dup = false;
const Symbol* name = NULL;
const Symbol* sig = NULL;
{
debug_only(NoSafepointVerifier nsv;)
for (AllFieldStream fs(_fields, cp); !fs.done(); fs.next()) {
name = fs.name();
sig = fs.signature();
// If no duplicates, add name/signature in hashtable names_and_sigs.
if (!put_after_lookup(name, sig, names_and_sigs)) {
dup = true;
break;
}
}
}
if (dup) {
classfile_parse_error("Duplicate field name \"%s\" with signature \"%s\" in class file %s",
name->as_C_string(), sig->as_klass_external_name(), CHECK);
}
}
}
fields的結構
類型 | 名稱 | 數量 | 說明 |
---|---|---|---|
u2 | access_flags | 1 | 訪問標識 |
u2 | name_index | 1 | 簡單名稱引用 |
u2 | descriptor_index | 1 | 類型引用信息 |
u2 | attribute_count | 1 | |
attribute_info | attributes | attribute_count |
標識字符
標識字符 | 含義 |
---|---|
B | byte |
C | char |
D | double |
F | float |
I | int |
J | long |
S | short |
Z | boolean |
V | void |
L* | 對象 |
如果是數組則會在前面加上左中括號,例如[I標識int[]
Methods
classFileParser.cpp
// Methods
AccessFlags promoted_flags;
parse_methods(stream,
_access_flags.is_interface(),
&promoted_flags,
&_has_final_method,
&_declares_nonstatic_concrete_methods,
CHECK);
assert(_methods != NULL, "invariant");
// promote flags from parse_methods() to the klass' flags
_access_flags.add_promoted_flags(promoted_flags.as_int());
if (_declares_nonstatic_concrete_methods) {
_has_nonstatic_concrete_methods = true;
}
// The promoted_flags parameter is used to pass relevant access_flags
// from the methods back up to the containing klass. These flag values
// are added to klass's access_flags.
// Side-effects: populates the _methods field in the parser
void ClassFileParser::parse_methods(const ClassFileStream* const cfs,
bool is_interface,
AccessFlags* promoted_flags,
bool* has_final_method,
bool* declares_nonstatic_concrete_methods,
TRAPS) {
assert(cfs != NULL, "invariant");
assert(promoted_flags != NULL, "invariant");
assert(has_final_method != NULL, "invariant");
assert(declares_nonstatic_concrete_methods != NULL, "invariant");
assert(NULL == _methods, "invariant");
cfs->guarantee_more(2, CHECK); // length
const u2 length = cfs->get_u2_fast();
if (length == 0) {
_methods = Universe::the_empty_method_array();
} else {
_methods = MetadataFactory::new_array<Method*>(_loader_data,
length,
NULL,
CHECK);
for (int index = 0; index < length; index++) {
Method* method = parse_method(cfs,
is_interface,
_cp,
promoted_flags,
CHECK);
if (method->is_final()) {
*has_final_method = true;
}
// declares_nonstatic_concrete_methods: declares concrete instance methods, any access flags
// used for interface initialization, and default method inheritance analysis
if (is_interface && !(*declares_nonstatic_concrete_methods)
&& !method->is_abstract() && !method->is_static()) {
*declares_nonstatic_concrete_methods = true;
}
_methods->at_put(index, method);
}
if (_need_verify && length > 1) {
// Check duplicated methods
ResourceMark rm(THREAD);
NameSigHash** names_and_sigs = NEW_RESOURCE_ARRAY_IN_THREAD(
THREAD, NameSigHash*, HASH_ROW_SIZE);
initialize_hashtable(names_and_sigs);
bool dup = false;
const Symbol* name = NULL;
const Symbol* sig = NULL;
{
debug_only(NoSafepointVerifier nsv;)
for (int i = 0; i < length; i++) {
const Method* const m = _methods->at(i);
name = m->name();
sig = m->signature();
// If no duplicates, add name/signature in hashtable names_and_sigs.
if (!put_after_lookup(name, sig, names_and_sigs)) {
dup = true;
break;
}
}
}
if (dup) {
classfile_parse_error("Duplicate method name \"%s\" with signature \"%s\" in class file %s",
name->as_C_string(), sig->as_klass_external_name(), CHECK);
}
}
}
}
結構
method的結構與field類似,但是其access_flags的可選性顯然與field不同
參數&返回
用描述符描述方法時,按照先參數列表,後返回值的順序描述,參數列表按照參數的嚴格順序放在一組"()“之內,如方法"String getAll(int id,String name)“的描述符爲”(I,Ljava/lang/String;)Ljava/lang/String”
void <clinit >()
這個是由編譯器自動添加的方法,用於初始化static變量和執行static塊
<init>
構造方法也是方法,會有編譯器自動產生
Attributes
在class文件中,屬性表,方法表都可以保護眼自己的屬性表集合,用於描述某些場景的專有信息.
與class文件中其他數據項對長度,順序,格式的嚴格要求不同,屬性表集合不要求其中包含的屬性表具有嚴格的順序,並且只要屬性的名稱不與已有的屬性名稱重複,任何人實現的編譯器都可以向屬性表寫入自己定義的屬性信息.虛擬機在運行時會忽略不能識別的屬性.
常見屬性表
以下是JVM規定的一些屬性,但是根據vmSymbols來看目前OpenJDK支持的更多
Code
位置:方法表
含義:java編譯成的字節碼指令,有方法體的方法纔會有Code
Code屬性的結構
類型 | 名稱 | 數量 | 含義 |
---|---|---|---|
u2 | attribute_name_index | 1 | |
u4 | attribute_length | 1 | |
u2 | max_stack | 1 | 虛擬機運行時根據這個值分配棧的操作數棧深度 |
u2 | max_locals | 1 | 局部變量所需的slot數量,不同變量可以服用slot |
u4 | code_length | 1 | 編譯後字節碼長度 |
u1 | code | code_length | 字節碼指令 |
u2 | exception_table_length | 1 | |
exception_info | exception_table | exception_table_length | |
u2 | attributes_count | 1 | |
attribute_info | attributes | attributes_count |
雖然上面的code_length定義爲u4,但是按照JVM規範一個方法不能超過65535字節
另外Code裏涉及到指令集,參見後面的揭祕java虛擬機(三)JVM指令集
ConstantValue
位置:字段表
含義:static定義的常量值(這是JVM要求的,但是javac一般會加上final這一限制)
ConstantValue屬性結構
類型 | 名稱 | 數量 |
---|---|---|
u2 | attribute_name_index | 1 |
u4 | attribute_length | 1 |
u2 | constantvalue_index | 1 |
其中attribute_length的值恆爲0x0000002,另外ConstantValue只支持int和String
Deprecated
位置:類文件,字段表,方法表
含義:被聲明爲deprecated的類,字段,方法
Exceptions
位置:方法表
含義:方法聲明的受檢異常
類型 | 名稱 | 數量 |
---|---|---|
u2 | attribute_name_index | 1 |
u4 | attribute_length | 1 |
u2 | number_of_exceptions | 1 |
u2 | exception_index_table | number_of_exceptions |
InnerClasses
位置:類文件
含義:內部類
InnerClasses屬性結構表
類型 | 名稱 | 數量 |
---|---|---|
u2 | attribute_name_index | 1 |
u4 | attribute_length | 1 |
u2 | number_of_classes | 1 |
inner_classes_info | inner_classes | number_of_classes |
inner_classes_info表結構
類型 | 名稱 | 數量 |
---|---|---|
u2 | inner_class_info_index | 1 |
u2 | outer_class_index | 1 |
u2 | inner_name_index | 1 |
u2 | inner_class_access_flags | 1 |
書中這個地方寫出了,最後一行按照源代碼以及語義可以知道應該是inner_class_access_flags,而書中寫的是inner_name_access_flags,並且下面還照抄了.
LineNumberTable
位置:Code
含義:java源碼行號和字節碼指令對應關係
在這裏,書中再一次錯誤,不過這是個拼寫錯誤,但是被大量CV或者是批量替換
LineNumberTable屬性結構表
類型 | 名稱 | 數量 |
---|---|---|
u2 | attribute_name_index | 1 |
u4 | attribute_length | 1 |
u2 | line_number_table_length | 1 |
line_number_info | line_number_table | line_number_table_length |
line_number_info屬性結構表
類型 | 名稱 | 數量 | 說明 |
---|---|---|---|
u2 | sart_pc | 1 | 字節碼行號 |
u2 | line_number | 1 | java源代碼行號 |
這個是用於端點調試和打印異常時使用
LocalVariableTable
位置:Code
含義:方法的局部變量
根據Option.java可以知道在class文件中是否生成LocalVariableTable可以在javac中通過-g:none和-g:vars來控制關閉或生成,默認是不生成.
LocalVariableTable屬性結構表
類型 | 名稱 | 數量 |
---|---|---|
u2 | attribute_name_index | 1 |
u4 | attribute_length | 1 |
u2 | local_variable_table_length | 1 |
local_variable_info | local_variable_table | local_variable_table_length |
SourceFile
位置:類文件
含義:源文件名稱
還是參見Option.java知道這個通過none或source來控制是否生成
SourceFile屬性結構表
類型 | 名稱 | 數量 |
---|---|---|
u2 | attribute_name_index | 1 |
u4 | attribute_length | 1 |
u2 | sourcefile_index | 1 |
Synthetic
位置:類文件,方法表,字段表
含義:該類/方法/字段由編譯器自動生成,但是clinit和init例外
classFileParser.cpp
最後讀取attributes並確保stream被讀取完畢
// Additional attributes/annotations
_parsed_annotations = new ClassAnnotationCollector();
parse_classfile_attributes(stream, cp, _parsed_annotations, CHECK);
assert(_inner_classes != NULL, "invariant");
// Finalize the Annotations metadata object,
// now that all annotation arrays have been created.
create_combined_annotations(CHECK);
最後是確保stream被讀取完畢
// Make sure this is the end of class file stream
guarantee_property(stream->at_eos(),
"Extra bytes at the end of class file %s",
CHECK);
// all bytes in stream read and parsed
oop-klass
當C,C++和Delphi等程序被編譯成二進制程序後,原來所定義的高級數據結構都不復存在了,當windows/linux等操作系統(宿主機)加載這些二進制程序時,是不會加載這些語言中所定義的高級數據結構的,宿主機壓根就不知道原來定了那些數據結構,哪些類,所有的數據結構都被轉換爲對特定內存段的偏移地址.例如C中的Struct結構體,被編譯後不復存在,彙編和機器語言中沒有與之對應的數據結構的概念,CPU更不知道何爲結構體.C++和Delphi中的類概念被編譯後也不復存在,所謂的類最終變成內存首地址.而JVM虛擬機在加載字節碼程序時,會記錄字節碼中所定義的所有類型的原始信息(元數據),JVM知道程序中包含了哪些類,以及每個類中所關聯的字段,方法,父類等信息.這是JVM虛擬機與操作系統最大的區別所在
oop:ordinary object pointer,用來描述對象實例信息,一般保存在HEAP
klass:用來描述java類,是虛擬機內部java類型結構的對等體,一般保存在PERM
根據最新的oopsHierarchy.hpp裏的說明,整個繼承樹分爲以下三個
OBJECT hierarchy
繼承關係代碼如下
// OBJECT hierarchy
// This hierarchy is a representation hierarchy, i.e. if A is a superclass
// of B, A's representation is a prefix of B's representation.
typedef class oopDesc* oop;
typedef class instanceOopDesc* instanceOop;
typedef class arrayOopDesc* arrayOop;
typedef class objArrayOopDesc* objArrayOop;
typedef class typeArrayOopDesc* typeArrayOop;
- oopDesc
這個類定義在oop.hpp中,方法有很對,屬性卻很少,如下所示
class oopDesc {
friend class VMStructs;
friend class JVMCIVMStructs;
private:
volatile markWord _mark;
union _metadata {
Klass* _klass;
narrowKlass _compressed_klass;
} _metadata;
......
}
_mark屬性主要使用標記對象的各種狀態,例如鎖,分代,線程狀態等等.而_metadata可以看出來指向Klass,至於裏面的方法很多.上面的narrowKlass是用於指針壓縮,相關定義在oopsHierarchy.hpp如下
typedef juint narrowOop; // Offset instead of address for an oop within a java object
// If compressed klass pointers then use narrowKlass.
typedef juint narrowKlass;
而juint則在globalDefinitions.hpp定義爲u4
metadata hierarchy
// The metadata hierarchy is separate from the oop hierarchy
// class MetaspaceObj
class ConstMethod;
class ConstantPoolCache;
class MethodData;
// class Metadata
class Method;
class ConstantPool;
// class CHeapObj
class CompiledICHolder;
klass hierarchy
// The klass hierarchy is separate from the oop hierarchy.
class Klass;
class InstanceKlass;
class InstanceMirrorKlass;
class InstanceClassLoaderKlass;
class InstanceRefKlass;
class ArrayKlass;
class ObjArrayKlass;
class TypeArrayKlass;