揭祕java虛擬機(二)字節碼

class文件

主要是classFileParser.cpp裏面的parse_stream方法

MagicNumber

classFileParser.cpp

u4:衆所周知的CAFEBABE,如果不是則虛擬機拒絕加載.這個常量定義在classFileParser.cpp

#define JAVA_CLASSFILE_MAGIC              0xCAFEBABE

至於使用則是在parse_stream方法(該方法在ClassParser的構造方法中調用)中

void ClassFileParser::parse_stream(const ClassFileStream* const stream,
                                   TRAPS) {
  assert(stream != NULL, "invariant");
  assert(_class_name != NULL, "invariant");
  // BEGIN STREAM PARSING
  stream->guarantee_more(8, CHECK);  // magic, major, minor
  // Magic value
  const u4 magic = stream->get_u4_fast();
  guarantee_property(magic == JAVA_CLASSFILE_MAGIC,
                     "Incompatible magic value %u in class file %s",
                     magic, CHECK);

大端和小端

Little-Endian:低位字節排在內存的低地址端,高位字節排在內存的高地址端(書寫中左邊是高位例如12中1是高位)
Big-Endian:
網絡字節序:TCP/IP各層使用Big-Endian
JVM:Big-Endian,所以class文件是遵循Big-Endian的,但是C/C++一般是追隨環境的,所以在讀取class文件時需要轉換.
在整個流的讀取過程中都會遇到平臺相關問題,這裏以cafebabe的讀取爲例,則就需要深入get_u4_fase(),下面代碼來自classFileStream.hpp,至於爲什麼帶上fast則是因爲這裏沒有做校驗(u1和u2有做校驗的非fast版本,但是u4只有fast版本)

// Read u4 from stream
  u4 get_u4_fast() const {
    u4 res = Bytes::get_Java_u4((address)_current);
    _current += 4;
    return res;
  }

至於get_Java_u4則在不同的CPU下有所不同,以下是幾種示例(這些代碼來自不同的hpp文件)

// arm
static inline u4 get_Java_u4(address p) {
    return u4(p[0]) << 24 |
           u4(p[1]) << 16 |
           u4(p[2]) <<  8 |
           u4(p[3]);
  }

//aarch64
static inline u4   get_Java_u4(address p)           { return swap_u4(get_native_u4(p)); }
static inline u4   get_native_u4(address p)         { return *(u4*)p; }
static inline u4   swap_u4(u4 x);                   // compiler-dependent implementation
//linux
inline u4   Bytes::swap_u4(u4 x) {
  return bswap_32(x);
}
inline static u4 bswap_32(u4 x) {
    return ((x & 0xFF) << 24) |
           ((x & 0xFF00) << 8) |
           ((x >> 8) & 0xFF00) |
           ((x >> 24) & 0xFF);
}
//bytes_linux_x86.inline.hpp
inline u4   Bytes::swap_u4(u4 x) {
#ifdef AMD64
  return bswap_32(x);
#else
  u4 ret;
  __asm__ __volatile__ (
    "bswap %0"
    :"=r" (ret)      // output : register 0 => ret
    :"0"  (x)        // input  : x => register 0
    :"0"             // clobbered register
  );
  return ret;
#endif // AMD64
}

//x86
static inline u4   get_Java_u4(address p)           { return get_Java<u4>(p); }

注意這裏的每一步調用都和平臺(包括CPU&OS)相關,不同的平臺調用的是不同代碼,不過我個人認爲最後bswap_32還可以再優化一下.

Version

classFileParser.cpp

依次是兩個u2,分別是minor_versionmajor_version

// Version numbers
  _minor_version = stream->get_u2_fast();
  _major_version = stream->get_u2_fast();
  if (DumpSharedSpaces && _major_version < JAVA_6_VERSION) {
    ResourceMark rm;
    warning("Pre JDK 6 class not supported by CDS: %u.%u %s",
            _major_version,  _minor_version, _class_name->as_C_string());
    Exceptions::fthrow(
      THREAD_AND_LOCATION,
      vmSymbols::java_lang_UnsupportedClassVersionError(),
      "Unsupported major.minor version for dump time %u.%u",
      _major_version,
      _minor_version);
  }
  // Check version numbers - we check this even with verifier off
  verify_class_version(_major_version, _minor_version, _class_name, CHECK);

其中真正執行的代碼檢查的verify_class_version代碼如下

// A legal major_version.minor_version must be one of the following:
//  Major_version >= 45 and major_version < 56, any minor_version.
//  Major_version >= 56 and major_version <= JVM_CLASSFILE_MAJOR_VERSION and minor_version = 0.
//  Major_version = JVM_CLASSFILE_MAJOR_VERSION and minor_version = 65535 and --enable-preview is present.
static void verify_class_version(u2 major, u2 minor, Symbol* class_name, TRAPS){
  ResourceMark rm(THREAD);
  const u2 max_version = JVM_CLASSFILE_MAJOR_VERSION;
  if (major < JAVA_MIN_SUPPORTED_VERSION) {
    Exceptions::fthrow(
      THREAD_AND_LOCATION,
      vmSymbols::java_lang_UnsupportedClassVersionError(),
      "%s (class file version %u.%u) was compiled with an invalid major version",
      class_name->as_C_string(), major, minor);
    return;
  }
  if (major > max_version) {
    Exceptions::fthrow(
      THREAD_AND_LOCATION,
      vmSymbols::java_lang_UnsupportedClassVersionError(),
      "%s has been compiled by a more recent version of the Java Runtime (class file version %u.%u), "
      "this version of the Java Runtime only recognizes class file versions up to %u.0",
      class_name->as_C_string(), major, minor, JVM_CLASSFILE_MAJOR_VERSION);
    return;
  }
  if (major < JAVA_12_VERSION || minor == 0) {
    return;
  }
  if (minor == JAVA_PREVIEW_MINOR_VERSION) {
    if (major != max_version) {
      Exceptions::fthrow(
        THREAD_AND_LOCATION,
        vmSymbols::java_lang_UnsupportedClassVersionError(),
        "%s (class file version %u.%u) was compiled with preview features that are unsupported. "
        "This version of the Java Runtime only recognizes preview features for class file version %u.%u",
        class_name->as_C_string(), major, minor, JVM_CLASSFILE_MAJOR_VERSION, JAVA_PREVIEW_MINOR_VERSION);
      return;
    }
    if (!Arguments::enable_preview()) {
      Exceptions::fthrow(
        THREAD_AND_LOCATION,
        vmSymbols::java_lang_UnsupportedClassVersionError(),
        "Preview features are not enabled for %s (class file version %u.%u). Try running with '--enable-preview'",
        class_name->as_C_string(), major, minor);
      return;
    }
  } else { // minor != JAVA_PREVIEW_MINOR_VERSION
    Exceptions::fthrow(
        THREAD_AND_LOCATION,
        vmSymbols::java_lang_UnsupportedClassVersionError(),
        "%s (class file version %u.%u) was compiled with an invalid non-zero minor version",
        class_name->as_C_string(), major, minor);
  }
}

常見version

  • 49:1.5
  • 50:1.6
  • 51:1.7

Constant_pool

classFileParser.cpp

先讀取一個u2類型的變量,作爲constant_pool_size(代碼中的cp_size)

  stream->guarantee_more(3, CHECK); // length, first cp tag
  u2 cp_size = stream->get_u2_fast();
  guarantee_property(
    cp_size >= 1, "Illegal constant pool size %u in class file %s",
    cp_size, CHECK);
  _orig_cp_size = cp_size;
  if (is_hidden()) { // Add a slot for hidden class name.
    assert(_max_num_patched_klasses == 0, "Sanity check");
    cp_size++;
  } else {
    if (int(cp_size) + _max_num_patched_klasses > 0xffff) {
      THROW_MSG(vmSymbols::java_lang_InternalError(), "not enough space for patched classes");
    }
    cp_size += _max_num_patched_klasses;
  }

其中*is_hidden()*是讀取了classFileParser.hpp中的_is_hidden變量,至於這個變量貌似在ClassFileParser的構造函數中來自ClassLoader
接下來纔是讀取constant_pool[cp_size]

  _cp = ConstantPool::allocate(_loader_data,cp_size,CHECK);
  ConstantPool* const cp = _cp;
  parse_constant_pool(stream, cp, _orig_cp_size, CHECK);
  assert(cp_size == (const u2)cp->length(), "invariant");

可以看出來這裏是調用了ConstantPool裏面的allocate來獲得cp,然後在使用parse_constant_pool來逐個讀取,相關代碼如下

ConstantPool* ConstantPool::allocate(ClassLoaderData* loader_data, int length, TRAPS) {
  Array<u1>* tags = MetadataFactory::new_array<u1>(loader_data, length, 0, CHECK_NULL);
  int size = ConstantPool::size(length);
  return new (loader_data, size, MetaspaceObj::ConstantPoolType, THREAD) ConstantPool(tags);
}

上面的size方法主要是做了對齊的操作
注意:這裏最大的一個坑是class文件裏cp_entries的實際長度是cp_size-1,因爲第0個cp_entries不存在

Constant Pool Entries

這些常量名稱都是JVM_CONSTANT_*格式,下面統一把前後綴去掉,其中tag長度爲u1,length爲u2,index爲u2,bytes的長度根據下面表確定(這些定義來自於classfile_constants.h.template),另外注意到injector.h裏也有類似定義,但是根據說明injector.h是用於classFile轉換

類型 tag length bytes index(指向…的索引項) index(指向…的索引項)
Utf8 1 u2 長度length的字符串
Unicode 2
Integer 3 u4 高位在前的int值
Float 4 u4 高位在前的float
Long 5 u8 高位在前的long
Double 6 u8 高位在前的double
Class 7 全限定名常量項
String 8 字符串字面量
Fieldref 9 聲明字段的Class 字段描述符NameAndType
Methodref 10 聲明方法的類Class 名稱及類型NameAndType
InterfaceMethodref 11 聲明方法的接口Class NameAndType
NameAndType 12 字段/方法名稱常量項 字段/方法描述符常量項
MethodHandle 15
MethodType 16
Dynamic 17
InvokeDynamic 18
Module 19
Package 20
ExternalMax 20

對於一個java中聲明爲int的變量,在class文件中它的類型也將對應一個String,並且值是I,如果用16位表示則是0x49,沒想到jvm沒有對這些基本類型做出區別於對象的處理.另外對於類的字符串,很多是以L開頭.

Access_flag

classFileParser.cpp

  // ACCESS FLAGS
  stream->guarantee_more(8, CHECK);  // flags, this_class, super_class, infs_len
  // Access flags
  jint flags;
  // JVM_ACC_MODULE is defined in JDK-9 and later.
  if (_major_version >= JAVA_9_VERSION) {
    flags = stream->get_u2_fast() & (JVM_RECOGNIZED_CLASS_MODIFIERS | JVM_ACC_MODULE);
  } else {
    flags = stream->get_u2_fast() & JVM_RECOGNIZED_CLASS_MODIFIERS;
  }
  if ((flags & JVM_ACC_INTERFACE) && _major_version < JAVA_6_VERSION) {
    // Set abstract bit for old class files for backward compatibility
    flags |= JVM_ACC_ABSTRACT;
  }
  verify_legal_class_modifiers(flags, CHECK);
  short bad_constant = class_bad_constant_seen();
  if (bad_constant != 0) {
    // Do not throw CFE until after the access_flags are checked because if
    // ACC_MODULE is set in the access flags, then NCDFE must be thrown, not CFE.
    classfile_parse_error("Unknown constant tag %u in class file %s", bad_constant, CHECK);
  }
  _access_flags.set_flags(flags);

其中用到的JVM_RECOGNIZED_CLASS_MODIFIERS來自jvm.h

#define JVM_RECOGNIZED_CLASS_MODIFIERS (JVM_ACC_PUBLIC | \
                                        JVM_ACC_FINAL | \
                                        JVM_ACC_SUPER | \
                                        JVM_ACC_INTERFACE | \
                                        JVM_ACC_ABSTRACT | \
                                        JVM_ACC_ANNOTATION | \
                                        JVM_ACC_ENUM | \
                                        JVM_ACC_SYNTHETIC)

其中JVM_ACC_SYNTHETIC用來標識這個類並非用戶代碼產生.另外從java1.2開始都會有JVM_ACC_SUPER

This_class

首先讀取一個u2的值用於指向constant_pool中的位置,然後就是各種校驗,詳細代碼如下


  // This class and superclass
  _this_class_index = stream->get_u2_fast();
  check_property(
    valid_cp_range(_this_class_index, cp_size) &&
      cp->tag_at(_this_class_index).is_unresolved_klass(),
    "Invalid this class index %u in constant pool in class file %s",
    _this_class_index, CHECK);

  Symbol* const class_name_in_cp = cp->klass_name_at(_this_class_index);
  assert(class_name_in_cp != NULL, "class_name can't be null");

  // Don't need to check whether this class name is legal or not.
  // It has been checked when constant pool is parsed.
  // However, make sure it is not an array type.
  if (_need_verify) {
    guarantee_property(class_name_in_cp->char_at(0) != JVM_SIGNATURE_ARRAY,
                       "Bad class name in class file %s",
                       CHECK);
  }

#ifdef ASSERT
  // Basic sanity checks
  assert(!(_is_hidden && (_unsafe_anonymous_host != NULL)), "mutually exclusive variants");

  if (_unsafe_anonymous_host != NULL) {
    assert(_class_name == vmSymbols::unknown_class_name(), "A named anonymous class???");
  }
  if (_is_hidden) {
    assert(_class_name != vmSymbols::unknown_class_name(), "hidden classes should have a special name");
  }
#endif

  // Update the _class_name as needed depending on whether this is a named,
  // un-named, hidden or unsafe-anonymous class.

  if (_is_hidden) {
    assert(_class_name != NULL, "Unexpected null _class_name");
#ifdef ASSERT
    if (_need_verify) {
      verify_legal_class_name(_class_name, CHECK);
    }
#endif

  // NOTE: !_is_hidden does not imply "findable" as it could be an old-style
  //       "hidden" unsafe-anonymous class

  // If this is an anonymous class fix up its name if it is in the unnamed
  // package.  Otherwise, throw IAE if it is in a different package than
  // its host class.
  } else if (_unsafe_anonymous_host != NULL) {
    update_class_name(class_name_in_cp);
    fix_unsafe_anonymous_class_name(CHECK);

  } else {
    // Check if name in class file matches given name
    if (_class_name != class_name_in_cp) {
      if (_class_name != vmSymbols::unknown_class_name()) {
        ResourceMark rm(THREAD);
        Exceptions::fthrow(THREAD_AND_LOCATION,
                           vmSymbols::java_lang_NoClassDefFoundError(),
                           "%s (wrong name: %s)",
                           class_name_in_cp->as_C_string(),
                           _class_name->as_C_string()
                           );
        return;
      } else {
        // The class name was not known by the caller so we set it from
        // the value in the CP.
        update_class_name(class_name_in_cp);
      }
      // else nothing to do: the expected class name matches what is in the CP
    }
  }
  // Verification prevents us from creating names with dots in them, this
  // asserts that that's the case.
  assert(is_internal_format(_class_name), "external class name format used internally");
  if (!is_internal()) {
    LogTarget(Debug, class, preorder) lt;
    if (lt.is_enabled()){
      ResourceMark rm(THREAD);
      LogStream ls(lt);
      ls.print("%s", _class_name->as_klass_external_name());
      if (stream->source() != NULL) {
        ls.print(" source: %s", stream->source());
      }
      ls.cr();
    }
#if INCLUDE_CDS
    if (DumpLoadedClassList != NULL && stream->source() != NULL && classlist_file->is_open()) {
      if (!ClassLoader::has_jrt_entry()) {
        warning("DumpLoadedClassList and CDS are not supported in exploded build");
        DumpLoadedClassList = NULL;
      } else if (SystemDictionaryShared::is_sharing_possible(_loader_data) &&
                 !_is_hidden &&
                 _unsafe_anonymous_host == NULL) {
        // Only dump the classes that can be stored into CDS archive.
        // Hidden and unsafe anonymous classes such as generated LambdaForm classes are also not included.
        oop class_loader = _loader_data->class_loader();
        ResourceMark rm(THREAD);
        bool skip = false;
        if (class_loader == NULL || SystemDictionary::is_platform_class_loader(class_loader)) {
          // For the boot and platform class loaders, skip classes that are not found in the
          // java runtime image, such as those found in the --patch-module entries.
          // These classes can't be loaded from the archive during runtime.
          if (!stream->from_boot_loader_modules_image() && strncmp(stream->source(), "jrt:", 4) != 0) {
            skip = true;
          }

          if (class_loader == NULL && ClassLoader::contains_append_entry(stream->source())) {
            // .. but don't skip the boot classes that are loaded from -Xbootclasspath/a
            // as they can be loaded from the archive during runtime.
            skip = false;
          }
        }
        if (skip) {
          tty->print_cr("skip writing class %s from source %s to classlist file",
            _class_name->as_C_string(), stream->source());
        } else {
          classlist_file->print_cr("%s", _class_name->as_C_string());
          classlist_file->flush();
        }
      }
    }
#endif
  }

Super_class

和上面代碼類似,同樣是通過一個u2來獲取在constant_pool中的位置,代碼如下

  // SUPERKLASS
  _super_class_index = stream->get_u2_fast();
  _super_klass = parse_super_class(cp,
                                   _super_class_index,
                                   _need_verify,
                                   CHECK);

可以看出來實際通過parse_super_class來完成,代碼如下

const InstanceKlass* ClassFileParser::parse_super_class(ConstantPool* const cp,
       const int super_class_index,const bool need_verify,TRAPS) {
  assert(cp != NULL, "invariant");
  const InstanceKlass* super_klass = NULL;
  if (super_class_index == 0) {
    check_property(_class_name == vmSymbols::java_lang_Object(),
                   "Invalid superclass index %u in class file %s",
                   super_class_index,
                   CHECK_NULL);
  } else {
    check_property(valid_klass_reference_at(super_class_index),
                   "Invalid superclass index %u in class file %s",
                   super_class_index,
                   CHECK_NULL);
    // The class name should be legal because it is checked when parsing constant pool.
    // However, make sure it is not an array type.
    bool is_array = false;
    if (cp->tag_at(super_class_index).is_klass()) {
      super_klass = InstanceKlass::cast(cp->resolved_klass_at(super_class_index));
      if (need_verify)
        is_array = super_klass->is_array_klass();
    } else if (need_verify) {
      is_array = (cp->klass_name_at(super_class_index)->char_at(0) == JVM_SIGNATURE_ARRAY);
    }
    if (need_verify) {
      guarantee_property(!is_array,
                        "Bad superclass name in class file %s", CHECK_NULL);
    }
  }
  return super_klass;
}

上面的第一個if分支是爲了java.lang.Object做了專門處理,它可以沒有父類(super_class_index==0),其他都需要檢查父類的合法性.另外就是對是否是數組做了一些檢查.

Interfaces

  // Interfaces
  _itfs_len = stream->get_u2_fast();
  parse_interfaces(stream,_itfs_len,cp,&_has_nonstatic_concrete_methods,CHECK);
  assert(_local_interfaces != NULL, "invariant");

具體幹活的parse_interfaces函數代碼如下,其中我對於assert(itfs_len > 0, “only called for len>0”);這一行代碼也是驚呆了.

// Side-effects: populates the _local_interfaces field
void ClassFileParser::parse_interfaces(const ClassFileStream* const stream,
                                       const int itfs_len,
                                       ConstantPool* const cp,
                                       bool* const has_nonstatic_concrete_methods,
                                       TRAPS) {
  assert(stream != NULL, "invariant");
  assert(cp != NULL, "invariant");
  assert(has_nonstatic_concrete_methods != NULL, "invariant");
  if (itfs_len == 0) {
    _local_interfaces = Universe::the_empty_instance_klass_array();
  } else {
    assert(itfs_len > 0, "only called for len>0");
    _local_interfaces = MetadataFactory::new_array<InstanceKlass*>(_loader_data, itfs_len, NULL, CHECK);
    int index;
    for (index = 0; index < itfs_len; index++) {
      const u2 interface_index = stream->get_u2(CHECK);
      Klass* interf;
      check_property(
        valid_klass_reference_at(interface_index),
        "Interface name has bad constant pool index %u in class file %s",
        interface_index, CHECK);
      if (cp->tag_at(interface_index).is_klass()) {
        interf = cp->resolved_klass_at(interface_index);
      } else {
        Symbol* const unresolved_klass  = cp->klass_name_at(interface_index);

        // Don't need to check legal name because it's checked when parsing constant pool.
        // But need to make sure it's not an array type.
        guarantee_property(unresolved_klass->char_at(0) != JVM_SIGNATURE_ARRAY,
                           "Bad interface name in class file %s", CHECK);

        // Call resolve_super so classcircularity is checked
        interf = SystemDictionary::resolve_super_or_fail(
                                                  _class_name,
                                                  unresolved_klass,
                                                  Handle(THREAD, _loader_data->class_loader()),
                                                  _protection_domain,
                                                  false,
                                                  CHECK);
      }

      if (!interf->is_interface()) {
        THROW_MSG(vmSymbols::java_lang_IncompatibleClassChangeError(),
                  err_msg("class %s can not implement %s, because it is not an interface (%s)",
                          _class_name->as_klass_external_name(),
                          interf->external_name(),
                          interf->class_in_module_of_loader()));
      }

      if (InstanceKlass::cast(interf)->has_nonstatic_concrete_methods()) {
        *has_nonstatic_concrete_methods = true;
      }
      _local_interfaces->at_put(index, InstanceKlass::cast(interf));
    }
    if (!_need_verify || itfs_len <= 1) {
      return;
    }
    // Check if there's any duplicates in interfaces
    ResourceMark rm(THREAD);
    NameSigHash** interface_names = NEW_RESOURCE_ARRAY_IN_THREAD(THREAD,
                                                                 NameSigHash*,
                                                                 HASH_ROW_SIZE);
    initialize_hashtable(interface_names);
    bool dup = false;
    const Symbol* name = NULL;
    {
      debug_only(NoSafepointVerifier nsv;)
      for (index = 0; index < itfs_len; index++) {
        const InstanceKlass* const k = _local_interfaces->at(index);
        name = k->name();
        // If no duplicates, add (name, NULL) in hashtable interface_names.
        if (!put_after_lookup(name, NULL, interface_names)) {
          dup = true;
          break;
        }
      }
    }
    if (dup) {
      classfile_parse_error("Duplicate interface name \"%s\" in class file %s",
                             name->as_C_string(), CHECK);
    }
  }
}

Fields

classFileParser.cpp

  // Fields (offsets are filled in later)
  _fac = new FieldAllocationCount();
  parse_fields(stream,
               _access_flags.is_interface(),
               _fac,
               cp,
               cp_size,
               &_java_fields_count,
               CHECK);

  assert(_fields != NULL, "invariant");

其中FieldAllocationCount代碼如下,看上去最多隻能有65535個field

class ClassFileParser::FieldAllocationCount : public ResourceObj {
 public:
  u2 count[MAX_FIELD_ALLOCATION_TYPE];

  FieldAllocationCount() {
    for (int i = 0; i < MAX_FIELD_ALLOCATION_TYPE; i++) {
      count[i] = 0;
    }
  }
  FieldAllocationType update(bool is_static, BasicType type) {
    FieldAllocationType atype = basic_type_to_atype(is_static, type);
    if (atype != BAD_ALLOCATION_TYPE) {
      // Make sure there is no overflow with injected fields.
      assert(count[atype] < 0xFFFF, "More than 65535 fields");
      count[atype]++;
    }
    return atype;
  }
};

下面是parse_fields的代碼,可以看出你裏面有很多防禦性代碼


// Side-effects: populates the _fields, _fields_annotations,
// _fields_type_annotations fields
void ClassFileParser::parse_fields(const ClassFileStream* const cfs,
                                   bool is_interface,
                                   FieldAllocationCount* const fac,
                                   ConstantPool* cp,
                                   const int cp_size,
                                   u2* const java_fields_count_ptr,
                                   TRAPS) {

  assert(cfs != NULL, "invariant");
  assert(fac != NULL, "invariant");
  assert(cp != NULL, "invariant");
  assert(java_fields_count_ptr != NULL, "invariant");

  assert(NULL == _fields, "invariant");
  assert(NULL == _fields_annotations, "invariant");
  assert(NULL == _fields_type_annotations, "invariant");

  cfs->guarantee_more(2, CHECK);  // length
  const u2 length = cfs->get_u2_fast();
  *java_fields_count_ptr = length;

  int num_injected = 0;
  const InjectedField* const injected = JavaClasses::get_injected(_class_name,
                                                                  &num_injected);
  const int total_fields = length + num_injected;

  // The field array starts with tuples of shorts
  // [access, name index, sig index, initial value index, byte offset].
  // A generic signature slot only exists for field with generic
  // signature attribute. And the access flag is set with
  // JVM_ACC_FIELD_HAS_GENERIC_SIGNATURE for that field. The generic
  // signature slots are at the end of the field array and after all
  // other fields data.
  //
  //   f1: [access, name index, sig index, initial value index, low_offset, high_offset]
  //   f2: [access, name index, sig index, initial value index, low_offset, high_offset]
  //       ...
  //   fn: [access, name index, sig index, initial value index, low_offset, high_offset]
  //       [generic signature index]
  //       [generic signature index]
  //       ...
  //
  // Allocate a temporary resource array for field data. For each field,
  // a slot is reserved in the temporary array for the generic signature
  // index. After parsing all fields, the data are copied to a permanent
  // array and any unused slots will be discarded.
  ResourceMark rm(THREAD);
  u2* const fa = NEW_RESOURCE_ARRAY_IN_THREAD(THREAD,
                                              u2,
                                              total_fields * (FieldInfo::field_slots + 1));
  // The generic signature slots start after all other fields' data.
  int generic_signature_slot = total_fields * FieldInfo::field_slots;
  int num_generic_signature = 0;
  for (int n = 0; n < length; n++) {
    // access_flags, name_index, descriptor_index, attributes_count
    cfs->guarantee_more(8, CHECK);

    AccessFlags access_flags;
    const jint flags = cfs->get_u2_fast() & JVM_RECOGNIZED_FIELD_MODIFIERS;
    verify_legal_field_modifiers(flags, is_interface, CHECK);
    access_flags.set_flags(flags);

    const u2 name_index = cfs->get_u2_fast();
    check_property(valid_symbol_at(name_index),
      "Invalid constant pool index %u for field name in class file %s",
      name_index, CHECK);
    const Symbol* const name = cp->symbol_at(name_index);
    verify_legal_field_name(name, CHECK);

    const u2 signature_index = cfs->get_u2_fast();
    check_property(valid_symbol_at(signature_index),
      "Invalid constant pool index %u for field signature in class file %s",
      signature_index, CHECK);
    const Symbol* const sig = cp->symbol_at(signature_index);
    verify_legal_field_signature(name, sig, CHECK);

    u2 constantvalue_index = 0;
    bool is_synthetic = false;
    u2 generic_signature_index = 0;
    const bool is_static = access_flags.is_static();
    FieldAnnotationCollector parsed_annotations(_loader_data);

    const u2 attributes_count = cfs->get_u2_fast();
    if (attributes_count > 0) {
      parse_field_attributes(cfs,
                             attributes_count,
                             is_static,
                             signature_index,
                             &constantvalue_index,
                             &is_synthetic,
                             &generic_signature_index,
                             &parsed_annotations,
                             CHECK);
      if (parsed_annotations.field_annotations() != NULL) {
        if (_fields_annotations == NULL) {
          _fields_annotations = MetadataFactory::new_array<AnnotationArray*>(
                                             _loader_data, length, NULL,
                                             CHECK);
        }
        _fields_annotations->at_put(n, parsed_annotations.field_annotations());
        parsed_annotations.set_field_annotations(NULL);
      }
      if (parsed_annotations.field_type_annotations() != NULL) {
        if (_fields_type_annotations == NULL) {
          _fields_type_annotations =
            MetadataFactory::new_array<AnnotationArray*>(_loader_data,
                                                         length,
                                                         NULL,
                                                         CHECK);
        }
        _fields_type_annotations->at_put(n, parsed_annotations.field_type_annotations());
        parsed_annotations.set_field_type_annotations(NULL);
      }
      if (is_synthetic) {
        access_flags.set_is_synthetic();
      }
      if (generic_signature_index != 0) {
        access_flags.set_field_has_generic_signature();
        fa[generic_signature_slot] = generic_signature_index;
        generic_signature_slot ++;
        num_generic_signature ++;
      }
    }
    FieldInfo* const field = FieldInfo::from_field_array(fa, n);
    field->initialize(access_flags.as_short(),
                      name_index,
                      signature_index,
                      constantvalue_index);
    const BasicType type = cp->basic_type_for_signature_at(signature_index);
    // Remember how many oops we encountered and compute allocation type
    const FieldAllocationType atype = fac->update(is_static, type);
    field->set_allocation_type(atype);
    // After field is initialized with type, we can augment it with aux info
    if (parsed_annotations.has_any_annotations()) {
      parsed_annotations.apply_to(field);
      if (field->is_contended()) {
        _has_contended_fields = true;
      }
    }
  }
  int index = length;
  if (num_injected != 0) {
    for (int n = 0; n < num_injected; n++) {
      // Check for duplicates
      if (injected[n].may_be_java) {
        const Symbol* const name      = injected[n].name();
        const Symbol* const signature = injected[n].signature();
        bool duplicate = false;
        for (int i = 0; i < length; i++) {
          const FieldInfo* const f = FieldInfo::from_field_array(fa, i);
          if (name      == cp->symbol_at(f->name_index()) &&
              signature == cp->symbol_at(f->signature_index())) {
            // Symbol is desclared in Java so skip this one
            duplicate = true;
            break;
          }
        }
        if (duplicate) {
          // These will be removed from the field array at the end
          continue;
        }
      }
      // Injected field
      FieldInfo* const field = FieldInfo::from_field_array(fa, index);
      field->initialize(JVM_ACC_FIELD_INTERNAL,
                        injected[n].name_index,
                        injected[n].signature_index,
                        0);

      const BasicType type = Signature::basic_type(injected[n].signature());

      // Remember how many oops we encountered and compute allocation type
      const FieldAllocationType atype = fac->update(false, type);
      field->set_allocation_type(atype);
      index++;
    }
  }
  assert(NULL == _fields, "invariant");
  _fields =
    MetadataFactory::new_array<u2>(_loader_data,
                                   index * FieldInfo::field_slots + num_generic_signature,
                                   CHECK);
  // Sometimes injected fields already exist in the Java source so
  // the fields array could be too long.  In that case the
  // fields array is trimed. Also unused slots that were reserved
  // for generic signature indexes are discarded.
  {
    int i = 0;
    for (; i < index * FieldInfo::field_slots; i++) {
      _fields->at_put(i, fa[i]);
    }
    for (int j = total_fields * FieldInfo::field_slots;
         j < generic_signature_slot; j++) {
      _fields->at_put(i++, fa[j]);
    }
    assert(_fields->length() == i, "");
  }
  if (_need_verify && length > 1) {
    // Check duplicated fields
    ResourceMark rm(THREAD);
    NameSigHash** names_and_sigs = NEW_RESOURCE_ARRAY_IN_THREAD(
      THREAD, NameSigHash*, HASH_ROW_SIZE);
    initialize_hashtable(names_and_sigs);
    bool dup = false;
    const Symbol* name = NULL;
    const Symbol* sig = NULL;
    {
      debug_only(NoSafepointVerifier nsv;)
      for (AllFieldStream fs(_fields, cp); !fs.done(); fs.next()) {
        name = fs.name();
        sig = fs.signature();
        // If no duplicates, add name/signature in hashtable names_and_sigs.
        if (!put_after_lookup(name, sig, names_and_sigs)) {
          dup = true;
          break;
        }
      }
    }
    if (dup) {
      classfile_parse_error("Duplicate field name \"%s\" with signature \"%s\" in class file %s",
                             name->as_C_string(), sig->as_klass_external_name(), CHECK);
    }
  }
}

fields的結構

類型 名稱 數量 說明
u2 access_flags 1 訪問標識
u2 name_index 1 簡單名稱引用
u2 descriptor_index 1 類型引用信息
u2 attribute_count 1
attribute_info attributes attribute_count

標識字符

標識字符 含義
B byte
C char
D double
F float
I int
J long
S short
Z boolean
V void
L* 對象

如果是數組則會在前面加上左中括號,例如[I標識int[]

Methods

classFileParser.cpp

  // Methods
  AccessFlags promoted_flags;
  parse_methods(stream,
                _access_flags.is_interface(),
                &promoted_flags,
                &_has_final_method,
                &_declares_nonstatic_concrete_methods,
                CHECK);

  assert(_methods != NULL, "invariant");

  // promote flags from parse_methods() to the klass' flags
  _access_flags.add_promoted_flags(promoted_flags.as_int());

  if (_declares_nonstatic_concrete_methods) {
    _has_nonstatic_concrete_methods = true;
  }

// The promoted_flags parameter is used to pass relevant access_flags
// from the methods back up to the containing klass. These flag values
// are added to klass's access_flags.
// Side-effects: populates the _methods field in the parser
void ClassFileParser::parse_methods(const ClassFileStream* const cfs,
                                    bool is_interface,
                                    AccessFlags* promoted_flags,
                                    bool* has_final_method,
                                    bool* declares_nonstatic_concrete_methods,
                                    TRAPS) {
  assert(cfs != NULL, "invariant");
  assert(promoted_flags != NULL, "invariant");
  assert(has_final_method != NULL, "invariant");
  assert(declares_nonstatic_concrete_methods != NULL, "invariant");

  assert(NULL == _methods, "invariant");

  cfs->guarantee_more(2, CHECK);  // length
  const u2 length = cfs->get_u2_fast();
  if (length == 0) {
    _methods = Universe::the_empty_method_array();
  } else {
    _methods = MetadataFactory::new_array<Method*>(_loader_data,
                                                   length,
                                                   NULL,
                                                   CHECK);

    for (int index = 0; index < length; index++) {
      Method* method = parse_method(cfs,
                                    is_interface,
                                    _cp,
                                    promoted_flags,
                                    CHECK);

      if (method->is_final()) {
        *has_final_method = true;
      }
      // declares_nonstatic_concrete_methods: declares concrete instance methods, any access flags
      // used for interface initialization, and default method inheritance analysis
      if (is_interface && !(*declares_nonstatic_concrete_methods)
        && !method->is_abstract() && !method->is_static()) {
        *declares_nonstatic_concrete_methods = true;
      }
      _methods->at_put(index, method);
    }

    if (_need_verify && length > 1) {
      // Check duplicated methods
      ResourceMark rm(THREAD);
      NameSigHash** names_and_sigs = NEW_RESOURCE_ARRAY_IN_THREAD(
        THREAD, NameSigHash*, HASH_ROW_SIZE);
      initialize_hashtable(names_and_sigs);
      bool dup = false;
      const Symbol* name = NULL;
      const Symbol* sig = NULL;
      {
        debug_only(NoSafepointVerifier nsv;)
        for (int i = 0; i < length; i++) {
          const Method* const m = _methods->at(i);
          name = m->name();
          sig = m->signature();
          // If no duplicates, add name/signature in hashtable names_and_sigs.
          if (!put_after_lookup(name, sig, names_and_sigs)) {
            dup = true;
            break;
          }
        }
      }
      if (dup) {
        classfile_parse_error("Duplicate method name \"%s\" with signature \"%s\" in class file %s",
                               name->as_C_string(), sig->as_klass_external_name(), CHECK);
      }
    }
  }
}

結構

method的結構與field類似,但是其access_flags的可選性顯然與field不同

參數&返回

用描述符描述方法時,按照先參數列表,後返回值的順序描述,參數列表按照參數的嚴格順序放在一組"()“之內,如方法"String getAll(int id,String name)“的描述符爲”(I,Ljava/lang/String;)Ljava/lang/String”

void <clinit >()

這個是由編譯器自動添加的方法,用於初始化static變量和執行static塊

<init>

構造方法也是方法,會有編譯器自動產生

Attributes

在class文件中,屬性表,方法表都可以保護眼自己的屬性表集合,用於描述某些場景的專有信息.

與class文件中其他數據項對長度,順序,格式的嚴格要求不同,屬性表集合不要求其中包含的屬性表具有嚴格的順序,並且只要屬性的名稱不與已有的屬性名稱重複,任何人實現的編譯器都可以向屬性表寫入自己定義的屬性信息.虛擬機在運行時會忽略不能識別的屬性.

常見屬性表

以下是JVM規定的一些屬性,但是根據vmSymbols來看目前OpenJDK支持的更多

Code

位置:方法表
含義:java編譯成的字節碼指令,有方法體的方法纔會有Code
Code屬性的結構

類型 名稱 數量 含義
u2 attribute_name_index 1
u4 attribute_length 1
u2 max_stack 1 虛擬機運行時根據這個值分配棧的操作數棧深度
u2 max_locals 1 局部變量所需的slot數量,不同變量可以服用slot
u4 code_length 1 編譯後字節碼長度
u1 code code_length 字節碼指令
u2 exception_table_length 1
exception_info exception_table exception_table_length
u2 attributes_count 1
attribute_info attributes attributes_count

雖然上面的code_length定義爲u4,但是按照JVM規範一個方法不能超過65535字節
另外Code裏涉及到指令集,參見後面的揭祕java虛擬機(三)JVM指令集

ConstantValue

位置:字段表
含義:static定義的常量值(這是JVM要求的,但是javac一般會加上final這一限制)
ConstantValue屬性結構

類型 名稱 數量
u2 attribute_name_index 1
u4 attribute_length 1
u2 constantvalue_index 1

其中attribute_length的值恆爲0x0000002,另外ConstantValue只支持int和String

Deprecated

位置:類文件,字段表,方法表
含義:被聲明爲deprecated的類,字段,方法

Exceptions

位置:方法表
含義:方法聲明的受檢異常

類型 名稱 數量
u2 attribute_name_index 1
u4 attribute_length 1
u2 number_of_exceptions 1
u2 exception_index_table number_of_exceptions

InnerClasses

位置:類文件
含義:內部類
InnerClasses屬性結構表

類型 名稱 數量
u2 attribute_name_index 1
u4 attribute_length 1
u2 number_of_classes 1
inner_classes_info inner_classes number_of_classes

inner_classes_info表結構

類型 名稱 數量
u2 inner_class_info_index 1
u2 outer_class_index 1
u2 inner_name_index 1
u2 inner_class_access_flags 1

書中這個地方寫出了,最後一行按照源代碼以及語義可以知道應該是inner_class_access_flags,而書中寫的是inner_name_access_flags,並且下面還照抄了.

LineNumberTable

位置:Code
含義:java源碼行號和字節碼指令對應關係
在這裏,書中再一次錯誤,不過這是個拼寫錯誤,但是被大量CV或者是批量替換
LineNumberTable屬性結構表

類型 名稱 數量
u2 attribute_name_index 1
u4 attribute_length 1
u2 line_number_table_length 1
line_number_info line_number_table line_number_table_length

line_number_info屬性結構表

類型 名稱 數量 說明
u2 sart_pc 1 字節碼行號
u2 line_number 1 java源代碼行號

這個是用於端點調試和打印異常時使用

LocalVariableTable

位置:Code
含義:方法的局部變量
根據Option.java可以知道在class文件中是否生成LocalVariableTable可以在javac中通過-g:none和-g:vars來控制關閉或生成,默認是不生成.
LocalVariableTable屬性結構表

類型 名稱 數量
u2 attribute_name_index 1
u4 attribute_length 1
u2 local_variable_table_length 1
local_variable_info local_variable_table local_variable_table_length

SourceFile

位置:類文件
含義:源文件名稱
還是參見Option.java知道這個通過none或source來控制是否生成
SourceFile屬性結構表

類型 名稱 數量
u2 attribute_name_index 1
u4 attribute_length 1
u2 sourcefile_index 1

Synthetic

位置:類文件,方法表,字段表
含義:該類/方法/字段由編譯器自動生成,但是clinit和init例外

classFileParser.cpp

最後讀取attributes並確保stream被讀取完畢


  // Additional attributes/annotations
  _parsed_annotations = new ClassAnnotationCollector();
  parse_classfile_attributes(stream, cp, _parsed_annotations, CHECK);

  assert(_inner_classes != NULL, "invariant");

  // Finalize the Annotations metadata object,
  // now that all annotation arrays have been created.
  create_combined_annotations(CHECK);

最後是確保stream被讀取完畢

  // Make sure this is the end of class file stream
  guarantee_property(stream->at_eos(),
                     "Extra bytes at the end of class file %s",
                     CHECK);

  // all bytes in stream read and parsed

oop-klass

當C,C++和Delphi等程序被編譯成二進制程序後,原來所定義的高級數據結構都不復存在了,當windows/linux等操作系統(宿主機)加載這些二進制程序時,是不會加載這些語言中所定義的高級數據結構的,宿主機壓根就不知道原來定了那些數據結構,哪些類,所有的數據結構都被轉換爲對特定內存段的偏移地址.例如C中的Struct結構體,被編譯後不復存在,彙編和機器語言中沒有與之對應的數據結構的概念,CPU更不知道何爲結構體.C++和Delphi中的類概念被編譯後也不復存在,所謂的類最終變成內存首地址.而JVM虛擬機在加載字節碼程序時,會記錄字節碼中所定義的所有類型的原始信息(元數據),JVM知道程序中包含了哪些類,以及每個類中所關聯的字段,方法,父類等信息.這是JVM虛擬機與操作系統最大的區別所在

oop:ordinary object pointer,用來描述對象實例信息,一般保存在HEAP
klass:用來描述java類,是虛擬機內部java類型結構的對等體,一般保存在PERM
根據最新的oopsHierarchy.hpp裏的說明,整個繼承樹分爲以下三個

OBJECT hierarchy

繼承關係代碼如下

// OBJECT hierarchy
// This hierarchy is a representation hierarchy, i.e. if A is a superclass
// of B, A's representation is a prefix of B's representation.

typedef class oopDesc*                    oop;
typedef class   instanceOopDesc*            instanceOop;
typedef class   arrayOopDesc*               arrayOop;
typedef class     objArrayOopDesc*            objArrayOop;
typedef class     typeArrayOopDesc*           typeArrayOop;
  • oopDesc
    這個類定義在oop.hpp中,方法有很對,屬性卻很少,如下所示

class oopDesc {
  friend class VMStructs;
  friend class JVMCIVMStructs;
 private:
  volatile markWord _mark;
  union _metadata {
    Klass*      _klass;
    narrowKlass _compressed_klass;
  } _metadata;
......
}

_mark屬性主要使用標記對象的各種狀態,例如鎖,分代,線程狀態等等.而_metadata可以看出來指向Klass,至於裏面的方法很多.上面的narrowKlass是用於指針壓縮,相關定義在oopsHierarchy.hpp如下

typedef juint narrowOop; // Offset instead of address for an oop within a java object

// If compressed klass pointers then use narrowKlass.
typedef juint  narrowKlass;

juint則在globalDefinitions.hpp定義爲u4

metadata hierarchy

// The metadata hierarchy is separate from the oop hierarchy
//      class MetaspaceObj
class   ConstMethod;
class   ConstantPoolCache;
class   MethodData;
//      class Metadata
class   Method;
class   ConstantPool;
//      class CHeapObj
class   CompiledICHolder;

klass hierarchy

// The klass hierarchy is separate from the oop hierarchy.
class Klass;
class   InstanceKlass;
class     InstanceMirrorKlass;
class     InstanceClassLoaderKlass;
class     InstanceRefKlass;
class   ArrayKlass;
class     ObjArrayKlass;
class     TypeArrayKlass;
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章