CPU L2緩存初探

CPU L2緩存初探

探索目的

爲了設計高性能的內存數據庫,往往要考慮CPU緩存的命中率,查閱資料發現,Intel架構的處理器有3級緩存的設置,其中L1,L2較小,L3較大但被三個核心所公用。在我使用的i7 4712mq中,使用dmidecode查看緩存大小:

inszva@inszva-Aspire-E5-572G:~$ sudo dmidecode
[sudo] password for inszva: 
# dmidecode 2.12
SMBIOS 2.8 present.
28 structures occupying 1951 bytes.
Table at 0x000E6F30.

Handle 0x0000, DMI type 0, 24 bytes
BIOS Information
    Vendor: Insyde Corp.
    Version: V1.05
    Release Date: 03/17/2015
    Address: 0xE0000
    Runtime Size: 128 kB
    ROM Size: 6144 kB
    Characteristics:
        PCI is supported
        BIOS is upgradeable
        BIOS shadowing is allowed
        Boot from CD is supported
        Selectable boot is supported
        EDD is supported
        Japanese floppy for NEC 9800 1.2 MB is supported (int 13h)
        Japanese floppy for Toshiba 1.2 MB is supported (int 13h)
        5.25"/360 kB floppy services are supported (int 13h)
        5.25"/1.2 MB floppy services are supported (int 13h)
        3.5"/720 kB floppy services are supported (int 13h)
        3.5"/2.88 MB floppy services are supported (int 13h)
        8042 keyboard services are supported (int 9h)
        CGA/mono video services are supported (int 10h)
        ACPI is supported
        USB legacy is supported
        BIOS boot specification is supported
        Targeted content distribution is supported
        UEFI is supported
    BIOS Revision: 1.5
    Firmware Revision: 1.5

Handle 0x0001, DMI type 1, 27 bytes
System Information
    Manufacturer: Acer
    Product Name: Aspire E5-572G
    Version: V1.05
    Serial Number: NXMV2CN01252001C033400
    UUID: F54CECF0-8EF4-E411-85C7-F0761CC0B25C
    Wake-up Type: Power Switch
    SKU Number: Aspire E5-572G_0922_V1.05
    Family: SharkBay System

Handle 0x0002, DMI type 2, 16 bytes
Base Board Information
    Manufacturer: Acer
    Product Name: EA50_HWS  
    Version: V1.05
    Serial Number: NBMV211001520310DC3400
    Asset Tag: Type2 - Board Asset Tag
    Features:
        Board is a hosting board
        Board is replaceable
    Location In Chassis: Type2 - Board Chassis Location
    Chassis Handle: 0x0003
    Type: Motherboard
    Contained Object Handles: 0

Handle 0x0003, DMI type 3, 23 bytes
Chassis Information
    Manufacturer: Acer
    Type: Notebook
    Lock: Not Present
    Version: Chassis Version
    Serial Number: Chassis Serial Number
    Asset Tag: Chassis Asset Tag
    Boot-up State: Safe
    Power Supply State: Safe
    Thermal State: Safe
    Security Status: None
    OEM Information: 0x00000000
    Height: Unspecified
    Number Of Power Cords: 1
    Contained Elements: 0
    SKU Number: Not Specified

Handle 0x0004, DMI type 4, 42 bytes
Processor Information
    Socket Designation: U3E1
    Type: Central Processor
    Family: Core i7
    Manufacturer: Intel(R) Corporation
    ID: C3 06 03 00 FF FB EB BF
    Signature: Type 0, Family 6, Model 60, Stepping 3
    Flags:
        FPU (Floating-point unit on-chip)
        VME (Virtual mode extension)
        DE (Debugging extension)
        PSE (Page size extension)
        TSC (Time stamp counter)
        MSR (Model specific registers)
        PAE (Physical address extension)
        MCE (Machine check exception)
        CX8 (CMPXCHG8 instruction supported)
        APIC (On-chip APIC hardware supported)
        SEP (Fast system call)
        MTRR (Memory type range registers)
        PGE (Page global enable)
        MCA (Machine check architecture)
        CMOV (Conditional move instruction supported)
        PAT (Page attribute table)
        PSE-36 (36-bit page size extension)
        CLFSH (CLFLUSH instruction supported)
        DS (Debug store)
        ACPI (ACPI supported)
        MMX (MMX technology supported)
        FXSR (FXSAVE and FXSTOR instructions supported)
        SSE (Streaming SIMD extensions)
        SSE2 (Streaming SIMD extensions 2)
        SS (Self-snoop)
        HTT (Multi-threading)
        TM (Thermal monitor supported)
        PBE (Pending break enabled)
    Version: Intel(R) Core(TM) i7-4712MQ CPU @ 2.30GHz
    Voltage: 0.9 V
    External Clock: 100 MHz
    Max Speed: 2300 MHz
    Current Speed: 2900 MHz
    Status: Populated, Enabled
    Upgrade: <OUT OF SPEC>
    L1 Cache Handle: 0x0006
    L2 Cache Handle: 0x0007
    L3 Cache Handle: 0x0008
    Serial Number: To Be Filled By O.E.M.
    Asset Tag: To Be Filled By O.E.M.
    Part Number: To Be Filled By O.E.M.
    Core Count: 4
    Core Enabled: 4
    Thread Count: 8
    Characteristics:
        64-bit capable
        Multi-Core
        Hardware Thread
        Execute Protection
        Enhanced Virtualization
        Power/Performance Control

Handle 0x0005, DMI type 7, 19 bytes
Cache Information
    Socket Designation: L1 Cache
    Configuration: Enabled, Not Socketed, Level 1
    Operational Mode: Write Back
    Location: Internal
    Installed Size: 32 kB
    Maximum Size: 32 kB
    Supported SRAM Types:
        Synchronous
    Installed SRAM Type: Synchronous
    Speed: Unknown
    Error Correction Type: Single-bit ECC
    System Type: Data
    Associativity: 8-way Set-associative

Handle 0x0006, DMI type 7, 19 bytes
Cache Information
    Socket Designation: L1 Cache
    Configuration: Enabled, Not Socketed, Level 1
    Operational Mode: Write Back
    Location: Internal
    Installed Size: 32 kB
    Maximum Size: 32 kB
    Supported SRAM Types:
        Synchronous
    Installed SRAM Type: Synchronous
    Speed: Unknown
    Error Correction Type: Single-bit ECC
    System Type: Instruction
    Associativity: 8-way Set-associative

Handle 0x0007, DMI type 7, 19 bytes
Cache Information
    Socket Designation: L2 Cache
    Configuration: Enabled, Not Socketed, Level 2
    Operational Mode: Write Back
    Location: Internal
    Installed Size: 256 kB
    Maximum Size: 256 kB
    Supported SRAM Types:
        Synchronous
    Installed SRAM Type: Synchronous
    Speed: Unknown
    Error Correction Type: Single-bit ECC
    System Type: Unified
    Associativity: 8-way Set-associative

Handle 0x0008, DMI type 7, 19 bytes
Cache Information
    Socket Designation: L3 Cache
    Configuration: Enabled, Not Socketed, Level 3
    Operational Mode: Write Back
    Location: Internal
    Installed Size: 6144 kB
    Maximum Size: 6144 kB
    Supported SRAM Types:
        Synchronous
    Installed SRAM Type: Synchronous
    Speed: Unknown
    Error Correction Type: Single-bit ECC
    System Type: Unified
    Associativity: 12-way Set-associative

Handle 0x0009, DMI type 10, 6 bytes
On Board Device Information
    Type: Video
    Status: Enabled
    Description: Video Graphics Controller

Handle 0x000A, DMI type 10, 6 bytes
On Board Device Information
    Type: Ethernet
    Status: Enabled
    Description: Realtek Lan Controller

Handle 0x000B, DMI type 11, 5 bytes
OEM Strings
    String 1: Acer System
    String 2: String2 for Original Equipment Manufacturer
    String 3: String3 for Original Equipment Manufacturer
    String 4: String4 for Original Equipment Manufacturer
    String 5: String5 for Original Equipment Manufacturer

Handle 0x000C, DMI type 12, 5 bytes
System Configuration Options
    Option 1: String1 for Type12 Equipment Manufacturer
    Option 2: String2 for Type12 Equipment Manufacturer
    Option 3: String3 for Type12 Equipment Manufacturer
    Option 4: String4 for Type12 Equipment Manufacturer
    Option 5: String5 for Type12 Equipment Manufacturer
    Option 6: String6 for Type12 Equipment Manufacturer
    Option 7: String7 for Type12 Equipment Manufacturer

Handle 0x000D, DMI type 16, 23 bytes
Physical Memory Array
    Location: System Board Or Motherboard
    Use: System Memory
    Error Correction Type: None
    Maximum Capacity: 32 GB
    Error Information Handle: Not Provided
    Number Of Devices: 4

Handle 0x000E, DMI type 17, 40 bytes
Memory Device
    Array Handle: 0x000D
    Error Information Handle: Not Provided
    Total Width: Unknown
    Data Width: Unknown
    Size: No Module Installed
    Form Factor: DIMM
    Set: None
    Locator: DIMM0
    Bank Locator: BANK 0
    Type: Unknown
    Type Detail: Unknown
    Speed: Unknown
    Manufacturer: Empty
    Serial Number: Empty
    Asset Tag: Unknown
    Part Number: Empty
    Rank: Unknown
    Configured Clock Speed: Unknown
    Minimum voltage:  Unknown
    Maximum voltage:  Unknown
    Configured voltage:  Unknown

Handle 0x000F, DMI type 17, 40 bytes
Memory Device
    Array Handle: 0x000D
    Error Information Handle: Not Provided
    Total Width: Unknown
    Data Width: Unknown
    Size: No Module Installed
    Form Factor: DIMM
    Set: None
    Locator: DIMM1
    Bank Locator: BANK 1
    Type: Unknown
    Type Detail: Unknown
    Speed: Unknown
    Manufacturer: Empty
    Serial Number: Empty
    Asset Tag: Unknown
    Part Number: Empty
    Rank: Unknown
    Configured Clock Speed: Unknown
    Minimum voltage:  Unknown
    Maximum voltage:  Unknown
    Configured voltage:  Unknown

Handle 0x0010, DMI type 17, 40 bytes
Memory Device
    Array Handle: 0x000D
    Error Information Handle: Not Provided
    Total Width: 64 bits
    Data Width: 64 bits
    Size: 8192 MB
    Form Factor: SODIMM
    Set: None
    Locator: DIMM2
    Bank Locator: BANK 2
    Type: DDR3
    Type Detail: Synchronous
    Speed: 1600 MHz
    Manufacturer: Hynix
    Serial Number: 282C5AB7
    Asset Tag: Unknown
    Part Number: HMT41GS6BFR8A-PB  
    Rank: 2
    Configured Clock Speed: 1600 MHz
    Minimum voltage:  Unknown
    Maximum voltage:  Unknown
    Configured voltage:  Unknown

Handle 0x0011, DMI type 17, 40 bytes
Memory Device
    Array Handle: 0x000D
    Error Information Handle: Not Provided
    Total Width: Unknown
    Data Width: Unknown
    Size: No Module Installed
    Form Factor: DIMM
    Set: None
    Locator: DIMM3
    Bank Locator: BANK 3
    Type: Unknown
    Type Detail: Unknown
    Speed: Unknown
    Manufacturer: Empty
    Serial Number: Empty
    Asset Tag: Unknown
    Part Number: Empty
    Rank: Unknown
    Configured Clock Speed: Unknown
    Minimum voltage:  Unknown
    Maximum voltage:  Unknown
    Configured voltage:  Unknown

Handle 0x0012, DMI type 19, 31 bytes
Memory Array Mapped Address
    Starting Address: 0x00000000000
    Ending Address: 0x001FFFFFFFF
    Range Size: 8 GB
    Physical Array Handle: 0x000D
    Partition Width: 4

Handle 0x0013, DMI type 20, 35 bytes
Memory Device Mapped Address
    Starting Address: 0x00000000000
    Ending Address: 0x001FFFFFFFF
    Range Size: 8 GB
    Physical Device Handle: 0x0010
    Memory Array Mapped Address Handle: 0x0012
    Partition Row Position: Unknown
    Interleave Position: 2
    Interleaved Data Depth: 1

Handle 0x0014, DMI type 24, 5 bytes
Hardware Security
    Power-On Password Status: Disabled
    Keyboard Password Status: Disabled
    Administrator Password Status: Disabled
    Front Panel Reset Status: Disabled

Handle 0x0015, DMI type 170, 98 bytes
OEM-specific Type
    Header and Data:
        AA 62 15 00 01 08 00 00 7F 00 0F 00 06 00 03 02
        01 08 21 02 00 00 23 02 00 00 25 02 10 00 41 02
        04 00 42 02 20 00 43 02 40 00 44 02 08 00 45 02
        10 00 48 02 01 00 49 02 02 00 61 02 08 00 62 02
        01 00 63 02 02 00 64 02 04 00 81 02 04 00 83 02
        02 00 85 02 00 00 86 02 00 00 87 02 00 00 88 02
        00 04

Handle 0x0016, DMI type 171, 44 bytes
OEM-specific Type
    Header and Data:
        AB 2C 16 00 01 86 80 16 04 02 EC 10 68 81 03 DE
        10 47 13 04 F2 04 8A B4 05 86 80 20 8C 07 8C 16
        36 00 08 CA 04 0B 30 11 CB 06 70 29

Handle 0x0017, DMI type 172, 24 bytes
OEM-specific Type
    Header and Data:
        AC 18 17 00 02 1E 01 FF 00 02 01 00 03 FF 00 04
        01 00 05 0F 00 06 FF 00

Handle 0x0018, DMI type 173, 9 bytes
OEM-specific Type
    Header and Data:
        AD 09 18 00 00 00 00 00 00

Handle 0x0019, DMI type 221, 12 bytes
OEM-specific Type
    Header and Data:
        DD 0C 19 00 01 01 00 01 07 00 00 00
    Strings:
        Reference Code - ACPI

Handle 0x001A, DMI type 221, 12 bytes
OEM-specific Type
    Header and Data:
        DD 0C 1A 00 01 01 00 01 07 00 00 00
    Strings:
        Reference Code - Intel Rapid Start

Handle 0x001B, DMI type 127, 4 bytes
End Of Table

可以看到,Handle 0x0004 是描述CPU的,裏面有以下幾行內容:

L1 Cache Handle: 0x0006
L2 Cache Handle: 0x0007
L3 Cache Handle: 0x0008

分別在對應handle查看三級緩存大小,得到L1 32kb,L2 256kb,L3 6144kb,有資料表明L1緩存中儲存大多是L2與L3的索引,CPU讀取時從L1中獲得地址,再從L2或L3中讀取數據,而大容量的L3相對L2的速度是極爲緩慢的。在設計數據庫的過程中要儘量將索引放入L2中。[1]

實驗

爲了證明[1]的正確性,設計瞭如下程序:

//test.go
package main
import "fmt"
var num  [60000]int32
var t int32

func find(n int32) (int32) {
    c := int32(15000)
    pos := int32(30000)
    for {
        if num[pos] == n{break;}
        if num[pos] > n{
            pos -= c
        }else {
            pos += c
        }
        c /= 2
        if c==0{c=1}
    }
    return pos
}

func main() {
    var sum int64
    for i := int32(0);i < 60000;i++{
        num[i] = i
    }
    for i := int32(0);i < 60000;i++{
        sum += int64(find(i))
    }
    fmt.Println(sum)
}
//test2.go
package main
import "fmt"
var num  [70000]int32
var t int32

func find(n int32) (int32) {
    c := int32(15000)
    pos := int32(30000)
    for {
        if num[pos] == n{break;}
        if num[pos] > n{
            pos -= c
        }else {
            pos += c
        }
        c /= 2
        if c==0{c=1}
    }
    return pos
}

func main() {
    var sum int64
    for i := int32(0);i < 70000;i++{
        num[i] = i
    }
    for i := int32(0);i < 70000;i++{
        sum += int64(find(i))
    }
    fmt.Println(sum)
}

程序先創建了一個遞增數組,然後對這個數組進行了多次二分查找,在第一份代碼中,數組大小爲60000,佔用空間60000*4=240000<256K L2緩存,而程序二中佔用 70000*4=280000>>256K L2緩存,相對於耗時較高的二分查詢,建立初始化數組的時間可以忽略(O(n)),而n次二分查詢的理論時間複雜度爲(O(nlogn)),按理說程序二運行時間應該是程序一的7/6*log7/log6=1.183012861倍,然而實驗結果:

inszva@inszva-Aspire-E5-572G:~/cache$ time ./test
1799970000

real    0m0.016s
user    0m0.017s
sys 0m0.000s
inszva@inszva-Aspire-E5-572G:~/cache$ time ./test2
2449965000

real    0m0.097s
user    0m0.098s
sys 0m0.000s

0.097/0.016=6.0625倍,程序二由於L2緩存失配,到底查詢進入L3進行,大大降低了程序運行效率。

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章