Nvidia GPU architecture

原創

minggoddess

2020-08-14 13:47

TPC texture/processor cluster

SM streaming multiprocessor

SP streaming processor 普通計算器mad之類的

SFU special function unit --超越函數。。。。三角函數 log 指數函數

ROP Raster Operation processor--做om階段的很多事情測試混合aa。。。。

L2 l2cache

DRAM 內存

I-cache instruction cache

C-cache constant cache

MT issue :multithreaded instruction fetch and issue unit --這貨看名字像是爲sp的task做調度的

https://www.cc.gatech.edu/fac/hyesoon/gputhread.pdf

mt issue給SM拿一個instruction 給一個warp裏面的所有sp用這個instruction需要多個cycle才能執行完

一個sm一段時間會並行多個warp 因爲資源約束和性能的緣故就需要調度

指令也屬於一種資源，一個指令一般需要4cycle

以上是1.0版本的認識可能需要迭代

fetch策略基本上是通過負載均衡讓性能吞吐量最大化，讓選到這個intruction可以被立刻執行不stall

這事情和powervr的thread scheduling是一樣的想要的結果一樣這裏手段可能更先進一些前者像粗粒度的後者simt 併發多線程

這裏需要迭代

fetch有各種策略

比如輪詢

比如挑最長時間沒被fetch的

比如先挑剩下最多的

或者memory佔用最多的

可以看到各種策略都是爲了更平衡

看上圖那三個distribution TPC處理這三種工作 vertex pixel compute 是unified的架構

warp The set of parallel threads that execute the same instruction together in a SIMT architecture.

SIMD--vector

SIMT--scalor

Single instruction, multiple thread (SIMT) is an execution model used in parallel computing where single instruction, multiple data (SIMD) is combined with multithreading.

simt的warp理解起來一個很重要的點是那四個cycles

一個instrucion 4 cycles

8 sp core x4

一段時間 32thread

一個sp裏有64個thread 一個thread一個cycle可以跑一個scaler instruction

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

Nvidia GPU architecture

DAPPER 事務 TRANSACTION

重要性採樣 ggx

好美的一篇證明--cube texel 投影到sphere上對應的solid angle的計算

ue mobile 延遲光照點光源 volumtric測試

ue flip y

對sdf 數據的理解 signed distance field

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結