本部分主要介紹基於GPU的相機視錐體可見性剔除模型以及根據模型距攝像機距離處理擇選模型LOD的方法。在本部分中計算着色器用於修改存儲在間接繪製命令緩衝區中的繪製命令,以切換模型可見性並根據攝像機距離選擇其詳細程度,而無需在CPU上進行任何計算並與CPU同步數據渲染。
一、理論
遮擋剔除(Occlusion Culling)
遮擋剔除是一種渲染優化技術,它使得攝像機視線範圍內但又看不見的物體不被渲染。
在vulkan中,攝像機視線範圍以內的物體會被進行渲染。這就可能出現一個問題:有些物體(遊戲物體)在視線範圍以內,但是攝像機看不見他們(或者說他們被別的物體擋住了),而這些物體依然需要被渲染。那這種渲染其實是無意義的。
而遮擋剔除就是剔除掉這些物體,不對這些物體進行渲染。從而降低了渲染上的開銷。
但這其實是有弊端的,因爲這使得CPU在做數據準備的時候,需要判斷哪些(物體的)頂點需要被處理,哪些不需要。因此遮擋剔除技術實際上是增加了CPU的開銷,從而來降低GPU開銷的一種技術。
所以,對於遊戲物體密集度很高的場景,遮擋剔除是十分適合的。但對於密集度不是那麼高的場景,遮擋剔除反而會浪費性能。
LOD(Levels Of Detail 多細節層次)
LOD也是一種渲染優化技術,它使得攝像機視線範圍內靠近的物體顯示精模,遠離的物體顯示簡模。
對於一個精模來說,當他離攝像機比較遠的時候(依舊在視線範圍內),視窗呈現出來的精緻效果其實不明顯的。在這樣的一種情況下,GPU還要對這個精模進行渲染,實際上是沒有太大意義的。
而LOD技術就是讓一個模型,當他靠近攝像機的時候顯示它的精模,讓它遠離攝像機的時候顯示它的簡模。從而降低渲染上的開銷。
但LOD也是有弊端的,那就是LOD技術,使得開發人員需要同時準備物體的精模、中模、簡模…(甚至更多),這使得這種技術十分消耗內存。
所以,對於一個存在精模,且精模會與攝像機的位置會產生變化的場景,適合採用LOD(本部分將使用一個具有lod的模型來處理)。
二、vulkan實現
2.1 場景數據定義
首先我們來看一下場景中需要自定義的一些數據結構:
//場景中物體的總數(^3)
#define OBJECT_COUNT 16
#define MAX_LOD_LEVEL 5
//僅顯示視錐體中的模型
bool fixedFrustum = false;
// 每個實例數據塊
struct InstanceData {
glm::vec3 pos;
float scale;
};
// 包含實例數據
vks::Buffer instanceBuffer;
// 包含間接繪製命令
vks::Buffer indirectCommandsBuffer;
vks::Buffer indirectDrawCountBuffer;
// 間接繪製統計數據(通過計算着色器更新)
struct {
uint32_t drawCount; // 發行的間接支取點數的總數
uint32_t lodCount[MAX_LOD_LEVEL + 1]; // 統計每個LOD級別的繪製數量(由計算着色器編寫)
} indirectStats;
// 存儲包含每個對象的索引偏移量和實例計數的間接繪製命令
std::vector<VkDrawIndexedIndirectCommand> indirectCommands;
//UBO
struct {
glm::mat4 projection;
glm::mat4 modelview;
glm::vec4 cameraPos;
glm::vec4 frustumPlanes[6]; //視錐體平面信息
} uboScene;
//場景uniform數據
struct {
vks::Buffer scene;
} uniformData;
struct {
VkPipeline plants;
} pipelines;
VkPipelineLayout pipelineLayout;
VkDescriptorSet descriptorSet;
VkDescriptorSetLayout descriptorSetLayout;
// 計算管線部分的資源
struct {
vks::Buffer lodLevelsBuffers; // 包含不同lod級別的索引起始和計數
VkQueue queue; // 用於計算命令的單獨隊列(隊列族可能與用於圖形的隊列不同)
VkCommandPool commandPool; // 使用單獨的命令池(隊列族可能與用於圖形的不同)
VkCommandBuffer commandBuffer; // 存儲調度命令和屏障的命令緩衝區
VkFence fence; // 同步柵欄以避免在仍在使用時重寫計算CB
VkSemaphore semaphore; // 用作圖形提交的等待信號量
VkDescriptorSetLayout descriptorSetLayout; // 計算着色器綁定佈局
VkDescriptorSet descriptorSet; // 計算着色器綁定
VkPipelineLayout pipelineLayout; // 計算管道的佈局
VkPipeline pipeline; // 計算管道更新粒子位置
} compute;
// 視錐體剔除不可見的對象
vks::Frustum frustum;
//場景模型數量(objectCount立方 )
uint32_t objectCount = 0;
其中,我們新定義立刻一個視錐體類來處理視錐體相關操作:
#include <array>
#include <math.h>
#include <glm/glm.hpp>
namespace vks
{
class Frustum
{
public:
enum side { LEFT = 0, RIGHT = 1, TOP = 2, BOTTOM = 3, BACK = 4, FRONT = 5 };
std::array<glm::vec4, 6> planes;
//更新六面
void update(glm::mat4 matrix)
{
planes[LEFT].x = matrix[0].w + matrix[0].x;
planes[LEFT].y = matrix[1].w + matrix[1].x;
planes[LEFT].z = matrix[2].w + matrix[2].x;
planes[LEFT].w = matrix[3].w + matrix[3].x;
planes[RIGHT].x = matrix[0].w - matrix[0].x;
planes[RIGHT].y = matrix[1].w - matrix[1].x;
planes[RIGHT].z = matrix[2].w - matrix[2].x;
planes[RIGHT].w = matrix[3].w - matrix[3].x;
planes[TOP].x = matrix[0].w - matrix[0].y;
planes[TOP].y = matrix[1].w - matrix[1].y;
planes[TOP].z = matrix[2].w - matrix[2].y;
planes[TOP].w = matrix[3].w - matrix[3].y;
planes[BOTTOM].x = matrix[0].w + matrix[0].y;
planes[BOTTOM].y = matrix[1].w + matrix[1].y;
planes[BOTTOM].z = matrix[2].w + matrix[2].y;
planes[BOTTOM].w = matrix[3].w + matrix[3].y;
planes[BACK].x = matrix[0].w + matrix[0].z;
planes[BACK].y = matrix[1].w + matrix[1].z;
planes[BACK].z = matrix[2].w + matrix[2].z;
planes[BACK].w = matrix[3].w + matrix[3].z;
planes[FRONT].x = matrix[0].w - matrix[0].z;
planes[FRONT].y = matrix[1].w - matrix[1].z;
planes[FRONT].z = matrix[2].w - matrix[2].z;
planes[FRONT].w = matrix[3].w - matrix[3].z;
for (auto i = 0; i < planes.size(); i++)
{
float length = sqrtf(planes[i].x * planes[i].x + planes[i].y * planes[i].y + planes[i].z * planes[i].z);
planes[i] /= length;
}
}
//檢查範圍
bool checkSphere(glm::vec3 pos, float radius)
{
for (auto i = 0; i < planes.size(); i++)
{
if ((planes[i].x * pos.x) + (planes[i].y * pos.y) + (planes[i].z * pos.z) + planes[i].w <= -radius)
{
return false;
}
}
return true;
}
};
}
2.2 頂點輸入屬性及圖形管線
加載完具有LOD數據的模型後,我們需要創建setupVertexDescriptions來設置定點輸入屬性:
void setupVertexDescriptions()
{
vertices.bindingDescriptions.resize(2);
// Binding 0: 每個頂點
vertices.bindingDescriptions[0] =
vks::initializers::vertexInputBindingDescription(VERTEX_BUFFER_BIND_ID, vertexLayout.stride(), VK_VERTEX_INPUT_RATE_VERTEX);
// Binding 1: 每個實例
vertices.bindingDescriptions[1] =
vks::initializers::vertexInputBindingDescription(INSTANCE_BUFFER_BIND_ID, sizeof(InstanceData), VK_VERTEX_INPUT_RATE_INSTANCE);
// 屬性描述
// 描述內存佈局和着色器位置
vertices.attributeDescriptions.clear();
// 每個頂點都具備的屬性
// Location 0 : Position
vertices.attributeDescriptions.push_back(
vks::initializers::vertexInputAttributeDescription(
VERTEX_BUFFER_BIND_ID,
0,
VK_FORMAT_R32G32B32_SFLOAT,
0)
);
// Location 1 : Normal
vertices.attributeDescriptions.push_back(
vks::initializers::vertexInputAttributeDescription(
VERTEX_BUFFER_BIND_ID,
1,
VK_FORMAT_R32G32B32_SFLOAT,
sizeof(float) * 3)
);
// Location 2 : Color
vertices.attributeDescriptions.push_back(
vks::initializers::vertexInputAttributeDescription(
VERTEX_BUFFER_BIND_ID,
2,
VK_FORMAT_R32G32B32_SFLOAT,
sizeof(float) * 6)
);
// 實例屬性
// Location 4: Position
vertices.attributeDescriptions.push_back(
vks::initializers::vertexInputAttributeDescription(
INSTANCE_BUFFER_BIND_ID, 4, VK_FORMAT_R32G32B32_SFLOAT, offsetof(InstanceData, pos))
);
// Location 5: Scale
vertices.attributeDescriptions.push_back(
vks::initializers::vertexInputAttributeDescription(
INSTANCE_BUFFER_BIND_ID, 5, VK_FORMAT_R32_SFLOAT, offsetof(InstanceData, scale))
);
vertices.inputState = vks::initializers::pipelineVertexInputStateCreateInfo();
vertices.inputState.vertexBindingDescriptionCount = static_cast<uint32_t>(vertices.bindingDescriptions.size());
vertices.inputState.pVertexBindingDescriptions = vertices.bindingDescriptions.data();
vertices.inputState.vertexAttributeDescriptionCount = static_cast<uint32_t>(vertices.attributeDescriptions.size());
vertices.inputState.pVertexAttributeDescriptions = vertices.attributeDescriptions.data();
}
在創建圖形管線的地方我們會將這些信息傳送至VkGraphicsPipelineCreateInfo創建管線,而且其頂點和片元着色器很簡單:
頂點着色器(indirectdraw.vert):
#version 450
// Vertex attributes
layout (location = 0) in vec4 inPos;
layout (location = 1) in vec3 inNormal;
layout (location = 2) in vec3 inColor;
// Instanced attributes
layout (location = 4) in vec3 instancePos;
layout (location = 5) in float instanceScale;
layout (binding = 0) uniform UBO
{
mat4 projection;
mat4 modelview;
} ubo;
layout (location = 0) out vec3 outNormal;
layout (location = 1) out vec3 outColor;
layout (location = 2) out vec3 outViewVec;
layout (location = 3) out vec3 outLightVec;
out gl_PerVertex
{
vec4 gl_Position;
};
void main()
{
outColor = inColor;
outNormal = inNormal;
//根據實例創建位置
vec4 pos = vec4((inPos.xyz * instanceScale) + instancePos, 1.0);
gl_Position = ubo.projection * ubo.modelview * pos;
vec4 wPos = ubo.modelview * vec4(pos.xyz, 1.0);
vec4 lPos = vec4(0.0, 10.0, 50.0, 1.0);
outLightVec = lPos.xyz - pos.xyz;
outViewVec = -pos.xyz;
}
片元着色器(indirectdraw.frag):
#version 450
layout (location = 0) in vec3 inNormal;
layout (location = 1) in vec3 inColor;
layout (location = 2) in vec3 inViewVec;
layout (location = 3) in vec3 inLightVec;
layout (location = 0) out vec4 outFragColor;
void main()
{
vec3 N = normalize(inNormal);
vec3 L = normalize(inLightVec);
vec3 ambient = vec3(0.25);
vec3 diffuse = vec3(max(dot(N, L), 0.0));
outFragColor = vec4((ambient + diffuse) * inColor, 1.0);
}
2.3 緩衝區數據填充
介紹完圖形着色器後頂點輸入及着色器後,我們來填充緩衝區數據,首先創建一個prepareBuffers函數:
void prepareBuffers()
{
objectCount = OBJECT_COUNT * OBJECT_COUNT * OBJECT_COUNT;
vks::Buffer stagingBuffer;
std::vector<InstanceData> instanceData(objectCount);
indirectCommands.resize(objectCount);
// 間接畫命令
for (uint32_t x = 0; x < OBJECT_COUNT; x++)
{
for (uint32_t y = 0; y < OBJECT_COUNT; y++)
{
for (uint32_t z = 0; z < OBJECT_COUNT; z++)
{
uint32_t index = x + y * OBJECT_COUNT + z * OBJECT_COUNT * OBJECT_COUNT;
indirectCommands[index].instanceCount = 1;
indirectCommands[index].firstInstance = index;
// firstIndex和indexCount是由計算着色器編寫的
}
}
}
indirectStats.drawCount = static_cast<uint32_t>(indirectCommands.size());
VK_CHECK_RESULT(vulkanDevice->createBuffer(
VK_BUFFER_USAGE_TRANSFER_SRC_BIT,
VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT | VK_MEMORY_PROPERTY_HOST_COHERENT_BIT,
&stagingBuffer,
indirectCommands.size() * sizeof(VkDrawIndexedIndirectCommand),
indirectCommands.data()));
VK_CHECK_RESULT(vulkanDevice->createBuffer(
VK_BUFFER_USAGE_INDIRECT_BUFFER_BIT | VK_BUFFER_USAGE_STORAGE_BUFFER_BIT | VK_BUFFER_USAGE_TRANSFER_DST_BIT,
VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT,
&indirectCommandsBuffer,
stagingBuffer.size));
vulkanDevice->copyBuffer(&stagingBuffer, &indirectCommandsBuffer, queue);
stagingBuffer.destroy();
VK_CHECK_RESULT(vulkanDevice->createBuffer(
VK_BUFFER_USAGE_STORAGE_BUFFER_BIT | VK_BUFFER_USAGE_TRANSFER_SRC_BIT,
VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT | VK_MEMORY_PROPERTY_HOST_COHERENT_BIT,
&indirectDrawCountBuffer,
sizeof(indirectStats)));
// 主機訪問映射
VK_CHECK_RESULT(indirectDrawCountBuffer.map());
// Instance data 實例數據
for (uint32_t x = 0; x < OBJECT_COUNT; x++)
{
for (uint32_t y = 0; y < OBJECT_COUNT; y++)
{
for (uint32_t z = 0; z < OBJECT_COUNT; z++)
{
uint32_t index = x + y * OBJECT_COUNT + z * OBJECT_COUNT * OBJECT_COUNT;
instanceData[index].pos = glm::vec3((float)x, (float)y, (float)z) - glm::vec3((float)OBJECT_COUNT / 2.0f);
instanceData[index].scale = 2.0f;
}
}
}
VK_CHECK_RESULT(vulkanDevice->createBuffer(
VK_BUFFER_USAGE_TRANSFER_SRC_BIT,
VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT | VK_MEMORY_PROPERTY_HOST_COHERENT_BIT,
&stagingBuffer,
instanceData.size() * sizeof(InstanceData),
instanceData.data()));
VK_CHECK_RESULT(vulkanDevice->createBuffer(
VK_BUFFER_USAGE_VERTEX_BUFFER_BIT | VK_BUFFER_USAGE_STORAGE_BUFFER_BIT | VK_BUFFER_USAGE_TRANSFER_DST_BIT,
VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT,
&instanceBuffer,
stagingBuffer.size));
vulkanDevice->copyBuffer(&stagingBuffer, &instanceBuffer, queue);
stagingBuffer.destroy();
// 包含索引偏移量和lod計數的着色器存儲緩衝區
struct LOD
{
uint32_t firstIndex;
uint32_t indexCount;
float distance;
float _pad0;
};
std::vector<LOD> LODLevels;
uint32_t n = 0;
for (auto modelPart : models.lodObject.parts)
{
LOD lod;
lod.firstIndex = modelPart.indexBase; // 這個LOD的第一個索引
lod.indexCount = modelPart.indexCount; // 這個LOD的索引計數
lod.distance = 5.0f + n * 5.0f; // 這個LOD的起始距離(到觀察相機位置),計算着色器中判斷模型LOD用
n++;
LODLevels.push_back(lod);
}
VK_CHECK_RESULT(vulkanDevice->createBuffer(
VK_BUFFER_USAGE_TRANSFER_SRC_BIT,
VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT | VK_MEMORY_PROPERTY_HOST_COHERENT_BIT,
&stagingBuffer,
LODLevels.size() * sizeof(LOD),
LODLevels.data()));
VK_CHECK_RESULT(vulkanDevice->createBuffer(
VK_BUFFER_USAGE_STORAGE_BUFFER_BIT | VK_BUFFER_USAGE_TRANSFER_DST_BIT,
VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT,
&compute.lodLevelsBuffers,
stagingBuffer.size));
vulkanDevice->copyBuffer(&stagingBuffer, &compute.lodLevelsBuffers, queue);
stagingBuffer.destroy();
// 場景緩衝區
VK_CHECK_RESULT(vulkanDevice->createBuffer(
VK_BUFFER_USAGE_UNIFORM_BUFFER_BIT,
VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT | VK_MEMORY_PROPERTY_HOST_COHERENT_BIT,
&uniformData.scene,
sizeof(uboScene)));
VK_CHECK_RESULT(uniformData.scene.map());
updateUniformBuffer(true);
}
其中updateUniformBuffer函數中來實時更新場景中數據:
void updateUniformBuffer(bool viewChanged)
{
if (viewChanged)
{
uboScene.projection = camera.matrices.perspective;
uboScene.modelview = camera.matrices.view;
//通過不更新視錐體數據固定視錐從而渲染固定場景的模型
if (!fixedFrustum)
{
uboScene.cameraPos = glm::vec4(camera.position, 1.0f) * -1.0f;
frustum.update(uboScene.projection * uboScene.modelview);
memcpy(uboScene.frustumPlanes, frustum.planes.data(), sizeof(glm::vec4) * 6);
}
}
memcpy(uniformData.scene.mapped, &uboScene, sizeof(uboScene));
}
之後常規的創建描述符佈局、2.2的管線創建相關、描述符池、描述符集等操作,balabala…
2.4 計算管線創建與運行
現在我們的場景數據都已創建我們,我們接下來需要進行計算管線與命令創建:
void prepareCompute()
{
// 獲取具有計算能力的設備隊列
vkGetDeviceQueue(device, vulkanDevice->queueFamilyIndices.compute, 0, &compute.queue);
//創建計算管道
//計算管道與圖形管道創建時是分開的,即使它們使用相同的隊列(家族索引)
std::vector<VkDescriptorSetLayoutBinding> setLayoutBindings = {
// Binding 0: 實例輸入數據緩衝區
vks::initializers::descriptorSetLayoutBinding(
VK_DESCRIPTOR_TYPE_STORAGE_BUFFER,
VK_SHADER_STAGE_COMPUTE_BIT,
0),
// Binding 1: 綁定1:間接繪製命令輸出緩衝區(輸入)
vks::initializers::descriptorSetLayoutBinding(
VK_DESCRIPTOR_TYPE_STORAGE_BUFFER,
VK_SHADER_STAGE_COMPUTE_BIT,
1),
// Binding 2: 綁定2:統一緩衝區與全局矩陣(輸入)
vks::initializers::descriptorSetLayoutBinding(
VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER,
VK_SHADER_STAGE_COMPUTE_BIT,
2),
// Binding 3: 間接draw stats(輸出)
vks::initializers::descriptorSetLayoutBinding(
VK_DESCRIPTOR_TYPE_STORAGE_BUFFER,
VK_SHADER_STAGE_COMPUTE_BIT,
3),
// Binding 4: LOD信息(輸入)
vks::initializers::descriptorSetLayoutBinding(
VK_DESCRIPTOR_TYPE_STORAGE_BUFFER,
VK_SHADER_STAGE_COMPUTE_BIT,
4),
};
VkDescriptorSetLayoutCreateInfo descriptorLayout =
vks::initializers::descriptorSetLayoutCreateInfo(
setLayoutBindings.data(),
static_cast<uint32_t>(setLayoutBindings.size()));
VK_CHECK_RESULT(vkCreateDescriptorSetLayout(device, &descriptorLayout, nullptr, &compute.descriptorSetLayout));
VkPipelineLayoutCreateInfo pPipelineLayoutCreateInfo =
vks::initializers::pipelineLayoutCreateInfo(
&compute.descriptorSetLayout,
1);
VK_CHECK_RESULT(vkCreatePipelineLayout(device, &pPipelineLayoutCreateInfo, nullptr, &compute.pipelineLayout));
VkDescriptorSetAllocateInfo allocInfo =
vks::initializers::descriptorSetAllocateInfo(
descriptorPool,
&compute.descriptorSetLayout,
1);
VK_CHECK_RESULT(vkAllocateDescriptorSets(device, &allocInfo, &compute.descriptorSet));
std::vector<VkWriteDescriptorSet> computeWriteDescriptorSets =
{
// Binding 0: 實例輸入數據緩衝區
vks::initializers::writeDescriptorSet(
compute.descriptorSet,
VK_DESCRIPTOR_TYPE_STORAGE_BUFFER,
0,
&instanceBuffer.descriptor),
// Binding 1: 間接繪製命令輸出緩衝區
vks::initializers::writeDescriptorSet(
compute.descriptorSet,
VK_DESCRIPTOR_TYPE_STORAGE_BUFFER,
1,
&indirectCommandsBuffer.descriptor),
// Binding 2: 具有全局矩陣的統一緩衝區
vks::initializers::writeDescriptorSet(
compute.descriptorSet,
VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER,
2,
&uniformData.scene.descriptor),
// Binding 3: 原子計數器(用着色器編寫)
vks::initializers::writeDescriptorSet(
compute.descriptorSet,
VK_DESCRIPTOR_TYPE_STORAGE_BUFFER,
3,
&indirectDrawCountBuffer.descriptor),
// Binding 4: LOD信息
vks::initializers::writeDescriptorSet(
compute.descriptorSet,
VK_DESCRIPTOR_TYPE_STORAGE_BUFFER,
4,
&compute.lodLevelsBuffers.descriptor)
};
vkUpdateDescriptorSets(device, static_cast<uint32_t>(computeWriteDescriptorSets.size()), computeWriteDescriptorSets.data(), 0, NULL);
// 創建管道
VkComputePipelineCreateInfo computePipelineCreateInfo = vks::initializers::computePipelineCreateInfo(compute.pipelineLayout, 0);
computePipelineCreateInfo.stage = loadShader(getAssetPath() + "shaders/computecullandlod/cull.comp.spv", VK_SHADER_STAGE_COMPUTE_BIT);
//使用專門化常量傳遞最大值。詳細程度(由編號決定)網格)
VkSpecializationMapEntry specializationEntry{};
specializationEntry.constantID = 0;
specializationEntry.offset = 0;
specializationEntry.size = sizeof(uint32_t);
uint32_t specializationData = static_cast<uint32_t>(models.lodObject.parts.size()) - 1;
VkSpecializationInfo specializationInfo;
specializationInfo.mapEntryCount = 1;
specializationInfo.pMapEntries = &specializationEntry;
specializationInfo.dataSize = sizeof(specializationData);
specializationInfo.pData = &specializationData;
computePipelineCreateInfo.stage.pSpecializationInfo = &specializationInfo;
VK_CHECK_RESULT(vkCreateComputePipelines(device, pipelineCache, 1, &computePipelineCreateInfo, nullptr, &compute.pipeline));
//作爲計算隊列家族的單獨命令池可能與圖形不同
VkCommandPoolCreateInfo cmdPoolInfo = {};
cmdPoolInfo.sType = VK_STRUCTURE_TYPE_COMMAND_POOL_CREATE_INFO;
cmdPoolInfo.queueFamilyIndex = vulkanDevice->queueFamilyIndices.compute;
cmdPoolInfo.flags = VK_COMMAND_POOL_CREATE_RESET_COMMAND_BUFFER_BIT;
VK_CHECK_RESULT(vkCreateCommandPool(device, &cmdPoolInfo, nullptr, &compute.commandPool));
// 爲計算操作創建一個命令緩衝區
VkCommandBufferAllocateInfo cmdBufAllocateInfo =
vks::initializers::commandBufferAllocateInfo(
compute.commandPool,
VK_COMMAND_BUFFER_LEVEL_PRIMARY,
1);
VK_CHECK_RESULT(vkAllocateCommandBuffers(device, &cmdBufAllocateInfo, &compute.commandBuffer));
// 柵欄用於計算CB同步
VkFenceCreateInfo fenceCreateInfo = vks::initializers::fenceCreateInfo(VK_FENCE_CREATE_SIGNALED_BIT);
VK_CHECK_RESULT(vkCreateFence(device, &fenceCreateInfo, nullptr, &compute.fence));
VkSemaphoreCreateInfo semaphoreCreateInfo = vks::initializers::semaphoreCreateInfo();
VK_CHECK_RESULT(vkCreateSemaphore(device, &semaphoreCreateInfo, nullptr, &compute.semaphore));
// 構建包含計算分派命令的單個命令緩衝區
buildComputeCommandBuffer();
}
其中,我們需要構建包含計算分派命令的單個命令緩衝區,創建buildComputeCommandBuffer來執行此部分:
void buildComputeCommandBuffer()
{
VkCommandBufferBeginInfo cmdBufInfo = vks::initializers::commandBufferBeginInfo();
VK_CHECK_RESULT(vkBeginCommandBuffer(compute.commandBuffer, &cmdBufInfo));
//添加內存屏障,以確保間接命令已經消耗之前,計算着色器更新他們
VkBufferMemoryBarrier bufferBarrier = vks::initializers::bufferMemoryBarrier();
bufferBarrier.buffer = indirectCommandsBuffer.buffer;
bufferBarrier.size = indirectCommandsBuffer.descriptor.range;
bufferBarrier.srcAccessMask = VK_ACCESS_INDIRECT_COMMAND_READ_BIT;
bufferBarrier.dstAccessMask = VK_ACCESS_SHADER_WRITE_BIT;
bufferBarrier.srcQueueFamilyIndex = vulkanDevice->queueFamilyIndices.graphics;
bufferBarrier.dstQueueFamilyIndex = vulkanDevice->queueFamilyIndices.compute;
vkCmdPipelineBarrier(
compute.commandBuffer,
VK_PIPELINE_STAGE_DRAW_INDIRECT_BIT,
VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT,
VK_FLAGS_NONE,
0, nullptr,
1, &bufferBarrier,
0, nullptr);
vkCmdBindPipeline(compute.commandBuffer, VK_PIPELINE_BIND_POINT_COMPUTE, compute.pipeline);
vkCmdBindDescriptorSets(compute.commandBuffer, VK_PIPELINE_BIND_POINT_COMPUTE, compute.pipelineLayout, 0, 1, &compute.descriptorSet, 0, 0);
//分派計算作業
//計算着色器將進行視錐剔除,並根據對象的可見性調整間接繪製調用。
//它還決定使用的lod取決於距離觀衆。
vkCmdDispatch(compute.commandBuffer, objectCount / 16, 1, 1);
// 添加內存屏障,以確保計算着色器在使用之前已經完成間接命令緩衝區的寫入
bufferBarrier.srcAccessMask = VK_ACCESS_SHADER_WRITE_BIT;
bufferBarrier.dstAccessMask = VK_ACCESS_INDIRECT_COMMAND_READ_BIT;
bufferBarrier.buffer = indirectCommandsBuffer.buffer;
bufferBarrier.size = indirectCommandsBuffer.descriptor.range;
bufferBarrier.srcQueueFamilyIndex = vulkanDevice->queueFamilyIndices.compute;
bufferBarrier.dstQueueFamilyIndex = vulkanDevice->queueFamilyIndices.graphics;
vkCmdPipelineBarrier(
compute.commandBuffer,
VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT,
VK_PIPELINE_STAGE_DRAW_INDIRECT_BIT,
VK_FLAGS_NONE,
0, nullptr,
1, &bufferBarrier,
0, nullptr);
vkEndCommandBuffer(compute.commandBuffer);
}
接下來重點來了,計算着色器時如何實現的呢,直接上代碼:
計算着色器(cull.comp):
#version 450
layout (constant_id = 0) const int MAX_LOD_LEVEL = 5;
struct InstanceData
{
vec3 pos;
float scale;
};
// Binding 0: 用於篩選的實例輸入數據
layout (binding = 0, std140) buffer Instances
{
InstanceData instances[ ];
};
// 與VkDrawIndexedIndirectCommand相同的佈局
struct IndexedIndirectCommand
{
uint indexCount;
uint instanceCount;
uint firstIndex;
uint vertexOffset;
uint firstInstance;
};
// 多畫輸出
layout (binding = 1, std430) writeonly buffer IndirectDraws
{
IndexedIndirectCommand indirectDraws[ ];
};
// Binding 2: 統一塊對象與矩陣
layout (binding = 2) uniform UBO
{
mat4 projection;
mat4 modelview;
vec4 cameraPos;
vec4 frustumPlanes[6];
} ubo;
// Binding 3: 間接繪製抽取屬性
layout (binding = 3) buffer UBOOut
{
uint drawCount;
uint lodCount[MAX_LOD_LEVEL + 1];
} uboOut;
// Binding 4: 詳細級別信息
struct LOD
{
uint firstIndex;
uint indexCount;
float distance;
float _pad0;
};
layout (binding = 4) readonly buffer LODs
{
LOD lods[ ];
};
layout (local_size_x = 16) in;
bool frustumCheck(vec4 pos, float radius)
{
// 根據視錐體平面判斷是否在視錐體內
for (int i = 0; i < 6; i++)
{
if (dot(pos, ubo.frustumPlanes[i]) + radius < 0.0)
{
return false;
}
}
return true;
}
layout (local_size_x = 16) in;
void main()
{
uint idx = gl_GlobalInvocationID.x + gl_GlobalInvocationID.y * gl_NumWorkGroups.x * gl_WorkGroupSize.x;
//清除第一次調用時的統計數據
if (idx == 0)
{
atomicExchange(uboOut.drawCount, 0);//在一幀開始時改變數據
for (uint i = 0; i < MAX_LOD_LEVEL + 1; i++)
{
atomicExchange(uboOut.lodCount[i], 0);
}
}
vec4 pos = vec4(instances[idx].pos.xyz, 1.0);
// Check if object is within current viewing frustum
//檢查對象是否在當前查看截錐內
if (frustumCheck(pos, 1.0))
{
indirectDraws[idx].instanceCount = 1;
// 增加間接抽取點數
atomicAdd(uboOut.drawCount, 1);//在一幀變化時數據+1
//根據到相機的距離選擇合適的LOD級別
uint lodLevel = MAX_LOD_LEVEL;
for (uint i = 0; i < MAX_LOD_LEVEL; i++)
{
if (distance(instances[idx].pos.xyz, ubo.cameraPos.xyz) < lods[i].distance)
{
lodLevel = i;
break;
}
}
indirectDraws[idx].firstIndex = lods[lodLevel].firstIndex;
indirectDraws[idx].indexCount = lods[lodLevel].indexCount;
// 更新統計數據
atomicAdd(uboOut.lodCount[lodLevel], 1);
}
else
{
indirectDraws[idx].instanceCount = 0;
}
}
2.5 圖形管線繪製
在執行完計算管線後,首先我們需要創建常規的圖形繪製命令及間接繪製命,然後我們需要的就是在根據剔除和LOD的數據來進行圖形繪製:
void buildCommandBuffers()
{
VkCommandBufferBeginInfo cmdBufInfo = vks::initializers::commandBufferBeginInfo();
VkClearValue clearValues[2];
clearValues[0].color = { { 0.18f, 0.27f, 0.5f, 0.0f } };
clearValues[1].depthStencil = { 1.0f, 0 };
VkRenderPassBeginInfo renderPassBeginInfo = vks::initializers::renderPassBeginInfo();
renderPassBeginInfo.renderPass = renderPass;
renderPassBeginInfo.renderArea.extent.width = width;
renderPassBeginInfo.renderArea.extent.height = height;
renderPassBeginInfo.clearValueCount = 2;
renderPassBeginInfo.pClearValues = clearValues;
for (int32_t i = 0; i < drawCmdBuffers.size(); ++i)
{
// 設定目標幀緩衝器
renderPassBeginInfo.framebuffer = frameBuffers[i];
VK_CHECK_RESULT(vkBeginCommandBuffer(drawCmdBuffers[i], &cmdBufInfo));
vkCmdBeginRenderPass(drawCmdBuffers[i], &renderPassBeginInfo, VK_SUBPASS_CONTENTS_INLINE);
VkViewport viewport = vks::initializers::viewport((float)width, (float)height, 0.0f, 1.0f);
vkCmdSetViewport(drawCmdBuffers[i], 0, 1, &viewport);
VkRect2D scissor = vks::initializers::rect2D(width, height, 0, 0);
vkCmdSetScissor(drawCmdBuffers[i], 0, 1, &scissor);
VkDeviceSize offsets[1] = { 0 };
vkCmdBindDescriptorSets(drawCmdBuffers[i], VK_PIPELINE_BIND_POINT_GRAPHICS, pipelineLayout, 0, 1, &descriptorSet, 0, NULL);
// Mesh containing the LODs 包含lod的網格
vkCmdBindPipeline(drawCmdBuffers[i], VK_PIPELINE_BIND_POINT_GRAPHICS, pipelines.plants);
vkCmdBindVertexBuffers(drawCmdBuffers[i], VERTEX_BUFFER_BIND_ID, 1, &models.lodObject.vertices.buffer, offsets);
vkCmdBindVertexBuffers(drawCmdBuffers[i], INSTANCE_BUFFER_BIND_ID, 1, &instanceBuffer.buffer, offsets);
vkCmdBindIndexBuffer(drawCmdBuffers[i], models.lodObject.indices.buffer, 0, VK_INDEX_TYPE_UINT32);
if (vulkanDevice->features.multiDrawIndirect)
{
vkCmdDrawIndexedIndirect(drawCmdBuffers[i], indirectCommandsBuffer.buffer, 0, indirectCommands.size(), sizeof(VkDrawIndexedIndirectCommand));
}
else
{
//如果多重繪製不可用,我們必鬚髮出單獨的繪製命令
for (auto j = 0; j < indirectCommands.size(); j++)
{
vkCmdDrawIndexedIndirect(drawCmdBuffers[i], indirectCommandsBuffer.buffer, j * sizeof(VkDrawIndexedIndirectCommand), 1, sizeof(VkDrawIndexedIndirectCommand));
}
}
drawUI(drawCmdBuffers[i]);
vkCmdEndRenderPass(drawCmdBuffers[i]);
VK_CHECK_RESULT(vkEndCommandBuffer(drawCmdBuffers[i]));
}
}
在繪製過程中,我們還必須在提交命令的時候注意對計算着色器中數據與圖形管線中數據進行信號量及數據柵欄交互,以保證其數據順序正確:
void draw()
{
VulkanExampleBase::prepareFrame();
// 提交計算着色器進行視錐剔除
// 等待fence以確保計算緩衝區寫操作已經完成
vkWaitForFences(device, 1, &compute.fence, VK_TRUE, UINT64_MAX);
vkResetFences(device, 1, &compute.fence);
VkSubmitInfo computeSubmitInfo = vks::initializers::submitInfo();
computeSubmitInfo.commandBufferCount = 1;
computeSubmitInfo.pCommandBuffers = &compute.commandBuffer;
computeSubmitInfo.signalSemaphoreCount = 1;
computeSubmitInfo.pSignalSemaphores = &compute.semaphore;
VK_CHECK_RESULT(vkQueueSubmit(compute.queue, 1, &computeSubmitInfo, VK_NULL_HANDLE));
//提交圖形命令緩衝區
submitInfo.commandBufferCount = 1;
submitInfo.pCommandBuffers = &drawCmdBuffers[currentBuffer];
// 等待當前並計算信號量
std::array<VkPipelineStageFlags,2> stageFlags = {
VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT,
VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT,
};
std::array<VkSemaphore,2> waitSemaphores = {
semaphores.presentComplete, // 等待顯示完成
compute.semaphore // 等待計算完成
};
submitInfo.pWaitSemaphores = waitSemaphores.data();
submitInfo.waitSemaphoreCount = static_cast<uint32_t>(waitSemaphores.size());
submitInfo.pWaitDstStageMask = stageFlags.data();
// 提交到隊列
VK_CHECK_RESULT(vkQueueSubmit(queue, 1, &submitInfo, compute.fence));
VulkanExampleBase::submitFrame();
// 從計算中獲得提取計數
memcpy(&indirectStats, indirectDrawCountBuffer.mapped, sizeof(indirectStats));
}
接下來,運行可見效果:
上圖我們可以見到,在攝像機離模型較遠時,對應的模型LOD爲3/4/5等級,當我們將攝像機移動到近處我們可與看到下圖:
我們可以看到此時模型逐步變得精細,LOD爲0/1/2級別,此時我們開啓固定視錐體,移動相機視角可見下圖:
此時我們可以看到,相機在固定視錐體時僅會看到其範圍內的486個模型(共4096個模型)並進行了渲染,其餘皆被剔除。