更多文章參考:自己動手實現darknet預測分類動態庫
報錯:
[5956] MPTLOG 12256 cuDNN status Error in: file: convolutional_layer.c : cudnn_convolutional_setup() : line: 237 : build time: Dec 13 2019 - 11:54:32 status:3
[5956] MPTLOG 12256 cuDNN status Error in: file: convolutional_layer.c : cudnn_convolutional_setup() : line: 237 : build time: Dec 13 2019 - 11:54:32 status:3
status=3:CUDNN_STATUS_BAD_PARAM
報錯代碼:
#if((CUDNN_MAJOR*10 + CUDNN_MINOR) >= 72) // cuDNN >= 7.2
CHECK_CUDNN(cudnnSetConvolutionMathType(l->convDesc, CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION));
#endif
代碼理解:
For the supported GPUs, the Tensor Core operations will be triggered for convolution functions only when cudnnSetConvolutionMathType() is called on the appropriate convolution descriptor by setting the mathType to CUDNN_TENSOR_OP_MATH or CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION.
對於支持的GPU,只有在對適當的卷積描述符調用cudnnSetConvolutionMathType()時,纔會爲卷積函數觸發張量核心操作,方法是將mathType設置爲CUDNN_Tensor_OP_MATH或CUDNN_Tensor_OP_MATH_ALLOW_CONVERSION。
3.180. cudnnSetConvolutionMathType()
cudnnStatus_t cudnnSetConvolutionMathType(
cudnnConvolutionDescriptor_t convDesc,
cudnnMathType_t mathType)
This function allows the user to specify whether or not the use of tensor op is permitted in the library routines associated with a given convolution descriptor.
Returns
CUDNN_STATUS_SUCCESS
The math type was set successfully.
CUDNN_STATUS_BAD_PARAM
Either an invalid convolution descriptor was provided or an invalid math type was specified.
此函數允許用戶指定,在與給定卷積描述符相關聯的庫例程中是否允許使用tensor op。
提供了無效的卷積描述符或指定了無效的數學類型時,返回CUDNN_STATUS_BAD_PARAM。
A new mode CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION is added to cudnnMathType_t. The computation time for FP32 tensors can be reduced by selecting this mode.
The functions cudnnRNNForwardInference(), cudnnRNNForwardTraining(), cudnnRNNBackwardData(), and cudnnRNNBackwardWeights() will now perform down conversion of FP32 input/output only when CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION is set.
Improved the heuristics for cudnnGet*Algorithm() functions.
- A new mode CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION is added to cudnnMathType_t. The computation time for FP32 tensors can be reduced by selecting this mode.選擇這種模式可以減少FP32張量的計算時間。
- The functions cudnnRNNForwardInference(), cudnnRNNForwardTraining(), cudnnRNNBackwardData(), and cudnnRNNBackwardWeights() will now perform down conversion of FP32 input/output only when CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION is set.
- 函數cudnnrnforwardinference()、cudnnrnforwardtraining()、cudnnrnbackwarddata()和cudnnrnbackwardweights()只有在設置了CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION時,執行FP32輸入/輸出的下轉換。
Following issues and limitations exist in this release:
本版本中存在以下問題和限制:
- When tensor cores are enabled in cuDNN 7.3.0, the wgrad calculations will perform an illegal memory access when K and C values are both non-integral multiples of 8. This will not likely produce incorrect results, but may corrupt other memory depending on the user buffer locations. This issue is present on Volta & Turing architectures.
- 當在cuDNN 7.3.0中啓用張量核時,當K和C值都是8的非整數倍時,wgrad計算將執行非法內存訪問。這可能不會產生不正確的結果,但可能會損壞其他內存,具體取決於用戶緩衝區的位置。這個問題出現在Volta和Turing架構上。
- Using cudnnGetConvolution*_v7 routines with cudnnConvolutionDescriptor_t set to CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION leads to incorrect outputs. These incorrect outputs will consist only of CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION cases, instead of also returning the performance results for both DEFAULT_MATH and CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION cases.
- 使用cudnnGetConvolution*uv7例程,並將cudnnConvolutionDescriptor設置爲CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION將導致不正確的輸出。這些不正確的輸出將僅包含CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION cases,而不是同時返回DEFAULT_MATH和CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION cases的性能結果。
如果VS,C/C++-->預編譯頭--》去掉CUDNN後,運行不報錯;
但是去掉CUDNN後,預測出現nan問題。
因爲報錯代碼只是加速卷積運算,所以註釋掉該代碼問題解決。
#if((CUDNN_MAJOR*10 + CUDNN_MINOR) >= 72) // cuDNN >= 7.2
//CHECK_CUDNN(cudnnSetConvolutionMathType(l->convDesc, CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION));
#endif
可能錯誤原因:
1.其他設備在用GPU時會報錯
2.參數不對,訪問了無效數據
官網文檔:https://docs.nvidia.com/deeplearning/sdk/cudnn-archived/cudnn_701/cudnn-user-guide/index.html
CUDNN_STATUS_BAD_PARAM
An incorrect value or parameter was passed to the function.
To correct: ensure that all the parameters being passed have valid values.
因爲單獨測試時不報錯,跟公司軟件一起測試也不報錯,但是在公司軟件內部調用時報錯,所以參數應該沒有問題,估計是其他設備在使用GPU報錯
參考文獻:
https://devblogs.nvidia.com/tensor-ops-made-easier-in-cudnn/
Caffe與cudnn 6.0 的兼容性問題 CUDNN_STATUS_BAD_PARAM
Check failed: status == CUDNN_STATUS_SUCCESS (3 vs. 0) CUDNN_STATUS_BAD_PARAM