darknet動態庫集成到公司軟件時報錯：CUDNN_STATUS_BAD_PARAM

原創

haimianjie2012

2020-02-22 08:41

報錯：

[5956]  MPTLOG 12256 cuDNN status Error in: file: convolutional_layer.c : cudnn_convolutional_setup() : line: 237 : build time: Dec 13 2019 - 11:54:32 status:3

[5956] MPTLOG 12256 cuDNN status Error in: file: convolutional_layer.c : cudnn_convolutional_setup() : line: 237 : build time: Dec 13 2019 - 11:54:32 status:3

status=3:CUDNN_STATUS_BAD_PARAM

報錯代碼：

#if((CUDNN_MAJOR*10 + CUDNN_MINOR) >= 72)   // cuDNN >= 7.2
    CHECK_CUDNN(cudnnSetConvolutionMathType(l->convDesc, CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION));
#endif

代碼理解：

For the supported GPUs, the Tensor Core operations will be triggered for convolution functions only when cudnnSetConvolutionMathType() is called on the appropriate convolution descriptor by setting the mathType to CUDNN_TENSOR_OP_MATH or CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION.

對於支持的GPU,只有在對適當的卷積描述符調用cudnnSetConvolutionMathType（）時，纔會爲卷積函數觸發張量核心操作，方法是將mathType設置爲CUDNN_Tensor_OP_MATH或CUDNN_Tensor_OP_MATH_ALLOW_CONVERSION。

3.180. cudnnSetConvolutionMathType()
cudnnStatus_t cudnnSetConvolutionMathType(
    cudnnConvolutionDescriptor_t    convDesc,
    cudnnMathType_t                 mathType)
This function allows the user to specify whether or not the use of tensor op is permitted in the library routines associated with a given convolution descriptor.

Returns
CUDNN_STATUS_SUCCESS
The math type was set successfully.

CUDNN_STATUS_BAD_PARAM
Either an invalid convolution descriptor was provided or an invalid math type was specified.

此函數允許用戶指定，在與給定卷積描述符相關聯的庫例程中是否允許使用tensor op。

提供了無效的卷積描述符或指定了無效的數學類型時，返回CUDNN_STATUS_BAD_PARAM。

A new mode CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION is added to cudnnMathType_t. The computation time for FP32 tensors can be reduced by selecting this mode.
The functions cudnnRNNForwardInference(), cudnnRNNForwardTraining(), cudnnRNNBackwardData(), and cudnnRNNBackwardWeights() will now perform down conversion of FP32 input/output only when CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION is set.
Improved the heuristics for cudnnGet*Algorithm() functions.

A new mode CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION is added to cudnnMathType_t. The computation time for FP32 tensors can be reduced by selecting this mode.選擇這種模式可以減少FP32張量的計算時間。
The functions cudnnRNNForwardInference(), cudnnRNNForwardTraining(), cudnnRNNBackwardData(), and cudnnRNNBackwardWeights() will now perform down conversion of FP32 input/output only when CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION is set.
函數cudnnrnforwardinference（）、cudnnrnforwardtraining（）、cudnnrnbackwarddata（）和cudnnrnbackwardweights（）只有在設置了CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION時,執行FP32輸入/輸出的下轉換。

Following issues and limitations exist in this release:

本版本中存在以下問題和限制：

When tensor cores are enabled in cuDNN 7.3.0, the wgrad calculations will perform an illegal memory access when K and C values are both non-integral multiples of 8. This will not likely produce incorrect results, but may corrupt other memory depending on the user buffer locations. This issue is present on Volta & Turing architectures.
當在cuDNN 7.3.0中啓用張量核時，當K和C值都是8的非整數倍時，wgrad計算將執行非法內存訪問。這可能不會產生不正確的結果，但可能會損壞其他內存，具體取決於用戶緩衝區的位置。這個問題出現在Volta和Turing架構上。
Using cudnnGetConvolution*_v7 routines with cudnnConvolutionDescriptor_t set to CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION leads to incorrect outputs. These incorrect outputs will consist only of CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION cases, instead of also returning the performance results for both DEFAULT_MATH and CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION cases.
使用cudnnGetConvolution*uv7例程，並將cudnnConvolutionDescriptor設置爲CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION將導致不正確的輸出。這些不正確的輸出將僅包含CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION cases，而不是同時返回DEFAULT_MATH和CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION cases的性能結果。

如果VS，C/C++-->預編譯頭--》去掉CUDNN後，運行不報錯；

但是去掉CUDNN後，預測出現nan問題。

因爲報錯代碼只是加速卷積運算，所以註釋掉該代碼問題解決。

#if((CUDNN_MAJOR*10 + CUDNN_MINOR) >= 72)   // cuDNN >= 7.2
    //CHECK_CUDNN(cudnnSetConvolutionMathType(l->convDesc, CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION));
#endif

可能錯誤原因：

1.其他設備在用GPU時會報錯

2.參數不對，訪問了無效數據

官網文檔：https://docs.nvidia.com/deeplearning/sdk/cudnn-archived/cudnn_701/cudnn-user-guide/index.html

CUDNN_STATUS_BAD_PARAM

An incorrect value or parameter was passed to the function.

To correct: ensure that all the parameters being passed have valid values.

因爲單獨測試時不報錯，跟公司軟件一起測試也不報錯，但是在公司軟件內部調用時報錯，所以參數應該沒有問題，估計是其他設備在使用GPU報錯

參考文獻：

https://devblogs.nvidia.com/tensor-ops-made-easier-in-cudnn/

Caffe與cudnn 6.0 的兼容性問題 CUDNN_STATUS_BAD_PARAM

caffe報錯：cudnn.hpp:86] Check failed: status == CUDNN_STATUS_SUCCESS (3 vs. 0) CUDNN_STATUS_BAD_PARAM 原因