darknet動態庫集成到公司軟件時報錯:CUDNN_STATUS_BAD_PARAM

更多文章參考:自己動手實現darknet預測分類動態庫

報錯:

[5956]  MPTLOG 12256 cuDNN status Error in: file: convolutional_layer.c : cudnn_convolutional_setup() : line: 237 : build time: Dec 13 2019 - 11:54:32 status:3

[5956]  MPTLOG 12256 cuDNN status Error in: file: convolutional_layer.c : cudnn_convolutional_setup() : line: 237 : build time: Dec 13 2019 - 11:54:32 status:3

status=3:CUDNN_STATUS_BAD_PARAM

報錯代碼:

#if((CUDNN_MAJOR*10 + CUDNN_MINOR) >= 72)   // cuDNN >= 7.2
    CHECK_CUDNN(cudnnSetConvolutionMathType(l->convDesc, CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION));
#endif

代碼理解:

For the supported GPUs, the Tensor Core operations will be triggered for convolution functions only when cudnnSetConvolutionMathType() is called on the appropriate convolution descriptor by setting the mathType to CUDNN_TENSOR_OP_MATH or CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION.

對於支持的GPU,只有在對適當的卷積描述符調用cudnnSetConvolutionMathType()時,纔會爲卷積函數觸發張量核心操作,方法是將mathType設置爲CUDNN_Tensor_OP_MATH或CUDNN_Tensor_OP_MATH_ALLOW_CONVERSION。

3.180. cudnnSetConvolutionMathType()
cudnnStatus_t cudnnSetConvolutionMathType(
    cudnnConvolutionDescriptor_t    convDesc,
    cudnnMathType_t                 mathType)
This function allows the user to specify whether or not the use of tensor op is permitted in the library routines associated with a given convolution descriptor.

Returns
CUDNN_STATUS_SUCCESS
The math type was set successfully.

CUDNN_STATUS_BAD_PARAM
Either an invalid convolution descriptor was provided or an invalid math type was specified.

此函數允許用戶指定,在與給定卷積描述符相關聯的庫例程中是否允許使用tensor op。

提供了無效的卷積描述符或指定了無效的數學類型時,返回CUDNN_STATUS_BAD_PARAM。

A new mode CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION is added to cudnnMathType_t. The computation time for FP32 tensors can be reduced by selecting this mode.
The functions cudnnRNNForwardInference(), cudnnRNNForwardTraining(), cudnnRNNBackwardData(), and cudnnRNNBackwardWeights() will now perform down conversion of FP32 input/output only when CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION is set.
Improved the heuristics for cudnnGet*Algorithm() functions.
  • A new mode CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION is added to cudnnMathType_t. The computation time for FP32 tensors can be reduced by selecting this mode.選擇這種模式可以減少FP32張量的計算時間。
  • The functions cudnnRNNForwardInference(), cudnnRNNForwardTraining(), cudnnRNNBackwardData(), and cudnnRNNBackwardWeights() will now perform down conversion of FP32 input/output only when CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION is set.
  • 函數cudnnrnforwardinference()、cudnnrnforwardtraining()、cudnnrnbackwarddata()和cudnnrnbackwardweights()只有在設置了CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION時,執行FP32輸入/輸出的下轉換。

Following issues and limitations exist in this release:

本版本中存在以下問題和限制:

  • When tensor cores are enabled in cuDNN 7.3.0, the wgrad calculations will perform an illegal memory access when K and C values are both non-integral multiples of 8. This will not likely produce incorrect results, but may corrupt other memory depending on the user buffer locations. This issue is present on Volta & Turing architectures.
  • 當在cuDNN 7.3.0中啓用張量核時,當K和C值都是8的非整數倍時,wgrad計算將執行非法內存訪問。這可能不會產生不正確的結果,但可能會損壞其他內存,具體取決於用戶緩衝區的位置。這個問題出現在Volta和Turing架構上。
  • Using cudnnGetConvolution*_v7 routines with cudnnConvolutionDescriptor_t set to CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION leads to incorrect outputs. These incorrect outputs will consist only of CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION cases, instead of also returning the performance results for both DEFAULT_MATH and CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION cases.
  • 使用cudnnGetConvolution*uv7例程,並將cudnnConvolutionDescriptor設置爲CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION將導致不正確的輸出。這些不正確的輸出將僅包含CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION cases,而不是同時返回DEFAULT_MATH和CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION cases的性能結果。

 

如果VS,C/C++-->預編譯頭--》去掉CUDNN後,運行不報錯;

但是去掉CUDNN後,預測出現nan問題。

因爲報錯代碼只是加速卷積運算,所以註釋掉該代碼問題解決。

#if((CUDNN_MAJOR*10 + CUDNN_MINOR) >= 72)   // cuDNN >= 7.2
    //CHECK_CUDNN(cudnnSetConvolutionMathType(l->convDesc, CUDNN_TENSOR_OP_MATH_ALLOW_CONVERSION));
#endif

可能錯誤原因:

1.其他設備在用GPU時會報錯

2.參數不對,訪問了無效數據

官網文檔:https://docs.nvidia.com/deeplearning/sdk/cudnn-archived/cudnn_701/cudnn-user-guide/index.html

CUDNN_STATUS_BAD_PARAM

An incorrect value or parameter was passed to the function.

To correct: ensure that all the parameters being passed have valid values.

因爲單獨測試時不報錯,跟公司軟件一起測試也不報錯,但是在公司軟件內部調用時報錯,所以參數應該沒有問題,估計是其他設備在使用GPU報錯

參考文獻:

https://devblogs.nvidia.com/tensor-ops-made-easier-in-cudnn/

Caffe與cudnn 6.0 的兼容性問題 CUDNN_STATUS_BAD_PARAM

caffe報錯:cudnn.hpp:86] Check failed: status == CUDNN_STATUS_SUCCESS (3 vs. 0) CUDNN_STATUS_BAD_PARAM 原因

Check failed: status == CUDNN_STATUS_SUCCESS (3 vs. 0) CUDNN_STATUS_BAD_PARAM

發佈了373 篇原創文章 · 獲贊 151 · 訪問量 33萬+
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章