win10下caffe+gpu使用問題總結

  • 環境:

gtx 1080Ti

i7

32g內存

三星sm951,256g固態NVME硬盤

vs2013+win10+cuda9.0+cudnn5.1

  • NVME的硬盤只支持win10的固態硬盤,無法安裝linux系統,因此後續的問題都是基於windows環境下的

  • x64 無法調試,出現“調試監視器(MSVSMON.EXE)未能啓動”錯誤

解決方案:打開vs2013後,從文件-->打開->項目裏重新打開caffe,只能是暫時的解決方案


  • 使用社區版2013,專業版裝時不知道是什麼問題Nuget使用不了,在編譯過程中protobuf調用出現1083的錯誤碼,讀取dll錯誤的問題,不清楚是什麼原因。

  • opencv、glog、gflafs在配置gpu版本時出現以下錯誤

      error:NuGet Error:未知命令:“overlay”

      error:MSB4062 加載任務“NuGetPackageOverlay”失敗問題

      反正使用Nuget怎麼配都有問題

      解決方案:手動配置,工程下的packages.config文件寫了依賴庫的版本,按這個去下載對應的庫,然後在屬性文件CommonSettings.props中

      去配置:

<PropertyGroup Label="UserMacros">
        <!-- opencv 2.4.10 supported -->
        <OpencvPath>D:\wxf\opencv\build</OpencvPath>
        <OpencvDependencies></OpencvDependencies>

        <!-- Glog supported -->
        <GlogPath>F:\caffe\NugetPackages\glog.0.3.3.0\build\native</GlogPath>
        <!-- Gflags supported -->
        <GflagsPath>F:\caffe\NugetPackages\gflags.2.1.2.1\build\native</GflagsPath>
</PropertyGroup>

<PropertyGroup>
        <LibraryPath>$(OpencvPath)\x64\vc12\lib;$(LibraryPath)</LibraryPath>
        <IncludePath>$(OpencvPath)\include;$(OpencvPath)\include\opencv;$(OpencvPath)\include\opencv2;$(IncludePath)</IncludePath>
</PropertyGroup>

<PropertyGroup>
        <LibraryPath>$(GflagsPath)\x64\v120\dynamic\Lib;$(LibraryPath)</LibraryPath>
        <IncludePath>$(GflagsPath)\include;$(IncludePath)</IncludePath>
</PropertyGroup>

<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|x64'">
      
      <LibraryPath>$(GlogPath)\lib\x64\v120\Release\dynamic;$(LibraryPath)</LibraryPath>
      <IncludePath>$(GlogPath)\include;$(IncludePath)</IncludePath>
      
      <CudaDependencies>
        opencv_objdetect2410.lib;
        opencv_ts2410.lib;
        opencv_video2410.lib;
        opencv_nonfree2410.lib;
        opencv_ocl2410.lib;
        opencv_photo2410.lib;
        opencv_stitching2410.lib;
        opencv_superres2410.lib;
        opencv_videostab2410.lib;
        opencv_calib3d2410.lib;
        opencv_contrib2410.lib;
        opencv_core2410.lib;
        opencv_features2d2410.lib;
        opencv_flann2410.lib;
        opencv_gpu2410.lib;
        opencv_highgui2410.lib;
        opencv_imgproc2410.lib;
        opencv_legacy2410.lib;
        opencv_ml2410.lib;
        libglog.lib;
        gflags.lib;
        gflags_nothreads.lib;
        $(CudaDependencies)
        </CudaDependencies>
</PropertyGroup>

<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|x64'">
      <LibraryPath>$(GlogPath)\lib\x64\v120\Debug\dynamic;$(LibraryPath)</LibraryPath>
      <IncludePath>$(GlogPath)\include;$(IncludePath)</IncludePath>
      <CudaDependencies>
        opencv_ml2410d.lib;
        opencv_calib3d2410d.lib;
        opencv_contrib2410d.lib;
        opencv_core2410d.lib;
        opencv_features2d2410d.lib;
        opencv_flann2410d.lib;
        opencv_gpu2410d.lib;
        opencv_highgui2410d.lib;
        opencv_imgproc2410d.lib;
        opencv_legacy2410d.lib;
        opencv_objdetect2410d.lib;
        opencv_ts2410d.lib;
        opencv_video2410d.lib;
        opencv_nonfree2410d.lib;
        opencv_ocl2410d.lib;
        opencv_photo2410d.lib;
        opencv_stitching2410d.lib;
        opencv_superres2410d.lib;
        opencv_videostab2410d.lib;
        libglog.lib;
        gflagsd.lib;
        gflags_nothreadsd.lib;
        $(CudaDependencies)
      </CudaDependencies>
 </PropertyGroup>

然後在項目的附加依賴項中添加$(OpencvDependencies),這樣編譯基本就沒問題了。

  • 數據類型不匹配,把數據轉爲LMDB,但還是報錯,反正就是其他地方都沒問題了,但還是報錯估計就是路徑問題了,‘/’,‘\’是不一樣的,路徑最好都用‘/’。
 
  • 設置數據層爲:
              name: "LeNet"

              layer {

              name: "data"

              type: "Input"

              top: "data"

              input_param { shape: { dim: 1 dim: 1 dim: 28 dim: 28 } }

              }

       錯誤:Check failed: labels_.size() == output_layer->channels() (1 vs. 10) Number of labels  is different from the output layer dimension.

       分類的標籤文件不對,在label.txt中只寫了個0,應該是0,1,2,3,4,5,6,7,8,9


  • 設置:input_param { shape: { dim: 1 dim: 3 dim: 28 dim: 28 } }時出現:
    Cannot copy param 0 weights from layer 'conv1'; shape mismatch.  Source param shape is 20 1 5 5 (500); target param shape is 20 3 5 5 (1500). To learn this layer's parameters from scratch rather than copying from a saved net, rename the layer.

     設置訓練模型時的通道數的地方沒找到,後面回來補充下。


  • 關於MNIST數據集測試不準確的問題

     之前感覺是均值文件的問題,因爲訓練LeNet時沒加減均值的過程,但應該不至於有很高的錯誤率,後面在知乎上 https://www.zhihu.com/question/52047327 原來訓練集是黑底白字的!將測試樣本改爲黑底白字後就ok了



發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章