TX2源碼安裝編譯mxnet踩坑記錄

原創

TurtleMeow

2020-06-24 06:22

Install mxnet for cpp package in TX2 is not that easy

Record here for my experience.

Steps:

Follow the documentation on this site Install MXNet on a Jetson , there is a little different

First clone mxnet from github && cd mxnet

git clone --recursive https://github.com/apache/incubator-mxnet.git mxnet
cd mxnet
git submodule init
git submodule update

Configure CUDA:

nvcc --version

on my TX2 is CUDA9.0

sudo rm /usr/local/cuda
sudo ln -s /usr/local/cuda-9.0 /usr/local/cuda

cudnn v7.1.3

Copy config.mk

  cp make/crosscompile.jetson.mk config.mk

Edit config.mk, in config.mk , modify these settings:
1. USE_CUDA_PATH = /usr/local/cuda
2. USE_OPENCV = 1
3. USE_JEMALLOC = 0 which is different from official guide but VERY IMPORTENT
4. USE_GPERFTOOLS = 0 which is different from official guide but **VERY IMPORTENT
5. USE_CPP_PACKAGE = 1 for cpp package
6. Update the NVCC settings. NVCCFLAGS := -m64
there 3 and 4 is important , or when you finish your build , using the mxnet api , you might get error like :
```
src/tcmalloc.cc:284] Attempt to free invalid pointer
```
in 3rdparty/mshadow/make/mshadow.mk, change this setteing as follow:
```
MSHADOW_CFLAGS += -DMSHADOW_USE_PASCAL=1
```

Something else:

in Makefile, limit the arch for tx2, which is important.
KNOWN_CUDA_ARCHS := 62 # limit arch for tx2 here 62 or 53

ifeq ($(USE_CUDA), 1)
ifeq ($(CUDA_ARCH),)
	# KNOWN_CUDA_ARCHS := 30 35 50 52 60 61 70 75 
	KNOWN_CUDA_ARCHS := 62    # limit arch for tx2 here
	# Run nvcc on a zero-length file to check architecture-level support.
	# Create args to include SASS in the fat binary for supported levels.
	CUDA_ARCH := $(foreach arch,$(KNOWN_CUDA_ARCHS), \
				$(shell $(NVCC) -arch=sm_$(arch) -E --x cu /dev/null >/dev/null 2>&1 && \
						echo -gencode arch=compute_$(arch),code=sm_$(arch)))
	# Convert a trailing "code=sm_NN" to "code=[sm_NN,compute_NN]" to also
	# include the PTX of the most recent arch in the fat-binaries for
	# forward compatibility with newer GPUs.
	CUDA_ARCH := $(shell echo $(CUDA_ARCH) | sed 's/sm_\([0-9]*\)$$/[sm_\1,compute_\1]/')
	# Add fat binary compression if supported by nvcc.
	COMPRESS := --fatbin-options -compress-all
	CUDA_ARCH += $(shell $(NVCC) -cuda $(COMPRESS) --x cu /dev/null -o /dev/null >/dev/null 2>&1 && \
						 echo $(COMPRESS))
endif
$(info Running CUDA_ARCH: $(CUDA_ARCH))
endif

OR you might get error like :

INFO: nvcc was not found on your path
INFO: Using /usr/local/cuda-9.0/bin/nvcc as nvcc path
Running CUDA_ARCH: -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=[sm_70,compute_70] --fatbin-options -compress-all

...

DMXNET_USE_LIBJPEG_TURBO=0" src/operator/tensor/broadcast_reduce_op_value.cu
Killed
Makefile:471: recipe for target 'build/src/operator/tensor/ordering_op_gpu.o' failed
make: *** [build/src/operator/tensor/ordering_op_gpu.o] Error 137
make: *** Waiting for unfinished jobs....

when you finished your built and use its cpp api, you may meet error like this:

  terminate called after throwing an instance of 'dmlc::Error'
      what(): [01:20:54] /usr/include/mxnet-cpp/ndarray.hpp:236: Check failed: MXNDArrayWaitToRead(blob_ptr_->handle_) == 0 (-1 vs. 0)

this is a problem on gpu mode, which is resulted of TX2 out of memory, change the input to a smaller one can solve.

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

TX2源碼安裝編譯mxnet踩坑記錄

10分鐘搞定Mysql主從部署配置

如何使用 JS 判斷用戶是否處於活躍狀態

「Pygors跨平臺GUI」2：安裝MinGW-w64、MSYS2還是WSL2

[轉帖]

python列出centos7內存使用前50的進程信息

「Pygors跨平臺GUI」1：Pygors跨平臺GUI應用研究

一鍵自動化博客發佈工具,用過的人都說好(掘金篇)

lightdb數據庫超時相關控制參數

lightdb秒級增加列和刪除列（not null帶默認值）

Java ThreadPoolShutdown

python自動發送郵件配置

寫於2020年中

mxnet/gluon學習系列（一）訓練結構概覽

gluon學習系列（三）—Dataset類和DataLoader類

TX2源碼安裝編譯mxnet踩坑記錄

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結