使用spark1.2 的 standalone 模式運行使用breeze的任務,報如下警告:
15/07/28 02:49:32 WARN netlib.BLAS: Failed to load implementation from: com.github.fommil.netlib.NativeSystemBLAS
15/07/28 02:49:32 WARN netlib.BLAS: Failed to load implementation from: com.github.fommil.netlib.NativeRefBLAS
由官網,spark1.2 有:
MLlib uses the linear algebra package Breeze,
which depends on netlib-java,
and jblas. netlib-java
and jblas
depend
on native Fortran routines. You need to install the gfortran
runtime library if it is not already present on your nodes. MLlib will throw a linking error if it cannot detect
these libraries automatically. Due to license issues, we do not include netlib-java
’s
native libraries in MLlib’s dependency set under default settings. If no native library is available at runtime, you will see a warning message. To use native libraries from netlib-java
,
please build Spark with -Pnetlib-lgpl
or
include com.github.fommil.netlib:all:1.1.2
as
a dependency of your project. If you want to use optimized BLAS/LAPACK libraries such as OpenBLAS,
please link its shared libraries to /usr/lib/libblas.so.3
and /usr/lib/liblapack.so.3
,
respectively. BLAS/LAPACK libraries on worker nodes should be built without multithreading.
簡單說:就是官方發佈版本的spark1.2編譯時未添加 com.github.fommil.netlib 依賴,要想使用需要手動添加該依賴,重新編譯。
但是,在spark1.3及以上版本中修復了這個問題,不再需要添加該依賴在重新編譯這麼麻煩。
由官網,spark1.3 有:
MLlib
uses the linear algebra package Breeze,
which depends on netlib-java for
optimised numerical processing. If natives are not available at runtime, you will see a warning message and a pure JVM implementation will be used instead.
在1.3裏,允許自己配置依賴並編譯。對於沒有提供實現的,spark1.3使用一個基於jvm的Java實現代替需要的基礎依賴。