KenLM安裝補坑實錄

背景

爲了高效、快速統計詞頻,故而採用KenLM。至於KenLM的詳情,請參考源碼: https://github.com/kpu/kenlm。

安裝

作者提供了安裝指南:https://kheafield.com/code/kenlm/ 。確實在一切其他依賴環境都具備的前提下,安裝如下:

wget -O - https://kheafield.com/code/kenlm.tar.gz |tar xz
mkdir kenlm/build
cd kenlm/build
cmake ..
make -j4

PS:本文在Centos 7下安裝,gcc版本是5.2。

boost

在boost版本過低時,cmake步驟大概率會出現以下錯誤:
在這裏插入圖片描述
解決方案:

yum install -y boost boost-devel boost-doc

再重新cmake,報錯如下:

CMake Error at /usr/local/share/cmake-3.15/Modules/FindPackageHandleStandardArgs.cmake:137 (message):
  Could NOT find Boost (missing: thread) (found suitable version "1.55.0",
  minimum required is "1.41.0")
Call Stack (most recent call first):
  /usr/local/share/cmake-3.15/Modules/FindPackageHandleStandardArgs.cmake:378 (_FPHSA_FAILURE_MESSAGE)
  /usr/local/share/cmake-3.15/Modules/FindBoost.cmake:2142 (find_package_handle_standard_args)
  CMakeLists.txt:66 (find_package)
CMake Warning (dev) in /usr/local/share/cmake-3.15/Modules/FindBoost.cmake:
  Policy CMP0011 is not set: Included scripts do automatic cmake_policy PUSH
  and POP.  Run "cmake --help-policy CMP0011" for policy details.  Use the
  cmake_policy command to set the policy and suppress this warning.
  The included script
    /usr/local/share/cmake-3.15/Modules/FindBoost.cmake
  affects policy settings.  CMake is implying the NO_POLICY_SCOPE option for
  compatibility, so the effects are applied to the including context.
Call Stack (most recent call first):
  CMakeLists.txt:66 (find_package)
This warning is for project developers.  Use -Wno-dev to suppress it.
-- Configuring incomplete, errors occurred!
See also "/home/data1/devtools/kenlm/build/CMakeFiles/CMakeOutput.log".
See also "/home/data1/devtools/kenlm/build/CMakeFiles/CMakeError.log".

可以看出是沒有找到按照的boost位置。那麼安裝的boost在哪裏呢?
先查看安裝了哪些boost相關的lib:rpm -qa|grep boost
在這裏插入圖片描述
查看相關具體包的安裝位置,比如查看boost-thread-1.53.0-27.el7.x86_64的安裝位置:rpm -ql boost-thread-1.53.0-27.el7.x86_64,結果如下:
在這裏插入圖片描述
最終發現boost-devel-1.53.0-27.el7.x86_64的include和lib安裝目錄:
在這裏插入圖片描述
綜上,知曉boost的include和lib目錄:

/usr/include/boost/
/usr/lib64/

將這2個目錄信息添加到CMakeLists.txt

SET(BOOST_INCLUDEDIR "/usr/include/boost/")
SET(BOOST_LIBRARYDIR "/usr/lib64/")

在這裏插入圖片描述

指定編譯器

再次安裝,報錯如下:

CMakeFiles/tokenize_piece_test.dir/tokenize_piece_test.cc.o: In function `boost::unit_test::make_test_case(boost::unit_test::callback0<boost::unit_test::ut_detail::unused> const&, boost::unit_test::basic_cstring<char const>)':
tokenize_piece_test.cc:(.text._ZN5boost9unit_test14make_test_caseERKNS0_9callback0INS0_9ut_detail6unusedEEENS0_13basic_cstringIKcEE[_ZN5boost9unit_test14make_test_caseERKNS0_9callback0INS0_9ut_detail6unusedEEENS0_13basic_cstringIKcEE]+0x11): undefined reference to `boost::unit_test::ut_detail::normalize_test_case_name[abi:cxx11](boost::unit_test::basic_cstring<char const>)'
collect2: error: ld returned 1 exit status
make[2]: *** [tests/tokenize_piece_test] Error 1
make[1]: *** [util/CMakeFiles/tokenize_piece_test.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
[ 34%] Linking CXX static library ../../lib/libkenlm_interpolate.a
[ 34%] Built target kenlm_interpolate
[ 35%] Linking CXX executable ../tests/string_stream_test
CMakeFiles/string_stream_test.dir/string_stream_test.cc.o: In function `boost::unit_test::make_test_case(boost::unit_test::callback0<boost::unit_test::ut_detail::unused> const&, boost::unit_test::basic_cstring<char const>)':
string_stream_test.cc:(.text._ZN5boost9unit_test14make_test_caseERKNS0_9callback0INS0_9ut_detail6unusedEEENS0_13basic_cstringIKcEE[_ZN5boost9unit_test14make_test_caseERKNS0_9callback0INS0_9ut_detail6unusedEEENS0_13basic_cstringIKcEE]+0x11): undefined reference to `boost::unit_test::ut_detail::normalize_test_case_name[abi:cxx11](boost::unit_test::basic_cstring<char const>)'
collect2: error: ld returned 1 exit status
make[2]: *** [tests/string_stream_test] Error 1
make[1]: *** [util/CMakeFiles/string_stream_test.dir/all] Error 2
[ 36%] Linking CXX executable ../tests/sorted_uniform_test
CMakeFiles/sorted_uniform_test.dir/sorted_uniform_test.cc.o: In function `boost::unit_test::make_test_case(boost::unit_test::callback0<boost::unit_test::ut_detail::unused> const&, boost::unit_test::basic_cstring<char const>)':
sorted_uniform_test.cc:(.text._ZN5boost9unit_test14make_test_caseERKNS0_9callback0INS0_9ut_detail6unusedEEENS0_13basic_cstringIKcEE[_ZN5boost9unit_test14make_test_caseERKNS0_9callback0INS0_9ut_detail6unusedEEENS0_13basic_cstringIKcEE]+0x11): undefined reference to `boost::unit_test::ut_detail::normalize_test_case_name[abi:cxx11](boost::unit_test::basic_cstring<char const>)'
collect2: error: ld returned 1 exit status
make[2]: *** [tests/sorted_uniform_test] Error 1
make[1]: *** [util/CMakeFiles/sorted_uniform_test.dir/all] Error 2
make: *** [all] Error 2

解決方案:
修改C++編譯器。在CMakeLists.txt頭部添加以下命令:
SET(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -D_GLIBCXX_USE_CXX11_ABI=0")

最後make成功後,可以將bin目錄添加到環境變量中。
在~/.bashrc中添加kenlm的bin目錄如下:

export PATH=$PATH:/usr/local/cuda-9.0/bin:/home/data1/devtools/kenlm/build/bin
source ~/.bashrc 

當然,也可以直接將編譯好需要用到的bin文件直接拷貝到待使用的目錄中,直接運行調用。

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章