參考鏈接:
http://blog.csdn.net/freewebsys/article/details/47259413
https://www.zhihu.com/question/23003213
http://www.swig.org/papers/PyTutorial98/PyTutorial98.pdf 更詳細的swig資料
https://stackoverflow.com/questions/27149849/python-importerror-undefined-symbol-g-utf8-skip
https://stackoverflow.com/questions/9098980/swig-error-undefined-symbol
http://blog.csdn.net/stpeace/article/details/51416297
1、動機
在做聲紋識別服務過程中,因爲腳本是每次識別都需要加載一次分類器,加載過程十分耗時,識別一次需要7s...覺得有兩個方案:1、守護進程來實現預加載,其他地方還是用腳本來實現,畢竟主要是加載耗時,識別時由web接口來調用守護進程服務來識別,然後返回結果,好處就是貌似需要改的地方少點,但由於有很多中間文件生成會顯得很亂,不是太適合生產環境使用;2、將識別過程重新組裝,然後封裝爲python可以調用的接口,然後利用web現成的服務來完成多客戶的同時訪問,這樣的問題就是改寫識別過程工作量比較大,很耗時,而且細節較多,但這樣應該可以一勞永逸,所以暫時確定用這個方案,下面是記錄將c++接口通過swig暴露給python的測試,主要有兩點:怎麼用,加載分類器後是否能持久化,不用每次都重新加載。
...用了將近兩週時間,一週重新組裝識別的代碼,一週擴展爲python的接口,可以運行,但還是有蠻多可以優化的地方,感覺不知道時好難,會了後又覺得好簡單...
2、環境
linux centos 7
3、安裝swig
官方網址:http://www.swig.org/download.html
目前最新版:3.0.12
yum install swig
安裝版本swig-2.0.10-5.el7.x86_64.rpm
如果想安裝最新版,刪掉yum remove swig
下載安裝包後:tar -zxvf swig-3.0.12.tar.gz
make -j 8
make install
這樣就裝好了
4、編寫測試的c源碼
4.1 直接寫,不使用swig
#include <Python.h>
int great_function(int a) {
return a + 1;
}
static PyObject * _great_function(PyObject *self, PyObject *args)
{
int _a;
int res;
if (!PyArg_ParseTuple(args, "i", &_a))
return NULL;
res = great_function(_a);
return PyLong_FromLong(res);
}
static PyMethodDef GreateModuleMethods[] = {
{
"great_function",
_great_function,
METH_VARARGS,
""
},
{NULL, NULL, 0, NULL}
};
PyMODINIT_FUNC initgreat_module(void) {
(void) Py_InitModule("great_module", GreateModuleMethods);
}
直接編譯:
gcc -fPIC -shared great_module.c -o great_module.so -I/root/anaconda2/include/python2.7/ -l/root/anaconda2/lib/python2.7
4.2 使用swig
參考:http://www.swig.org/translations/chinese/tutorial.html
swig_example.c
include <time.h>
double My_variable = 3.0;
int fact(int n) {
if (n <= 1) return 1;
else return n*fact(n-1);
}
int my_mod(int x, int y) {
return (x%y);
}
char *get_time()
{
time_t ltime;
time(<ime);
return ctime(<ime);
}
swig_example.i(接口文件)/* example.i */
%module example ----(模塊名稱)-----
%{
/* Put header files here or function declarations like below */
extern double My_variable;
extern int fact(int n);
extern int my_mod(int x, int y);
extern char *get_time();
%}
extern double My_variable; -----(申明)------
extern int fact(int n);
extern int my_mod(int x, int y);
extern char *get_time();
生成python文件---形成動態庫
swig -python swig_example.i
gcc -c -fPIC swig_example.c swig_example_wrap.c -I/root/anaconda2/include/python2.7/
ld -shared swig_example.o swig_example_wrap.o -o _example.so
測試:到生成的文件路徑下,在終端打開
[root@hadoop-0 cpython]# python
Python 2.7.5 (default, Nov 20 2015, 02:00:19)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import example
>>> example.fact(5)
120
>>> example.my_mod(10,30)
10
>>> example.get_time()
'Fri Jul 28 11:43:30 2017\n'
>>>example.cvar.My_variable
3.0
全局變量需要這樣訪問,cvar是缺省的全局變量對象
如果想改變cvar的命名,可以將指令swig -python swig_example.i 改爲swig -python-globals
myvar swig_example.i
目前看來,將聲紋識別過程擴展成python的接口應該是沒有問題的。
5、編寫測試的c++源碼
//---example.h
#include <iostream>
using namespace std;
class Example{
public:
void say_hello();
};
//---example.cpp
#include "example.h"
void Example::say_hello(){
cout<<"hello"<<endl;
}
//---example.i
%module example
%{
#include "example.h"
%}
%include "example.h"
#---setup.py
#!/usr/bin/env python
"""
setup.py file for SWIG C\+\+/Python example
"""
from distutils.core import setup, Extension
example_module = Extension('_example',
sources=['example.cpp', 'example_wrap.cxx',],
)
setup (name = 'example',
version = '0.1',
author = "www.99fang.com",
description = """Simple swig C\+\+/Python example""",
ext_modules = [example_module],
py_modules = ["example"],
)
執行:
swig -c\+\+ -python example.i
python setup.py build_ext --inplace
這個地方卡了下,我安裝了默認的python2.7,anaconda2,anaconda3,而python2.7的頭文件和動態庫路徑設置有問題,直接用會導致找不到Python.h文件,直接將anaconda2,anaconda3設置環境變量:PATH=$PATH:/root/anaconda2/bin:/root/anaconda3/bin,然後執行python3 setup.py build_ext --inplace,然後鏈接的就是python3的頭文件和庫文件了.
6、開始擴展聲紋識別的接口
首先先腳本改寫爲c++的文件,然後測試c++文件運行ok後,就可以開始擴展了,因爲這主要是記錄將c++接口擴展爲python的過程,改寫的過程就不詳述了
linux gcc 編譯相關指令可以看:http://blog.csdn.net/wuxianfeng1987/article/details/76528254
指令意思不明白,出問題基本就沒解決思路了
文件結構:
/* recognition.i */
%module recognition
%{
#define SWIG_FILE_WITH_INIT
#include "recognition.h"
%}
int init();
int recognation();
/* include "recognition.h */
#include "base/kaldi-common.h"
#include "util/common-utils.h"
#include "util/kaldi-thread.h"
#include "feat/feature-mfcc.h"
#include "feat/wave-reader.h"
#include "matrix/kaldi-matrix.h"
#include "ivector/voice-activity-detection.h"
#include "ivector/ivector-extractor.h"
#include "ivector/plda.h"
#include "gmm/full-gmm.h"
#include "gmm/diag-gmm.h"
#include "gmm/mle-full-gmm.h"
#include "gmm/am-diag-gmm.h"
#include "hmm/transition-model.h"
#include "hmm/posterior.h"
int init();
int recognation();
/* recognition.cc */
#include <iostream>
using namespace std;
using namespace kaldi;
// global var
IvectorExtractor extractor;
FullGmm fgmm;
DiagGmm gmm;
Plda plda; ...
下面是從一個簡單函數慢慢加入函數到成功運行的整個解決過程:
distutils 編譯:
swig -c++ -python recognition.i
python3.6 setup.py build_ext --inplace
distutils沒有仔細看過,修改指令可能會有問題,手動編譯測試。
手動編譯:
swig -c++ -python recognition.i
gcc -O2 -fPIC -c recognition.cc
gcc -O2 -fPIC -c recognition_wrap.cxx -I..-I/root/anaconda3/include/python3.6m/
gcc -shared recognition.o recognition_wrap.o -o _recognition.so
error:---------------------------------------------------------------------------------
import recognition
Traceback (most recent call last):
File"/usr/wxf/kaldi/src/featbin/recognition.py", line 14,inswig_import_helper
return importlib.import_module(mname)
File"/root/anaconda3/lib/python3.6/importlib/__init__.py",line 126, inimport_module
return _bootstrap._gcd_import(name[level:], package, level)
File"<frozen importlib._bootstrap>", line 978, in_gcd_import
File"<frozen importlib._bootstrap>", line 961, in_find_and_load
File"<frozen importlib._bootstrap>", line 950,in_find_and_load_unlocked
File"<frozen importlib._bootstrap>", line 648, in_load_unlocked
File"<frozen importlib._bootstrap>", line 560, inmodule_from_spec
File"<frozen importlib._bootstrap_external>", line 922,increate_module
File"<frozen importlib._bootstrap>", line 205,in_call_with_frames_removed
ImportError: /usr/wxf/kaldi/src/featbin/_recognition.so: undefinedsymbol:__gxx_personality_v0
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File"<stdin>", line 1, in <module>
File"/usr/wxf/kaldi/src/featbin/recognition.py", line 17,in<module>
_recognition = swig_import_helper()
File"/usr/wxf/kaldi/src/featbin/recognition.py", line 16,inswig_import_helper
return importlib.import_module('_recognition')
File"/root/anaconda3/lib/python3.6/importlib/__init__.py",line 126, inimport_module
return _bootstrap._gcd_import(name[level:], package, level)
ImportError: /usr/wxf/kaldi/src/featbin/_recognition.so: undefinedsymbol:__gxx_personality_v0
分析-------------------------------------------------------------------------
因爲是cc文件,要用g++編譯,修改指令爲:
swig -c++ -python recognition.i
g++ -O2 -fPIC -c recognition.cc
g++ -O2 -fPIC -c recognition_wrap.cxx -I..-I/root/anaconda3/include/python3.6m/
g++ -shared recognition.o recognition_wrap.o -o _recognition.so
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
添加 頭文件測試: #include "base/kaldi-common.h"
error:-------------------------------------------------------------------
[root@hadoop-0 featbin]# g++ -O2 -fPIC -c recognition.cc
In file included from recognition.cc:2:0:
../base/kaldi-common.h:34:30: 致命錯誤:base/kaldi-utils.h:沒有那個文件或目錄
#include "base/kaldi-utils.h"
分析---------------------------------------------------------------------
包含路徑:上級目錄-I..
g++ -O2 -fPIC -c recognition.cc -I..
error:------------------------------------------------------------------
In file included from ../base/kaldi-error.h:32:0,
from ../base/kaldi-common.h:35,
from recognition.cc:2:
../base/kaldi-types.h:44:23: 致命錯誤:fst/types.h:沒有那個文件或目錄
#include <fst/types.h>
g++ -O2 -fPIC -c recognition.cc -I.. -I/usr/wxf/kaldi/tools/openfst/include
錯誤:#error This filerequires compiler and librarysupport for the ISO C++ 2011 standard.
This support is currently experimental, and must be enabled with the-std=c++11or -std=gnu++11 compiler options.
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
swig -c++ -python recognition.i
g++ -std=c++11 -O2 -fPIC -c recognition.cc-I..-I/usr/wxf/kaldi/tools/openfst/include
g++ -O2 -fPIC -c recognition_wrap.cxx -I/root/anaconda3/include/python3.6m/
g++ -shared recognition.o recognition_wrap.o -o _recognition.so
添加:KALDI_LOG 測試 暫時正常執行
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
添加頭文件:#include "util/common-utils.h" 測試
error:------------------------------------------------------------
#error "You need to define (using the preprocessor) either HAVE_CLAPACKorHAVE_ATLAS or HAVE_MKL (but not more than one)"
#error "You need to define (using thepreprocessor) eitherHAVE_CLAPACK or HAVE_ATLAS or HAVE_MKL (but not more thanone)"
分析--------------------------------------------------------------
g++ -std=c++11 -DHAVE_ATLAS -O2 -fPIC -c recognition.cc-I..-I/usr/wxf/kaldi/tools/openfst/include
error:----------------------------------------------------------
cblas.h沒有那個文件或目錄#include <cblas.h>
分析--------------------------------------------------------------
g++ -std=c++11 -DHAVE_ATLAS -O2 -fPIC -c recognition.cc-I..-I/usr/wxf/kaldi/tools/openfst/include -I/usr/wxf/kaldi/tools/ATLAS/include
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
添加頭文件:
#include "base/kaldi-common.h"
#include "util/common-utils.h"
#include "util/kaldi-thread.h"
#include "feat/feature-mfcc.h"
#include "feat/wave-reader.h"
#include "matrix/kaldi-matrix.h"
#include "ivector/voice-activity-detection.h"
#include "ivector/ivector-extractor.h"
#include "ivector/plda.h"
#include "gmm/full-gmm.h"
#include "gmm/diag-gmm.h"
#include "gmm/mle-full-gmm.h"
#include "gmm/am-diag-gmm.h"
#include "hmm/transition-model.h"
#include "hmm/posterior.h"
#include <iostream>
using namespace std;
using namespace kaldi;
正常
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
加入函數
int LoadIvector(std::string ivector_extractor_rxfilename,IvectorExtractor&extractor){
try{ReadKaldiObject(ivector_extractor_rxfilename,&extractor);}
catch (const std::exception &e) {
std::cerr << e.what();
return -1;
}
return 0;
}
測試:
error:----------------------------------------------------------
python 導入模塊 import recognition 時錯誤
ImportError: /usr/wxf/kaldi/src/featbin/_recognition.so: undefinedsymbol:_ZN5kaldi5InputC1ERKSsPb
ImportError: /usr/wxf/kaldi/src/featbin/_recognition.so: undefinedsymbol:_ZN5kaldi7DiagGmm15CopyFromFullGmmERKNS_7FullGmmE
分析--------------------------------------------------------------
編譯沒問題,導入時出現未定義符號,庫沒加載到
·>如果有使用到動態庫,加入:-rdynamic -ldl
·>添加靜態庫:
../hmm/kaldi-hmm.a
../ivector/kaldi-ivector.a
../feat/kaldi-feat.a
../transform/kaldi-transform.a
../gmm/kaldi-gmm.a
../tree/kaldi-tree.a
../util/kaldi-util.a
../matrix/kaldi-matrix.a
../base/kaldi-base.a
編譯指令修改爲:只編譯不鏈接 -c
g++ -std=c++11 -rdynamic -DHAVE_ATLAS -O2 -fPIC -c recognition.cc-I..-I/usr/wxf/kaldi/tools/openfst/include-I/usr/wxf/kaldi/tools/ATLAS/include../hmm/kaldi-hmm.a../ivector/kaldi-ivector.a ../feat/kaldi-feat.a../transform/kaldi-transform.a../gmm/kaldi-gmm.a
../tree/kaldi-tree.a../util/kaldi-util.a../matrix/kaldi-matrix.a ../base/kaldi-base.a -ldl
還是存在問題
感覺不應該是編譯過程中沒鏈接庫,不過還是試試 測試-鏈接生成 -o
g++ -std=c++11 -rdynamic -DHAVE_ATLAS -O2 -fPIC -o recognition.cc-I..-I/usr/wxf/kaldi/tools/openfst/include-I/usr/wxf/kaldi/tools/ATLAS/include../hmm/kaldi-hmm.a../ivector/kaldi-ivector.a ../feat/kaldi-feat.a../transform/kaldi-transform.a../gmm/kaldi-gmm.a
../tree/kaldi-tree.a../util/kaldi-util.a../matrix/kaldi-matrix.a ../base/kaldi-base.a -ldl
g++:警告:../hmm/kaldi-hmm.a:未使用鏈接器輸入文件,因爲鏈接尚未完成,說明不是生成obj文件過程中鏈接的
考慮是最後步鏈接靜態庫,測試-最後一步在生成動態庫時鏈接靜態庫 比較幾個版本,發現2.7纔可以
python3.6m-------------
g++ -std=c++11 -DHAVE_ATLAS -O2 -fPIC-c recognition_wrap.cxx -I..-I/usr/wxf/kaldi/tools/openfst/include-I/usr/wxf/kaldi/tools/ATLAS/include-I/root/anaconda3/include/python3.6m/
g++ -std=c++11 -Wall -shared-DKALDI_DOUBLEPRECISION=0-DHAVE_EXECINFO_H=1 -DHAVE_CXXABI_H -DHAVE_ATLAS -O2recognition.orecognition_wrap.o -o _recognition.so -fPIC -L/root/anaconda3/lib-lpython3.6m-I.. -rdynamic ../hmm/kaldi-hmm.a../ivector/kaldi-ivector.a../feat/kaldi-feat.a ../transform/kaldi-transform.a../gmm/kaldi-gmm.a../tree/kaldi-tree.a ../util/kaldi-util.a ../matrix/kaldi-matrix.a../base/kaldi-base.a-ldl
python3---------------
g++ -std=c++11 -DHAVE_ATLAS -O2 -fPIC-c recognition_wrap.cxx -I..-I/usr/wxf/kaldi/tools/openfst/include-I/usr/wxf/kaldi/tools/ATLAS/include-I/root/anaconda3/include/python3.6m/
g++ -std=c++11 -Wall -shared-DKALDI_DOUBLEPRECISION=0-DHAVE_EXECINFO_H=1 -DHAVE_CXXABI_H -DHAVE_ATLAS -O2recognition.orecognition_wrap.o -o _recognition.so -fPIC -L/root/anaconda3/lib-lpython3-I.. -rdynamic ../hmm/kaldi-hmm.a ../ivector/kaldi-ivector.a../feat/kaldi-feat.a../transform/kaldi-transform.a ../gmm/kaldi-gmm.a../tree/kaldi-tree.a../util/kaldi-util.a ../matrix/kaldi-matrix.a../base/kaldi-base.a -ldl
python2.7-------------------
g++ -std=c++11 -DHAVE_ATLAS -O2 -fPIC-c recognition_wrap.cxx -I..-I/usr/wxf/kaldi/tools/openfst/include-I/usr/wxf/kaldi/tools/ATLAS/include-I/root/anaconda2/include/python2.7/
g++ -std=c++11 -Wall -shared-DKALDI_DOUBLEPRECISION=0-DHAVE_EXECINFO_H=1 -DHAVE_CXXABI_H -DHAVE_ATLAS -O2recognition.orecognition_wrap.o -o _recognition.so -fPIC -L/root/anaconda2/lib-lpython2.7-I.. -rdynamic ../hmm/kaldi-hmm.a../ivector/kaldi-ivector.a../feat/kaldi-feat.a ../transform/kaldi-transform.a../gmm/kaldi-gmm.a../tree/kaldi-tree.a ../util/kaldi-util.a../matrix/kaldi-matrix.a../base/kaldi-base.a -ldl
同樣存在問題
後面查了下資料,動態庫中鏈接靜態庫,必須要加-fPIC參數,也就是說我在安裝kaldi的過程中生成的靜態庫是有問題的,需要配置-fPIC參數才行,後面重新編譯kaldi後就ok了
7、修改接口API的輸入參數
之前爲了減少調試的變量,參數直接寫死的,下面將添加相關參數。
先看看swig對c++一些類型的支持情況:
我主要是傳入模型的路徑和模型參數配置,用到的是std:string,int,double等
當我直接將接口改爲:int init(std::string ivector_extractor_rxfilename,
std::string fgmm_rxfilename,
std::string plda_rxfilename);
編譯,在python中調用時,出現錯誤提示:TypeError: in method 'init', argument 1 of type 'std::string',解決辦法就是在*.i文件中加入%include "std_string.i"