centos7下使用swig擴展python接口來調用c++ 實現聲紋識別

參考鏈接：

1、動機

在做聲紋識別服務過程中，因爲腳本是每次識別都需要加載一次分類器，加載過程十分耗時，識別一次需要7s...覺得有兩個方案：1、守護進程來實現預加載，其他地方還是用腳本來實現，畢竟主要是加載耗時，識別時由web接口來調用守護進程服務來識別，然後返回結果，好處就是貌似需要改的地方少點，但由於有很多中間文件生成會顯得很亂，不是太適合生產環境使用；2、將識別過程重新組裝，然後封裝爲python可以調用的接口，然後利用web現成的服務來完成多客戶的同時訪問，這樣的問題就是改寫識別過程工作量比較大，很耗時，而且細節較多，但這樣應該可以一勞永逸，所以暫時確定用這個方案，下面是記錄將c++接口通過swig暴露給python的測試，主要有兩點：怎麼用，加載分類器後是否能持久化，不用每次都重新加載。

...用了將近兩週時間，一週重新組裝識別的代碼，一週擴展爲python的接口，可以運行，但還是有蠻多可以優化的地方，感覺不知道時好難，會了後又覺得好簡單...

2、環境

linux centos 7

3、安裝swig

官方網址：http://www.swig.org/download.html

目前最新版：3.0.12

yum install swig

安裝版本swig-2.0.10-5.el7.x86_64.rpm

如果想安裝最新版，刪掉yum remove swig

下載安裝包後：tar -zxvf swig-3.0.12.tar.gz

make -j 8

make install

這樣就裝好了

4、編寫測試的c源碼

4.1 直接寫，不使用swig

#include <Python.h>

int great_function(int a) {
    return a + 1;
}

static PyObject * _great_function(PyObject *self, PyObject *args)
{
    int _a;
    int res;

    if (!PyArg_ParseTuple(args, "i", &_a))
        return NULL;
    res = great_function(_a);
    return PyLong_FromLong(res);
}

static PyMethodDef GreateModuleMethods[] = {
    {
        "great_function",
        _great_function,
        METH_VARARGS,
        ""
    },
    {NULL, NULL, 0, NULL}
};

PyMODINIT_FUNC initgreat_module(void) {
    (void) Py_InitModule("great_module", GreateModuleMethods);
}

包裹函數_great_function。它負責將Python的參數轉化爲C的參數（PyArg_ParseTuple），調用實際的great_function，並處理great_function的返回值，最終返回給Python環境。導出表GreateModuleMethods。它負責告訴Python這個模塊裏有哪些函數可以被Python調用。導出表的名字可以隨便起，每一項有4個參數：第一個參數是提供給Python環境的函數名稱，第二個參數是_great_function，即包裹函數。第三個參數的含義是參數變長，第四個參數是一個說明性的字符串。導出表總是以{NULL, NULL, 0, NULL}結束。導出函數initgreat_module。這個的名字不是任取的，是你的module名稱添加前綴init。導出函數中將模塊名稱與導出表進行連接

直接編譯：
gcc -fPIC -shared great_module.c -o great_module.so -I/root/anaconda2/include/python2.7/ -l/root/anaconda2/lib/python2.7

4.2 使用swig

參考：http://www.swig.org/translations/chinese/tutorial.html

swig_example.c

include <time.h>
 double My_variable = 3.0;
 
 int fact(int n) {
     if (n <= 1) return 1;
     else return n*fact(n-1);
 }
 
 int my_mod(int x, int y) {
     return (x%y);
 }
 	
 char *get_time()
 {
     time_t ltime;
     time(<ime);
     return ctime(<ime);
 }

swig_example.i（接口文件）

/* example.i */
 %module example    ----(模塊名稱)-----
 %{
 /* Put header files here or function declarations like below */
 extern double My_variable;
 extern int fact(int n);
 extern int my_mod(int x, int y);
 extern char *get_time();
 %}
 
 extern double My_variable;  -----（申明）------
 extern int fact(int n);
 extern int my_mod(int x, int y);
 extern char *get_time();

生成python文件---形成動態庫

swig -python swig_example.i

gcc -c -fPIC swig_example.c swig_example_wrap.c -I/root/anaconda2/include/python2.7/

ld -shared swig_example.o swig_example_wrap.o -o _example.so

測試：到生成的文件路徑下，在終端打開

[root@hadoop-0 cpython]# python
Python 2.7.5 (default, Nov 20 2015, 02:00:19)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import example
>>> example.fact(5)
120
>>> example.my_mod(10,30)
10
>>> example.get_time()
'Fri Jul 28 11:43:30 2017\n'
>>>example.cvar.My_variable
3.0
全局變量需要這樣訪問，cvar是缺省的全局變量對象

如果想改變cvar的命名，可以將指令swig -python swig_example.i 改爲swig -python-globals myvar swig_example.i

目前看來，將聲紋識別過程擴展成python的接口應該是沒有問題的。

5、編寫測試的c++源碼

//---example.h
#include <iostream>
using namespace std;
class Example{
public:
void say_hello();
};

//---example.cpp
#include "example.h"

void Example::say_hello(){
cout<<"hello"<<endl;
}

//---example.i
%module example
%{
#include "example.h"
%}
%include "example.h"

#---setup.py
#!/usr/bin/env python

"""
setup.py file for SWIG C\+\+/Python example
"""
from distutils.core import setup, Extension
example_module = Extension('_example',
sources=['example.cpp', 'example_wrap.cxx',],
)
setup (name = 'example',
version = '0.1',
author = "www.99fang.com",
description = """Simple swig C\+\+/Python example""",
ext_modules = [example_module],
py_modules = ["example"],
)

執行：
swig -c\+\+ -python example.i
python setup.py build_ext --inplace

這個地方卡了下，我安裝了默認的python2.7，anaconda2，anaconda3，而python2.7的頭文件和動態庫路徑設置有問題，直接用會導致找不到Python.h文件，直接將anaconda2，anaconda3設置環境變量：PATH=$PATH:/root/anaconda2/bin:/root/anaconda3/bin，然後執行python3 setup.py build_ext --inplace，然後鏈接的就是python3的頭文件和庫文件了.

6、開始擴展聲紋識別的接口

首先先腳本改寫爲c++的文件，然後測試c++文件運行ok後，就可以開始擴展了，因爲這主要是記錄將c++接口擴展爲python的過程，改寫的過程就不詳述了

linux gcc 編譯相關指令可以看：http://blog.csdn.net/wuxianfeng1987/article/details/76528254

指令意思不明白，出問題基本就沒解決思路了

文件結構：

/* recognition.i */

%module recognition

%{
#define SWIG_FILE_WITH_INIT
#include "recognition.h"
%}

int init();
int recognation();

/* include "recognition.h */

#include "base/kaldi-common.h"
#include "util/common-utils.h"
#include "util/kaldi-thread.h"
#include "feat/feature-mfcc.h"
#include "feat/wave-reader.h"
#include "matrix/kaldi-matrix.h"
#include "ivector/voice-activity-detection.h"
#include "ivector/ivector-extractor.h"
#include "ivector/plda.h"
#include "gmm/full-gmm.h"
#include "gmm/diag-gmm.h"
#include "gmm/mle-full-gmm.h"
#include "gmm/am-diag-gmm.h"
#include "hmm/transition-model.h"
#include "hmm/posterior.h"
int init();
int recognation();

/* recognition.cc */

#include <iostream>
using namespace std;
using namespace kaldi;

// global var
IvectorExtractor extractor;
FullGmm fgmm;
DiagGmm gmm;
Plda plda;   ...

下面是從一個簡單函數慢慢加入函數到成功運行的整個解決過程：

distutils 編譯：

swig -c++ -python recognition.i

python3.6 setup.py build_ext --inplace

distutils沒有仔細看過，修改指令可能會有問題，手動編譯測試。

手動編譯：
swig -c++ -python recognition.i
gcc -O2 -fPIC -c recognition.cc
gcc -O2 -fPIC -c recognition_wrap.cxx -I..-I/root/anaconda3/include/python3.6m/
gcc -shared recognition.o recognition_wrap.o -o _recognition.so

error:---------------------------------------------------------------------------------
import recognition
Traceback (most recent call last):
File"/usr/wxf/kaldi/src/featbin/recognition.py", line 14,inswig_import_helper
   return importlib.import_module(mname)
File"/root/anaconda3/lib/python3.6/importlib/__init__.py",line 126, inimport_module
   return _bootstrap._gcd_import(name[level:], package, level)
File"<frozen importlib._bootstrap>", line 978, in_gcd_import
File"<frozen importlib._bootstrap>", line 961, in_find_and_load
File"<frozen importlib._bootstrap>", line 950,in_find_and_load_unlocked
File"<frozen importlib._bootstrap>", line 648, in_load_unlocked
File"<frozen importlib._bootstrap>", line 560, inmodule_from_spec
File"<frozen importlib._bootstrap_external>", line 922,increate_module
File"<frozen importlib._bootstrap>", line 205,in_call_with_frames_removed
ImportError: /usr/wxf/kaldi/src/featbin/_recognition.so: undefinedsymbol:__gxx_personality_v0

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File"<stdin>", line 1, in <module>
File"/usr/wxf/kaldi/src/featbin/recognition.py", line 17,in<module>
   _recognition = swig_import_helper()
File"/usr/wxf/kaldi/src/featbin/recognition.py", line 16,inswig_import_helper
   return importlib.import_module('_recognition')
File"/root/anaconda3/lib/python3.6/importlib/__init__.py",line 126, inimport_module
   return _bootstrap._gcd_import(name[level:], package, level)
ImportError: /usr/wxf/kaldi/src/featbin/_recognition.so: undefinedsymbol:__gxx_personality_v0
分析-------------------------------------------------------------------------
因爲是cc文件，要用g++編譯，修改指令爲：
swig -c++ -python recognition.i
g++ -O2 -fPIC -c recognition.cc
g++ -O2 -fPIC -c recognition_wrap.cxx -I..-I/root/anaconda3/include/python3.6m/
g++ -shared recognition.o recognition_wrap.o -o _recognition.so
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

添加頭文件測試： #include "base/kaldi-common.h"
error：-------------------------------------------------------------------
[root@hadoop-0 featbin]# g++ -O2 -fPIC -c recognition.cc
In file included from recognition.cc:2:0:
../base/kaldi-common.h:34:30: 致命錯誤：base/kaldi-utils.h：沒有那個文件或目錄
#include "base/kaldi-utils.h"
分析---------------------------------------------------------------------
包含路徑：上級目錄-I..
g++ -O2 -fPIC -c recognition.cc -I..
error:------------------------------------------------------------------
In file included from ../base/kaldi-error.h:32:0,
                from ../base/kaldi-common.h:35,
                from recognition.cc:2:
../base/kaldi-types.h:44:23: 致命錯誤：fst/types.h:沒有那個文件或目錄
#include <fst/types.h>

g++ -O2 -fPIC -c recognition.cc -I.. -I/usr/wxf/kaldi/tools/openfst/include
錯誤：#error This filerequires compiler and librarysupport for the ISO C++ 2011 standard.
This support is currently experimental, and must be enabled with the-std=c++11or -std=gnu++11 compiler options.
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
swig -c++ -python recognition.i
g++ -std=c++11 -O2 -fPIC -c recognition.cc-I..-I/usr/wxf/kaldi/tools/openfst/include
g++ -O2 -fPIC -c recognition_wrap.cxx -I/root/anaconda3/include/python3.6m/
g++ -shared recognition.o recognition_wrap.o -o _recognition.so

添加：KALDI_LOG 測試暫時正常執行
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

添加頭文件：#include "util/common-utils.h" 測試
error:------------------------------------------------------------
#error "You need to define (using the preprocessor) either HAVE_CLAPACKorHAVE_ATLAS or HAVE_MKL (but not more than one)"
   #error "You need to define (using thepreprocessor) eitherHAVE_CLAPACK or HAVE_ATLAS or HAVE_MKL (but not more thanone)"
分析--------------------------------------------------------------
g++ -std=c++11 -DHAVE_ATLAS -O2 -fPIC -c recognition.cc-I..-I/usr/wxf/kaldi/tools/openfst/include
error：----------------------------------------------------------
cblas.h沒有那個文件或目錄#include <cblas.h>
分析--------------------------------------------------------------
g++ -std=c++11 -DHAVE_ATLAS -O2 -fPIC -c recognition.cc-I..-I/usr/wxf/kaldi/tools/openfst/include -I/usr/wxf/kaldi/tools/ATLAS/include
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

添加頭文件：
#include "base/kaldi-common.h"
#include "util/common-utils.h"
#include "util/kaldi-thread.h"
#include "feat/feature-mfcc.h"
#include "feat/wave-reader.h"
#include "matrix/kaldi-matrix.h"
#include "ivector/voice-activity-detection.h"
#include "ivector/ivector-extractor.h"
#include "ivector/plda.h"
#include "gmm/full-gmm.h"
#include "gmm/diag-gmm.h"
#include "gmm/mle-full-gmm.h"
#include "gmm/am-diag-gmm.h"
#include "hmm/transition-model.h"
#include "hmm/posterior.h"
#include <iostream>
using namespace std;
using namespace kaldi;
正常
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

加入函數
int LoadIvector(std::string ivector_extractor_rxfilename,IvectorExtractor&extractor){
   try{ReadKaldiObject(ivector_extractor_rxfilename,&extractor);}
   catch (const std::exception &e) {
       std::cerr << e.what();
       return -1;
   }
   return 0;
}
測試：
error：----------------------------------------------------------
python 導入模塊 import recognition 時錯誤
ImportError: /usr/wxf/kaldi/src/featbin/_recognition.so: undefinedsymbol:_ZN5kaldi5InputC1ERKSsPb
ImportError: /usr/wxf/kaldi/src/featbin/_recognition.so: undefinedsymbol:_ZN5kaldi7DiagGmm15CopyFromFullGmmERKNS_7FullGmmE
分析--------------------------------------------------------------
編譯沒問題，導入時出現未定義符號，庫沒加載到
·>如果有使用到動態庫，加入：-rdynamic -ldl
·>添加靜態庫：
../hmm/kaldi-hmm.a
../ivector/kaldi-ivector.a
../feat/kaldi-feat.a
../transform/kaldi-transform.a
../gmm/kaldi-gmm.a
../tree/kaldi-tree.a
../util/kaldi-util.a
../matrix/kaldi-matrix.a
../base/kaldi-base.a

編譯指令修改爲：只編譯不鏈接 -c
g++ -std=c++11 -rdynamic -DHAVE_ATLAS -O2 -fPIC -c recognition.cc-I..-I/usr/wxf/kaldi/tools/openfst/include-I/usr/wxf/kaldi/tools/ATLAS/include../hmm/kaldi-hmm.a../ivector/kaldi-ivector.a ../feat/kaldi-feat.a../transform/kaldi-transform.a../gmm/kaldi-gmm.a ../tree/kaldi-tree.a../util/kaldi-util.a../matrix/kaldi-matrix.a ../base/kaldi-base.a -ldl
還是存在問題

感覺不應該是編譯過程中沒鏈接庫，不過還是試試測試-鏈接生成 -o
g++ -std=c++11 -rdynamic -DHAVE_ATLAS -O2 -fPIC -o recognition.cc-I..-I/usr/wxf/kaldi/tools/openfst/include-I/usr/wxf/kaldi/tools/ATLAS/include../hmm/kaldi-hmm.a../ivector/kaldi-ivector.a ../feat/kaldi-feat.a../transform/kaldi-transform.a../gmm/kaldi-gmm.a ../tree/kaldi-tree.a../util/kaldi-util.a../matrix/kaldi-matrix.a ../base/kaldi-base.a -ldl

g++:警告：../hmm/kaldi-hmm.a:未使用鏈接器輸入文件,因爲鏈接尚未完成，說明不是生成obj文件過程中鏈接的

考慮是最後步鏈接靜態庫，測試-最後一步在生成動態庫時鏈接靜態庫比較幾個版本，發現2.7纔可以

python3.6m-------------

g++ -std=c++11 -DHAVE_ATLAS -O2 -fPIC-c recognition_wrap.cxx -I..-I/usr/wxf/kaldi/tools/openfst/include-I/usr/wxf/kaldi/tools/ATLAS/include-I/root/anaconda3/include/python3.6m/

g++ -std=c++11 -Wall -shared-DKALDI_DOUBLEPRECISION=0-DHAVE_EXECINFO_H=1 -DHAVE_CXXABI_H -DHAVE_ATLAS -O2recognition.orecognition_wrap.o -o _recognition.so -fPIC -L/root/anaconda3/lib-lpython3.6m-I.. -rdynamic ../hmm/kaldi-hmm.a../ivector/kaldi-ivector.a../feat/kaldi-feat.a ../transform/kaldi-transform.a../gmm/kaldi-gmm.a../tree/kaldi-tree.a ../util/kaldi-util.a ../matrix/kaldi-matrix.a../base/kaldi-base.a-ldl

python3---------------

g++ -std=c++11 -DHAVE_ATLAS -O2 -fPIC-c recognition_wrap.cxx -I..-I/usr/wxf/kaldi/tools/openfst/include-I/usr/wxf/kaldi/tools/ATLAS/include-I/root/anaconda3/include/python3.6m/

g++ -std=c++11 -Wall -shared-DKALDI_DOUBLEPRECISION=0-DHAVE_EXECINFO_H=1 -DHAVE_CXXABI_H -DHAVE_ATLAS -O2recognition.orecognition_wrap.o -o _recognition.so -fPIC -L/root/anaconda3/lib-lpython3-I.. -rdynamic ../hmm/kaldi-hmm.a ../ivector/kaldi-ivector.a../feat/kaldi-feat.a../transform/kaldi-transform.a ../gmm/kaldi-gmm.a../tree/kaldi-tree.a../util/kaldi-util.a ../matrix/kaldi-matrix.a../base/kaldi-base.a -ldl

python2.7-------------------

g++ -std=c++11 -DHAVE_ATLAS -O2 -fPIC-c recognition_wrap.cxx -I..-I/usr/wxf/kaldi/tools/openfst/include-I/usr/wxf/kaldi/tools/ATLAS/include-I/root/anaconda2/include/python2.7/

g++ -std=c++11 -Wall -shared-DKALDI_DOUBLEPRECISION=0-DHAVE_EXECINFO_H=1 -DHAVE_CXXABI_H -DHAVE_ATLAS -O2recognition.orecognition_wrap.o -o _recognition.so -fPIC -L/root/anaconda2/lib-lpython2.7-I.. -rdynamic ../hmm/kaldi-hmm.a../ivector/kaldi-ivector.a../feat/kaldi-feat.a ../transform/kaldi-transform.a../gmm/kaldi-gmm.a../tree/kaldi-tree.a ../util/kaldi-util.a../matrix/kaldi-matrix.a../base/kaldi-base.a -ldl

同樣存在問題

後面查了下資料，動態庫中鏈接靜態庫，必須要加-fPIC參數，也就是說我在安裝kaldi的過程中生成的靜態庫是有問題的，需要配置-fPIC參數才行，後面重新編譯kaldi後就ok了

7、修改接口API的輸入參數

之前爲了減少調試的變量，參數直接寫死的，下面將添加相關參數。

先看看swig對c++一些類型的支持情況：

我主要是傳入模型的路徑和模型參數配置，用到的是std：string，int，double等

         當我直接將接口改爲：int init(std::string ivector_extractor_rxfilename,
   std::string fgmm_rxfilename,
                                                   std::string plda_rxfilename);

編譯，在python中調用時，出現錯誤提示：TypeError: in method 'init', argument 1 of type 'std::string'，解決辦法就是在*.i文件中加入%include "std_string.i"

wuxianfeng1987

發佈了55 篇原創文章 · 獲贊 18 · 訪問量 8萬+

私信關注

centos7下使用swig擴展python接口來調用c++ 實現聲紋識別

kaldi安裝編譯

Timit SPHERE格式轉換

linux 使用記錄

儀表識別實時儀表盤識別

python django 使用記錄

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結