最近需要驗證一下語音,測試了微軟的離線在線這些,測試微軟認知語音識別服務遇到一些問題,記錄一下;
第一步需要在微軟認知服務網站上訂閱一個試用碼。
第二部獲取Toekn
POST https://api.cognitive.microsoft.com/sts/v1.0/issueToken
Content-Length: 0
Ocp-Apim-Subscription-Key ASCIIYour subscription key.
主要代碼如下:
Authentication::Authentication()
{
m_nmNetAccess = new QNetworkAccessManager(this);
connect(m_nmNetAccess, SIGNAL(finished(QNetworkReply*)), this, SLOT(slotFinishReply(QNetworkReply *)));
init();
}
void Authentication::init()
{
QString urlAdress = "https://api.cognitive.microsoft.com/sts/v1.0/issueToken";
QNetworkRequest request;
QSslConfiguration config;
config.setPeerVerifyMode(QSslSocket::VerifyNone);
config.setProtocol(QSsl::TlsV1_0OrLater);
m_authenticationRequest.setSslConfiguration(config);
m_authenticationRequest.setUrl(urlAdress);
m_authenticationRequest.setRawHeader("Content-Length", 0);
m_authenticationRequest.setRawHeader("Content-type", "application/x-www-form-urlencoded");
m_authenticationRequest.setRawHeader("Ocp-Apim-Subscription-Key", "your key");
QByteArray array;
m_nmNetAccess->post(m_authenticationRequest, array);
connect(&m_timerExpired, SIGNAL(timeout()), this, SLOT(updateToken()));
m_timerExpired.start(540000); //9分鐘更新一次token
}
void Authentication::slotFinishReply(QNetworkReply *reply)
{
reply->ignoreSslErrors();
m_token = reply->readAll();
qDebug() << "TOKEN: " << m_token;
reply->deleteLater();
}
返回值是一段基於Base64編碼的數據,無需處理直接保存,下面會用到。數據需要在11分鐘之內更新,否則會失效。
第三部Post語音數據
POST /recognize?scenarios=catsearch&appid=f84e364c-ec34-4773-a783-73707bd9a585&locale=en-US&device.os=wp7&version=3.0&format=xml&requestid=1d4b6030-9099-11e0-91e4-0800200c9a66&instanceid=1d4b6030-9099-11e0-91e4-0800200c9a66 HTTP/1.1
Host: speech.platform.bing.com
Content-Type: audio/wav; samplerate=16000
Authorization: Bearer [Base64 access_token]
(audio data)
具體解釋看官方文檔,實現代碼如下:
RSPluginOnLinePrivate::RSPluginOnLinePrivate(RSPluginMSOnLine* parent)
{
m_parent = parent;
m_nmNetAccess = new QNetworkAccessManager(this);
connect(m_nmNetAccess, SIGNAL(finished(QNetworkReply*)), this, SLOT(slotFinishReply(QNetworkReply *)));
m_authentication = new Authentication();
QString urlAdress = "https://speech.platform.bing.com/recognize";
QSslConfiguration config;
config.setPeerVerifyMode(QSslSocket::VerifyNone);
config.setProtocol(QSsl::TlsV1_0OrLater);
m_recogniseRequest.setSslConfiguration(config);
//to send record
QString _tmpString = QString("?scenarios=smd&appid=D4D52672-91D7-4C74-8AD8-42B1D98141A5&instanceid=565D69FF-E928-4B7E-87DA-9A750B96D9E3&locale=zh-CN&format=json&version=3.0&device.os=wp7&requestid=%1").arg(QUuid::createUuid().toString());
urlAdress += _tmpString;
m_recogniseRequest.setUrl(urlAdress);
m_recogniseRequest.setRawHeader("Accept", "application/json;text/xml");
m_recogniseRequest.setRawHeader("Content-type", "audio/wav; codec=""audio/pcm""; samplerate=8000");
m_recogniseRequest.setRawHeader("Host", "speech.platform.bing.com");
m_recogniseRequest.setRawHeader("Connection", "keep-alive");
m_recogniseRequest.setRawHeader("SendChunked", "true");
}
void RSPluginOnLinePrivate::recognize(const QByteArray& array) { QString token = m_authentication->token();// "Bearer ";// +m_token; token += "Bearer "+token; m_recogniseRequest.setRawHeader("Authorization", token.toUtf8()); m_nmNetAccess->post(m_recogniseRequest, array); }
void RSPluginOnLinePrivate::slotFinishReply(QNetworkReply *reply) { reply->ignoreSslErrors(); QJsonDocument doc = QJsonDocument::fromJson(reply->readAll()); QString ss = doc.toJson(QJsonDocument::Indented); QJsonObject object = doc.object(); QString result; bool bSuccess = false; if (object["header"].isObject()) { QJsonObject obj = object["header"].toObject(); QString status = obj["status"].toString(); bSuccess = (status == "success") ? true : false; } if (bSuccess) { if (object["results"].isArray()) { QJsonArray array = object["results"].toArray(); for (auto val : array) { if (val.isObject()) { QJsonObject obj = val.toObject(); result = obj["name"].toString(); } } } m_parent->notice(result); } reply->deleteLater(); }
void RSPluginOnLinePrivate::recognize(const QByteArray& array)中array爲麥克風採集的語音數據,需要加上wav頭。
HttpRequst參數具體說明可以參考微軟官方文檔,地址在:https://www.azure.cn/cognitive-services/en-us/Speech-api/documentation/API-Reference-REST/BingVoiceRecognition
具體的代碼上傳在http://download.csdn.net/detail/stafniejay/9713476