在BAE tomcat環境下實現訊飛TTS在線文字轉語音

原文地址：http://blog.csdn.net/luhanglei/article/details/73246146

市場部同學忽然說，能不能在叮咚音箱的skill裏，用方言對用戶進行回覆。因爲叮咚音箱支持回覆一段媒體文件，所以應該具有可行性。查了下，支持方言的TTS，只找到了訊飛一家。但是他家的java SDK只有播放和下載兩種，而下載還是PCM格式的，因此需要把訊飛家SDK實現爲一種通過網址進行請求的模式。

隱藏BUFF：BAE環境下，只有特定的路徑是可以進行寫操作的，所以臨時文件路徑有要求。

最終效果：打開http://***.duapp.com/tts?text=你要說的話，即可獲取到一段wav音頻

1.導入訊飛SDK

把lib裏的兩個jar文件放到項目的Lib裏；

dll和so文件，通過git或者svn傳到ROOT.war所在的文件夾裏；

並按照百度官方的說明，配置好tomcat的路徑

2.servlet代碼如下

原理就是，利用訊飛的java API，把生成的PCM 文件放到bae允許進行寫操作的臨時路徑下，並轉成WAV格式，進行輸出。

請求參數只有一個，text，值就是要轉換的文字

/**
	 * @see HttpServlet#doGet(HttpServletRequest request, HttpServletResponse
	 *      response)
	 */
	protected void doGet(HttpServletRequest request, HttpServletResponse response)
			throws ServletException, IOException {
		String text = request.getParameter("text");// 獲取需要說的話
		File f = new File("/home/bae/" + text + ".wav");// bae允許的臨時文件目錄

		// TODO Auto-generated method stub
		SpeechUtility.createUtility(SpeechConstant.APPID + "=******** ");// 訊飛APPID
		// 1.創建SpeechSynthesizer對象
		SpeechSynthesizer mTts = SpeechSynthesizer.createSynthesizer();
		// 2.合成參數設置，詳見《MSC Reference Manual》SpeechSynthesizer 類
		mTts.setParameter(SpeechConstant.VOICE_NAME, "xiaokun");// 設置發音人
		mTts.setParameter(SpeechConstant.SPEED, "50");// 設置語速，範圍0~100
		mTts.setParameter(SpeechConstant.PITCH, "50");// 設置語調，範圍0~100
		mTts.setParameter(SpeechConstant.VOLUME, "100");// 設置音量，範圍0~100
		// 3.開始合成
		// 設置合成音頻保存位置
		// 合成監聽器
		MySynthesizeToUriListener synthesizeToUriListener = new MySynthesizeToUriListener();
		mTts.synthesizeToUri(text, "/home/bae/" + text + ".pcm", synthesizeToUriListener);
		while (!synthesizeToUriListener.isFinish()) {
			try {
				Thread.sleep(200);
			} catch (InterruptedException e) {
				// TODO 自動生成的 catch 塊
				e.printStackTrace();
			}
		}
		try {
			convertAudioFiles("/home/bae/" + text + ".pcm", "/home/bae/" + text + ".wav");
		} catch (Exception e) {
			// TODO 自動生成的 catch 塊
			e.printStackTrace();
			return;
		}

		// 處理請求
		// 讀取要下載的文件

		if (f.exists()) {
			FileInputStream fis = new FileInputStream(f);
			byte[] b = new byte[fis.available()];
			fis.read(b);
			response.setCharacterEncoding("utf-8");
			response.setHeader("Content-type", "audio/x-wav");
			// 獲取響應報文輸出流對象
			ServletOutputStream out = response.getOutputStream();
			// 輸出
			out.write(b);
			out.flush();
			out.close();
		}
	}

	/**
	 * @see HttpServlet#doPost(HttpServletRequest request, HttpServletResponse
	 *      response)
	 */
	protected void doPost(HttpServletRequest request, HttpServletResponse response)
			throws ServletException, IOException {
		// TODO Auto-generated method stub
		doGet(request, response);
	}

	class MySynthesizeToUriListener implements SynthesizeToUriListener {
		public boolean isFinish = false;

		// progress爲合成進度0~100
		public void onBufferProgress(int progress) {
			if (progress == 100)
				isFinish = true;
		}

		// 會話合成完成回調接口
		// uri爲合成保存地址，error爲錯誤信息，爲null時表示合成會話成功
		@Override
		public void onSynthesizeCompleted(String uri, SpeechError error) {
			isFinish = true;

		}

		@Override
		public void onEvent(int arg0, int arg1, int arg2, int arg3, Object arg4, Object arg5) {
			// TODO 自動生成的方法存根

		}

		public boolean isFinish() {
			return isFinish;
		}

		public void setFinish(boolean isFinish) {
			this.isFinish = isFinish;
		}

	}

	private void convertAudioFiles(String src, String target) throws Exception {
		FileInputStream fis = new FileInputStream(src);
		FileOutputStream fos = new FileOutputStream(target);

		// 計算長度
		byte[] buf = new byte[1024 * 4];
		int size = fis.read(buf);
		int PCMSize = 0;
		while (size != -1) {
			PCMSize += size;
			size = fis.read(buf);
		}
		fis.close();

		// 填入參數，比特率等等。這裏用的是16位單聲道 8000 hz
		WaveHeader header = new WaveHeader();
		// 長度字段 = 內容的大小（PCMSize) + 頭部字段的大小(不包括前面4字節的標識符RIFF以及fileLength本身的4字節)
		header.fileLength = PCMSize + (44 - 8);
		header.FmtHdrLeth = 16;
		header.BitsPerSample = 16;
		header.Channels = 1;
		header.FormatTag = 0x0001;
		header.SamplesPerSec = 16000;
		header.BlockAlign = (short) (header.Channels * header.BitsPerSample / 8);
		header.AvgBytesPerSec = header.BlockAlign * header.SamplesPerSec;
		header.DataHdrLeth = PCMSize;

		byte[] h = header.getHeader();

		assert h.length == 44; // WAV標準，頭部應該是44字節
		// write header
		fos.write(h, 0, h.length);
		// write data stream
		fis = new FileInputStream(src);
		size = fis.read(buf);
		while (size != -1) {
			fos.write(buf, 0, size);
			size = fis.read(buf);
		}
		fis.close();
		fos.close();
	}

PCM轉碼感謝：http://blog.csdn.net/xiunai78/article/details/6867331

原文地址：http://blog.csdn.net/luhanglei/article/details/73246146

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

在BAE tomcat環境下實現訊飛TTS在線文字轉語音

python gdal 安裝使用（Windows， python 3.6.8）

在Activity的Theme裏設置關閉動畫不生效的問題

AndroidThings發射紅外信號

使用GZIPOutputStream解壓byte[]出現Unexpected end of ZLIB input stream的解決方法

小米智能家庭接入亞馬遜Echo

使用Arduino錄製與播放遙控器的紅外信號

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結