Python實戰之聽書小子

原創

2020-05-03 13:53

前段時間在公衆號《Python愛好者社區》看到這篇文章：

https://mp.weixin.qq.com/s?__biz=MzI5NDY1MjQzNA==&mid=2247489917&idx=5&sn=4e5aa7626d480368c3edaa543a474b2f&chksm=ec5ec600db294f16a7d490f6a24e8296acea2069903a8749faafd8c4212872d4a6c6b3392454&mpshare=1&scene=1&srcid=0428PUNAL4ubKu9LSuT9YVft&sharer_sharetime=1588077632124&sharer_shareid=ac3ae07c07d7eeaa77ecab49b86cc99e&key=b057c75bc90186ba1d681e6a3712efacfd6535366253fadb64c9750f9936979c2542da3451446b54c51a2cf4e6c1f0c093045a3dda0b93923a066ee13f55d37c804ac434662b53cdf4168ec1942e22aa&ascene=1&uin=MjU2NzUxMzExMQ%3D%3D&devicetype=Windows+10+x64&version=62090070&lang=zh_CN&exportkey=A6XkASyz8tDXvDktPa1WOgY%3D&pass_ticket=ffVVHy6oiS17%2BgZMUytcIopIS5vv%2FOPxyRRLXt21%2FS1spi6t%2BQw%2FjBmA4F3kbWKg

覺得挺有意思，就和小夥伴一起寫了一下當作學習python的練習。

語音合成

科大訊飛提供的語音合成有限制次數，只能免費調用500次，而百度提供的語音合成不限次數，但QFS限額爲5，即一秒鐘最多調用5次。於是果斷選擇用百度語音合成API。

註冊一個百度賬戶，登錄到百度AI智能平臺，創建一個應用。

記錄AppID、API Key、Secret Key，在調用API的時候會用到。

具體調用代碼參考：https://ai.baidu.com/ai-doc/SPEECH/Gk4nlz8tc

result  = client.synthesis('你好百度', 'zh', 1, {
    'vol': 5,
})

# 識別正確返回語音二進制 錯誤則返回dict 參照下面錯誤碼
if not isinstance(result, dict):
    with open('auido.mp3', 'wb') as f:
        f.write(result)

'zh'應該是中文的意思；'vol'選項爲選擇語音類型(1爲普通男聲，2爲普通女聲，3爲度逍遙，4爲度丫丫)。

此外，還可以通過參數調節音量、語調、語速等。

獲取小說內容

小說內容從筆趣閣獲取，原因是免費且沒有反爬。首先選擇一本小說，找到其網址。

這裏以《我真沒想重生啊》爲例，其網址是http://www.biquge.info/69_69102/。

觀察網站源代碼，發現章節信息爲：

<dd><a href="12893198.html" title="1、喝酒不開車">1、喝酒不開車</a></dd>

因此從dd標籤中獲取章節名、章節鏈接。

def getChapters(self):
	"""獲取所有章節信息(章節名、章節url)
	input:
	output:"""
	r=requests.get(self.main_url)
	r.encoding='utf-8'
	soup = BeautifulSoup(r.text,'html.parser')
	
	#每個章節
	cpts=soup.findAll('dd')
	num=1
	for cpt in tqdm(cpts):
	    if str(num) not in cpt.text:
			continue
		#章節名
		self.chapters_name[num]=cpt.text
		
		#章節url
		url=cpt.find('a')
		self.chapters_url[num]=url.get('href').split('.')[0]
			
		num+=1

然後使用BeautifulSoup.find(attrs={"id":"content"})來獲取具體章節內容。

def getContents(self,chapter):
	"""獲取指定章節內容"""
	r=requests.get(self.main_url+self.chapters_url[chapter]+'.html')
	r.encoding='utf-8'
	soup=BeautifulSoup(r.text,'html.parser')
	content=soup.find(attrs={"id":"content"})
	soup_text = BeautifulSoup(str(content), 'lxml')
	text=soup_text.div.text.replace('\xa0','')
	self.chapters_contens[chapter]=text

播放語音

使用pygame庫，做法與上述文章一致。

def play(self,chapter):
	"""播放指定章節內容"""
	#如果沒有下載章節內容則先下載
	if chapter not in self.chapters_contens:
		self.getContents(chapter)

	#調用百度語音合成API，把文字轉成語音
	result = self.client.synthesis(self.chapters_contens[chapter][:1024], 'zh', 1, {"per": 4})
	if isinstance(result, dict):
		print('合成失敗')
		
	#播放語音
	pygame_mixer = pygame.mixer
	pygame_mixer.init(frequency=16000)
	byte_obj = BytesIO()
	byte_obj.write(result)
	byte_obj.seek(0, 0)
	pygame_mixer.music.load(byte_obj)
	pygame_mixer.music.play()
	while pygame_mixer.music.get_busy():
		time.sleep(0.1)
	pygame_mixer.stop()

最終代碼

from bs4 import BeautifulSoup
import requests
from aip import AipSpeech
from tqdm import tqdm
import time
import pygame
from io import BytesIO

class listenWebNovel():
	def __init__(self):
		"""初始化"""
		self.main_url='http://www.biquge.info/69_69102/'
		self.chapters_name=dict()
		self.chapters_url=dict()
		self.chapters_contens=dict()
		self.APP_ID=''#需改爲個人APP_ID
		self.API_KEY=''#需改爲個人API_KEY
		self.SECRET_KEY=''#需改爲個人SECRET_KEY
		self.client = AipSpeech(self.APP_ID,self.API_KEY,self.SECRET_KEY)
		self.getChapters()
		
	def getChapters(self):
		"""獲取所有章節信息(章節名、章節url)
		input:
		output:"""
		r=requests.get(self.main_url)
		r.encoding='utf-8'
		soup = BeautifulSoup(r.text,'html.parser')
		
		#每個章節
		cpts=soup.findAll('dd')
		num=1
		for cpt in tqdm(cpts):
			if str(num) not in cpt.text:
				continue
			#章節名
			self.chapters_name[num]=cpt.text
			
			#章節url
			url=cpt.find('a')
			self.chapters_url[num]=url.get('href').split('.')[0]
			
			num+=1
			
	def getContents(self,chapter):
		"""獲取指定章節內容"""
		r=requests.get(self.main_url+self.chapters_url[chapter]+'.html')
		r.encoding='utf-8'
		soup=BeautifulSoup(r.text,'html.parser')
		content=soup.find(attrs={"id":"content"})
		soup_text = BeautifulSoup(str(content), 'lxml')
		text=soup_text.div.text.replace('\xa0','')
		self.chapters_contens[chapter]=text
		
	def play(self,chapter):
		"""播放指定章節內容"""
		#如果沒有下載章節內容則先下載
		if chapter not in self.chapters_contens:
			self.getContents(chapter)

		#調用百度語音合成API，把文字轉成語音
		result = self.client.synthesis(self.chapters_contens[chapter][:1024], 'zh', 1, {"per": 4})
		if isinstance(result, dict):
			print('合成失敗')
		
		#播放語音
		pygame_mixer = pygame.mixer
		pygame_mixer.init(frequency=16000)
		byte_obj = BytesIO()
		byte_obj.write(result)
		byte_obj.seek(0, 0)
		pygame_mixer.music.load(byte_obj)
		pygame_mixer.music.play()
		while pygame_mixer.music.get_busy():
			time.sleep(0.1)
		pygame_mixer.stop()

if __name__ == '__main__':
	#實例化
	lw=listenWebNovel()
	#播放第1章
	lw.play(1)

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

Python實戰之聽書小子

語音合成

獲取小說內容

播放語音

最終代碼

UVA-12169 不爽的裁判(擴展歐幾里得算法)

Ch8 file organizations & indexes(筆記+習題)

[計蒜客]2019 ICPC 南昌邀請賽

Ch3 the relational model(筆記+習題)

Ch1 introduction to database systems(筆記+習題)

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結