利用Audio Queue Services錄音實戰
1.前言
在開了好幾個學習筆記博客坑沒更新之後,我又來寫一個新玩意,而那些我立個flag,我一定會補上更新的,一定!
Audio Queue Services是Apple用於錄製或播放音頻的軟件對象。Audio queue services允許您以線性PCM、壓縮格式(如Apple無損和AAC)以及用戶已安裝編解碼器的其他格式錄製和播放音頻。音頻隊列服務還支持多個音頻隊列的預定回放和同步,以及音頻與視頻的同步。
以上摘自apple documentation,簡單來說,這個工具可以對音頻數據進行實時處理,在某些方面是有極大用處的,因爲普通的audio player是等整個資源全部加載完畢再進行處理的。
爲什麼要寫這個呢,因爲我做了這件事呀,而且好像很少關於怎麼具體去用這個工具去錄音和播放的,而我,剛好用到了錄音功能,於是乎,我來寫寫吧。
2.正言
2.1 關於Audio Queue Services
這個我不多講,因爲不論是百度還是Google你會看到很多關於這個的講解,所以我的重點在於怎麼去寫,那麼不得不提的是指針,對,在swift裏我們也要碰指針了,因爲這個工具是C實現的,有些底層,而在swift 裏指針一律以Unsafe冠名,可見swift對於指針的牴觸,但沒辦法🤷♂️,我們還是得迎難而上。
一張很普遍的圖來揭示Audio Queue Services的面目
這些buffers就是最主要的東西,我們要做的就是填滿buffer,取出buffer中的數據,放回buffer,週而復始
2.2 整體框架
就像上圖看到的callback,整個由錄音,buffers,callback組成,callback裏就是我們對於每個buffer的處理,先上我寫的callback函數
2.2.1 callback
func AQAudioQueueInputCallback(inUserData: UnsafeMutableRawPointer?,
inAQ: AudioQueueRef,
inBuffer: AudioQueueBufferRef,
inStartTime: UnsafePointer<AudioTimeStamp>,
inNumberPacketDescriptions: UInt32,
inPacketDescs: UnsafePointer<AudioStreamPacketDescription>?) {
let audioService = unsafeBitCast(inUserData!, to:AudioService.self)
if inBuffer.pointee.mAudioDataByteSize == 0 {
//audioService.isLastFrame = "0"
return
}
audioService.writePackets(inBuffer: inBuffer)
AudioQueueEnqueueBuffer(inAQ, inBuffer, 0, nil);
//print("startingPacketCount: \(audioService.startingPacketCount), maxPacketCount: \(audioService.maxPacketCount)")
if (audioService.maxPacketCount <= audioService.startingPacketCount) {
audioService.stopRecord()
}
}
函數聲明中的參數很多我沒用到,就不解釋了,因爲後面的類型把這些參數的意義表示的很清楚,其一是inUserData,這裏面存儲的就是buffer裏的聲音數據,其二是inBuffer,這個就是包含音頻數據的buffer。
函數裏面第一個unsafeBitCast是強制類型轉換,把指針裏的原生數據轉成音頻數據,現在看可能比較突兀嗷,在初始化的時候你會明白的,然後是做判斷,如果buffer裏沒數據,那麼回調結束直接return,不然的話就執行具體的函數並在這之後調用AudioQueueEnqueueBuffer函數,把buffer放回隊列中,這裏0就是清空這個buffer,nil就是指針指向空。
總所周知,指針的大小是要事先申請的,這也就意味着我們的錄音會有時長限制,也許可以沒有,但我還沒研究出來,而且我做的項目並不需要無限制錄音,我也就沒管,所以最後的這個判斷就是我是否已經寫滿了我的大的指針數據,寫滿了後面再錄音也不會存進來了。
2.2.2 init
var buffer: UnsafeMutableRawPointer
var audioQueueObject: AudioQueueRef?
//let numPacketsToRead: UInt32 = 44100
var numPacketsToWrite: UInt32 = 44100*3
var startingPacketCount: UInt32
var maxPacketCount: UInt32
let bytesPerPacket: UInt32 = 2
let seconds: UInt32 = 200
var audioFormat: AudioStreamBasicDescription {
return AudioStreamBasicDescription(mSampleRate: 44100.0,
mFormatID: kAudioFormatLinearPCM,
mFormatFlags: AudioFormatFlags(kLinearPCMFormatFlagIsSignedInteger | kLinearPCMFormatFlagIsPacked),
mBytesPerPacket: 2,
mFramesPerPacket: 1,
mBytesPerFrame: 2,
mChannelsPerFrame: 1,
mBitsPerChannel: 16,
mReserved: 0)
}
var data: NSData? {
didSet {
NotificationCenter.default.post(name: .audioServiceDidUpdateData, object: self)
}
}
init(_ obj: Any?) {
startingPacketCount = 0
maxPacketCount = (44100 * seconds)
buffer = UnsafeMutableRawPointer(malloc(Int(maxPacketCount * bytesPerPacket)))
}
參數講解:
buffer:原生指針,用來存放整個音頻數據
audioQueueObject:音頻隊列的寄存器,用這個來進行錄音開始結束等系列操作
numPacketsToWrite:每個buffer裏的數據大小,44100是採樣率,*3就表明我一個buffer存3秒的音頻數據
startingPacketCount:每次讀buffer時進行計數防止超過最大錄音時間
maxPacketCount:就是最大錄音時長對應的數據量
bytesPerPacket:每個數據的字節數
seconds:最大時長
然後這個audioFormat就是普遍錄音前要設置的採樣率,通道數等等數據
data:我個人設置的對於其中一些數據臨時取出處理的參數
然後init函數裏就是我對於以上一些參數的初始化
2.2.3 start&stop
func startRecord() {
buffer = UnsafeMutableRawPointer(malloc(Int(maxPacketCount * bytesPerPacket)))
guard audioQueueObject == nil else { return }
data = nil
prepareForRecord()
let err: OSStatus = AudioQueueStart(audioQueueObject!, nil)
print("err: \(err)")
}
startRecord裏主要就是再把數據初始化一下防止出錯,重要的在prepareForRecord裏,準備完了之後就開始錄音
private func prepareForRecord() {
print("prepareForRecord")
var audioFormat = self.audioFormat
AudioQueueNewInput(&audioFormat,
AQAudioQueueInputCallback,
unsafeBitCast(self, to: UnsafeMutableRawPointer.self),
CFRunLoopGetCurrent(),
CFRunLoopMode.commonModes.rawValue,
0,
&audioQueueObject)
startingPacketCount = 0;
var buffers = Array<AudioQueueBufferRef?>(repeating: nil, count: 3)
let bufferByteSize: UInt32 = numPacketsToWrite * audioFormat.mBytesPerPacket
for bufferIndex in 0 ..< buffers.count {
AudioQueueAllocateBuffer(audioQueueObject!, bufferByteSize, &buffers[bufferIndex])
AudioQueueEnqueueBuffer(audioQueueObject!, buffers[bufferIndex]!, 0, nil)
}
}
小重點來了嗷,AudioQueueNewInput就是創建一個新的音頻隊列對象,在這裏面第一個就是設置音頻格式,第二個是指定回調函數,第三個是用於回調的數據結構,我們這裏就定爲指針,第四個和第五個是回調函數的線程問題,我這裏把它們放在錄音的音頻隊列的內部線程裏,第六個就必須是0,最後一個是在輸出端,新創建的錄製音頻隊列。
後面兩句就是創建3個3秒的buffer,在循環體裏做的是對緩衝區的處理和重新排隊,防止錯亂
func stopRecord() {
AudioQueueStop(audioQueueObject!, true)
AudioQueueDispose(audioQueueObject!, true)
audioQueueObject = nil
data = NSData(bytesNoCopy: buffer, length: Int(startingPacketCount * bytesPerPacket))
}
這裏就是先把錄音停了,然後把還在緩衝區裏還沒處理完的buffer拿出來處理,到最後這個data相當於取出來了整個音頻數據
2.2.4 對每個buffer進行個性化處理
之所以叫個性化,其實是因爲每個人真對自己要做的項目做自己的處理,我這裏呢做的是把每段封裝成wav格式進行網絡上傳
func writePackets(inBuffer: AudioQueueBufferRef) {
//print("writePackets mAudioDataByteSize: \(inBuffer.pointee.mAudioDataByteSize), numPackets: \(inBuffer.pointee.mAudioDataByteSize / 2)")
var numPackets: UInt32 = (inBuffer.pointee.mAudioDataByteSize / bytesPerPacket)
if ((maxPacketCount - startingPacketCount) < numPackets) {
numPackets = (maxPacketCount - startingPacketCount)
}
/*
**do what you wanna do
*/
if 0 < numPackets {
memcpy(buffer.advanced(by: Int(bytesPerPacket * startingPacketCount)),
inBuffer.pointee.mAudioData,
Int(bytesPerPacket * numPackets))
startingPacketCount += numPackets;
}
}
這裏面的東西就比較直觀了,前面做計算,後面把數據整個移進buffer(之前聲明的變量裏)
ps:我們錄音得到的是原生的pcm數據,要封裝成wav還得加頭部,我這裏把代碼放出來
public func writewaveFileHeader(input: NSData, totalAudioLen: Int64, totalDataLen: Int64, longSampleRate: Int64, channels: Int, byteRate: Int64) -> NSMutableData{
var header: [UInt8] = Array(repeating: 0, count: 44)
let postData:NSMutableData = NSMutableData()
// RIFF/WAVE header
header[0] = UInt8(ascii: "R")
header[1] = UInt8(ascii: "I")
header[2] = UInt8(ascii: "F")
header[3] = UInt8(ascii: "F")
header[4] = (UInt8)(totalDataLen & 0xff)
header[5] = (UInt8)((totalDataLen >> 8) & 0xff)
header[6] = (UInt8)((totalDataLen >> 16) & 0xff)
header[7] = (UInt8)((totalDataLen >> 24) & 0xff)
//WAVE
header[8] = UInt8(ascii: "W")
header[9] = UInt8(ascii: "A")
header[10] = UInt8(ascii: "V")
header[11] = UInt8(ascii: "E")
// 'fmt' chunk
header[12] = UInt8(ascii: "f")
header[13] = UInt8(ascii: "m")
header[14] = UInt8(ascii: "t")
header[15] = UInt8(ascii: " ")
// 4 bytes: size of 'fmt ' chunk
header[16] = 16
header[17] = 0
header[18] = 0
header[19] = 0
// format = 1
header[20] = 1
header[21] = 0
header[22] = UInt8(channels)
header[23] = 0
header[24] = (UInt8)(longSampleRate & 0xff)
header[25] = (UInt8)((longSampleRate >> 8) & 0xff)
header[26] = (UInt8)((longSampleRate >> 16) & 0xff)
header[27] = (UInt8)((longSampleRate >> 24) & 0xff)
header[28] = (UInt8)(byteRate & 0xff)
header[29] = (UInt8)((byteRate >> 8) & 0xff)
header[30] = (UInt8)((byteRate >> 16) & 0xff)
header[31] = (UInt8)((byteRate >> 24) & 0xff)
// block align
header[32] = UInt8(2 * 16 / 8)
header[33] = 0
// bits per sample
header[34] = 16
header[35] = 0
//data
header[36] = UInt8(ascii: "d")
header[37] = UInt8(ascii: "a")
header[38] = UInt8(ascii: "t")
header[39] = UInt8(ascii: "a")
header[40] = UInt8(totalAudioLen & 0xff)
header[41] = UInt8((totalAudioLen >> 8) & 0xff)
header[42] = UInt8((totalAudioLen >> 16) & 0xff)
header[43] = UInt8((totalAudioLen >> 24) & 0xff)
postData.append(header, length: header.count)
postData.append(input as Data)
return postData
}
}
就對着這個函數體寫參數就行了
3.小結
這個Audio Queue Services是在AudioToolbox裏,所以得import這個,看起來不難,其實你要是去細糾指針的操作,那就是另一回事了。
希望我能儘快填了前面開的博客的坑,唉