本篇作爲 使用AVFoundation處理視頻 的續篇;
上篇講到AVAssetExportSession
的侷限性,一個更好的方案是使用AVAssetWriter重新編碼視頻:
與AVAssetExportSession相比,AVAssetWriter
優勢體現在它對輸出進行編碼時能夠進行更加細緻的壓縮設置控制。可以指定諸如關鍵幀間隔、視頻比特率、像素寬高比和純淨光圈H.264配置文件等設置;
基礎
- AVAssetReader,讀取資源(可看做解碼)
- AVAssetReaderOutput,讀取資源的輸出配置
- AVAssetReaderTrackOutput
- AVAssetReaderVideoCompositionOutput
- AVAssetReaderAudioMixOutput
- AVAssetWriter,寫資源(可看做編碼)
- AVAssetWriterInput,編碼的輸入配置
- CMSampleBuffer,緩存的數據
AVAssetReader、AVAssetReaderOutput配套使用,決定以什麼樣的配置解碼成buffer數據;
AVAssetWriter、AVAssetWriterInput配套使用,決定將數據以什麼配置編碼成視頻
CMSampleBuffer爲編碼的數據,視頻經AVAssetReader後輸出CMSampleBuffer;經AVAssetWriter可以重新將CMSampleBuffer編碼成視頻;
AVAssetReader
AVAssetReader provides services for obtaining media data from an asset.
AVAssetReader可以從資源中讀取媒體數據,每個AVAssetReader和單個AVAsset關聯,AVAsset可能包含多個tracks,相當於一個AVAssetReader可以讀取多個tracks數據。
如果需要AVAssetReader讀取多個AVAsset數據,可以將多個AVAsset合成一個AVComposition,AVAssetReader關聯/讀取AVComposition即可;
AVAssetReader讀取數據時,必須添加output(AVAssetReaderOutput) ,來配置media data怎樣讀取,可以爲每個軌道添加不同的output
open var outputs: [AVAssetReaderOutput] { get }
open func add(_ output: AVAssetReaderOutput)
設置output後, startReading開始讀取數據;
open func startReading() -> Bool
AVAssetReaderOutput
讀取資源的配置,基類,具體需要使用以下子類
- AVAssetReaderTrackOutput
針對單個軌道數據配置 - AVAssetReaderVideoCompositionOutput
針對視頻AVVideoComposition配置,同AVAssetExportSession 的videoComposition一樣作用,可以處理視頻的尺寸、背景等 - AVAssetReaderAudioMixOutput
針對音頻AVAudioMix配置,同AVAssetExportSession 的audioMix一樣作用,可以調節音頻
可以通過copyNextSampleBuffer獲取到讀取的視頻數據
open func copyNextSampleBuffer() -> CMSampleBuffer?
AVAssetWriter
AVAssetWriter provides services for writing media data to a new file
AVAssetWriter將媒體數據以指定文件格式、指定配置寫入新的的單個文件;
和AVAssetReader不同,AVAssetWriter不需要關聯AVAsset,可以從多個資源寫入數據;
AVAssetWriter寫數據時必須添加input(AVAssetWriterInput),來配置media data怎樣寫入
open var inputs: [AVAssetWriterInput] { get }
open func add(_ input: AVAssetWriterInput)
設置input後,startWriting開始寫數據
open func startWriting() -> Bool
然後需要開啓會話WriteSession
open func startSession(atSourceTime startTime: CMTime)
寫入完成後,需要關閉會話
// 標記寫入完成,同時也會endSession
open func finishWriting() async
AVAssetWriterInput
輸入配置,可以爲不同類型資源配置不同input
// 是否有數據需要寫入
open var isReadyForMoreMediaData: Bool { get }
// 請求寫入數據
open func requestMediaDataWhenReady(on queue: DispatchQueue, using block: @escaping () -> Void)
// 緩存數據
open func append(_ sampleBuffer: CMSampleBuffer) -> Bool
// 當前輸入完成
open func markAsFinished()
需要說明的是,AVAssetWriter和AVAssetReader並不一定需要成對使用;實際上AVAssetWriter只需要拿到數據SampleBuffer即可,這個數據可以由很多途徑得到,可以是通過AVAssetReader獲取,可以是相機拍攝視頻時獲取實時流,還可以是圖片數據轉換而來;關於圖片轉視頻,下面會着重編碼實現
outputSettings
AVAssetReaderOutput和AVAssetWriterInput都有outputSettings設置(字典),outputSettings纔是控制解、編碼視頻的核心;
AVVideoSettings
- AVVideoCodecKey 編碼方式
- AVVideoWidthKey 像素寬
- AVVideoHeightKey 像素高
- AVVideoCompressionPropertiesKey 壓縮設置:
- AVVideoAverageBitRateKey 每秒bit數,720*1280適合3000000
- AVVideoProfileLevelKey 畫質級別 從低到高分別是BP、EP、MP、HP
- AVVideoMaxKeyFrameIntervalKey
AVAudioSettings
- AVFormatIDKey 音頻格式
- AVNumberOfChannelsKey
- AVSampleRateKey 採樣率
- AVEncoderBitRateKey 編碼碼率
代碼實現
老規矩先上UML
將多個視頻合併一個視頻
// 創建資源集合composition及可編輯軌道
let composition = AVMutableComposition()
// 視頻軌道
let videoCompositionTrack = composition.addMutableTrack(withMediaType: .video, preferredTrackID:
kCMPersistentTrackID_Invalid)
// 音頻軌道
let audioCompositionTrack = composition.addMutableTrack(withMediaType: .audio, preferredTrackID:
kCMPersistentTrackID_Invalid)
var insertTime = CMTime.zero
for url in urls {
autoreleasepool {
// 獲取視頻資源 並分離出視頻、音頻軌道
let asset = AVURLAsset(url: url)
let videoTrack = asset.tracks(withMediaType: .video).first
let audioTrack = asset.tracks(withMediaType: .audio).first
let videoTimeRange = videoTrack?.timeRange
let audioTimeRange = audioTrack?.timeRange
// 將多個視頻軌道合到一個軌道上(AVMutableCompositionTrack)
if let insertVideoTrack = videoTrack, let insertVideoTime = videoTimeRange {
do {
// 在某個時間點插入軌道
try videoCompositionTrack?.insertTimeRange(CMTimeRange(start: .zero, duration:
insertVideoTime.duration), of: insertVideoTrack, at: insertTime)
} catch let e {
callback(false, e)
return
}
}
// 將多個音頻軌道合到一個軌道上(AVMutableCompositionTrack)
if let insertAudioTrack = audioTrack, let insertAudioTime = audioTimeRange {
do {
try audioCompositionTrack?.insertTimeRange(CMTimeRange(start: .zero, duration:
insertAudioTime.duration), of: insertAudioTrack, at: insertTime)
} catch let e {
callback(false, e)
return
}
}
insertTime = insertTime + asset.duration
}
}
// -----讀取數據----
let videoTracks = composition.tracks(withMediaType: .video)
let audioTracks = composition.tracks(withMediaType: .audio)
guard let videoTrack = videoTracks.first, let audioTrack = audioTracks.first else {
callback(false, nil)
return
}
// AVAssetReader
do {
reader = try AVAssetReader(asset: composition)
} catch let e {
callback(false, e)
return
}
reader.timeRange = CMTimeRange(start: .zero, duration: composition.duration)
// 音視頻uncompressed設置 (使用AVAssetReaderTrackOutput必要使用uncompressed設置)
let audioOutputSetting = [
AVFormatIDKey: kAudioFormatLinearPCM
]
let videoOutputSetting = [
kCVPixelBufferPixelFormatTypeKey as String: UInt32(kCVPixelFormatType_422YpCbCr8)
]
videoOutput = AVAssetReaderTrackOutput(track: videoTrack, outputSettings: videoOutputSetting)
videoOutput.alwaysCopiesSampleData = false
if reader.canAdd(videoOutput) {
reader.add(videoOutput)
}
audioOutput = AVAssetReaderTrackOutput(track: audioTrack, outputSettings: audioOutputSetting)
audioOutput.alwaysCopiesSampleData = false
if reader.canAdd(audioOutput) {
reader.add(audioOutput)
}
reader.startReading()
// -----寫數據----
// AVAssetWriter
do {
writer = try AVAssetWriter(outputURL: outputUrl, fileType: .mp4)
} catch let e {
callback(false, e)
return
}
writer.shouldOptimizeForNetworkUse = true
let videoInputSettings: [String : Any] = [
AVVideoCodecKey: AVVideoCodecType.h264,
AVVideoWidthKey: 720,
AVVideoHeightKey: 1280,
AVVideoCompressionPropertiesKey: [
AVVideoAverageBitRateKey: 1000000,
AVVideoProfileLevelKey: AVVideoProfileLevelH264High40
]
]
let audioInputSettings: [String : Any] = [
AVFormatIDKey: NSNumber(value: kAudioFormatMPEG4AAC),
AVNumberOfChannelsKey: NSNumber(value: 2),
AVSampleRateKey: NSNumber(value: 44100),
AVEncoderBitRateKey: NSNumber(value: 128000)
]
// AVAssetWriterInput
videoInput = AVAssetWriterInput(mediaType: .video, outputSettings: videoInputSettings)
if writer.canAdd(videoInput) {
writer.add(videoInput)
}
audioInput = AVAssetWriterInput(mediaType: .audio, outputSettings: audioInputSettings)
if writer.canAdd(audioInput) {
writer.add(audioInput)
}
writer.startWriting()
writer.startSession(atSourceTime: .zero)
// 準備寫入數據
writeGroup.enter()
videoInput.requestMediaDataWhenReady(on: inputQueue) { [weak self] in
guard let wself = self else {
callback(false, nil)
return
}
if wself.encodeReadySamples(from: wself.videoOutput, to: wself.videoInput) {
wself.writeGroup.leave()
}
}
writeGroup.enter()
audioInput.requestMediaDataWhenReady(on: inputQueue) { [weak self] in
guard let wself = self else {
callback(false, nil)
return
}
if wself.encodeReadySamples(from: wself.audioOutput, to: wself.audioInput) {
wself.writeGroup.leave()
}
}
writeGroup.notify(queue: inputQueue) {
self.writer.finishWriting {
callback(true, nil)
}
}
多個視頻合成(設置VideoComposition,AudioMix)
let composition = AVMutableComposition()
guard let videoCompositionTrack = composition.addMutableTrack(withMediaType: .video, preferredTrackID:
kCMPersistentTrackID_Invalid) else {
callback(false, nil)
return
}
let audioCompositionTrack = composition.addMutableTrack(withMediaType: .audio, preferredTrackID:
kCMPersistentTrackID_Invalid)
// layerInstruction 用於更改視頻圖層
let vcLayerInstruction = AVMutableVideoCompositionLayerInstruction(assetTrack: videoCompositionTrack)
var layerInstructions = [vcLayerInstruction]
var audioParameters: [AVMutableAudioMixInputParameters] = []
var insertTime = CMTime.zero
for url in urls {
autoreleasepool {
let asset = AVURLAsset(url: url)
let videoTrack = asset.tracks(withMediaType: .video).first
let audioTrack = asset.tracks(withMediaType: .audio).first
let videoTimeRange = videoTrack?.timeRange
let audioTimeRange = audioTrack?.timeRange
if let insertVideoTrack = videoTrack, let insertVideoTime = videoTimeRange {
do {
try videoCompositionTrack.insertTimeRange(CMTimeRange(start: .zero, duration:
insertVideoTime.duration), of: insertVideoTrack, at: insertTime)
// 更改Transform 調整方向、大小
var trans = insertVideoTrack.preferredTransform
let size = insertVideoTrack.naturalSize
let orientation = VideoEditHelper.orientationFromVideo(assetTrack: insertVideoTrack)
switch orientation {
case .portrait:
let scale = MMAssetExporter.renderSize.height / size.width
trans = CGAffineTransform(scaleX: scale, y: scale)
trans = trans.translatedBy(x: size.height, y: 0)
trans = trans.rotated(by: .pi / 2.0)
case .landscapeLeft:
let scale = MMAssetExporter.renderSize.width / size.width
trans = CGAffineTransform(scaleX: scale, y: scale)
trans = trans.translatedBy(x: size.width, y: size.height +
(MMAssetExporter.renderSize.height - size.height * scale) / scale / 2.0)
trans = trans.rotated(by: .pi)
case .portraitUpsideDown:
let scale = MMAssetExporter.renderSize.height / size.width
trans = CGAffineTransform(scaleX: scale, y: scale)
trans = trans.translatedBy(x: 0, y: size.width)
trans = trans.rotated(by: .pi / 2.0 * 3)
case .landscapeRight:
// 默認方向
let scale = MMAssetExporter.renderSize.width / size.width
trans = CGAffineTransform(scaleX: scale, y: scale)
trans = trans.translatedBy(x: 0, y: (MMAssetExporter.renderSize.height -
size.height * scale) / scale / 2.0)
}
vcLayerInstruction.setTransform(trans, at: insertTime)
layerInstructions.append(vcLayerInstruction)
} catch let e {
callback(false, e)
return
}
}
if let insertAudioTrack = audioTrack, let insertAudioTime = audioTimeRange {
do {
try audioCompositionTrack?.insertTimeRange(CMTimeRange(start: .zero, duration:
insertAudioTime.duration), of: insertAudioTrack, at: insertTime)
let adParameter = AVMutableAudioMixInputParameters(track: insertAudioTrack)
adParameter.setVolume(1, at: .zero)
audioParameters.append(adParameter)
} catch let e {
callback(false, e)
return
}
}
insertTime = insertTime + asset.duration
}
}
let videoTracks = composition.tracks(withMediaType: .video)
let audioTracks = composition.tracks(withMediaType: .audio)
let videoComposition = AVMutableVideoComposition()
// videoComposition必須指定 幀率frameDuration、大小renderSize
videoComposition.frameDuration = CMTime(value: 1, timescale: 30)
videoComposition.renderSize = MMAssetExporter.renderSize
let vcInstruction = AVMutableVideoCompositionInstruction()
vcInstruction.timeRange = CMTimeRange(start: .zero, duration: composition.duration)
vcInstruction.backgroundColor = UIColor.red.cgColor // 可以設置視頻背景顏色
vcInstruction.layerInstructions = layerInstructions
videoComposition.instructions = [vcInstruction]
let audioMix = AVMutableAudioMix()
audioMix.inputParameters = audioParameters
// AVAssetReader
do {
reader = try AVAssetReader(asset: composition)
} catch let e {
callback(false, e)
return
}
reader.timeRange = CMTimeRange(start: .zero, duration: composition.duration)
// AVAssetReaderOutput
videoOutput = AVAssetReaderVideoCompositionOutput(videoTracks: videoTracks, videoSettings: nil)
videoOutput.alwaysCopiesSampleData = false
videoOutput.videoComposition = videoComposition
if reader.canAdd(videoOutput) {
reader.add(videoOutput)
}
audioOutput = AVAssetReaderAudioMixOutput(audioTracks: audioTracks, audioSettings: nil)
audioOutput.alwaysCopiesSampleData = false
audioOutput.audioMix = audioMix
if reader.canAdd(audioOutput) {
reader.add(audioOutput)
}
if !reader.startReading() {
callback(false, reader.error)
return
}
// -----寫數據----
// AVAssetWriter
do {
writer = try AVAssetWriter(outputURL: outputUrl, fileType: .mp4)
} catch let e {
callback(false, e)
return
}
writer.shouldOptimizeForNetworkUse = true
let videoInputSettings: [String : Any] = [
AVVideoCodecKey: AVVideoCodecType.h264,
AVVideoWidthKey: 720,
AVVideoHeightKey: 1280,
AVVideoCompressionPropertiesKey: [
AVVideoAverageBitRateKey: 1000000,
AVVideoProfileLevelKey: AVVideoProfileLevelH264High40
]
]
let audioInputSettings: [String : Any] = [
AVFormatIDKey: NSNumber(value: kAudioFormatMPEG4AAC),
AVNumberOfChannelsKey: NSNumber(value: 2),
AVSampleRateKey: NSNumber(value: 44100),
AVEncoderBitRateKey: NSNumber(value: 128000)
]
// AVAssetWriterInput
videoInput = AVAssetWriterInput(mediaType: .video, outputSettings: videoInputSettings)
if writer.canAdd(videoInput) {
writer.add(videoInput)
}
audioInput = AVAssetWriterInput(mediaType: .audio, outputSettings: audioInputSettings)
if writer.canAdd(audioInput) {
writer.add(audioInput)
}
writer.startWriting()
writer.startSession(atSourceTime: .zero)
// 準備寫入數據
writeGroup.enter()
videoInput.requestMediaDataWhenReady(on: inputQueue) { [weak self] in
guard let wself = self else {
callback(false, nil)
return
}
if wself.encodeReadySamples(from: wself.videoOutput, to: wself.videoInput) {
wself.writeGroup.leave()
}
}
writeGroup.enter()
audioInput.requestMediaDataWhenReady(on: inputQueue) { [weak self] in
guard let wself = self else {
callback(false, nil)
return
}
if wself.encodeReadySamples(from: wself.audioOutput, to: wself.audioInput) {
wself.writeGroup.leave()
}
}
writeGroup.notify(queue: inputQueue) {
self.writer.finishWriting {
callback(true, nil)
}
}
設置VideoComposition,AudioMix,AVAssetReader的output需要使用AVAssetReaderVideoCompositionOutput、AVAssetReaderAudioMixOutput;
同AVAssetExportSession一樣,最終設置VideoComposition,AudioMix實現調節視頻size、旋轉方向、背景顏色、音量;同樣也能給視頻添加水印等;
使用AVAssetReader、AVAssetWriter和AVAssetExportSession相比,對於資源、軌道的分離合成等操作都是一樣的;對於AVAssetReader、AVAssetWriter可以進一步封裝成類似AVAssetExportSession使用:
public var composition: AVComposition!
public var videoComposition: AVVideoComposition!
public var audioMix: AVAudioMix!
public var outputUrl: URL!
public var videoInputSettings: [String : Any]?
public var videoOutputSettings: [String : Any]?
public var audioInputSettings: [String : Any]?
public var audioOutputSettings: [String : Any]?
public func exportAsynchronously(completionHandler callback: @escaping VideoResult) {
let videoTracks = composition.tracks(withMediaType: .video)
let audioTracks = composition.tracks(withMediaType: .audio)
do {
reader = try AVAssetReader(asset: composition)
} catch let e {
callback(false, e)
return
}
reader.timeRange = CMTimeRange(start: .zero, duration: composition.duration)
videoOutput = AVAssetReaderVideoCompositionOutput(videoTracks: videoTracks, videoSettings:
videoOutputSettings)
videoOutput.alwaysCopiesSampleData = false
videoOutput.videoComposition = videoComposition
if reader.canAdd(videoOutput) {
reader.add(videoOutput)
}
audioOutput = AVAssetReaderAudioMixOutput(audioTracks: audioTracks, audioSettings: audioOutputSet
audioOutput.alwaysCopiesSampleData = false
audioOutput.audioMix = audioMix
if reader.canAdd(audioOutput) {
reader.add(audioOutput)
}
if !reader.startReading() {
callback(false, reader.error)
return
}
// -----寫數據----
do {
writer = try AVAssetWriter(outputURL: outputUrl, fileType: .mp4)
} catch let e {
callback(false, e)
return
}
writer.shouldOptimizeForNetworkUse = true
// AVAssetWriterInput
videoInput = AVAssetWriterInput(mediaType: .video, outputSettings: videoInputSettings)
if writer.canAdd(videoInput) {
writer.add(videoInput)
}
audioInput = AVAssetWriterInput(mediaType: .audio, outputSettings: audioInputSettings)
if writer.canAdd(audioInput) {
writer.add(audioInput)
}
writer.startWriting()
writer.startSession(atSourceTime: .zero)
// 準備寫入數據
// videoInput.requestMediaDataWhenReady
// audioInput.requestMediaDataWhenReady
// encodeReadySamples
...
}
圖片合成爲視頻
do {
writer = try AVAssetWriter(outputURL: outputUrl, fileType: .mp4)
} catch let e {
callback(false, e)
return
}
writer.shouldOptimizeForNetworkUse = true
videoInputSettings = [
AVVideoCodecKey: AVVideoCodecType.h264,
AVVideoWidthKey: MMAssetExporter.renderSize.width,
AVVideoHeightKey: MMAssetExporter.renderSize.height
]
videoInput = AVAssetWriterInput(mediaType: .video, outputSettings: videoInputSettings)
let adaptor = AVAssetWriterInputPixelBufferAdaptor(assetWriterInput: videoInput,
sourcePixelBufferAttributes: nil)
if writer.canAdd(videoInput) {
writer.add(videoInput)
}
writer.startWriting()
writer.startSession(atSourceTime: .zero)
let pixelBuffers = images.map { image in
self.pixelBuffer(from: image)
}
let seconds = 2 // 每張圖片顯示時長 s
let timescale = 30 // 1s 30幀
let frames = images.count * seconds * timescale // 總幀數
var frame = 0
videoInput.requestMediaDataWhenReady(on: inputQueue) { [weak self] in
guard let wself = self else {
callback(false, nil)
return
}
if frame >= frames {
// 全部數據寫入完畢
wself.videoInput.markAsFinished()
wself.writer.finishWriting {
callback(true, nil)
}
return
}
let imageIndex = frame / (seconds * timescale)
let time = CMTime(value: CMTimeValue(frame), timescale: CMTimeScale(timescale))
let pxData = pixelBuffers[imageIndex]
if let cvbuffer = pxData {
adaptor.append(cvbuffer, withPresentationTime: time)
}
frame += 1
}
這裏使用到了AVAssetWriterInputPixelBufferAdaptor,前面也提到了AVASsetWriter數據並不一定需要從AVAssetReader得到,它可以接受其他各種數據;AVAssetWriterInputPixelBufferAdaptor就起到適配左右,可以讓AVASsetWriter寫入不同數據;
AVAssetWriterInputPixelBufferAdaptor創建時需要在writer.startWriting()前
遇到的問題
-[AVAssetWriterInput appendSampleBuffer:] Cannot append sample buffer: Input buffer must be in an uncompressed format when outputSettings is not nil
原因:使用了AVAssetReaderTrackOutput,且outSettings未設置uncompressed的配置(如果爲nil則使用源視頻設置);增加uncompressed 的outSettings即可;
[AVAssetReaderTrackOutput copyNextSampleBuffer] cannot copy next sample buffer before adding this output to an instance of AVAssetReader (using -addOutput:) and calling -startReading on that asset reader'
原因,在讀取數據過程中,AVAssetReader被釋放了;需要強引用AVAssetReader對象;
詳見: https://stackoverflow.com/questions/27608510/avfoundation-add-first-frame-to-video
- reader.startReading()報錯
Error Domain=AVFoundationErrorDomain Code=-11841