使用AVAssetReader、AVAssetWriter導出視頻

本篇作爲 使用AVFoundation處理視頻 的續篇;

上篇講到AVAssetExportSession的侷限性,一個更好的方案是使用AVAssetWriter重新編碼視頻:

與AVAssetExportSession相比,AVAssetWriter優勢體現在它對輸出進行編碼時能夠進行更加細緻的壓縮設置控制。可以指定諸如關鍵幀間隔、視頻比特率、像素寬高比和純淨光圈H.264配置文件等設置;

基礎

  • AVAssetReader,讀取資源(可看做解碼)
  • AVAssetReaderOutput,讀取資源的輸出配置
  • AVAssetReaderTrackOutput
  • AVAssetReaderVideoCompositionOutput
  • AVAssetReaderAudioMixOutput
  • AVAssetWriter,寫資源(可看做編碼)
  • AVAssetWriterInput,編碼的輸入配置
  • CMSampleBuffer,緩存的數據

AVAssetReader、AVAssetReaderOutput配套使用,決定以什麼樣的配置解碼成buffer數據;
AVAssetWriter、AVAssetWriterInput配套使用,決定將數據以什麼配置編碼成視頻
CMSampleBuffer爲編碼的數據,視頻經AVAssetReader後輸出CMSampleBuffer;經AVAssetWriter可以重新將CMSampleBuffer編碼成視頻;

AVAssetReader

AVAssetReader provides services for obtaining media data from an asset.

AVAssetReader可以從資源中讀取媒體數據,每個AVAssetReader和單個AVAsset關聯,AVAsset可能包含多個tracks,相當於一個AVAssetReader可以讀取多個tracks數據。
如果需要AVAssetReader讀取多個AVAsset數據,可以將多個AVAsset合成一個AVComposition,AVAssetReader關聯/讀取AVComposition即可;

AVAssetReader讀取數據時,必須添加output(AVAssetReaderOutput) ,來配置media data怎樣讀取,可以爲每個軌道添加不同的output

    open var outputs: [AVAssetReaderOutput] { get }
    open func add(_ output: AVAssetReaderOutput)

設置output後, startReading開始讀取數據;

open func startReading() -> Bool

AVAssetReaderOutput

讀取資源的配置,基類,具體需要使用以下子類

  • AVAssetReaderTrackOutput
    針對單個軌道數據配置
  • AVAssetReaderVideoCompositionOutput
    針對視頻AVVideoComposition配置,同AVAssetExportSession 的videoComposition一樣作用,可以處理視頻的尺寸、背景等
  • AVAssetReaderAudioMixOutput
    針對音頻AVAudioMix配置,同AVAssetExportSession 的audioMix一樣作用,可以調節音頻

可以通過copyNextSampleBuffer獲取到讀取的視頻數據

open func copyNextSampleBuffer() -> CMSampleBuffer?

AVAssetWriter

AVAssetWriter provides services for writing media data to a new file

AVAssetWriter將媒體數據以指定文件格式、指定配置寫入新的的單個文件;
和AVAssetReader不同,AVAssetWriter不需要關聯AVAsset,可以從多個資源寫入數據;

AVAssetWriter寫數據時必須添加input(AVAssetWriterInput),來配置media data怎樣寫入

open var inputs: [AVAssetWriterInput] { get }
open func add(_ input: AVAssetWriterInput)

設置input後,startWriting開始寫數據

open func startWriting() -> Bool

然後需要開啓會話WriteSession

open func startSession(atSourceTime startTime: CMTime)

寫入完成後,需要關閉會話

// 標記寫入完成,同時也會endSession
open func finishWriting() async

AVAssetWriterInput

輸入配置,可以爲不同類型資源配置不同input

// 是否有數據需要寫入
open var isReadyForMoreMediaData: Bool { get }

// 請求寫入數據
open func requestMediaDataWhenReady(on queue: DispatchQueue, using block: @escaping () -> Void)

// 緩存數據
open func append(_ sampleBuffer: CMSampleBuffer) -> Bool

// 當前輸入完成
open func markAsFinished()

需要說明的是,AVAssetWriter和AVAssetReader並不一定需要成對使用;實際上AVAssetWriter只需要拿到數據SampleBuffer即可,這個數據可以由很多途徑得到,可以是通過AVAssetReader獲取,可以是相機拍攝視頻時獲取實時流,還可以是圖片數據轉換而來;關於圖片轉視頻,下面會着重編碼實現


outputSettings

AVAssetReaderOutput和AVAssetWriterInput都有outputSettings設置(字典),outputSettings纔是控制解、編碼視頻的核心;

AVVideoSettings

  • AVVideoCodecKey 編碼方式
  • AVVideoWidthKey 像素寬
  • AVVideoHeightKey 像素高
  • AVVideoCompressionPropertiesKey 壓縮設置:
    • AVVideoAverageBitRateKey 每秒bit數,720*1280適合3000000
    • AVVideoProfileLevelKey 畫質級別 從低到高分別是BP、EP、MP、HP
    • AVVideoMaxKeyFrameIntervalKey

AVAudioSettings

  • AVFormatIDKey 音頻格式
  • AVNumberOfChannelsKey
  • AVSampleRateKey 採樣率
  • AVEncoderBitRateKey 編碼碼率

代碼實現

老規矩先上UML

將多個視頻合併一個視頻

// 創建資源集合composition及可編輯軌道
let composition = AVMutableComposition()
// 視頻軌道
let videoCompositionTrack = composition.addMutableTrack(withMediaType: .video, preferredTrackID:
kCMPersistentTrackID_Invalid)
// 音頻軌道
let audioCompositionTrack = composition.addMutableTrack(withMediaType: .audio, preferredTrackID:
kCMPersistentTrackID_Invalid)
var insertTime = CMTime.zero
for url in urls {
    autoreleasepool {
        // 獲取視頻資源 並分離出視頻、音頻軌道
        let asset = AVURLAsset(url: url)
        let videoTrack = asset.tracks(withMediaType: .video).first
        let audioTrack = asset.tracks(withMediaType: .audio).first
        let videoTimeRange = videoTrack?.timeRange
        let audioTimeRange = audioTrack?.timeRange
        
        // 將多個視頻軌道合到一個軌道上(AVMutableCompositionTrack)
        if let insertVideoTrack = videoTrack, let insertVideoTime = videoTimeRange {
            do {
                // 在某個時間點插入軌道
                try videoCompositionTrack?.insertTimeRange(CMTimeRange(start: .zero, duration:
insertVideoTime.duration), of: insertVideoTrack, at: insertTime)
            } catch let e {
                callback(false, e)
                return
            }
        }
        
        // 將多個音頻軌道合到一個軌道上(AVMutableCompositionTrack)
        if let insertAudioTrack = audioTrack, let insertAudioTime = audioTimeRange {
            do {
                try audioCompositionTrack?.insertTimeRange(CMTimeRange(start: .zero, duration:
insertAudioTime.duration), of: insertAudioTrack, at: insertTime)
            } catch let e {
                callback(false, e)
                return
            }
        }
        
        insertTime = insertTime + asset.duration
    }
}
// -----讀取數據----
let videoTracks = composition.tracks(withMediaType: .video)
let audioTracks = composition.tracks(withMediaType: .audio)
guard let videoTrack = videoTracks.first, let audioTrack = audioTracks.first else {
    callback(false, nil)
    return
}
// AVAssetReader
do {
    reader = try AVAssetReader(asset: composition)
} catch let e {
    callback(false, e)
    return
}
reader.timeRange = CMTimeRange(start: .zero, duration: composition.duration)
// 音視頻uncompressed設置 (使用AVAssetReaderTrackOutput必要使用uncompressed設置)
let audioOutputSetting = [
    AVFormatIDKey: kAudioFormatLinearPCM
]
let videoOutputSetting = [
    kCVPixelBufferPixelFormatTypeKey as String: UInt32(kCVPixelFormatType_422YpCbCr8)
]
videoOutput = AVAssetReaderTrackOutput(track: videoTrack, outputSettings: videoOutputSetting)
videoOutput.alwaysCopiesSampleData = false
if reader.canAdd(videoOutput) {
    reader.add(videoOutput)
}
audioOutput = AVAssetReaderTrackOutput(track: audioTrack, outputSettings: audioOutputSetting)
audioOutput.alwaysCopiesSampleData = false
if reader.canAdd(audioOutput) {
    reader.add(audioOutput)
}
reader.startReading()
// -----寫數據----
// AVAssetWriter
do {
    writer = try AVAssetWriter(outputURL: outputUrl, fileType: .mp4)
} catch let e {
    callback(false, e)
    return
}
writer.shouldOptimizeForNetworkUse = true
let videoInputSettings: [String : Any] = [
    AVVideoCodecKey: AVVideoCodecType.h264,
    AVVideoWidthKey: 720,
    AVVideoHeightKey: 1280,
    AVVideoCompressionPropertiesKey: [
        AVVideoAverageBitRateKey: 1000000,
        AVVideoProfileLevelKey: AVVideoProfileLevelH264High40
    ]
]
let audioInputSettings: [String : Any] = [
    AVFormatIDKey: NSNumber(value: kAudioFormatMPEG4AAC),
    AVNumberOfChannelsKey: NSNumber(value: 2),
    AVSampleRateKey: NSNumber(value: 44100),
    AVEncoderBitRateKey: NSNumber(value: 128000)
]
// AVAssetWriterInput
videoInput = AVAssetWriterInput(mediaType: .video, outputSettings: videoInputSettings)
if writer.canAdd(videoInput) {
    writer.add(videoInput)
}
audioInput = AVAssetWriterInput(mediaType: .audio, outputSettings: audioInputSettings)
if writer.canAdd(audioInput) {
    writer.add(audioInput)
}
writer.startWriting()
writer.startSession(atSourceTime: .zero)
// 準備寫入數據
writeGroup.enter()
videoInput.requestMediaDataWhenReady(on: inputQueue) { [weak self] in
    guard let wself = self else {
        callback(false, nil)
        return
    }
    
    if wself.encodeReadySamples(from: wself.videoOutput, to: wself.videoInput) {
        wself.writeGroup.leave()
    }
}
writeGroup.enter()
audioInput.requestMediaDataWhenReady(on: inputQueue) { [weak self] in
    guard let wself = self else {
        callback(false, nil)
        return
    }
    
    if wself.encodeReadySamples(from: wself.audioOutput, to: wself.audioInput) {
        wself.writeGroup.leave()
    }
}
writeGroup.notify(queue: inputQueue) {
    self.writer.finishWriting {
        callback(true, nil)
    }
}

多個視頻合成(設置VideoComposition,AudioMix)

let composition = AVMutableComposition()
guard let videoCompositionTrack = composition.addMutableTrack(withMediaType: .video, preferredTrackID:
kCMPersistentTrackID_Invalid) else {
    callback(false, nil)
    return
}
let audioCompositionTrack = composition.addMutableTrack(withMediaType: .audio, preferredTrackID:
kCMPersistentTrackID_Invalid)
// layerInstruction 用於更改視頻圖層
let vcLayerInstruction = AVMutableVideoCompositionLayerInstruction(assetTrack: videoCompositionTrack)
var layerInstructions = [vcLayerInstruction]
var audioParameters: [AVMutableAudioMixInputParameters] = []
var insertTime = CMTime.zero
for url in urls {
    autoreleasepool {
        let asset = AVURLAsset(url: url)
        let videoTrack = asset.tracks(withMediaType: .video).first
        let audioTrack = asset.tracks(withMediaType: .audio).first
        let videoTimeRange = videoTrack?.timeRange
        let audioTimeRange = audioTrack?.timeRange
        
        if let insertVideoTrack = videoTrack, let insertVideoTime = videoTimeRange {
            do {
                try videoCompositionTrack.insertTimeRange(CMTimeRange(start: .zero, duration:
insertVideoTime.duration), of: insertVideoTrack, at: insertTime)
                
                // 更改Transform 調整方向、大小
                var trans = insertVideoTrack.preferredTransform
                let size = insertVideoTrack.naturalSize
                let orientation = VideoEditHelper.orientationFromVideo(assetTrack: insertVideoTrack)
                switch orientation {
                    case .portrait:
                        let scale = MMAssetExporter.renderSize.height / size.width
                        trans = CGAffineTransform(scaleX: scale, y: scale)
                        trans = trans.translatedBy(x: size.height, y: 0)
                        trans = trans.rotated(by: .pi / 2.0)
                    case .landscapeLeft:
                        let scale = MMAssetExporter.renderSize.width / size.width
                        trans = CGAffineTransform(scaleX: scale, y: scale)
                        trans = trans.translatedBy(x: size.width, y: size.height +
(MMAssetExporter.renderSize.height - size.height * scale) / scale / 2.0)
                        trans = trans.rotated(by: .pi)
                    case .portraitUpsideDown:
                        let scale = MMAssetExporter.renderSize.height / size.width
                        trans = CGAffineTransform(scaleX: scale, y: scale)
                        trans = trans.translatedBy(x: 0, y: size.width)
                        trans = trans.rotated(by: .pi / 2.0 * 3)
                    case .landscapeRight:
                        // 默認方向
                        let scale = MMAssetExporter.renderSize.width / size.width
                        trans = CGAffineTransform(scaleX: scale, y: scale)
                        trans = trans.translatedBy(x: 0, y: (MMAssetExporter.renderSize.height -
size.height * scale) / scale / 2.0)
                }
                
                vcLayerInstruction.setTransform(trans, at: insertTime)
                layerInstructions.append(vcLayerInstruction)
            } catch let e {
                callback(false, e)
                return
            }
        }
        if let insertAudioTrack = audioTrack, let insertAudioTime = audioTimeRange {
            do {
                try audioCompositionTrack?.insertTimeRange(CMTimeRange(start: .zero, duration:
insertAudioTime.duration), of: insertAudioTrack, at: insertTime)
                
                let adParameter = AVMutableAudioMixInputParameters(track: insertAudioTrack)
                adParameter.setVolume(1, at: .zero)
                audioParameters.append(adParameter)
            } catch let e {
                callback(false, e)
                return
            }
        }
        
        insertTime = insertTime + asset.duration
    }
}
let videoTracks = composition.tracks(withMediaType: .video)
let audioTracks = composition.tracks(withMediaType: .audio)
let videoComposition = AVMutableVideoComposition()
// videoComposition必須指定 幀率frameDuration、大小renderSize
videoComposition.frameDuration = CMTime(value: 1, timescale: 30)
videoComposition.renderSize = MMAssetExporter.renderSize
let vcInstruction = AVMutableVideoCompositionInstruction()
vcInstruction.timeRange = CMTimeRange(start: .zero, duration: composition.duration)
vcInstruction.backgroundColor = UIColor.red.cgColor // 可以設置視頻背景顏色
vcInstruction.layerInstructions = layerInstructions
videoComposition.instructions = [vcInstruction]
let audioMix = AVMutableAudioMix()
audioMix.inputParameters = audioParameters
// AVAssetReader
do {
    reader = try AVAssetReader(asset: composition)
} catch let e {
    callback(false, e)
    return
}
reader.timeRange = CMTimeRange(start: .zero, duration: composition.duration)
// AVAssetReaderOutput
videoOutput = AVAssetReaderVideoCompositionOutput(videoTracks: videoTracks, videoSettings: nil)
videoOutput.alwaysCopiesSampleData = false
videoOutput.videoComposition = videoComposition
if reader.canAdd(videoOutput) {
    reader.add(videoOutput)
}
audioOutput = AVAssetReaderAudioMixOutput(audioTracks: audioTracks, audioSettings: nil)
audioOutput.alwaysCopiesSampleData = false
audioOutput.audioMix = audioMix
if reader.canAdd(audioOutput) {
    reader.add(audioOutput)
}
if !reader.startReading() {
    callback(false, reader.error)
    return
}
// -----寫數據----
// AVAssetWriter
do {
    writer = try AVAssetWriter(outputURL: outputUrl, fileType: .mp4)
} catch let e {
    callback(false, e)
    return
}
writer.shouldOptimizeForNetworkUse = true
let videoInputSettings: [String : Any] = [
    AVVideoCodecKey: AVVideoCodecType.h264,
    AVVideoWidthKey: 720,
    AVVideoHeightKey: 1280,
    AVVideoCompressionPropertiesKey: [
        AVVideoAverageBitRateKey: 1000000,
        AVVideoProfileLevelKey: AVVideoProfileLevelH264High40
    ]
]
let audioInputSettings: [String : Any] = [
    AVFormatIDKey: NSNumber(value: kAudioFormatMPEG4AAC),
    AVNumberOfChannelsKey: NSNumber(value: 2),
    AVSampleRateKey: NSNumber(value: 44100),
    AVEncoderBitRateKey: NSNumber(value: 128000)
]
// AVAssetWriterInput
videoInput = AVAssetWriterInput(mediaType: .video, outputSettings: videoInputSettings)
if writer.canAdd(videoInput) {
    writer.add(videoInput)
}
audioInput = AVAssetWriterInput(mediaType: .audio, outputSettings: audioInputSettings)
if writer.canAdd(audioInput) {
    writer.add(audioInput)
}
writer.startWriting()
writer.startSession(atSourceTime: .zero)
// 準備寫入數據
writeGroup.enter()
videoInput.requestMediaDataWhenReady(on: inputQueue) { [weak self] in
    guard let wself = self else {
        callback(false, nil)
        return
    }
    
    if wself.encodeReadySamples(from: wself.videoOutput, to: wself.videoInput) {
        wself.writeGroup.leave()
    }
}
writeGroup.enter()
audioInput.requestMediaDataWhenReady(on: inputQueue) { [weak self] in
    guard let wself = self else {
        callback(false, nil)
        return
    }
    if wself.encodeReadySamples(from: wself.audioOutput, to: wself.audioInput) {
        wself.writeGroup.leave()
    }
}
writeGroup.notify(queue: inputQueue) {
    self.writer.finishWriting {
        callback(true, nil)
    }
}

設置VideoComposition,AudioMix,AVAssetReader的output需要使用AVAssetReaderVideoCompositionOutput、AVAssetReaderAudioMixOutput;
同AVAssetExportSession一樣,最終設置VideoComposition,AudioMix實現調節視頻size、旋轉方向、背景顏色、音量;同樣也能給視頻添加水印等;

使用AVAssetReader、AVAssetWriter和AVAssetExportSession相比,對於資源、軌道的分離合成等操作都是一樣的;對於AVAssetReader、AVAssetWriter可以進一步封裝成類似AVAssetExportSession使用

public var composition: AVComposition!
public var videoComposition: AVVideoComposition!
public var audioMix: AVAudioMix!
public var outputUrl: URL!
public var videoInputSettings: [String : Any]?
public var videoOutputSettings: [String : Any]?
public var audioInputSettings: [String : Any]?
public var audioOutputSettings: [String : Any]?

public func exportAsynchronously(completionHandler callback: @escaping VideoResult) {
    let videoTracks = composition.tracks(withMediaType: .video)
    let audioTracks = composition.tracks(withMediaType: .audio)
    
    do {
        reader = try AVAssetReader(asset: composition)
    } catch let e {
        callback(false, e)
        return
    }
    reader.timeRange = CMTimeRange(start: .zero, duration: composition.duration)
    
    videoOutput = AVAssetReaderVideoCompositionOutput(videoTracks: videoTracks, videoSettings:
videoOutputSettings)
    videoOutput.alwaysCopiesSampleData = false
    videoOutput.videoComposition = videoComposition
    if reader.canAdd(videoOutput) {
        reader.add(videoOutput)
    }
    audioOutput = AVAssetReaderAudioMixOutput(audioTracks: audioTracks, audioSettings: audioOutputSet
    audioOutput.alwaysCopiesSampleData = false
    audioOutput.audioMix = audioMix
    if reader.canAdd(audioOutput) {
        reader.add(audioOutput)
    }
    
    if !reader.startReading() {
        callback(false, reader.error)
        return
    }
    
    // -----寫數據----
    do {
        writer = try AVAssetWriter(outputURL: outputUrl, fileType: .mp4)
    } catch let e {
        callback(false, e)
        return
    }
    writer.shouldOptimizeForNetworkUse = true
    
    // AVAssetWriterInput
    videoInput = AVAssetWriterInput(mediaType: .video, outputSettings: videoInputSettings)
    if writer.canAdd(videoInput) {
        writer.add(videoInput)
    }
    
    audioInput = AVAssetWriterInput(mediaType: .audio, outputSettings: audioInputSettings)
    if writer.canAdd(audioInput) {
        writer.add(audioInput)
    }
    
    writer.startWriting()
    writer.startSession(atSourceTime: .zero)
    
    // 準備寫入數據
// videoInput.requestMediaDataWhenReady
// audioInput.requestMediaDataWhenReady
// encodeReadySamples
   ...
}

圖片合成爲視頻

do {
    writer = try AVAssetWriter(outputURL: outputUrl, fileType: .mp4)
} catch let e {
    callback(false, e)
    return
}
writer.shouldOptimizeForNetworkUse = true
videoInputSettings = [
    AVVideoCodecKey: AVVideoCodecType.h264,
    AVVideoWidthKey: MMAssetExporter.renderSize.width,
    AVVideoHeightKey: MMAssetExporter.renderSize.height
]
videoInput = AVAssetWriterInput(mediaType: .video, outputSettings: videoInputSettings)
let adaptor = AVAssetWriterInputPixelBufferAdaptor(assetWriterInput: videoInput,
sourcePixelBufferAttributes: nil)
if writer.canAdd(videoInput) {
    writer.add(videoInput)
}
writer.startWriting()
writer.startSession(atSourceTime: .zero)
let pixelBuffers = images.map { image in
    self.pixelBuffer(from: image)
}
let seconds = 2 // 每張圖片顯示時長 s
let timescale = 30 // 1s 30幀
let frames = images.count * seconds * timescale // 總幀數
var frame = 0
videoInput.requestMediaDataWhenReady(on: inputQueue) { [weak self] in
    guard let wself = self else {
        callback(false, nil)
        return
    }
    
    if frame >= frames {
        // 全部數據寫入完畢
        wself.videoInput.markAsFinished()
        wself.writer.finishWriting {
            callback(true, nil)
        }
        return
    }
    
    let imageIndex = frame / (seconds * timescale)
    let time = CMTime(value: CMTimeValue(frame), timescale: CMTimeScale(timescale))
    let pxData = pixelBuffers[imageIndex]
    if let cvbuffer = pxData {
        adaptor.append(cvbuffer, withPresentationTime: time)
    }
    
    frame += 1
}

這裏使用到了AVAssetWriterInputPixelBufferAdaptor,前面也提到了AVASsetWriter數據並不一定需要從AVAssetReader得到,它可以接受其他各種數據;AVAssetWriterInputPixelBufferAdaptor就起到適配左右,可以讓AVASsetWriter寫入不同數據;
AVAssetWriterInputPixelBufferAdaptor創建時需要在writer.startWriting()前

遇到的問題

-[AVAssetWriterInput appendSampleBuffer:] Cannot append sample buffer: Input buffer must be in an uncompressed format when outputSettings is not nil

原因:使用了AVAssetReaderTrackOutput,且outSettings未設置uncompressed的配置(如果爲nil則使用源視頻設置);增加uncompressed 的outSettings即可;

[AVAssetReaderTrackOutput copyNextSampleBuffer] cannot copy next sample buffer before adding this output to an instance of AVAssetReader (using -addOutput:) and calling -startReading on that asset reader'

原因,在讀取數據過程中,AVAssetReader被釋放了;需要強引用AVAssetReader對象;
詳見: https://stackoverflow.com/questions/27608510/avfoundation-add-first-frame-to-video

  1. reader.startReading()報錯
Error Domain=AVFoundationErrorDomain Code=-11841

詳見: https://www.cnblogs.com/song-jw/p/9530249.html

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章