使用AVFoundation處理視頻

項目中有類似視頻編輯的功能,之前都是直接使用AVFoundation開發完成,實際效果也不錯;
對於一些常見用法,現在花時間來個總結;

基礎

  • CMTime 資源時間(視頻)
  • AVAsset 資源信息
  • AVURLAsset 根據URL路徑創建的資源
  • AVAssetTrack 資源軌道,包括音頻軌道和視頻軌道
  • AVMutableComposition 包含多個軌道的資源集合,可以添加、刪除軌道
  • AVMutableCompositionTrack 可變軌道用於集合
  • AVMutableVideoComposition 視頻操作指令集合
  • AVMutableVideoCompositionInstruction 視頻操作指令
  • AVMutableVideoCompositionLayerInstruction視頻軌道操作指令,需要添加到AVMutableVideoCompositionInstruction
  • AVMutableAudioMix 音頻配置
  • AVMutableAudioMixInputParameters音頻操作參數
  • AVPlayerItem媒體資源管理對象,管理視頻的基本信息和狀態
  • AVPlayer 視頻播放類,本身不顯示視頻,需創建一個AVPlayerLayer層,添加到視圖
  • AVAssetExportSession 導出資源

一個視頻的信息 大致如圖所示:

CMTime

爲了確保時間精度,AVFoundation時間單位使用了CMTime(Core Media); CMTime由value和timescale表示:

public var value: CMTimeValue
public var timescale: CMTimeScale

timescale可以理解爲將1s時間分成了多少段,value可以理解爲有多少段時間,最終的時間則爲 value/timescale;

        let time3 = CMTime(value: 1, timescale: 30)
        print(time3.seconds) // 0.033

也可以通過時長創建CMTime:

        let time2 = CMTime(seconds: 1, preferredTimescale: 2)
        print(time2.value) // 2

CMTime同樣支持加減運算:

let time = time2 + time3 

最終結果爲分數加減結果,最終分母timescale爲參與運算所有時間的timescale的最小公倍數;

用於視頻時,timescale一般都設置爲600
主要是因爲,各種視頻的幀率frame不一致,有的視頻爲30fps,有的視頻爲24fps,還有些視頻爲60fps;爲了能兼容,timescale設置爲其最小公倍數;

AVAsset

表示資源信息,這個資源泛指視頻、音頻、圖片等等;
其中包含資源軌道、資源時長、資源類型等信息;

AVURLAsset爲AVAsset子類,爲通過URL創建的資源,這個URL既可以是網絡的url也可以是客戶端存儲的file url;

        let asset = AVURLAsset(url: videoUrl)
        let duration = asset.duration
        CMTimeShow(duration)

        let tracks = asset.tracks

AVAssetTrack

資源軌道信息,可以理解爲資源的最小表示對象;一個資源由多個軌道組成;
軌道種類:

public static let video: AVMediaType
public static let audio: AVMediaType
public static let text: AVMediaType
public static let closedCaption: AVMediaType
public static let subtitle: AVMediaType
public static let timecode: AVMediaType
public static let metadata: AVMediaType
public static let muxed: AVMediaType
public static let metadataObject: AVMediaType
public static let depthData: AVMediaType

一般視頻包含視頻、音頻、字幕軌道;
軌道包含的主要信息:

open var timeRange: CMTimeRange { get } // 時間範圍
open var naturalSize: CGSize { get } // 長寬
open var minFrameDuration: CMTime { get }  // frame
open var preferredVolume: Float { get } // 音量

AVMutableComposition

AVAsset子類,也是表示資源;可以簡單看做是可以編輯的資源:可以對軌道進行組合、刪除等操作

AVMutableCompositionTrack

AVAssetTrack子類,也是表示軌道;可以簡單看做是可以編輯的軌道:可以插入別的軌道

AVMutableVideoComposition

處理視頻的信息: 可以設置視頻背景顏色,視頻size等,在添加視頻水印、視頻轉場動畫時需要使用到

AVMutableAudioMix

音頻配置,通過配置可更改視頻的音量等


資源、軌道等之間的關係(引用官方文檔):

AVAssetExportSession

通過軌道設置、音頻設置、視頻設置等操作後即可導出一個新的資源;

編碼

在實際編碼時,這些類容易搞混;爲了更直觀更清晰的理解其間的關係,特意做了UML圖:

涉及到對資源、軌道的修改,因此Composition、CompositionTrack等都是使用的可變類型AVMutableComposition、AVMutableCompositionTrack(它們均有自己的不可變的基類)

多個視頻合併成一個視頻

思路:
按照需要的合併順序依次將多個視頻原始的視頻、音頻等軌道信息分離出來,將分離的視頻、音頻軌道分別根據時間線合併到一個新的視頻、音頻軌道;由新的視頻、音頻軌道再創建一個新的資源(AVComposition)

// 創建資源集合composition及可編輯軌道
let composition = AVMutableComposition()
// 視頻軌道
let videoCompositionTrack = composition.addMutableTrack(withMediaType: .video, preferredTrackID:
kCMPersistentTrackID_Invalid) // kCMPersistentTrackID_Invalid 自動創建隨機ID
// 音頻軌道
let audioCompositionTrack = composition.addMutableTrack(withMediaType: .audio, preferredTrackID:
kCMPersistentTrackID_Invalid)
var insertTime = CMTime.zero
for url in urls {
    autoreleasepool {
        // 獲取視頻資源 並分離出視頻、音頻軌道
        let asset = AVURLAsset(url: url)
        let videoTrack = asset.tracks(withMediaType: .video).first
        let audioTrack = asset.tracks(withMediaType: .audio).first
        let videoTimeRange = videoTrack?.timeRange
        let audioTimeRange = audioTrack?.timeRange
        
        // 將多個視頻軌道合到一個軌道上(AVMutableCompositionTrack)
        if let insertVideoTrack = videoTrack, let insertVideoTime = videoTimeRange {
            do {
                // 在某個時間點插入軌道
                try videoCompositionTrack?.insertTimeRange(CMTimeRange(start: .zero, duration:
insertVideoTime.duration), of: insertVideoTrack, at: insertTime)
            } catch let e {
                callback(false, e)
                return
            }
        }
        
        // 將多個音頻軌道合到一個軌道上(AVMutableCompositionTrack)
        if let insertAudioTrack = audioTrack, let insertAudioTime = audioTimeRange {
            do {
                try audioCompositionTrack?.insertTimeRange(CMTimeRange(start: .zero, duration:
insertAudioTime.duration), of: insertAudioTrack, at: insertTime)
            } catch let e {
                callback(false, e)
                return
            }
        }
        
        /* insertTimeRange 多個軌道,後一個視頻黑屏
          let assetTimeRage = CMTimeRange(start: .zero, duration: asset.duration)
          do {
              try composition.insertTimeRange(assetTimeRage, of: asset, at: insertTime)
          } catch let e {
              callback(false, e)
              return
          }
        */

        insertTime = insertTime + asset.duration
    }
}

至此,得到了合併後的資源,將資源導出才能取得最終的視頻數據:

guard let exportSession = AVAssetExportSession(asset: asset, presetName: AVAssetExportPresetPassthrough) else {
    callback?(false, nil)
    return
}
exportSession.outputFileType = .mp4
exportSession.outputURL = outputUrl
exportSession.shouldOptimizeForNetworkUse = true
exportSession.exportAsynchronously {
    switch exportSession.status {
        case .completed:
            callback?(true, nil)
        default:
            callback?(false, exportSession.error)
    }
}

preset:
視頻預設,決定導出後的視頻編碼方式、清晰度;
AVAssetExportPresetPassthrough表示使用視頻原有預設,使用該值audioMix,videoInstruction均不生效;

ps:
AVMutableComposition提供了方法:

- (BOOL)insertTimeRange:(CMTimeRange)timeRange ofAsset:(AVAsset *)asset atTime:(CMTime)startTime error:(NSError * _Nullable * _Nullable)outError;

可以直接插入某個資源,而不用分離軌道;之前嘗試使用該方式合併多個視頻,但最終結果只有第一個視頻能正常顯示後面的全黑屏(大概是因爲存在多個軌道無法切換吧);

調整視頻方向、統一視頻size

以上多個視頻合併時存在的2個問題:

  1. 拍攝視頻時,手機的方向將會決定視頻的方向;視頻方向不是默認的landscapeRight導出的視頻也會是旋轉的;
    視頻方向和設備方向類似:
  • portrait : home鍵在下,豎屏拍攝的視頻 視頻旋轉了90度
  • landscapeLeft:home鍵朝左,橫屏拍攝的視頻 視頻旋轉了180度
  • portraitUpsideDown:home鍵在上,豎屏拍攝的視頻 視頻旋轉了270度
  • landscapeRight: home鍵朝右,橫屏拍攝的視頻(視頻顯示默認方向)
  1. 多個視頻的size不一樣,合成一個視頻時有時需要統一size;

以上2點需要配置AVMutableVideoComposition實現:

let composition = AVMutableComposition()
guard let videoCompositionTrack = composition.addMutableTrack(withMediaType: .video, preferredTrackID:
kCMPersistentTrackID_Invalid) else {
    callback(false, nil)
    return
}
let audioCompositionTrack = composition.addMutableTrack(withMediaType: .audio, preferredTrackID:
kCMPersistentTrackID_Invalid)
// layerInstruction 用於更改視頻圖層
let vcLayerInstruction = AVMutableVideoCompositionLayerInstruction(assetTrack: videoCompositionTrack)
var layerInstructions = [vcLayerInstruction]
var insertTime = CMTime.zero
for url in urls {
    autoreleasepool {
        let asset = AVURLAsset(url: url)
        let videoTrack = asset.tracks(withMediaType: .video).first
        let audioTrack = asset.tracks(withMediaType: .audio).first
        let videoTimeRange = videoTrack?.timeRange
        let audioTimeRange = audioTrack?.timeRange
        
        if let insertVideoTrack = videoTrack, let insertVideoTime = videoTimeRange {
            do {
                try videoCompositionTrack.insertTimeRange(CMTimeRange(start: .zero, duration:
insertVideoTime.duration), of: insertVideoTrack, at: insertTime)
                
                // 更改Transform 調整方向、大小 , 並統一一致的rendersize
                var trans = insertVideoTrack.preferredTransform
                let size = insertVideoTrack.naturalSize
                let orientation = orientationFromVideo(assetTrack: insertVideoTrack)
                switch orientation {
                    case .portrait:
                        let scale = renderSize.height / size.width
                        trans = CGAffineTransform(scaleX: scale, y: scale)
                        trans = trans.translatedBy(x: size.height, y: 0)
                        trans = trans.rotated(by: .pi / 2.0)
                    case .landscapeLeft:
                        let scale = renderSize.width / size.width
                        trans = CGAffineTransform(scaleX: scale, y: scale)
                        trans = trans.translatedBy(x: size.width, y: size.height + (renderSize.height -
size.height * scale) / scale / 2.0)
                        trans = trans.rotated(by: .pi)
                    case .portraitUpsideDown:
                        let scale = renderSize.height / size.width
                        trans = CGAffineTransform(scaleX: scale, y: scale)
                        trans = trans.translatedBy(x: 0, y: size.width)
                        trans = trans.rotated(by: .pi / 2.0 * 3)
                    case .landscapeRight:
                        // 默認方向
                        let scale = renderSize.width / size.width
                        trans = CGAffineTransform(scaleX: scale, y: scale)
                        trans = trans.translatedBy(x: 0, y: (renderSize.height - size.height * scale) /
scale / 2.0)
                }
                
                vcLayerInstruction.setTransform(trans, at: insertTime)
                layerInstructions.append(vcLayerInstruction)
            } catch let e {
                callback(false, e)
                return
            }
        }
        if let insertAudioTrack = audioTrack, let insertAudioTime = audioTimeRange {
            do {
                try audioCompositionTrack?.insertTimeRange(CMTimeRange(start: .zero, duration:
insertAudioTime.duration), of: insertAudioTrack, at: insertTime)
            } catch let e {
                callback(false, e)
                return
            }
        }
        
        insertTime = insertTime + asset.duration
    }
}

導出視頻時,設置videoComposition即可:

let videoComposition = AVMutableVideoComposition()
// videoComposition必須指定 幀率frameDuration、大小renderSize
videoComposition.frameDuration = CMTime(value: 1, timescale: 30)
videoComposition.renderSize = renderSize
let vcInstruction = AVMutableVideoCompositionInstruction()
vcInstruction.timeRange = CMTimeRange(start: .zero, duration: composition.duration)
vcInstruction.backgroundColor = UIColor.red.cgColor // 可以設置視頻背景顏色
vcInstruction.layerInstructions = layerInstructions
videoComposition.instructions = [vcInstruction]
guard let exportSession = AVAssetExportSession(asset: composition, presetName:
AVAssetExportPreset1280x720) else {
    callback(false, nil)
    return
}
exportSession.videoComposition = videoComposition
exportSession.outputFileType = .mp4
exportSession.outputURL = outputUrl
exportSession.shouldOptimizeForNetworkUse = true
exportSession.exportAsynchronously {
    switch exportSession.status {
        case .completed:
            callback(true, nil)
        default:
            callback(false, exportSession.error)
    }
}

縮放、旋轉、平移視頻的操作,是通過修改AVMutableVideoCompositionLayerInstruction的CGAffineTransform實現;視頻的CGAffineTransform和UIView的CGAffineTransform一樣,是一個矩陣,通過修改矩陣的值最終跳轉視頻;

CGAffineTransform矩陣包含a,b,c,d,tx,ty6個值:通過矩陣運算得到最終的結果:

CGAffineTransform原理及使用可參考: https://www.jianshu.com/p/ca7f9bc62429
https://www.jianshu.com/p/a848d6b5a4b5

通過以上處理,之前橫屏的視頻豎向顯示效果如下:

添加音頻

給原有視頻添加音頻:可用於添加解說、添加背景音樂等;
實現的思路萬變不離其宗,在原有音頻軌道添加新的音頻軌道數據即可:

var audioParameters: [AVMutableAudioMixInputParameters] = []
let asset = AVURLAsset(url: videoUrl)
let composition = AVMutableComposition()
do {
    try composition.insertTimeRange(CMTimeRange(start: .zero, duration: asset.duration), of: asset, at:
.zero)
} catch let e {
    callback(false, e)
    return
}
let audioAsset = AVURLAsset(url: audioUrl)
let audioCompositionTrack = composition.addMutableTrack(withMediaType: .audio, preferredTrackID:
kCMPersistentTrackID_Invalid)
let audioTracks = audioAsset.tracks(withMediaType: .audio)
for audioTrack in audioTracks {
    let adParameter = AVMutableAudioMixInputParameters(track: audioTrack)
    adParameter.setVolume(1, at: .zero)
    audioParameters.append(adParameter)
    
    do {
        try audioCompositionTrack?.insertTimeRange(audioTrack.timeRange, of: audioTrack, at: .zero)
    } catch let e {
        callback(false, e)
        return
    }
}
// AVAssetExportPresetPassthrough報錯:Code=-11838 "Operation Stopped"
guard let exportSession = AVAssetExportSession(asset: composition, presetName:
AVAssetExportPresetMediumQuality) else {
    callback(false, nil)
    return
}
// 調節音頻
let audioMix = AVMutableAudioMix()
audioMix.inputParameters = audioParameters
exportSession.audioMix = audioMix

在添加音頻的同時設置了AVMutableAudioMix,可以通過audiomix調節音頻的音量;Volume的值範圍:0--1(由低到高)

刪除軌道數據

以上都是添加軌道的操作,有時也需要刪除軌道,如去除視頻原聲等:

let asset = AVURLAsset(url: url)
let composition = AVMutableComposition()
do {
    try composition.insertTimeRange(CMTimeRange(start: .zero, duration: asset.duration), of: asset,
.zero)
} catch let e {
    exportback?(false, e)
    return nil
}
let tracks = composition.tracks(withMediaType: type)
for track in tracks {
    composition.removeTrack(track)
}

裁剪視頻

根據某個時間段裁剪視頻:

public static func cutVideo(url: URL, outputUrl: URL, secondsRange: ClosedRange<Double>, callback: @escaping
VideoResult) {
    let asset = AVURLAsset(url: url)
    let composition = AVMutableComposition()
    do {
        let timeRange = CMTimeRange(start: CMTime(seconds: secondsRange.lowerBound, preferredTimescale:
timescale), end: CMTime(seconds: secondsRange.upperBound, preferredTimescale: timescale))
        try composition.insertTimeRange(timeRange, of: asset, at: .zero)
    } catch let e {
        callback(false, e)
        return
    }
    
    exportVideo(composition, AVAssetExportPresetPassthrough, outputUrl, callback)
}

獲取指定幀的圖片

一般用於設置視頻封面
使用AVAssetImageGenerator,可以獲取資源asset某個時間cmtime的數據:

let imageGenerator = AVAssetImageGenerator(asset: asset)
imageGenerator.appliesPreferredTrackTransform = true
// .zero 精確獲取
imageGenerator.requestedTimeToleranceAfter = .zero
imageGenerator.requestedTimeToleranceBefore = .zero
var actualTime: CMTime = .zero
do {
    let time = CMTime(seconds: seconds, preferredTimescale: timescale)
    let imageRef = try imageGenerator.copyCGImage(at: time, actualTime: &actualTime)
    print(actualTime)
    return UIImage(cgImage: imageRef)
} catch {
    return nil
}

requestedTimeToleranceAfter、requestedTimeToleranceBefore用於指定誤差範圍;2者都設置zero則爲精確的時間點,但是相對來說耗時也長點;
actualTime爲獲取到的圖片的實際視頻時間點,如果前面誤差都設置zero,則actualTime就是設置的時間

保存視頻到系統相冊

需使用Photos庫

let photoLibrary = PHPhotoLibrary.shared()
photoLibrary.performChanges {
    PHAssetChangeRequest.creationRequestForAssetFromVideo(atFileURL: url)
} completionHandler: { success, error in
    callback(success, error)
}

視頻添加水印

視頻水印需要使用前面提到的videoComposition.animationTool

// instructions
var trans = videoCompositionTrack.preferredTransform
let size = videoCompositionTrack.naturalSize
let orientation = orientationFromVideo(assetTrack: videoCompositionTrack)
switch orientation {
    case .portrait:
        trans = CGAffineTransform(translationX: size.height, y: 0)
        trans = trans.rotated(by: .pi / 2.0)
    case .landscapeLeft:
        trans = CGAffineTransform(translationX: size.width, y: size.height)
        trans = trans.rotated(by: .pi)
    case .portraitUpsideDown:
        trans = CGAffineTransform(translationX: 0, y: size.width)
        trans = trans.rotated(by: .pi / 2.0 * 3)
    case .landscapeRight:
        // 默認方向
        break
}
let vcLayerInstruction = AVMutableVideoCompositionLayerInstruction(assetTrack: videoCompositionTrack)
vcLayerInstruction.setTransform(trans, at: .zero)
let videoComposition = AVMutableVideoComposition()
videoComposition.frameDuration = CMTime(value: 1, timescale: 30)
videoComposition.renderSize = composition.naturalSize
let vcInstruction = AVMutableVideoCompositionInstruction()
vcInstruction.timeRange = CMTimeRange(start: .zero, duration: composition.duration)
vcInstruction.layerInstructions = [vcLayerInstruction]
videoComposition.instructions = [vcInstruction]
// animationTool
let renderFrame = CGRect(origin: .zero, size: size)
let imageLayer = CALayer()
let textLayer = CATextLayer()
let watermarkLayer = CALayer()
let videoLayer = CALayer()
let animationLayer = CALayer()
// 水印layer 可以包含多個圖層
watermarkLayer.frame = wmframe
imageLayer.frame = watermarkLayer.bounds
imageLayer.contents = wmImage.cgImage
textLayer.frame = watermarkLayer.bounds
textLayer.string = wmText
textLayer.foregroundColor = UIColor.red.cgColor
textLayer.fontSize = 30
watermarkLayer.addSublayer(imageLayer)
watermarkLayer.addSublayer(textLayer)
watermarkLayer.masksToBounds = true
watermarkLayer.backgroundColor = UIColor.red.cgColor
// 視頻layer
videoLayer.frame = renderFrame
// core animation layer
animationLayer.frame = renderFrame
animationLayer.addSublayer(videoLayer)
animationLayer.addSublayer(watermarkLayer)
let animationTool = AVVideoCompositionCoreAnimationTool(postProcessingAsVideoLayer: videoLayer, in:
animationLayer)
videoComposition.animationTool = animationTool
guard let exportSession = AVAssetExportSession(asset: composition, presetName: AVAssetExportPreset1280x720)
else {
    callback(false, nil)
    return
}
exportSession.videoComposition = videoComposition
exportVideo(exportSession, outputUrl, callback)

添加水印主要涉及到3個layer:

  • animationLayer
  • videoLayer
  • 水印圖層 watermarkLayer

animationLayer爲父layer,videoLayer和水印層都需要添加在此上面;
videoLayer爲視頻圖層、不需要設置內容,只需設置frame,視頻數據會渲染這個layer

videoComposition.animationTool是針對圖層處理,除了可以添加水印,同樣能實現各種轉場動畫

水印最終結果:

AVAssetExportSession導出視頻遇到的錯誤

在使用AVAssetExportSession導出視頻時,經常會遇到各種莫名奇妙的錯誤;
常見的錯誤信息/類型:

這些錯誤大部分原因是因爲視頻的preset預設、分辨率、編碼等不同導致,而AVAssetExportSession除了能設置preset外,對於輸出的編碼不能進行更加細緻的設置;
如果需要對輸出的視頻編碼進行細緻的設置,就需要使用AVFoundation另一個類:AVAssetWriter;關於這部分,有時間我會單獨另開一篇總結下;


完整代碼:https://github.com/momoAI/VideoComposition-AV



參考:
https://developer.apple.com/library/archive/documentation/AudioVideo/Conceptual/AVFoundationPG/Articles/03_Editing.html#//apple_ref/doc/uid/TP40010188-CH8-SW1

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章