http://bbs.chinavideo.org/redirect.php?tid=1795&goto=lastpost
Multimedia Files ---------------- Many multimedia files that carry both audio and video bear extensions such as .avi (Microsoft AVI files), .asf (a.k.a., .wmv and .wma, collectively known as Microsoft ASF files), .mov (Apple Quicktime files), and .rm (RealMedia files). Confusion often arises as one wonders what application can, for example, "play .mov files". That is a very difficult question to answer and here is why: All of the formats mentioned in the preceding paragraph are also referred to as multimedia container formats. All they do is pack chunks of audio and video data together, interleaved, along with some instructions to inform a playback application how the data is to be decoded and presented to the user. This is the typical layout of many multimedia file formats: file header title, creator, other meta-info video header video codec FourCC width, height, colorspace, playback framerate audio header audio codec FourCC bits/sample, playback frequency, channel count file data encoded audio chunk #0 encoded video chunk #0 encoded audio chunk #1 encoded video chunk #1 encoded audio chunk #2 encoded video chunk #2 encoded audio chunk #3 encoded video chunk #3 .. .. Those audio and video chunks can be encoded with any number of audio or video codecs, the FourCCs of which are specified in the file header. See The Almost Definitive FourCC Definition List listed in the reference for more information on the jungle of FourCCs out there, and where they commonly appear. Interleaving ------------ Interleaving is the process of storing alternating audio and video chunks in the data section of a multimedia file: encoded audio chunk #0 encoded video chunk #0 encoded audio chunk #1 encoded video chunk #1 encoded audio chunk #2 encoded video chunk #2 .. .. encoded audio chunk #n encoded video chunk #n Why is this done? Why not just place all of the video data in the file, followed by all of the audio data? For example: encoded video chunk #0 encoded video chunk #1 encoded video chunk #2 .. .. encoded video chunk #n encoded audio chunk #0 encoded audio chunk #1 encoded audio chunk #2 .. .. encoded audio chunk #n Conceptually, this appears to be a valid solution. In practice, however, it falls over. Assuming these audio and video streams are part of the same file on the same disk (almost always the case), there is a physical mechanism called the disk read head which has to constantly make a leap between two different positions on the disk. When the chunks are interleaved, the read head does not need to seek at all; it can read all the data off in a contiguous fashion. |