Home > DVD
DVD Technical Guide - DVD Video Format
1.1 Video Format Overview

DVD-Video contains not only the actual video and audio content, but a variety of powerful information which enables features peculiar to the DVD format, such as multi-angle viewing, parental lock, random shuffle playback, etc., and also provides support for special playback modes such as fast forward and reverse. In this chapter, we will call the actual video and audio content the "presentation data," and the special extra information the "navigation data."

1.2 VMG and VTS


The DVD-Video zone contains all the files necessary for playback of DVD-Video, and is made up of one Video Manager (VMG) and multiple Video Title Sets (VTS). The VMG is composed of VMGI(Video Manager Information), VMGM_VOBS (Video Object Set for VMG Menu), and the backup VMGI(BUP).

The VMGI consists of control information for the entire DVD-Video zone, and comprises a single file named VIDEO_TS.IFO.
The VMGM_VOBS contains the content necessary for the title selection menu, and comprises a single file named VIDEO_TS.VOB.
The VMGI(BUP) is a complete copy of the VMGI, and comprises a single file named VIDEO_TS.BUP.
VMGM_VOB may or may not exist, but the other two types of information are required.

Each VTS is composed of VTSI (Video Title Set Information), VTSM_VOBS (Video Object Set for the VTS Menu), VTSTT_VOBS (Video Object Set for Titles in a VTS), and the backup VTSI(BUP). The VTSI is control information for the VTS, and comprises a single file named VTS_##_0.IFO. The VTSM_VOBS contains the content for all types of menus within the VTS, and comprises a single file named VTS_##_0.VOB.
The VTSTT_VOBS contains the content needed for title playback, and comprises multiple files, named VTS_##_@.VOB.
The VTSI(BUP) is a complete copy of the VTSI, and comprises a single file named VTS_##_0.BUP.
VTSM_VOBS may or may not exist, but the other three types of information are required. In the file names above, ## represents a two-digit number between 01 and 99, and @ represents a single-digit number between 1 and 9.

1.3 Presentation Data

Within the presentation data, video, audio, and sub-picture data are multiplexed with a portion of the navigation data in conformance with the MPEG-2 program stream specification. The structure of pack and packet comply with this specification, and each pack contains 2048 bytes. The multiplex rate (mux_rate) is 10.08 Mbps.

1.3.1 Video

data compression method MPEG2, MPEG1
bit rate 9.8 Mbps max. (MPEG-2) 1.856 Mbps max. (MPEG-1)
GOP size 36 fields max.
screen display
TV systems 525/60, 625/50
aspect ratios 4:3, 16:9
modes pan & scan, letterbox
user data closed captions

Video data exists as one stream of data compressed according to the MPEG-2 video format. The maximum bit rate is 9.8 Mbps, and the stream supports variable bit rate to provide high-quality video.
DVD-Video is compatible with both NTSC and PAL formats, and supports both 4:3 and 16:9 aspect ratios. The title creator can specify either "pan & scan" (cutting off a portion of the image) or "letterbox" (showing then entire image with black bands at the top and bottom of the screen) format to provide output of 16:9 aspect ratio video content at an aspect ratio of 4:3.

1.3.2 Audio
  Linear PCM Dolby Digital* MPEG Audio
Fs 48kHz, 96kHz 48kHz 48kHz
Qb 16 / 20 / 24 bits Compressed Compressed
Bit-rate
(per 1stream)
MAX 6.144Mbps MAX 448kbps MAX 912kbps
* Trademarks of Dolby Laboratories Licensing Corporation

Three audio formats are allowed by the DVD specification: linear PCM, Dolby Digital, and MPEG audio. Each title can have up to eight audio streams. The streams are distinguished by attributes such as language. Each stream is comprised of multiple channels. For instance, the Dolby Digital format supports 5.1 channels.

When using the linear PCM format, DVD audio can support a sampling rate of up to 96 kHz with up to 24 bits per sample, providing audio quality which far surpasses that of CDs. For Dolby Digital or MPEG audio, the sampling rate is 48 kHz. MPEG audio supports MPEG-2 audio with multi-channel capability.

1.3.3 Sub-picture

data format run-length encoding, two bits per pixel
data size per picture 52 kB max.
resolution 720X480 (525/60) 720X576 (625/50)
display color 16 colors (specified per PGC)
display control change pixel contrast and color change display area (move) change display data (scroll up/down) force display

Sub-picture data is a concept peculiar to DVD, and consists of defining data, such as subtitles, menus, and karaoke lyrics, which is overlaid as a bitmap onto the main video content. This data is compressed using run-length encoding. Up to 32 streams of sub-picture data can exist for each title. Sub-picture streams are distinguished by attributes such as language.

Sub-picture data can be displayed in up to 16 different colors. For applications such as subtitles, the user controls the display of sub-picture data. DVD also supports the forcing of sub-picture data display, for example if the title creator wants to force a menu to be displayed at a particular point in the content stream.

1.4 Navigation Data

1.4.1 Cells and PGCs




A cell is a unit of playback of real-time data. Each cell is identified with a fixed ID number. A Program Chain (PGC) defines the order in which cells are played back. That is, each PGC defines the order in which the cell numbers are to be played. A title is comprised of one or more linked PGCs. In a case such as a simple movie, where one title is comprised of one PGC, the cells recorded on the disc are played back in order, and so the cell numbers and cell ID numbers will be the same. If multiple titles with different stories in a title set are defined by their own PGCs, then each PGC will call out the cells to be played for that title and the order in which they are to be played, and the cell numbers and cell ID numbers will not be the same. In this way, the DVD specification defines PGCs and cells to allow the order and time relationship of the real-time data playback to be essentially arbitrary. This structure can be utilized to provide playback options such as parental level selection, angle selection, and story selection. Each PGC may also contain a pre-command, which is executed before playing back the first cell, and a post-command, which is executed after playing back the last cell. And the PGC may contain button or cell commands, which can be executed each time a cell is played. Through these commands and user operation, one PGC can branch into multiple PGCs, multiple PGCs can branch into the same PGC, etc., providing the possibility for many types of interactive playback.

1.4.2 Programs and PTTs


A sequence of one or more cells with consecutive numbers within a PGC can be defined as a program. Programs may be used as units of playback for random or shuffle playback, or to be accessed via commands.
Further, sequences of one or more programs with consecutive numbers within a PGC can be defined as a PTT. PTTs correspond to chapters, and are one unit of access provided to the user.

1.4.3 PCI and DSI


A cell is comprised of one or more Video Object Units (VOBU). Each VOBU consists of 0.4 seconds to 1 second of playback time. Each VOBU begins with a Navigation Pack (NV_PCK) and is followed by several Group Of Pictures (GOP) structures which contain video, audio, sub-picture, and other data in a packetized, time-division multiplexed fashion. However, a VOBU is not required to contain any data other than the NV_PCK, and thus the content within a VOBU may be shorter than the playback time of the VOBU itself. Further, the number of frames per GOP is not fixed, and if it is ended with an MPEG sequence end code, playback will be paused on the last frame of the GOP. This makes it possible to include still frames displayed for an arbitrary length of time at arbitrary points within video playback. Audio information may also be added to such sequences.
The NV_PCK is comprised of two packets, called Presentation Control Information (PCI) and Data Search Information (DSI).
In order for DVD players to support variable-rate playback and seamless playback, there is a large memory between the pickup and the decoder, called a track buffer. As a result, there is a time delay between the signal being read by the pickup and the video and audio being decoded and played. Therefore, real-time control information is divided between and stored within the PCI and DSI packets, and the player checks and utilizes those information after and before the cell passes through the track buffer.

2008 - 04 -14