See Also:
H.261, Picturetel, Compression Labs, and Indeo are basically videoconferencing protocols and are generally useful for remote meetings but not for broadcast quality. The vendor protocols do not interoperate, but rather, use H.261 as a least common denominator to interoperate with each other's equipment.
The advantage of progressive scanning is that it is relatively simple to compress a single frame. For any pixel at a particular location, there is a high probability that all eight contiguous pixels have the same value. This information is used when compressing a single frame (spatial compression).
Progressive scanning is also used for videoconferencing, computer monitors, and motion pictures. MPEG-1 was designed for use by progressively scanned media, such as CD-ROM.
The problem with interleaving is that it is more difficult to compress spatially. For any pixel at a particular position, only the pixels before and after on the same line are to be displayed within the scan rate of 1/30th of a second. The other six pixels will be displayed 1/30th of a second later. So normal spatial compression algorithms are a bit more complicated, although not impossibly so.
MPEG-2 is designed, among other things, for compression of interlaced displays. MPEG-1 is not suitable for television broadcast.
Recent industry views have changed about HDTV. It is no longer viewed as an option for consumers. The television sets would be too expensive for too long, viewers cannot differentiate the quality on anything less than a 50-inch monitor, and programmers would rather sell more channels than better quality. If HDTV is used at all, it will be in commercial applications where large screens are required, such as air traffic control, network management monitors, baseball/football stadiums, and the like.
MPEG-2 was chosen as the encoding, compression, and transmission format for HDTV, in part because of its multiplexing and encryption features.
More on MPEG
It is possible to display a video with a sequence of JPEG pictures at 24 pictures per second (the rate used in motion pictures) or 30 frames per second (the rate used in U.S. broadcast television). However, this would not provide optimal compression. At these rates, any picture is very much like the picture either immediately preceding or immediately following it. This information should be used to transmit (and store) fewer bits. MPEG provides this interframe compression, called temporal compression.
To achieve temporal compression, some frames are computed from other frames. The technique is to define three different kind of frames. First there are Intraframes or I frames. These are much like fully coded JPEG pictures. Next there are Predicted frames or P frames. These are predicted from I frames or other P frames. Finally, there are Bidirectional frames or B frames. B frames are interpolated from I and/or P frames.
The process is as follows. The encoder sends a I frame. Then a P frame is sent, perhaps 100 ms later. The time interval is set by configuration. The decoder cannot display the two pictures consecutively, because a 100-ms gap would not provide a smooth picture. So the pictures in between are computed (interpolated) from the two. The sequence of frames in a video may be similar to the following:
------------------------------ Time (ms) Frame ============================== 0 I 30 B 60 B 90 P 120 B 150 B 180 P 210 I Repeat ... ------------------------------This example is for illustration purposes only. By convention, I frames are sent roughly every 400 ms. Also by convention, there are generally 10 to 12 frames between I frames. The mix of B frames and P frames is variable. Some users have elected not to use B frames at all but to use more P frames instead.
B frames tend to make pictures smoother on playback while consuming less bandwidth. The problem is that they force the decoder to buffer P frames and compute B frames. This requirement increases decoder costs, which is a particular problem for cable TV set top manufacturers. General Instrument's Digicipher protocol has specifically excluded B frames in an effort to keep costs down.
This architecture also has implications on networking. I frames anchor picture quality, because ultimately P and B frames are derived from them. Therefore, it is important that I frames be transmitted with higher reliability than P or B frames. Thus, when transmitting MPEG frames over ATM or Frame Relay, it is advisable that I frames be given priority.
Video encoders should dynamically react to network congestion by dynamically altering the mix of I and B frames.
MPEG Types
There are three types of MPEG; numbers 1, 2, and 4. They
address
different issues. MPEG-1 video is optimized for T1/E1 speeds, single programs
in a stream, and progressive scanning. MPEG-1 audio provides CD-ROM-quality
stereo sound.
MPEG-2 is enhanced to handle HDTV. It supports higher speeds, multiple programs in a single stream, and interlaced as well as progressive images. MPEG-2 multiplexing provides data transmission, which may be necessary for home shopping. MPEG-2 audio supports MPEG-1 and has options for lower-quality sound, such as secondary audio channels for television broadcast.
MPEG-4 is designed for DS0 audio/video, such as MIME messages. MPEG-4 work is still in process.
MPEG-2 is rapidly assuming a centerpiece role in broadband networking. It is backward compatible with MPEG-1, which means that MPEG-2 decoders can display MPEG-1 encoded files. It has full functionality for video on demand, television broadcast, and Mosaic-type data services. MPEG-2 chips exist that permit real-time encoding, and there is a specification for MPEG-2 adaptation over ATM AAL5.