Next: Subjective Quality Test and Up: Measuring Video Quality Previous: Network Transport Simulation Contents Index

The Quality-Affecting Parameters

To generate the distorted video sequences, we used a tool that encodes a real-time video stream over an IP network into H263 format [69], simulates the packetization of the video stream, decodes the received stream and allows us to handle the simulated lost model (See the previous Section). We used a standard video sequence called stefan used to test the performance of H26x and MPEG4 codecs. It contains 300 frames encoded into 30 frames per sec., and lasts 10 sec. The encoded sequence format is CIF (352 lines x 288 pixels). The maximum allowed packet length is 536 bytes, in order to avoid the fragmentation of packets between routers. We present here the quality-affecting parameters that we consider as having the highest impact on quality:

The Bit Rate (BR): this is the rate of the actual encoder's output. It is chosen to take four values: 256, 512, 768 and 1024 Kbytes/sec. It is known that not all the scenes compress the same ratio, for the same video quality, depending on the amount of redundancy in the scene (spatial and temporal), as well as image dimensions. All video encoders use a mixture of lossless and lossy compression techniques. Lossless compression does not degrade the quality, as the process is reversible. Lossy compression degrades the quality depending on the compression ratio needed by changing the quantization parameter of the Discrete Cosine Transform (DCT). For details about video encoding, see Section 3.4. If the video is encoded only by lossless compression, the decoded video will have the same quality as the original one, provided that there is no other quality degradation. In our method, we normalize the encoder's output in the following way. If BRmax denotes the bit rate after the lossless compression process and BRout is the final encoder's output bit rate, we select the scaled parameter BR = BRout/BRmax. In our environments, we have BRmax = 1430 Kbyte/s.
The Frame Rate (FR) or the number of frames per second: the original video sequence is encoded at 30 frames per sec. This parameter takes one of 4 values (5, 10, 15 and 30 fps). This is done in the encoder by dropping frames uniformly. A complete study of the effect of this parameter on quality is given in [53].
The ratio of the encoded intra macro-blocs to inter macro-blocs (RA): this is done by the encoder, by changing the refresh rate of the intra macro-blocs in order to make the encoded sequence more or less sensitive to the packet loss [84]. This parameter takes values that vary between 0.053 and 0.4417 depending on the BR and the FR for the given sequence. We selected five values for it.
The packet loss rate (LR): the simulator can drop packets randomly and uniformly to satisfy a given percentage loss. This parameter takes five values (0, 1, 2, 4, and 8 %). It is admitted that a loss rate higher than 8 % will drastically reduce video quality. In the networks where the LR is expected to be higher than this value, some kind of FEC [131] should be used to reduce the effect of losses. There are many studies analyzing the impact of this parameter on quality; see for example [4,53,58].
The number of consecutively lost packets (CLP): we chose to drop packets in bursts of 1 to 5 packets. These values come from real measurements that we performed before [92]. A study of the effect of this parameter upon the quality is, for instance, in [58].

The delay and the delay variation are indirectly considered: they are included in the LR parameter. Indeed, it is known that if a dejittering mechanism with adaptive playback buffer size is used, then all the packets arriving after a predefined threshold are considered as lost. So, in this way, all delays and delay variations are mapped into loss. The receiver holds the first packet and some next packets in the buffer for a while before playing them out. There are many studies to find the optimal values of the buffer sizes, the number of packets to be hold in the dejittering buffer and the time at which the first packet in the buffer is to be played out [37]. The amount of the hold time is a measure of the size of the jitter buffer. In article [159], authors argue that the effect of delay and delay jitter can be reduced by the MPEG decoder buffer at the receiver. This is also valid for H263 decoder buffer. A de-jittering scheme for the transport of video over ATM is given in [122]. As stated, the author argues that the scheme can be easily adapted to IP networks. Other study about jitter is given in [162] where it is stated that the jitter affect the decoder buffer size and the loss ratio in a significant way. If we choose to consider all the combinations of these parameters' values, we have to take into account 4x4x5x5x5 = 2000 different combinations. It is the role of the NN to interpolate the quality scores for the missing parts of this potentially large input space. We followed the procedure given in Section 4.3, in page

, to find the minimum number of combinations to use. Hence, we chose to give default values and to compose different combinations by changing only two parameters at a time. This led to 94 combinations (as shown in Table 6.2). We will see from the next Chapter that this number is sufficient to train and test the NN.

Next: Subjective Quality Test and Up: Measuring Video Quality Previous: Network Transport Simulation Contents Index

Samir Mohamed 2003-01-08