1. Introduction
Presently, Objective Video Quality Assessment is became essential for references towards Human Visualization characteristics and moreover Some features of HVS like contrast, orientation sensitivity, spatial and temporal masking effects, color perception and frequency selectivity are considered towards incorporating in real time scenarios Even though it is computationally very complex to incorporate above aspects in real time issues which are useful for a wide range if it correlates well with human perception. The impairments visibility is subjected to spatial and temporal properties of visual content, more over spatial and temporal complexity is quite expensive and time conservative method while considering within Human Visual characteristics.
2. Reference Works of ITU Recommendations
Since our methodology of data screening is limited to visual content excluding audio because we considered out test video sequences data at frame by frame level or macro block level where complexity of motion content is essential and audio visual data is negligible. Results of subjective assessment largely dependents on the factors like selection of test video sequences and welly defined evaluation procedure. In our research work, we carefully employed the specifications recommended by ITU-R BT 500-10 [
1] as mentioned and VQEG Hybrid test plan in which is explained briefly in following sections.
Test Video Sequences
Shahid et.al [
2] considered six different video sequences of CIF and QCIF spatial resolutions were selected in raw progressive format based on different motion content and including various levels of spatial-temporal complexity recommended by ITU-R P.910.
Specifications of Data Screening Methodology
The involvement of human observers within laboratory viewing environment specified by ITU-R BT.500-12 Standards are mentioned for Single Stimulus Continuous Quality Evaluation(SSCQ) process out of Single Stimulus which includes Stimulus Comparison Quality Evaluation.
3. Estimation of Motion Dynamics while Considering User Experience(UX)
Investigation of unidentified error at decoder side has been considered into account because of missing motion vectors in reconstructed frames which resulted in increase computational complexity within motion vectors and its not due to poor coding or compression nor because of delay. Amitesh et al. [
3] proposed an approach towards identifying tradeoff between of image compression and quality estimation more over based on assumptions if b frame size is less than predetermined threshold the we must not consider motion intensity feature which was traced out by User Experience, so decision making tree decides either to consider the motion intensity features out of all or not based on conditional acceptance.
4. Translations of Recurrent Neural Networks
The measurement of spatial-temporal information is essential due to quality of transmitted video sequence is highly dependent on this whereabouts. The formulations for quantifying spatial and temporal perceptual information of test sequences are shown in following sections.
Spatial Information
SI is calculated for each frame of video sequence within luminance plane, which leads in obtaining spatial information for n frames and it is expressed as
is each frame at time n which is filtered with sobel filter.
is standard deviation over pixels.
is maximum value in time series.
Temporal Information
TI is calculated for amount of temporal changes of a video sequence on luminance plane and measurement of temporal perceptual information is computed as
M(i,j) is motion difference between pixel values in space for sequential frames on the luminance plane. M(i, j) as a function of frame(n) and it is expressed as
f(i, j) is pixel at ith row and jth column of a nth frame in a video sequence and f(i, j) is pixel at ith row and jth column of a (n-1) frame in a video sequence. The higher motion content in successive frames of a video sequence leads to higher temporal content. The below scatter plots shows spatial and temporal perceptual content obtained from above equations, which were computed on luminance plane for all CIF and QCIF test sequences.
Figure 1.
Spatial and Temporal information computed for luminance component of selected CIF and QCIF videos.
Figure 1.
Spatial and Temporal information computed for luminance component of selected CIF and QCIF videos.
4.1. Principles Based on Translations
Structural Information and Motion Content: Out of all existing features within structural information of bit stream data, motion vector plays quite essential role for quantifying dedicated features such as motion intensity, more over Motion vector complexity is quite high at macro block layer as mentioned in [
4].
Coding Distortion: The effectiveness of changes for identifying the errors within data transmitted due to interruption in signal within a channel is completely based on coding theory and amitesh et al.[
5] explained in detail information about rate distortion control and information theory.
5. Confidence Interval of Observations and Consistency Based on Decision Making Tree
This above box plot shows inconsistency between outcome of two possibilities based on decision making tree which concludes that there is huge difference in prediction between observations within observers based on comparison of consistency.
Figure 2.
99 % Confidence Interval between observations and consistency for Two Decisions
Figure 2.
99 % Confidence Interval between observations and consistency for Two Decisions
6. Conclusion
Statistical analysis illustrates that our principles based on Recurrent Neural Networks predicted with acceptable consistency which is expected. Since all the 120 video sequences which we generated for our research work are encoded in JM Reference 16.1 based on H.264/AVC software uses Rate distortion optimization algorithm for improving video quality while video compression and distortion measure in JM encoder is MSE by default[
6]. It concludes that our assumptions based on RNN priciples are within reach of Human Perception level because subjective scores are considered as true or standard scores of video quality.
References
- ITU-R Radio communication Sector of ITU, Recommendation ITU-R BT.500-12, 2009. http://www.itu.int/.
- Shahid, M.; Singam, A.K.; Rossholm, A.; Lovstrom, B. Subjective quality assessment of H.264/AVC encoded low resolution videos. 2012 5th International Congress on Image and Signal Processing 2012, pp. 63–67.
- Singam, A.K. Peer Review of Tradeoff between Image Compression and Quality Estimation. Global Journal for Research Analysis 2023, pp. 74–75.
- Singam, A.K.; Wlode, K. Revised One, a Full Reference Video Quality Assessment Based on Statistical Based Transform Coefficient. SSRN Electronic Journal 2023. [Google Scholar] [CrossRef]
- Singam, A.K. Coding Estimation based on Rate Distortion Control of H.264 Encoded Videos for Low Latency Applications. arXiv e-prints 2023 p. arXiv:2306.16366, [arXiv:cs.IT/2306.16366], arXiv:2306.16366, [arXiv:cs.IT/2306.16366]. [CrossRef]
- Singam, A.; Wlodek, K.; Lövström, B. Classification Review of Raw Subjective Scores towards Statistical Analysis. SSRN 2023. [Google Scholar] [CrossRef]
|
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).