VMAF (Video Multi-method Assessment Fusion) is FR (Full Reference) tool to calculate MOS and auxiliary metrics of Video data transmitting through WiFi or 4G/5G Mobile Networks. It calculates the metrics by applying Reference (Original) Video file (signal) and the Degraded Video signal to the VMAF algorithm. It means that the Reference file must be known to the side receiving video traffic. So, Two-armed testing model is expecting: that is either two Landslide (LS) emulating subscribers (if SIP proxy) or one subscriber + one endpoint (if LS emulates a VoLTE Nodal and IMS Node/Endpoint).
The entire VMAF Calculation Model is shown below:
For the Flowchart simplicity UE1 shown as data transmitter and UE2 receives the video data and performs VMAF calculation. But the schema is fully symmetrical: both sides, UE1 as well as UE2, can transmit, receive, and accomplish VMAF Quality measurements.
The .AVI (rawvideo format) provides more accurate estimation of Video Quality. If rtpvideo DMF provides encoded video, then the Landslide decodes it to rawvideo presentation and uses it as the Reference Video Signal. It slightly lows the Quality metrics. However, usage of the encoded format makes sense as it significantly reduces input video file (50 times at least) against the rawvideo format. For the Video files of higher resolution (VGA, HD, and beyond) the encoded format is the only option as its rawvideo size are beyond 100 MB even for VGA. In fact, for rawvideo video of 640x480, fps 25, duration 8.8 seconds its size is 194MB. For 1080i/1080p the rawvideo is of almost 600MB. So, the TAC Reports will be over-weighted and unmanageable.
In a purpose of VMAF Quality measurement the Landslide will support the following Video formats:
# | Format | Codec | Comment |
---|---|---|---|
1 | rawvideo | N/A | Applicable for CIF (352x288) and QCIF (176x144) video formats only. It uses as the Reference Video Signal. |
2 | encoded | H.263 |
The video files will be decoded to rawvideo format, and it will be used as the Reference Video Signal. |
3 | encoded | H.264 | |
4 | encoded | H.265 | |
5 | encoded | VP8 |
In contrast to PEVQ the Media TDF in case of VMAF can be configured as rawvideo and an encoded one. The file qqq.mp4 is shown as an example of encoded one.
VMAF algorithm input is Reference and Degraded/Distorted files in YUV format. This format is of rawvideo flavor, and the implementation uses it internally, not exposing to GUI or whatever level.
Landslide PEVQ implementation is widely using FFMPEG (FFmpeg - Wikipedia) for video encoding/decoding, packetization/depacketization. To make it possible FFMPEG is built along with the specific video codec support.
VMAF Library must be also added to FFMPEG over the same integration mechanism. Finally, FFMPEG executable becomes a universal tool performing not only video data transformation but also to accomplish VMAF Quality Measurements.
These are the following GUI components:
1) CPU core reservation to perform VMAF calculation: “Test Server | Configuration” - About Test Server Configuration
2) “System Status” that reports “Remaining VMAF ECs” - About the System Status Window
3) Actual VMAF enabler - Media Qos Settings
TC Nodal side:
IMS Node side:
Additional Quality Metrics is introducing to give customer more control over VMAF execution time versus Metrics types:
POLQA and VMAF share the same Resource Reservation mechanism to allocate CPU core to perform POLQA and/or VMAF calculation. - About Test Server Configuration
System Status shall include “Remaining VMAF ECs” - About the System Status Window
VMAF measurements are added to L5-7 Client | Video (traf) and L5-7 Server | Video (host):
Name | Variable (prefix = traf, host) | Highlighting | Value Range | Description |
---|---|---|---|---|
RTP VMAF-MOS Successes | prefixRtpVideoVmafMosSuccess | 0-2^64-1 | counter of successful measurements | |
RTP VMAF-MOS Failures | prefixRtpVideoVmafMosFailed | Yes | 0-2^64-1 |
counter of failed measurements |
RTP VMAF-MOS Overloaded | prefixRtpVideoVmafMosOverloaded | Yes | 0-2^64-1 | counter of measurements that cannot be performed because of the overloading condition: VMAF calculation for previous Video stream is not completed but current stream is fully collected and ready for VMAF. |
RTP VMAF-MOS Average | prefixRtpVideoVmafMosValue | 0.0-100.0 |
Average/Min/Max MOS: 0.0 is worst; 100.0 is best. Reported to GUI as (int64) (float-value*1000) |
|
RTP VMAF-MOS Minimum | prefixRtpVideoVmafMosMin | |||
RTP VMAF-MOS Maximum | prefixRtpVideoVmafMosMax | |||
RTP VMAF-MOS Duration Average (s) | prefixRtpVideoVmafMosTimeValue | 0.0-100.0 |
Average/Min/Max time (sec) consumed for VMAF MOS and supplemental metrics calculation. Reported to GUI as (int64) (float-value*1000) |
|
RTP VMAF-MOS Duration Minimum (s) | prefixRtpVideoVmafMosTimeMin | |||
RTP VMAF-MOS Duration Maximum (s) | prefixRtpVideoVmafMosTimeMax | |||
RTP VMAF PSNR-Y Average (dB) | prefixRtpVideoVmafPsnrYValue | 0.0-50.0 |
Average/Min/Max PSNR-Y (Luma). Bad: 30.00-; Poor: 30.0-33.0; Fair: 33.0-38.0; Good/Excellent: 38.0+ Reported to GUI as (int64) (float-value*1000).
|
|
RTP VMAF PSNR-Y Min (dB) | prefixRtpVideoVmafPsnrYMin | |||
RTP VMAF PSNR-Y Max (dB) | prefixRtpVideoVmafPsnrYMax | |||
RTP VMAF PSNR-Cb Average (dB) | prefixRtpVideoVmafPsnrCbValue | 0.0-50.0 |
Average/Min/Max PSNR-Cb (U - Blue Chroma) Bad: 30.00-; Poor: 30.0-33.0; Fair: 33.0-38.0; Good/Excellent: 38.0+ Reported to GUI as (int64) (float-value*1000).
|
|
RTP VMAF PSNR-Cb Min (dB) | prefixRtpVideoVmafPsnrCbMin | |||
RTP VMAF PSNR-Cb Max (dB) | prefixRtpVideoVmafPsnrCbMax | |||
RTP VMAF PSNR-Cr Average (dB) | prefixRtpVideoVmafPsnrCrValue | 0.0-50.0 |
Average/Min/Max PSNR-Cr (V - Red Chroma) Bad: 30.00-; Poor: 30.0-33.0; Fair: 33.0-38.0; Good/Excellent: 38.0+ Reported to GUI as (int64) (float-value*1000).
|
|
RTP VMAF PSNR-Cr Min (dB) | prefixRtpVideoVmafPsnrCrMin | |||
RTP VMAF PSNR-Cr Max (dB) | prefixRtpVideoVmafPsnrCrMax | |||
RTP VMAF SSIM Average | prefixRtpVideoVmafSsimValue | -1.0 - +1.0 |
Average/Min/Max SIMM:
All other values are within the range. Reported to GUI as (int64) (float-value*1000) |
|
RTP VMAF SSIM Min | prefixRtpVideoVmafSsimMin | |||
RTP VMAF SSIM Max | prefixRtpVideoVmafSsimMax |
Actual Capacity/Performance of the system depends on:
Real number of concurrent UEs performing VMAF will be known upon the implementation completion, and we will be able to accomplish VMAF benchmark.
Keep in mind the following rule: if Video file duration time is Tvideo, and VMAF calculation time is Tvmaf then 'Number of UEs' = Tvideo / Tvmaf without Overload condition (see description to RTP VMAF-MOS Overloaded OM) being reported.
If the Overloading is okay, then number of UEs performing VMAF can be extended.
For example: if Tvideo = 9 seconds and Tvmaf = 3 seconds then
FFMPEG is a universal tool that accomplishes all needed video data transformation as well as getting of VMAF MOS score along with auxiliary metrics (PSNR, SSIM).
Assume for simplicity:
ffmpeg -i Ref.avi -pix_fmt yuv420p -vcodec h264 -y Ref.mp4
ffmpeg -v trace -i Ref.mp4 -f rtp -vcodec copy -an rtp://127.0.0.1:7000?pkt_size=1024 1>Ref.sdp
This packetization method assumes that the Landslide application listens on 127.0.0.1/7000 and receives video data in form of RTP packets.
fmpeg -v trace -stdin -threads 1 -buffer_size 500000 -reorder_queue_size 40 \
-protocol_whitelist "file,crypto,udp,rtp" -i Ref.sdp \
-pix_fmt yuv420p -local_rtpport 7000 -local_rtcpport 7001 \
-y Deg.mp4 2>/dev/null
On the receive side RTP packets after Jitter Buffer delay are sending to 127.0.0.1/7000. Once all the RTP packets that constitute video file have been sent Landslide application sends RTCP BYE to FFMPEG signaling the completion. Deg.mp4 is the result of this step.
ffmpeg -v trace -i Deg.mp4 -vcodec h264 -pix_fmt YUV420P -y Deg.avi 2>/dev/null
After this the final step of getting the VMAF score can be accomplished.
ffmpeg -i Ref.avi -i Deg.avi -lavfi \
" \
[0:v]trim=start=0.080:duration=8.000,setpts=PTS-STARTPTS[reference]; \
[1:v]trim=start=0.000:duration=8.000,setpts=PTS-STARTPTS[degraded]; \
[reference][degraded]libvmaf='log_fmt=xml:log_path=vmafMeas.xml: \
feature=name=psnr|name=ssim' \
" -f null
Per video frame results as well as averaged values are produced in vmafMeas.xml file.