Wednesday, November 16, 2005

480x352

VideoSSIMPSNREncodeSSIMPSNRNotes
Bike45.2732.53480-400-2048.5135,73
Dance56.4834.66480-400-2070.2438.23
Gazon61.5435.24480-400-2051.9938.82
Udarenia61.9832.00480-400-2067.3636.20
Podcast61.9834.41
And Dance at different framerates at different bandwidth
EncodeSSIMPSNR
Dance-500-2569.9838.21
Dance-400-2069.8938.01
Dance-300-1566.4238.17

Friday, November 11, 2005

MPEG-2 SSIM as a gauge for MPEG-4 encoding

Different videos are well different. Even encoded to MPEG-2 with minimal quality loss they have different SSIM and PSNR. Further encoded to MPEG-4, those that have higher SSIM and PSNR while encoded to MPEG-2, would "encode better". This dependency is not linear, i.e. videos with higher SSIM and PSNR for MPEG-2 would not neccessarily encode better to MPEG-4 then those with slightly lower SSIM and PSNR for MPEG-2 - it could be the other way around. However, videos with "way lower" SSIM and PSNR for MPEG-2 would encode to MPEG-4 a lot worse then those with "way higher".

For example, as shown in the table bellow, for MPEG-2 "Infiniti" video has higher SSIM and PNSR then "Budweiser" video, but "Budweiser" encodes better to MPEG-4 and could be squeezed all the way bellow 250kbps at 15 fps at SD resolution (480p) while "Infiniti" needs 300kbps for the same profile. (Visually at 300kbps Infiniti video looks about the same as Budweiser at 250kbps - very few noticable artifacts, while at 250kbps Infiniti is about the same as Budweiser at 200kbps - some noticable artifacrs).

EncodeSSIMPSNREncodeSSIMPSNR
InfinitiBudweiser
MPEG-286.9841.91MPEG-274.5437.74
153kbps@15fps71.1433.82
203kbps@15fps74.4134.06
257kbps@15fps73.9740.01254kbps@15fps76.3334.18
309kbps@15fps75.7040.47306kbps@15fps77.6840.74
413kbps@15fps78.3541.17511kbps@15fps81.0134.46
On the other hand "Dance" video has "way lower" SSIM and PSNR when encoded to MPEG-2 then either Budweiser or Infiniti and when encoded to MPEG-4 cannot be squeezed bellow 500 kbps at 15fps at 480p at all! The following table list SSIM and PSNR for video encoded using different methods at 500 kbps at 15 fps - all has unaceptable video quality.
EncodeSSIMPSNR
MPEG-256.4834.66
x26435.1927.37
divx635.3227.37
divx5.6622.59
xvid5.4322.59
Similar pattern could be observed with CNN stock footage - videos with higher SSIM/PSNR for MPEG-2 encodes encode fine upto 300 kbps at 15fps, perhaps with the exception of 035-airplane video that regardless of SSIM similar to "Dance" video could be pushed all the way to 300 kbps, alas with noticable artifacts.
VideoSSIMPSNREncodeSSIMPSNRNotes
021-war79.7439.26300kbps@15fps76.0741.29Fine
080-reuters85.2837.74500kbps@25fps62.4136.35Fine
035-aero53.1232.00300kbps@15fps39.1333.89Blockiness
And finally, even home videos exhibit similar pattern (MPEG-4 encode at 500fps@15fps)
VideoSSIMPSNREncodeSSIMPSNRNotes
Bike45.2732.53 
Dance56.4834.66500kbps@15fps35.1927.37Worst, 15fps are not enough for motion
Gazon61.5435.24500kbps@15fps54.4539.23Some artifacts, 15 fps are not enough
Udarenia61.9832.00300kbps@15fps56.5231.19Almost fine. 400-500kbps almost do it
Podcast61.9834.41    
So what is the point of this excersize especially at half the framerate?
  • "super clean" videos could be encoded 300fps@15fps at 480p or between 500kbps - 700 kbps at normal framerates.
  • SSIM < 50-60 or videos with a lot of motion or black background (like home videos) cannot be encoded at 480p resolution bellow 1Mbps and need to be scaled to 480x360 to be sqeezed into 500 kbps
  • scaling framerate to 15-20 fps works for "talking head" video, but is not acceptable for videos with motion, thus once again, the only option is scaling to 480x360.

Tuesday, November 08, 2005

MPEG-2 SSIM and PSNR

On D600 1.6Gz Pentium-M CPU mencoder could do MPEG-2 at 15fps with avisynth script and 33fps without (CPU <80%), but result is bad. Apart from video getting stuck (probably because mencoder could only do I and P and no B frames) SSIM 0.59 or 0.49 and PSNR 11.89 both times.

At the same time TMPGEnc could do 14.5 fps when encoding using "Constant Quality" = 100% and average bitrate 6Mbps which quality and size wise is the same as "2pass VRB" with average bitrate 6Mbps and 8Mbps max. The following table demonstrates that on "Dance" video with 2658 frames i.e. 88.7 sec or 01:28

EncodeTimeFPSkbpsSizeSSIMPSNR
CQ100-8-ME3:1713.5 fps771883M45.2431.46
CQ80-8-ME3:0714.2 fps672073M44.9531.45
CQ100-6-ME3:0314.5 fps581963M43.4231.40
VRB-8-6-ME8:125.40 fps582063M44.0831.44
VRB-8-6-HQ10:414.15 fps581663M44.2931.45
VRB-8-6-HighestQ24:161.83 fps581563M44.5131.46
VRB-8-6-ME (NR)45:280.97 fps580663M43.9031.31
From the table it is clear that neither "High Quality" setting, nor "2-pass VRB" nor especially TMPGEnc noise reduction filter really matters and most efficiently encoding should be done either CQ100-6-ME or if time is of no importance VRB-8-6-ME.

TMPGEnc also has built in deinterlacer that at VRB-8-6-ME encoded in 6:42 to the same 63M using "Deinterlace Even-Odd - field, adoptation" filter, but the video became noticably jerky.

Wednesday, November 02, 2005

PSNR and compressibility

Some video just don't seem to compress well. Namely, only video with PSNR over 30, compresses well at bellow 500kbps at full resolution. But some video has PSNR close to 20 even when encoded to MPEG-2 at close to 8Mbps with no visible deffects. Those stream just don't compress well in MPEG-4 and thus the only solution is to make the picture smaller.

For example, both Budweiser and Infiniti comercials have PSNR 32+ when encoded to MPEG-2 and could be "easily" encoded bellow 500kbps with similar PSNR

Clip PSNR Y PSNR U PSNR V
Budweiser
MPEG-2 32.09 41.40 41.64
15fps@ 300 kbps 31.71 42.30 41.79
15fps@ 150 kbps 31.18 41.98 41.31
Infiniti
MPEG-2 32.17 45.54 45.29
15fps@ 300 kbps 31.57 47.08 46.80
15fps@ 250 kbps 31.43 46.64 46.42
While my home videos has PSNR around 20 and cannot be encoded bellow 500kbps@15fps (or 1Mbps at regular framerates) without significant artifacts.
Clip PSNR Y PSNR U PSNR V
Dance
MPEG-2 22.03 25.29 26.52
15fps@ 500 kbps 21.14 37.53 45.03
15fps@ 300 kbps 21.14 37.35 44.17
xvid-15fps@ 346 kbps Adv Simple@L5 21.12 37.39 44.55
divx-15fps@ 388 kbps Simple @L3 21.11 37.19 43.87
divx6-15fps@ 375 kbps Simple @L3 20.22 36.69 43.67
Udarenia
MPEG-2 24.28 31.65 37.31
15fps@ 500 kbps 24.88 35.89 41.01
15fps@ 400 kbps 24.15 34.98 40.36
15fps@ 300 kbps 24.14 34.85 39.99
BTW, neither xvid, nor divx5 or divx6 didn't play in QuickTime. Also Gazon (4:20) turned out to be too big for vacpsnr (4+Gb in yuv), so no data... but 15fps@300kbps (9.6M+audio) is definitelly not enough bandwidth, while 15fps@500kbps (15.9M+audio) is watchable.

Another interesting observation is that rizing framerate would increase PSNR, while visual quality would suffer. For example, here is PSNR for "airplane clip"

Clip PSNR Y PSNR U PSNR V
MPEG-2 22.49 34.22 38.35
15fps@300 kbps 22.84 36.06 39.57
20fps@ 300 kbps 24.27 36.47 40.08
25fps@ 300 kbps 25.06 36.57 40.26
29.97fps@300 kbps 25.56 36.53 40.37
All in all, it looks like for most, but cleanest footage (like commercials) 500kbps is not enough even at half the framerate. Only videos with MPEG-2 PSNR over 25 could be played with to push bellow or around 500kbps, others would need to be resized bellow 480p to let's say 480x360.

Followers