Tuesday, June 12, 2007

ATSC HD – The end of the story

Boosting D 805 performance 25+% yielded about 15% increase in speed transcoding SD tape captures to 640x480@29.97 around 750Kbps AVC. The following table summarize different encodes on different computes (italic is for expected times)
D 805 (2x3.4GHz)T2300(2x1.6GHz)D 805(2x2.6GHz)M 725(1.6GHz)M 360(1.6GHz)Dual Xeon(4x3.4GHz)
VTR(640x480@25)600kbps35+/2139?/19.533/1818/9
VTR(640x480@29.97) 750kbps35+/2139?/19.533/1817/945/30.5
mencoder 25fps 750kbps21.7(26)
mencoder 29.97fps 750kbps28.48
DVD (interleave) 600kbps40/2035/1818/917/9
480pSD 500kbps23/15
720pSD 800kbps17/1019/1113/813.5/13
720pSD 1Mbps10/719/118/5.5
mencoder 720pSD 1Mbps97
720pHD 1Mbps11.5/711.0/78.5/5.820/14
mencoder 720pHD 1.2Mbps7.4 of 25 (9of30)4.5
1080pHD 1.8Mbps7.3/47.2/46/???
mencoder 1080pHD 2Mbps4 of 25 (5of29.97)4.5
It pretty much boils down to the following
On performance
  1. 480p SD is real-time on current generation of processors (Core Duo, Core 2 Duo) regardless where it comes from. By real-time I mean either one-pass or 2-nd pass. Two pass encodes would go faster then 2xRT on faster processors, but the second pass would still be around RT. One pass encode although being slower because it needs more bandwidth (750 kbps vs. 600 kbps on good sources or 800+ vs. 750 on noisy sources) would still be faster then 2-pass with filtering and would create same result. 480p SD is real-time not because the processors got that much faster, but rather because there are now two of the same processors as before, i.e. for most part improvement in processor efficiencies got eaten up by scheduling between 2 threads.
  2. 720p SD is 2+xRT at 800kbps, but it needs 1000kbps on one-pass, thus it is at least 2.5xRT. Since the first pass takes the same 2xRT it would take 4.5xRT to do two-pass encode and thus it makes no sense to do 2-pass encode 720p SD on current generation of processors. It is questionable if it makes sense to encode 720p SD period – 480p would work fine for SD and could be done twice as fast.
  3. 720p HD is 3.5-5xRT. 2nd-pass could be done in 3.5xRT at around 1Mbps. One-pass would need 1.25Mbps and thus could go as slow as 5xRT. Quad Core 2 supposedly more then 2-times faster (2-times for 4 cores, “more” for Core 2) thus 720p HD would take 2-3xRT, i.e. one-pass should clock 10+ fps – still more then two times real-time – even 4 cores cranking the best they could would make it close to what Pentium M 360 (as in Toshiba) could do to DVD since 2005.
  4. 1080p HD is 5+RT - is not happening until $350 computer would have 10 Dells D600 inside. D 805 and Yonah sometimes have difficulties playing 1080p since WMP 11 cannot thread across cores.
So as far as performance is concerned when it comes to ATSC until every shows on the air is not in HD, it is 640x480@25fps single pass at around 700kbps – no need for wide screen TV, xbox should work, D 805 should do it in 1.33xRT i.e. 22.5 fps for 29.97 stations, 45 fps for 59.94 station, 18 fps/25 fps for x264. Movies are still the best from DVD and if need be could be upconverted to 720p.
On encoding
MainConcept MP2 decoder (ad2mcdsmpeg.ax) that comes pretty much with every software these days is the best decoder because it could deinterlace and output in YV12
  • 480p from clean sources (DVD, ATSC) could be encoded around 600kbps at 25 fps in two-passes at 2xRT and 700kbps in single pass around RT (1.33RT for ATSC). Bellow 600kbps it is pushing it, but for less then 480 height is possible (like I did it with China @500kbps for 360 and better yet 368 height).
  • 480p from bad sources (tapes) better be left at original 29.97 frame rate and encoded at 750kbps in 2-passes at less then 2xRT. Deinterlacing should be done by Mainconcept decoder, so the only filter is DeGrainMedian(limitY=5,limitUV=5,mode=3) and Undot. Nothing could be done to bad sources – junk in – junk out.
  • 720p SD is not worth it, but since I played with it… Two-pass encodes need 700-800kbps at 25 fps. One pass needs 800+ (really around 1Mbps). Goes around 2.75xRT. Doesn’t work on XBOX.
  • 720p HD – needs 1.25Mbps for 2-pass at 25 fps. For single-pass 1.25Mbps on 25 fps could be OK, but 1Mbps is definitely not enough.
  • 1080p HD – would need 1.75-2Mbps but I couldn’t really tell since I cannot even play it.
This is as far as 32-bit architecture would go – no need for 24” widescreen monitor, no need for HD-DVD. What needed is 64-bit software that may be would speed things up 2-times. For now the following is left to be tested on 32-bit
  • SD show as one-pass at 480p on emachine at 700kbps for 25 fps
  • Tape as one-pass at 480p on emachine at 29.97 at 750kbps.
  • Upconvert widescreen DVD to 720p and 2-pass at 1Mbps. Time and log PSNR and SSIM
  • 1080p on Lenovo (and upclocked emachine) – would it play? Cannot play on old emachine.
  • DRVMSToolbox to automatically transcode
Oh and BTW, 32-bit Vista started to rank computers. This is what I’ve got
CPURAMAeroGPUDisk
Lenovo4.64.53.33.04.1
eMachine4.63.92.23.05.0
D805+GMA950@3.3GHz4.94.53.03.05.2
D6003.44.01.91.03.7
P.S. Did a bit more testing... more research actually... and cannot compile mencoder on OSX and they stopped pre-building in 2006, so on OSX there is no support for neither dvr-ms, nor recent x264 options... Pretty much a dead end trying to test 2.33GHz Core Duo2 I've got in MacBook. Parallels loads CPU just about 70% and at that transcodes 720pSD 1Mbps at 9of30 or 7.5fps or the same as overclocked D 805. So theoretically 2.33GHz Core Duo2 (4GB L2 on 667MHz) could be 30% faster, but practically there is no software for OSX and even 30% faster would still be worse then 2xRT single pass for 720pSD and 720pHD and for 1080pHD it would be still around 5-6xRT. So as expected Core Duo2 will not make any difference as far as HD goes. Would 64-bit make any difference?

Since on MacBook Parallels will not fully load CPU the only way to find out is to either install Vista x64 or Ubuntu on D 805. There is no overclocking software for Linux, so at the end it would be Vista x64 for which I would have to build mencoder myself and that would take too much time. So until somebody would build 64-bit version of mencoder I am done playing with HD and yeah, there is no way I am buying overpriced Apple hardware 64-bit Leopard or not. Only Quad would make me happy and that means L775 socket.

P.P.S. Automating DRVMSToolbox would imply writing a custom action in C# to call different mencoder profiles (different crop options) which is not a big deal, but would take time... Since most of the programming is crap it ain't worth my time either. Get back to this only once there is a 64-bit mencoder for Windows and do it by writing a shell script to run overnight thru Scheduled Tasks. There are 64-bit versions of x264 BTW to test that DVD upconverts.

No comments:

Followers