• 台湾“裤子大王”:百姓三餐不济谈啥“台湾价值” 2019-05-23
  • 韩国釜山海滩变“垃圾场” 清洁工叫苦不堪 2019-05-23
  • 浙江宣讲十九大:之江大地“好声音”“红船”精神入人心 2019-05-19
  • “回天地区”下月开放千套人才公寓 ——凤凰网房产北京 2019-05-13
  • 中国智能手机在东南亚受追捧 2019-04-25
  • 阜阳网络达人“点赞”颍泉绿化提升专项工作 2019-04-23
  • 《国家人文历史》往期杂志汇总 2019-04-22
  • 一师一团土地确权登记颁证工作全面展开 2019-04-14
  • 德州扑克赌场披“俱乐部”外衣 打竞技旗号难掩赌博实质 2019-04-12
  • 自治区党委召开常委(扩大)会议 陈全国主持 2019-04-12
  • 17年来首次!塔利班组织宣布停火3天 与阿富汗民众自拍 2019-04-04
  • 2022年冬奥会筹备进行时 2019-04-03
  • 人家80年前就造航母,我们现在才造航母,基础不一样。 2019-04-03
  • 葡萄牙首都上演城市节狂欢 2019-04-01
  • RED EARTH红地球展现自我丝绒唇膏全新发布 2019-03-24
  • Welcome to

    Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

     

    Go Back   Doom9's Forum > Capturing and Editing Video > Avisynth Development

    Reply
     
    Thread Tools Search this Thread Display Modes
    Old 13th March 2010, 18:59   #141  |  Link
    Registered User
     
    Join Date: Feb 2010
    Posts: 84
    Quote:
    Originally Posted by turbojet View Post
    JoshyD: The SSE2 build works. I don't think SSE2 is what AviSynth 2.60 is using during resize judging by these results

    2.58-x86: 51.51
    2.60a2-x86: 57.77
    +12%

    3-1-10-x64: 49.25
    SSE2-x64: 51.07
    +3%

    If you plan on suggesting the SSE2 build for AMD CPU's for now you might want to link libiomp5md.dll if you can, it's not an easy dll to find.
    It isn't . . . it uses the same opcodes my resizers use, which is why I'm at a loss as to why my resizers cause your computer to error.

    Avisynth 2.6 uses SSE3 as long as the FIR filter size is below 8, which is a pretty decent cutoff. Resizers that need larger filter sizes are normally pretty rare. TGMC beta 2 goes up to size 9 for my test vectors.

    Also, my compiler settings shouldn't allow your Athlon down any Intel code paths. The intel specific code paths, from what I can gather by reading icc vs gcc posts, are only executed when an intel family processor is detected. Any of Intel's "special" code should look at your AMD processor, read the vendor ID, and send it down a generic code path.

    Also, aegisofrime seems to indicate that he has it running on his Phenom II x4 in the post above. kemuri-_9's system specs indicate he's running a Phenom II x4 as well, and he hasn't voiced any resize specific problems as of yet. Your Athlon II is a stripped down phenom II core, I believe. I'm a bit baffled.

    I usually link Open-MP statically, that SSE2 only build is really old and I probably omitted the compiler flag for it. It has to be manually entered, because Open-MP, when used by multiple plugins, will error if they're all statically linked. For this reason, I haven't built any of the plugins with Open-MP directives. EEDI2 in particular enjoys a massive speed gain if you let it run with multiple threads.

    Quote:
    you seem to have simply ported over the stack corruption checking that's supposed to check for the plugin being stdcall (and following that convention)
    which you had already pointed out that win x64 does not follow this convention so what's going on here?
    You are correct, I didn't touch any of these routines, they'll definitely need to be changed. I hadn't even thought about loading C plugins with this dll until you expressed interest in getting a 64bit port of FFMS2 working. I was updating on an "as needed" basis, it's just a lot of code to sift through, I can't get it all in one shot. I grabbed the FFMS2 source, are you building it with MinGW? There's a MSVS project in the svn checkout, but even with C99 support, I'm missing some headers, and wanted to be consistent with your build environment, for testing/debugging and such. A compiled x64 dll would let me run through my source to tie any loose ends up, if you could either a) just link an x64 dll or b) fill me in on how you're compiling, I'd greatly appreciate it, and can get the avisynth core changed ASAP.
    JoshyD is offline   Reply With Quote
    Old 13th March 2010, 19:33   #142  |  Link
    Compiling Encoder
     
    kemuri-_9's Avatar
     
    Join Date: Jan 2007
    Posts: 1,348
    Quote:
    Originally Posted by JoshyD View Post
    Also, aegisofrime seems to indicate that he has it running on his Phenom II x4 in the post above. kemuri-_9's system specs indicate he's running a Phenom II x4 as well, and he hasn't voiced any resize specific problems as of yet. Your Athlon II is a stripped down phenom II core, I believe. I'm a bit baffled.
    I've been mostly sitting on the sidelines on this and not actively testing so don't pull me into any arguments as proof of something!
    (I only got the binary and started working with it for trying the ffms2 plugin just yesterday!)
    I have 3 pcs consisting of PhenomII x4, Phenom x4, and athlon64 x2 all running on x64 versions of windows, so i can do testing from the AMD side of things if the need arises....
    (as i usually do this for x264 as the other x264 devs mostly use Intel and/or linux)

    Quote:
    You are correct, I didn't touch any of these routines, they'll definitely need to be changed. I hadn't even thought about loading C plugins with this dll until you expressed interest in getting a 64bit port of FFMS2 working. I was updating on an "as needed" basis, it's just a lot of code to sift through, I can't get it all in one shot. I grabbed the FFMS2 source, are you building it with MinGW? There's a MSVS project in the svn checkout, but even with C99 support, I'm missing some headers, and wanted to be consistent with your build environment, for testing/debugging and such. A compiled x64 dll would let me run through my source to tie any loose ends up, if you could either a) just link an x64 dll or b) fill me in on how you're compiling, I'd greatly appreciate it, and can get the avisynth core changed ASAP.
    yes, I'm building with MinGW completely as per the reasoning of //doom10.org/index.php?topic=25.msg1730#msg1730
    ffms2.dll: x64 testing binary
    aforementioned LoadCPlugin plugin: x64 binary src
    kemuri-_9 is offline   Reply With Quote
    Old 13th March 2010, 21:28   #143  |  Link
    Registered User
     
    Join Date: Feb 2010
    Posts: 84
    @kemuri-_9
    Being the king of AMD PC's that you are, would you mind pulling a quick vertical resize test on any source to see if it's broken across the board for AMD users? If you could pull the latest binary (I uploaded a new one today), it'd be helpful to see if something funny is happening on the AMD side of things. You'll need it to properly test ffms2 anyway .

    Giving a quick look at your loadCplugin vs the current one that checks stack corruption, I decided to just drop your LoadCPlugin function in as a replacement. As a result, I've got your ffms2.dll loading and the few tests I've run (various sources with some post processing effects, etc) have it running, with some oddities. Running a debug build of avisynth through through MSVS's debugger points to some code in ffms2.dll that is making illegal memory accesses. However, frames come through and appear correctly when running a release build of avisynth. There are no memory access violations thrown.

    The oddity is that the 1st frame (frame 0) comes through as garbage. Moving around the source a bit, and then coming back to frame 0 has the frame rendering correctly. I figured I'd let you look it over, it may be my avisynth not initializing your plugin correctly. As you've probably gathered, I'm not too familiar with the plugin loading code of avisynth. My main focus has been on optimizing the calculation and memory heavy routines, which aren't usually hanging out in the core code.

    Oh, and in case you needed to see what I was building: a quick snapshot of my source. The C plugin routines still check for/specify fastcall and stdcall in places, but icc ignores these types when compiling 64bit. I thought it would be safe to leave them in if nothing was breaking as a result.

    It is very cool to have ffms2 working somewhat correctly though, it can't be far off from having full blown functionality.

    Last edited by JoshyD; 13th March 2010 at 21:38.
    JoshyD is offline   Reply With Quote
    Old 14th March 2010, 02:15   #144  |  Link
    Compiling Encoder
     
    kemuri-_9's Avatar
     
    Join Date: Jan 2007
    Posts: 1,348
    Quote:
    Originally Posted by JoshyD View Post
    @kemuri-_9
    Being the king of AMD PC's that you are, would you mind pulling a quick vertical resize test on any source to see if it's broken across the board for AMD users? If you could pull the latest binary (I uploaded a new one today), it'd be helpful to see if something funny is happening on the AMD side of things. You'll need it to properly test ffms2 anyway .
    har har, i have SWScale to resize with instead!
    but yes, trying something like Lanczos4Resize(Width(last),Height(last)*2) is throwing
    "Avisynth Unknown exceptions" exceptions here on my phenomII and athlon64 machines with this new build.

    Quote:
    Giving a quick look at your loadCplugin vs the current one that checks stack corruption, I decided to just drop your LoadCPlugin function in as a replacement. As a result, I've got your ffms2.dll loading and the few tests I've run (various sources with some post processing effects, etc) have it running, with some oddities. Running a debug build of avisynth through through MSVS's debugger points to some code in ffms2.dll that is making illegal memory accesses. However, frames come through and appear correctly when running a release build of avisynth. There are no memory access violations thrown.

    The oddity is that the 1st frame (frame 0) comes through as garbage. Moving around the source a bit, and then coming back to frame 0 has the frame rendering correctly. I figured I'd let you look it over, it may be my avisynth not initializing your plugin correctly. As you've probably gathered, I'm not too familiar with the plugin loading code of avisynth. My main focus has been on optimizing the calculation and memory heavy routines, which aren't usually hanging out in the core code.
    I'm not experiencing either of these, though i can't manage to compile the source code you provided (I do have ICL 11.1.051) due to missing convert_a64.asm to make a debug build

    Last edited by kemuri-_9; 14th March 2010 at 02:20.
    kemuri-_9 is offline   Reply With Quote
    Old 14th March 2010, 03:12   #145  |  Link
    Registered User
     
    Join Date: Apr 2009
    Posts: 453
    JoshyD, sorry for any confusion but the tests were on my Core 2 Duo machine, not my Phenom II machine. I do intend to test it on my Phenom II rig after my current encoding run is completed.
    aegisofrime is offline   Reply With Quote
    Old 14th March 2010, 04:09   #146  |  Link
    Registered User
     
    Join Date: Feb 2010
    Posts: 84
    kemuri-_9
    Whoops, I keep a CVS server running for my own code, forgot to add it to the tree. Here it is on it's own. Here's the source again, just to be on the safe side.

    If you can point me to the code making AMD processors so unhappy, I'd really really appreciate it.
    JoshyD is offline   Reply With Quote
    Old 14th March 2010, 06:19   #147  |  Link
    Registered User
     
    Join Date: May 2008
    Posts: 1,840
    Thanks levi for pointing to a newer RePAL version that outputs 25fps. About a year ago I had a blended pal->ntsc dvd that repal didn't handle all that well and tried SRestore to find worse results but maybe SRestore has improved since then. I agree repal should be low priority considering the low percentage of pal->ntsc sources.

    Another filter that I've started to use lately is autocrop which would be nice for avisynth64 and required for x264 input imo without it you need to depend on external programs to get crop values. Maybe something like --autocrop:width(none = no resize):resizer(lanczos default):mod(default 16) some examples with a 1920x1080 2.35:1 source:
    --autocrop:1280 = lanczosresize(1280,544,0,132,0,-132)
    --autocrop:1920 = --autocrop = crop(0,132,0,-132) (this should undercrop to mod set, since it's still higher quality then resizing)
    if the source is 1920x1080 1.78:1 --autocrop:1920 = 1920x1080 with no crop/resize (mod8 input with no crop/resize would always be mod8 output, same for mod4/2?)
    though something like --crop and --resize would be helpful in cases where autocrop doesn't work (which I haven't ran into yet). I really don't understand the need for --cli-filter prefix however.

    Thanks JoshyD for tivtc I'll try it on some sources next week.

    Last edited by turbojet; 14th March 2010 at 06:31.
    turbojet is offline   Reply With Quote
    Old 14th March 2010, 07:15   #148  |  Link
    Registered User
     
    Stephen R. Savage's Avatar
     
    Join Date: Nov 2009
    Posts: 341
    I just tried the new plugins, and I can say that all of them passed a quick test without issues. However, I do have a question regarding the "threads" parameter to some of these filters. Does the threads parameter get ignored on your Avs64 build, JoshyD?

    Now if only we had nnedi2

    Last edited by Stephen R. Savage; 14th March 2010 at 07:17.
    Stephen R. Savage is offline   Reply With Quote
    Old 14th March 2010, 08:01   #149  |  Link
    Registered User
     
    Join Date: Feb 2010
    Posts: 84
    Nope, they'll thread themselves independently of the main avisynth dll. It's a bit of a balancing act that the user has to do to make all processors stay busy, while not choking them with too many threads. NNEDI2 would be nice, but I don't think we're going to see that anytime soon.

    For now, I really want to figure out why AMD processors can't run the resizers in current binary . . . it's has to be something really simple that I'm missing.

    Last edited by JoshyD; 14th March 2010 at 08:07.
    JoshyD is offline   Reply With Quote
    Old 14th March 2010, 09:01   #150  |  Link
    Registered User
     
    Join Date: Mar 2007
    Posts: 35
    Hi,


    thread title says x86_64 so can i assume that it works on 32 bit versions of avisynth as well. can anyone confirm please.

    Thanks,
    mavinashbabu is offline   Reply With Quote
    Old 14th March 2010, 09:16   #151  |  Link
    Registered User
     
    Join Date: Apr 2009
    Posts: 453
    Quote:
    Originally Posted by mavinashbabu View Post
    Hi,


    thread title says x86_64 so can i assume that it works on 32 bit versions of avisynth as well. can anyone confirm please.

    Thanks,
    Nope, it doesn't. You can't mix 64-bit filters with 32-bit filters, AFAIK.
    aegisofrime is offline   Reply With Quote
    Old 14th March 2010, 10:36   #152  |  Link
    Registered User
     
    Join Date: Feb 2009
    Location: USA
    Posts: 651
    I did a clean download last night and grabbed everything again. To make sure I had all the recent updates.

    Did a test last night.. Took 2h:37m. And no crash. LEt's hope it stays that way

    The problem did not appear to be source access either, I think it was something like Trim or Decimate.. I'm guessing trim. I set Mode 3 after TGMC

    Code:
    LoadPlugin("C:\yatta\plugins64\decomb.dll")
    LoadPlugin("C:\yatta\plugins64\dgdecode.dll")
    LoadPlugin("C:\yatta\plugins64\telecidehints.dll")
    LoadPlugin("C:\yatta\plugins64\fieldhint.dll")
    
    
    function Preset0(clip c) {
    #Name: Default
    c
    return last
    }
    SetMTMode(2,0)
    DGDecode_Mpeg2Source("L:\Ep 01\VTS_01_1.d2v")
    
    
    
    FieldHint(ovr="L:\Ep 01\VTS_01_1.d2v.fh.txt")
    
    #MT("TempGaussMC_beta2().SelectEven()",threads=2,overlap=4)
    TempGaussMC_beta2().SelectEven()
    SetMTMode(3)
    
    PresetClip0=Preset0()
    
    PresetClip0.Trim(0,41023)
    
    
    DClip = Decimate(cycle=5,quality=3,ovr="L:\Ep 01\VTS_01_1.d2v.dec.txt").assumefps(last.framerate)
    osgZach is offline   Reply With Quote
    Old 14th March 2010, 11:47   #153  |  Link
    Registered User
     
    Join Date: Dec 2004
    Location: Melbourne, AU
    Posts: 1,963
    Are you perhaps confusing SSSE3 instructions with SSE3? I think you'll find the pmulhrsw instructions are the issue.
    squid_80 is offline   Reply With Quote
    Old 14th March 2010, 15:29   #154  |  Link
    Registered User
     
    Join Date: Mar 2003
    Posts: 116
    Xeon quad core E5530 2.40 ghz w/ turbo(hyperthreading)

    x264(x86) + set avisynth 2.6(x86)
    First Pass Output to null = 27.18

    x264(x64) + JoshyD avisynth 3-13-10(x64)
    First Pass Output to null = 29.54

    I've seen an 8% speed improvement

    Code:
    SetMTmode(3,3)
    mpeg2source("my.d2v")
    SetMTmode(2,3)
    tdeint()
    crop(4,4,1916,1076)
    LanczosResize(1280,720)
    levi is offline   Reply With Quote
    Old 14th March 2010, 17:29   #155  |  Link
    Registered User
     
    Join Date: Feb 2010
    Posts: 84
    @Squid_80
    You've got it. It's pshufb that throws the first error on my (really old) Athlon64. Then, those pmulrhrw's probably cause an issue as well. Reading some programming message boards, apparently pshufb has been giving AMD developers trouble. I'd guess the subtleties between SSE3 and SSSE3 do as well. Back to the drawing board for AMD people. Thanks for taking a look, being the only eyes looking over your source can drive a person a bit crazy.

    @levi
    Why only 3 threads? Are you accounting for x264 also taking up some cores? I think x264's defaults to creating 1.5x the amount of threads it detects your system can run. I was hoping for some more speed gains. Any chance on seeing some straight single threading tests on the same system? I want to re-build tdeint to see if I can't eek out some more performance as well.

    @osgZach
    It's great that there aren't any crashes anymore! I don't think trimming frames from a source should kill multithreading, but decimate has an unusual access pattern. I guess you can't win them all.

    Last edited by JoshyD; 14th March 2010 at 17:36.
    JoshyD is offline   Reply With Quote
    Old 14th March 2010, 17:37   #156  |  Link
    Registered User
     
    Join Date: Feb 2009
    Location: USA
    Posts: 651
    Crap keyboard.. lost my post to ill-place Forward/Backward keys...

    Anyway. Its technically 2 different sources being mixed in, so whatever the case the problem probably lies somewhere in there, with one or both of them.. I know Mode 3 was recommended for Trim at the very least.

    Any chance of grabbing the latest DGindex/decode source (1.5.8) and seeing if it will compile for x64? I'd give it a shot but if it needed changes I'd be clueless about that stuff..

    It would certainly make managing Yatta a lot easier (right now I have to setup two copies, using older x86 source to make projects, since the x64 DLL we have is only 1.4.6).
    If anyone that has time could look into it for that matter, it'd be great.

    Last edited by osgZach; 14th March 2010 at 17:40.
    osgZach is offline   Reply With Quote
    Old 14th March 2010, 18:08   #157  |  Link
    Registered User
     
    Join Date: Dec 2004
    Location: Melbourne, AU
    Posts: 1,963
    I just had a look on my HDD and it seems at some point I did make an x64 build of DGDecode 1.5.4: //www.mediafire.com/?dl4fc2yyyzz

    It seems to work but I have no idea if it's faster than the old build or if anything apart from the .d2v identifier was changed. Too bad the author has no regard for backwards compatibility.
    squid_80 is offline   Reply With Quote
    Old 14th March 2010, 18:43   #158  |  Link
    Registered User
     
    Join Date: Jan 2007
    Posts: 530
    FWIW, just did a test with an SD MPEG2 source, encoding with x264, using the 3/13 AVISynth64:

    AVS:
    Code:
    #LoadPlugin("C:\Program Files (x86)\AviSynth 2.5\plugins\mvtools2.dll")
    LoadPlugin("C:\Program Files (x86)\AviSynth 2.5\plugins\DGDecode.dll")
    LoadPlugin("C:\Program Files (x86)\AviSynth 2.5\plugins\tivtc.dll")
    SetMTmode(2,8)
    Mpeg2Source("lotr.d2v")
    #Insert Deinterlacer
    tfm(last,d2v="lotr.d2v").tdecimate()
    #Applying Resizing
    LanczosResize(720,352,0,62,-0,-66)
    x264 v1471
    64: x264-64bit.exe --crf 20 --preset medium --threads auto --tune film --sar 32:27 --output "C:\Temp\lotr64.mkv" lotr64.avs
    32: x264.exe --crf 20 --preset medium --threads auto --tune film --sar 32:27 --output "C:\Temp\lotr.mkv" lotr.avs

    64bit chain: encoded 16356 frames, 79.40 fps, 1345.89 kb/s
    32bit chain: encoded 16356 frames, 70.57 fps, 1345.89 kb/s

    Using Athlon II (620) O/C'd to 3.5Ghz
    Win7 Ult x64
    noee is offline   Reply With Quote
    Old 14th March 2010, 19:14   #159  |  Link
    Registered User
     
    Join Date: Feb 2009
    Location: USA
    Posts: 651
    Thanks Squid, I suppose its better than nothing

    I don't have any issues with using old versions, as long as nothing major has changed since then.. But when I open the D2V files from two different versions and they look different it kind of makes me nervous I'm not getting the best indexing of my source.
    osgZach is offline   Reply With Quote
    Old 14th March 2010, 19:47   #160  |  Link
    Registered User
     
    Join Date: Feb 2010
    Posts: 84
    @noee
    I'm guessing this means that you've gotten tivtc with working results? If so, that's great news. Those results seem about in line with expected. Somewhere between 10-20% faster when using x64 code.

    @Squid80
    You've got a DGDecode listed on your webpage along with source, but checking the version info indicates it's 1.4.6. Any chance you have the 1.5.6 source on hand and I could take a peek at it? Also, may I add that to the first post?

    @kemuri-_9
    Can I link your FFMS2.dll on the first post? I can move it over to mediafire if you don't want to waste bandwidth on hosting it locally.

    @turbojet
    Autocrop is built and working for me. Link is on the first post.

    Last edited by JoshyD; 14th March 2010 at 20:20.
    JoshyD is offline   Reply With Quote
    Reply


    Posting Rules
    You may not post new threads
    You may not post replies
    You may not post attachments
    You may not edit your posts

    BB code is On
    Smilies are On
    [IMG] code is On
    HTML code is Off

    Forum Jump


    All times are GMT +1. The time now is 10:05.


    Powered by vBulletin® Version 3.8.11
    Copyright ©2000 - 2019, vBulletin Solutions Inc.
  • 台湾“裤子大王”:百姓三餐不济谈啥“台湾价值” 2019-05-23
  • 韩国釜山海滩变“垃圾场” 清洁工叫苦不堪 2019-05-23
  • 浙江宣讲十九大:之江大地“好声音”“红船”精神入人心 2019-05-19
  • “回天地区”下月开放千套人才公寓 ——凤凰网房产北京 2019-05-13
  • 中国智能手机在东南亚受追捧 2019-04-25
  • 阜阳网络达人“点赞”颍泉绿化提升专项工作 2019-04-23
  • 《国家人文历史》往期杂志汇总 2019-04-22
  • 一师一团土地确权登记颁证工作全面展开 2019-04-14
  • 德州扑克赌场披“俱乐部”外衣 打竞技旗号难掩赌博实质 2019-04-12
  • 自治区党委召开常委(扩大)会议 陈全国主持 2019-04-12
  • 17年来首次!塔利班组织宣布停火3天 与阿富汗民众自拍 2019-04-04
  • 2022年冬奥会筹备进行时 2019-04-03
  • 人家80年前就造航母,我们现在才造航母,基础不一样。 2019-04-03
  • 葡萄牙首都上演城市节狂欢 2019-04-01
  • RED EARTH红地球展现自我丝绒唇膏全新发布 2019-03-24
  • 五分彩计划app软件 重庆时时彩开奖 2017082福彩中奖号码 pk10冠军四码规律破解 老时时彩历史数据 新浪比分直播 体彩排列3跨度振幅 中国福彩网3d 赛车赌博破解方法 排列三跨度走势图彩宝网 pk10个人投注心得分享 澳洲幸运10是哪里的 易网重庆老时时彩开奖 老版彩经网走势图 新疆时时彩开奖上银狐网 江西多乐彩出号