• 明星高考奇葩事杨幂总分第一 赵薇丢准考证(组图) 2019-07-02
  • 肩负促进和平与发展的时代重任 2019-07-02
  • 探秘海南高考评卷场 考生答题卡武警24小时值守 2019-06-28
  • “人民日报是我一生最尊敬和宝贵的朋友” 2019-06-28
  • 惊艳卢浮宫小牛电动发布新款电动车惊艳卢浮宫小牛电动发布新款电动车-手机行情 2019-06-23
  • 社区 —频道 春城壹网 七彩云南 一网天下 2019-06-15
  • 全国网络举报工作会议 2019-06-15
  • 珍贵!“国宝”林麝现身重庆金佛山 2019-06-11
  • 亚冠前瞻:恒大权健皆不容有失 两将伤情成上港争胜关键 2019-06-09
  • 端午节回归传统习俗 西安市民排队买艾草端午节艾叶-要闻 2019-06-09
  • 湖州唤醒“沉睡”的土地 2019-06-01
  • 在美中国留学生江玥被枪杀案宣判 罪犯获刑25年 2019-05-31
  • 日本大阪6.1级强震4死逾300伤 工厂及店铺恢复运营 2019-05-31
  • 台湾“裤子大王”:百姓三餐不济谈啥“台湾价值” 2019-05-23
  • 韩国釜山海滩变“垃圾场” 清洁工叫苦不堪 2019-05-23
  • Welcome to

    Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

     

    Go Back   Doom9's Forum > Capturing and Editing Video > Avisynth Development

    Reply
     
    Thread Tools Search this Thread Display Modes
    Old 13th March 2010, 18:59   #141  |  Link
    Registered User
     
    Join Date: Feb 2010
    Posts: 84
    Quote:
    Originally Posted by turbojet View Post
    JoshyD: The SSE2 build works. I don't think SSE2 is what AviSynth 2.60 is using during resize judging by these results

    2.58-x86: 51.51
    2.60a2-x86: 57.77
    +12%

    3-1-10-x64: 49.25
    SSE2-x64: 51.07
    +3%

    If you plan on suggesting the SSE2 build for AMD CPU's for now you might want to link libiomp5md.dll if you can, it's not an easy dll to find.
    It isn't . . . it uses the same opcodes my resizers use, which is why I'm at a loss as to why my resizers cause your computer to error.

    Avisynth 2.6 uses SSE3 as long as the FIR filter size is below 8, which is a pretty decent cutoff. Resizers that need larger filter sizes are normally pretty rare. TGMC beta 2 goes up to size 9 for my test vectors.

    Also, my compiler settings shouldn't allow your Athlon down any Intel code paths. The intel specific code paths, from what I can gather by reading icc vs gcc posts, are only executed when an intel family processor is detected. Any of Intel's "special" code should look at your AMD processor, read the vendor ID, and send it down a generic code path.

    Also, aegisofrime seems to indicate that he has it running on his Phenom II x4 in the post above. kemuri-_9's system specs indicate he's running a Phenom II x4 as well, and he hasn't voiced any resize specific problems as of yet. Your Athlon II is a stripped down phenom II core, I believe. I'm a bit baffled.

    I usually link Open-MP statically, that SSE2 only build is really old and I probably omitted the compiler flag for it. It has to be manually entered, because Open-MP, when used by multiple plugins, will error if they're all statically linked. For this reason, I haven't built any of the plugins with Open-MP directives. EEDI2 in particular enjoys a massive speed gain if you let it run with multiple threads.

    Quote:
    you seem to have simply ported over the stack corruption checking that's supposed to check for the plugin being stdcall (and following that convention)
    which you had already pointed out that win x64 does not follow this convention so what's going on here?
    You are correct, I didn't touch any of these routines, they'll definitely need to be changed. I hadn't even thought about loading C plugins with this dll until you expressed interest in getting a 64bit port of FFMS2 working. I was updating on an "as needed" basis, it's just a lot of code to sift through, I can't get it all in one shot. I grabbed the FFMS2 source, are you building it with MinGW? There's a MSVS project in the svn checkout, but even with C99 support, I'm missing some headers, and wanted to be consistent with your build environment, for testing/debugging and such. A compiled x64 dll would let me run through my source to tie any loose ends up, if you could either a) just link an x64 dll or b) fill me in on how you're compiling, I'd greatly appreciate it, and can get the avisynth core changed ASAP.
    JoshyD is offline   Reply With Quote
    Old 13th March 2010, 19:33   #142  |  Link
    Compiling Encoder
     
    kemuri-_9's Avatar
     
    Join Date: Jan 2007
    Posts: 1,348
    Quote:
    Originally Posted by JoshyD View Post
    Also, aegisofrime seems to indicate that he has it running on his Phenom II x4 in the post above. kemuri-_9's system specs indicate he's running a Phenom II x4 as well, and he hasn't voiced any resize specific problems as of yet. Your Athlon II is a stripped down phenom II core, I believe. I'm a bit baffled.
    I've been mostly sitting on the sidelines on this and not actively testing so don't pull me into any arguments as proof of something!
    (I only got the binary and started working with it for trying the ffms2 plugin just yesterday!)
    I have 3 pcs consisting of PhenomII x4, Phenom x4, and athlon64 x2 all running on x64 versions of windows, so i can do testing from the AMD side of things if the need arises....
    (as i usually do this for x264 as the other x264 devs mostly use Intel and/or linux)

    Quote:
    You are correct, I didn't touch any of these routines, they'll definitely need to be changed. I hadn't even thought about loading C plugins with this dll until you expressed interest in getting a 64bit port of FFMS2 working. I was updating on an "as needed" basis, it's just a lot of code to sift through, I can't get it all in one shot. I grabbed the FFMS2 source, are you building it with MinGW? There's a MSVS project in the svn checkout, but even with C99 support, I'm missing some headers, and wanted to be consistent with your build environment, for testing/debugging and such. A compiled x64 dll would let me run through my source to tie any loose ends up, if you could either a) just link an x64 dll or b) fill me in on how you're compiling, I'd greatly appreciate it, and can get the avisynth core changed ASAP.
    yes, I'm building with MinGW completely as per the reasoning of //doom10.org/index.php?topic=25.msg1730#msg1730
    ffms2.dll: x64 testing binary
    aforementioned LoadCPlugin plugin: x64 binary src
    kemuri-_9 is offline   Reply With Quote
    Old 13th March 2010, 21:28   #143  |  Link
    Registered User
     
    Join Date: Feb 2010
    Posts: 84
    @kemuri-_9
    Being the king of AMD PC's that you are, would you mind pulling a quick vertical resize test on any source to see if it's broken across the board for AMD users? If you could pull the latest binary (I uploaded a new one today), it'd be helpful to see if something funny is happening on the AMD side of things. You'll need it to properly test ffms2 anyway .

    Giving a quick look at your loadCplugin vs the current one that checks stack corruption, I decided to just drop your LoadCPlugin function in as a replacement. As a result, I've got your ffms2.dll loading and the few tests I've run (various sources with some post processing effects, etc) have it running, with some oddities. Running a debug build of avisynth through through MSVS's debugger points to some code in ffms2.dll that is making illegal memory accesses. However, frames come through and appear correctly when running a release build of avisynth. There are no memory access violations thrown.

    The oddity is that the 1st frame (frame 0) comes through as garbage. Moving around the source a bit, and then coming back to frame 0 has the frame rendering correctly. I figured I'd let you look it over, it may be my avisynth not initializing your plugin correctly. As you've probably gathered, I'm not too familiar with the plugin loading code of avisynth. My main focus has been on optimizing the calculation and memory heavy routines, which aren't usually hanging out in the core code.

    Oh, and in case you needed to see what I was building: a quick snapshot of my source. The C plugin routines still check for/specify fastcall and stdcall in places, but icc ignores these types when compiling 64bit. I thought it would be safe to leave them in if nothing was breaking as a result.

    It is very cool to have ffms2 working somewhat correctly though, it can't be far off from having full blown functionality.

    Last edited by JoshyD; 13th March 2010 at 21:38.
    JoshyD is offline   Reply With Quote
    Old 14th March 2010, 02:15   #144  |  Link
    Compiling Encoder
     
    kemuri-_9's Avatar
     
    Join Date: Jan 2007
    Posts: 1,348
    Quote:
    Originally Posted by JoshyD View Post
    @kemuri-_9
    Being the king of AMD PC's that you are, would you mind pulling a quick vertical resize test on any source to see if it's broken across the board for AMD users? If you could pull the latest binary (I uploaded a new one today), it'd be helpful to see if something funny is happening on the AMD side of things. You'll need it to properly test ffms2 anyway .
    har har, i have SWScale to resize with instead!
    but yes, trying something like Lanczos4Resize(Width(last),Height(last)*2) is throwing
    "Avisynth Unknown exceptions" exceptions here on my phenomII and athlon64 machines with this new build.

    Quote:
    Giving a quick look at your loadCplugin vs the current one that checks stack corruption, I decided to just drop your LoadCPlugin function in as a replacement. As a result, I've got your ffms2.dll loading and the few tests I've run (various sources with some post processing effects, etc) have it running, with some oddities. Running a debug build of avisynth through through MSVS's debugger points to some code in ffms2.dll that is making illegal memory accesses. However, frames come through and appear correctly when running a release build of avisynth. There are no memory access violations thrown.

    The oddity is that the 1st frame (frame 0) comes through as garbage. Moving around the source a bit, and then coming back to frame 0 has the frame rendering correctly. I figured I'd let you look it over, it may be my avisynth not initializing your plugin correctly. As you've probably gathered, I'm not too familiar with the plugin loading code of avisynth. My main focus has been on optimizing the calculation and memory heavy routines, which aren't usually hanging out in the core code.
    I'm not experiencing either of these, though i can't manage to compile the source code you provided (I do have ICL 11.1.051) due to missing convert_a64.asm to make a debug build

    Last edited by kemuri-_9; 14th March 2010 at 02:20.
    kemuri-_9 is offline   Reply With Quote
    Old 14th March 2010, 03:12   #145  |  Link
    Registered User
     
    Join Date: Apr 2009
    Posts: 453
    JoshyD, sorry for any confusion but the tests were on my Core 2 Duo machine, not my Phenom II machine. I do intend to test it on my Phenom II rig after my current encoding run is completed.
    aegisofrime is offline   Reply With Quote
    Old 14th March 2010, 04:09   #146  |  Link
    Registered User
     
    Join Date: Feb 2010
    Posts: 84
    kemuri-_9
    Whoops, I keep a CVS server running for my own code, forgot to add it to the tree. Here it is on it's own. Here's the source again, just to be on the safe side.

    If you can point me to the code making AMD processors so unhappy, I'd really really appreciate it.
    JoshyD is offline   Reply With Quote
    Old 14th March 2010, 06:19   #147  |  Link
    Registered User
     
    Join Date: May 2008
    Posts: 1,840
    Thanks levi for pointing to a newer RePAL version that outputs 25fps. About a year ago I had a blended pal->ntsc dvd that repal didn't handle all that well and tried SRestore to find worse results but maybe SRestore has improved since then. I agree repal should be low priority considering the low percentage of pal->ntsc sources.

    Another filter that I've started to use lately is autocrop which would be nice for avisynth64 and required for x264 input imo without it you need to depend on external programs to get crop values. Maybe something like --autocrop:width(none = no resize):resizer(lanczos default):mod(default 16) some examples with a 1920x1080 2.35:1 source:
    --autocrop:1280 = lanczosresize(1280,544,0,132,0,-132)
    --autocrop:1920 = --autocrop = crop(0,132,0,-132) (this should undercrop to mod set, since it's still higher quality then resizing)
    if the source is 1920x1080 1.78:1 --autocrop:1920 = 1920x1080 with no crop/resize (mod8 input with no crop/resize would always be mod8 output, same for mod4/2?)
    though something like --crop and --resize would be helpful in cases where autocrop doesn't work (which I haven't ran into yet). I really don't understand the need for --cli-filter prefix however.

    Thanks JoshyD for tivtc I'll try it on some sources next week.

    Last edited by turbojet; 14th March 2010 at 06:31.
    turbojet is offline   Reply With Quote
    Old 14th March 2010, 07:15   #148  |  Link
    Registered User
     
    Stephen R. Savage's Avatar
     
    Join Date: Nov 2009
    Posts: 341
    I just tried the new plugins, and I can say that all of them passed a quick test without issues. However, I do have a question regarding the "threads" parameter to some of these filters. Does the threads parameter get ignored on your Avs64 build, JoshyD?

    Now if only we had nnedi2

    Last edited by Stephen R. Savage; 14th March 2010 at 07:17.
    Stephen R. Savage is offline   Reply With Quote
    Old 14th March 2010, 08:01   #149  |  Link
    Registered User
     
    Join Date: Feb 2010
    Posts: 84
    Nope, they'll thread themselves independently of the main avisynth dll. It's a bit of a balancing act that the user has to do to make all processors stay busy, while not choking them with too many threads. NNEDI2 would be nice, but I don't think we're going to see that anytime soon.

    For now, I really want to figure out why AMD processors can't run the resizers in current binary . . . it's has to be something really simple that I'm missing.

    Last edited by JoshyD; 14th March 2010 at 08:07.
    JoshyD is offline   Reply With Quote
    Old 14th March 2010, 09:01   #150  |  Link
    Registered User
     
    Join Date: Mar 2007
    Posts: 35
    Hi,


    thread title says x86_64 so can i assume that it works on 32 bit versions of avisynth as well. can anyone confirm please.

    Thanks,
    mavinashbabu is offline   Reply With Quote
    Old 14th March 2010, 09:16   #151  |  Link
    Registered User
     
    Join Date: Apr 2009
    Posts: 453
    Quote:
    Originally Posted by mavinashbabu View Post
    Hi,


    thread title says x86_64 so can i assume that it works on 32 bit versions of avisynth as well. can anyone confirm please.

    Thanks,
    Nope, it doesn't. You can't mix 64-bit filters with 32-bit filters, AFAIK.
    aegisofrime is offline   Reply With Quote
    Old 14th March 2010, 10:36   #152  |  Link
    Registered User
     
    Join Date: Feb 2009
    Location: USA
    Posts: 651
    I did a clean download last night and grabbed everything again. To make sure I had all the recent updates.

    Did a test last night.. Took 2h:37m. And no crash. LEt's hope it stays that way

    The problem did not appear to be source access either, I think it was something like Trim or Decimate.. I'm guessing trim. I set Mode 3 after TGMC

    Code:
    LoadPlugin("C:\yatta\plugins64\decomb.dll")
    LoadPlugin("C:\yatta\plugins64\dgdecode.dll")
    LoadPlugin("C:\yatta\plugins64\telecidehints.dll")
    LoadPlugin("C:\yatta\plugins64\fieldhint.dll")
    
    
    function Preset0(clip c) {
    #Name: Default
    c
    return last
    }
    SetMTMode(2,0)
    DGDecode_Mpeg2Source("L:\Ep 01\VTS_01_1.d2v")
    
    
    
    FieldHint(ovr="L:\Ep 01\VTS_01_1.d2v.fh.txt")
    
    #MT("TempGaussMC_beta2().SelectEven()",threads=2,overlap=4)
    TempGaussMC_beta2().SelectEven()
    SetMTMode(3)
    
    PresetClip0=Preset0()
    
    PresetClip0.Trim(0,41023)
    
    
    DClip = Decimate(cycle=5,quality=3,ovr="L:\Ep 01\VTS_01_1.d2v.dec.txt").assumefps(last.framerate)
    osgZach is offline   Reply With Quote
    Old 14th March 2010, 11:47   #153  |  Link
    Registered User
     
    Join Date: Dec 2004
    Location: Melbourne, AU
    Posts: 1,963
    Are you perhaps confusing SSSE3 instructions with SSE3? I think you'll find the pmulhrsw instructions are the issue.
    squid_80 is offline   Reply With Quote
    Old 14th March 2010, 15:29   #154  |  Link
    Registered User
     
    Join Date: Mar 2003
    Posts: 116
    Xeon quad core E5530 2.40 ghz w/ turbo(hyperthreading)

    x264(x86) + set avisynth 2.6(x86)
    First Pass Output to null = 27.18

    x264(x64) + JoshyD avisynth 3-13-10(x64)
    First Pass Output to null = 29.54

    I've seen an 8% speed improvement

    Code:
    SetMTmode(3,3)
    mpeg2source("my.d2v")
    SetMTmode(2,3)
    tdeint()
    crop(4,4,1916,1076)
    LanczosResize(1280,720)
    levi is offline   Reply With Quote
    Old 14th March 2010, 17:29   #155  |  Link
    Registered User
     
    Join Date: Feb 2010
    Posts: 84
    @Squid_80
    You've got it. It's pshufb that throws the first error on my (really old) Athlon64. Then, those pmulrhrw's probably cause an issue as well. Reading some programming message boards, apparently pshufb has been giving AMD developers trouble. I'd guess the subtleties between SSE3 and SSSE3 do as well. Back to the drawing board for AMD people. Thanks for taking a look, being the only eyes looking over your source can drive a person a bit crazy.

    @levi
    Why only 3 threads? Are you accounting for x264 also taking up some cores? I think x264's defaults to creating 1.5x the amount of threads it detects your system can run. I was hoping for some more speed gains. Any chance on seeing some straight single threading tests on the same system? I want to re-build tdeint to see if I can't eek out some more performance as well.

    @osgZach
    It's great that there aren't any crashes anymore! I don't think trimming frames from a source should kill multithreading, but decimate has an unusual access pattern. I guess you can't win them all.

    Last edited by JoshyD; 14th March 2010 at 17:36.
    JoshyD is offline   Reply With Quote
    Old 14th March 2010, 17:37   #156  |  Link
    Registered User
     
    Join Date: Feb 2009
    Location: USA
    Posts: 651
    Crap keyboard.. lost my post to ill-place Forward/Backward keys...

    Anyway. Its technically 2 different sources being mixed in, so whatever the case the problem probably lies somewhere in there, with one or both of them.. I know Mode 3 was recommended for Trim at the very least.

    Any chance of grabbing the latest DGindex/decode source (1.5.8) and seeing if it will compile for x64? I'd give it a shot but if it needed changes I'd be clueless about that stuff..

    It would certainly make managing Yatta a lot easier (right now I have to setup two copies, using older x86 source to make projects, since the x64 DLL we have is only 1.4.6).
    If anyone that has time could look into it for that matter, it'd be great.

    Last edited by osgZach; 14th March 2010 at 17:40.
    osgZach is offline   Reply With Quote
    Old 14th March 2010, 18:08   #157  |  Link
    Registered User
     
    Join Date: Dec 2004
    Location: Melbourne, AU
    Posts: 1,963
    I just had a look on my HDD and it seems at some point I did make an x64 build of DGDecode 1.5.4: //www.mediafire.com/?dl4fc2yyyzz

    It seems to work but I have no idea if it's faster than the old build or if anything apart from the .d2v identifier was changed. Too bad the author has no regard for backwards compatibility.
    squid_80 is offline   Reply With Quote
    Old 14th March 2010, 18:43   #158  |  Link
    Registered User
     
    Join Date: Jan 2007
    Posts: 530
    FWIW, just did a test with an SD MPEG2 source, encoding with x264, using the 3/13 AVISynth64:

    AVS:
    Code:
    #LoadPlugin("C:\Program Files (x86)\AviSynth 2.5\plugins\mvtools2.dll")
    LoadPlugin("C:\Program Files (x86)\AviSynth 2.5\plugins\DGDecode.dll")
    LoadPlugin("C:\Program Files (x86)\AviSynth 2.5\plugins\tivtc.dll")
    SetMTmode(2,8)
    Mpeg2Source("lotr.d2v")
    #Insert Deinterlacer
    tfm(last,d2v="lotr.d2v").tdecimate()
    #Applying Resizing
    LanczosResize(720,352,0,62,-0,-66)
    x264 v1471
    64: x264-64bit.exe --crf 20 --preset medium --threads auto --tune film --sar 32:27 --output "C:\Temp\lotr64.mkv" lotr64.avs
    32: x264.exe --crf 20 --preset medium --threads auto --tune film --sar 32:27 --output "C:\Temp\lotr.mkv" lotr.avs

    64bit chain: encoded 16356 frames, 79.40 fps, 1345.89 kb/s
    32bit chain: encoded 16356 frames, 70.57 fps, 1345.89 kb/s

    Using Athlon II (620) O/C'd to 3.5Ghz
    Win7 Ult x64
    noee is offline   Reply With Quote
    Old 14th March 2010, 19:14   #159  |  Link
    Registered User
     
    Join Date: Feb 2009
    Location: USA
    Posts: 651
    Thanks Squid, I suppose its better than nothing

    I don't have any issues with using old versions, as long as nothing major has changed since then.. But when I open the D2V files from two different versions and they look different it kind of makes me nervous I'm not getting the best indexing of my source.
    osgZach is offline   Reply With Quote
    Old 14th March 2010, 19:47   #160  |  Link
    Registered User
     
    Join Date: Feb 2010
    Posts: 84
    @noee
    I'm guessing this means that you've gotten tivtc with working results? If so, that's great news. Those results seem about in line with expected. Somewhere between 10-20% faster when using x64 code.

    @Squid80
    You've got a DGDecode listed on your webpage along with source, but checking the version info indicates it's 1.4.6. Any chance you have the 1.5.6 source on hand and I could take a peek at it? Also, may I add that to the first post?

    @kemuri-_9
    Can I link your FFMS2.dll on the first post? I can move it over to mediafire if you don't want to waste bandwidth on hosting it locally.

    @turbojet
    Autocrop is built and working for me. Link is on the first post.

    Last edited by JoshyD; 14th March 2010 at 20:20.
    JoshyD is offline   Reply With Quote
    Reply


    Posting Rules
    You may not post new threads
    You may not post replies
    You may not post attachments
    You may not edit your posts

    BB code is On
    Smilies are On
    [IMG] code is On
    HTML code is Off

    Forum Jump


    All times are GMT +1. The time now is 23:31.


    Powered by vBulletin® Version 3.8.11
    Copyright ©2000 - 2019, vBulletin Solutions Inc.
  • 明星高考奇葩事杨幂总分第一 赵薇丢准考证(组图) 2019-07-02
  • 肩负促进和平与发展的时代重任 2019-07-02
  • 探秘海南高考评卷场 考生答题卡武警24小时值守 2019-06-28
  • “人民日报是我一生最尊敬和宝贵的朋友” 2019-06-28
  • 惊艳卢浮宫小牛电动发布新款电动车惊艳卢浮宫小牛电动发布新款电动车-手机行情 2019-06-23
  • 社区 —频道 春城壹网 七彩云南 一网天下 2019-06-15
  • 全国网络举报工作会议 2019-06-15
  • 珍贵!“国宝”林麝现身重庆金佛山 2019-06-11
  • 亚冠前瞻:恒大权健皆不容有失 两将伤情成上港争胜关键 2019-06-09
  • 端午节回归传统习俗 西安市民排队买艾草端午节艾叶-要闻 2019-06-09
  • 湖州唤醒“沉睡”的土地 2019-06-01
  • 在美中国留学生江玥被枪杀案宣判 罪犯获刑25年 2019-05-31
  • 日本大阪6.1级强震4死逾300伤 工厂及店铺恢复运营 2019-05-31
  • 台湾“裤子大王”:百姓三餐不济谈啥“台湾价值” 2019-05-23
  • 韩国釜山海滩变“垃圾场” 清洁工叫苦不堪 2019-05-23
  • 广东快乐十分计划网 奇迹觉醒官网客服电话 古怪猴子游戏app 广岛三箭vs大邱fc 糖果撞击游戏 湖北快三直播 真正高手如何思考 明日之后少女房子 萨索洛赛事 魔术 3d三码组六最大遗漏值 棋牌app制作多少钱 时时彩计划群号 传奇霸业道士攻略 捷豹的传说电子游艺 泰坦帝国电子游戏