Message boards :
Number crunching :
4x AMD Radeon R9 Fury X
Message board moderation
Previous · 1 . . . 3 · 4 · 5 · 6
Author | Message |
---|---|
Louis Loria II Send message Joined: 20 Oct 03 Posts: 259 Credit: 9,208,040 RAC: 24 |
okie dokie... installed the BETA. No problems as far as I can tell at this point. I am still running multiple WUs with no hiccups. We'll see what happens over the next day or so... AMD FX-8350 at 4300mhz 16gigs of G-Skill Ripjaws RAM at 1600mhz 2-Powercolor R9 280Xs 1030/1500mhz Gigabyte GA970-UD3 MOBO Samsung EVO 850 SSD EVGA Supernova P2 1200W PSU |
Sutaru Tsureku Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5 |
Because of bench test runs... I have no idea when AMD will have fixed the new added BUG (for the new chips). AstroPulse... 'Windows AP bench 211 minimal' and... In past I used the '2LC67' AP WU of 'Zblank shortened WUs' (for J1900 iGPU and NV GT730). For the R9 Fury X I should use the '9LC67' AP WU, right? Execution like last time? Or there are now new added cmdline params possible? MultiBeam... 'MBbench 2.10'? And which WU? Just the following cmdline params, or more? -sbs N -period_iterations_num N -spike_fft_thresh N -tune 1 N N N -oclfft_tune_gr N -oclfft_tune_lr N -oclfft_tune_wg N -oclfft_tune_ls N -oclfft_tune_bn N -oclfft_tune_cw N [EDIT: From which to which value is possible each params?] They are all independence, or one or more cmdline params are connected (like AP -ffa_block N and -ffa_block_fetch N)? Now I have 4 identical VGA cards. If I execute one 'bench .cmd tool', it will run on GPU#0, right? If I execute a 2nd instance of the 'bench .cmd tool', it will run on GPU#1, or also on GPU#0? (I could speed up the bench test runs if GPU#0, #1 and #2 will be used simultaneously during it (3 times same cmdline params).) Thanks. |
Sutaru Tsureku Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5 |
BTW. What are the default cmdline settings for AP and MB for a 64 compute units VGA card? |
Mike Send message Joined: 17 Feb 01 Posts: 34253 Credit: 79,922,639 RAC: 80 |
BTW. Just run a bench without comand line settings and check stderr.txt. With each crime and every kindness we birth our future. |
Mike Send message Joined: 17 Feb 01 Posts: 34253 Credit: 79,922,639 RAC: 80 |
AstroPulse... 'Windows AP bench 211 minimal' and... Bench script for both is 2.13. Task for AP is 9LC67 For MB use PG0395 PG444 and PG1327. You can make 4 folders for bench runs for each GPU. In comandline just add -device 0 to 3 to use all GPU`s. To test oclFFT planning and tune params you need to understand how it works. I certainly won`t give lessons in fft kernels. There are hundreds of possiblities. Here is a small example of one of my benches. #MB7_win_x86_SSE_OpenCL_ATi_HD5_r2889.exe -device 0 -spike_fft_thresh 4096 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 #MB7_win_x86_SSE_OpenCL_ATi_HD5_r2889.exe -device 0 -spike_fft_thresh 2048 -tune 1 1 1 2 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 #MB7_win_x86_SSE_OpenCL_ATi_HD5_r2889.exe -device 0 -spike_fft_thresh 2048 -tune 1 1 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 #MB7_win_x86_SSE_OpenCL_ATi_HD5_r2889.exe -device 0 -spike_fft_thresh 2048 -tune 1 1 1 8 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 32 -oclfft_tune_cw 32 #MB7_win_x86_SSE_OpenCL_ATi_HD5_r2889.exe -device 0 -spike_fft_thresh 2048 -tune 1 1 1 16 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 32 -oclfft_tune_cw 64 #MB7_win_x86_SSE_OpenCL_ATi_HD5_r2889.exe -device 0 -spike_fft_thresh 2048 -tune 1 1 1 32 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 #MB7_win_x86_SSE_OpenCL_ATi_HD5_r2889.exe -device 0 -spike_fft_thresh 2048 -tune 1 1 1 64 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 16 -oclfft_tune_cw 32 #MB7_win_x86_SSE_OpenCL_ATi_HD5_r2889.exe -device 0 -spike_fft_thresh 2048 -tune 1 1 1 128 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 16 -oclfft_tune_cw 16 #MB7_win_x86_SSE_OpenCL_ATi_HD5_r2889.exe -device 0 -spike_fft_thresh 2048 -tune 1 1 1 256 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 #MB7_win_x86_SSE_OpenCL_ATi_HD5_r2889.exe -device 0 -spike_fft_thresh 2048 -tune 1 2 1 1 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 #MB7_win_x86_SSE_OpenCL_ATi_HD5_r2889.exe -device 0 -spike_fft_thresh 2048 -tune 1 4 1 1 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 #MB7_win_x86_SSE_OpenCL_ATi_HD5_r2889.exe -device 0 -spike_fft_thresh 2048 -tune 1 8 1 1 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 #MB7_win_x86_SSE_OpenCL_ATi_HD5_r2889.exe -device 0 -spike_fft_thresh 2048 -tune 1 16 1 1 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 #MB7_win_x86_SSE_OpenCL_ATi_HD5_r2889.exe -device 0 -spike_fft_thresh 2048 -tune 1 32 1 1 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 #MB7_win_x86_SSE_OpenCL_ATi_HD5_r2889.exe -device 0 -spike_fft_thresh 2048 -tune 1 64 1 1 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 #MB7_win_x86_SSE_OpenCL_ATi_HD5_r2889.exe -device 0 -spike_fft_thresh 2048 -tune 1 128 1 1 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 #MB7_win_x86_SSE_OpenCL_ATi_HD5_r2889.exe -device 0 -spike_fft_thresh 2048 -tune 1 256 1 1 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 #MB7_win_x86_SSE_OpenCL_ATi_HD5_r2889.exe -device 0 -spike_fft_thresh 2048 -tune 1 2 1 2 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 #MB7_win_x86_SSE_OpenCL_ATi_HD5_r2889.exe -device 0 -spike_fft_thresh 2048 -tune 1 2 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 #MB7_win_x86_SSE_OpenCL_ATi_HD5_r2889.exe -device 0 -spike_fft_thresh 2048 -tune 1 2 1 8 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 #MB7_win_x86_SSE_OpenCL_ATi_HD5_r2889.exe -device 0 -spike_fft_thresh 2048 -tune 1 2 1 16 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 #MB7_win_x86_SSE_OpenCL_ATi_HD5_r2889.exe -device 0 -spike_fft_thresh 2048 -tune 1 2 1 32 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 #MB7_win_x86_SSE_OpenCL_ATi_HD5_r2889.exe -device 0 -spike_fft_thresh 2048 -tune 1 2 1 64 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 #MB7_win_x86_SSE_OpenCL_ATi_HD5_r2889.exe -device 0 -spike_fft_thresh 2048 -tune 1 2 1 128 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 #MB7_win_x86_SSE_OpenCL_ATi_HD5_r2889.exe -device 0 -spike_fft_thresh 2048 -tune 1 4 1 2 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 #MB7_win_x86_SSE_OpenCL_ATi_HD5_r2889.exe -device 0 -spike_fft_thresh 2048 -tune 1 4 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 #MB7_win_x86_SSE_OpenCL_ATi_HD5_r2889.exe -device 0 -spike_fft_thresh 2048 -tune 1 4 1 8 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 #MB7_win_x86_SSE_OpenCL_ATi_HD5_r2889.exe -device 0 -spike_fft_thresh 2048 -tune 1 4 1 16 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 #MB7_win_x86_SSE_OpenCL_ATi_HD5_r2889.exe -device 0 -spike_fft_thresh 2048 -tune 1 4 1 32 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 #MB7_win_x86_SSE_OpenCL_ATi_HD5_r2889.exe -device 0 -spike_fft_thresh 2048 -tune 1 4 1 64 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 #MB7_win_x86_SSE_OpenCL_ATi_HD5_r2889.exe -device 0 -spike_fft_thresh 2048 -tune 1 8 1 2 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 #MB7_win_x86_SSE_OpenCL_ATi_HD5_r2889.exe -device 0 -spike_fft_thresh 2048 -tune 1 8 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 #MB7_win_x86_SSE_OpenCL_ATi_HD5_r2889.exe -device 0 -spike_fft_thresh 2048 -tune 1 8 1 16 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 #MB7_win_x86_SSE_OpenCL_ATi_HD5_r2889.exe -device 0 -spike_fft_thresh 2048 -tune 1 8 1 32 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 #MB7_win_x86_SSE_OpenCL_ATi_HD5_r2889.exe -device 0 -spike_fft_thresh 2048 -tune 1 8 1 64 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 #MB7_win_x86_SSE_OpenCL_ATi_HD5_r2889.exe -device 0 -spike_fft_thresh 2048 -tune 1 16 1 2 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 #MB7_win_x86_SSE_OpenCL_ATi_HD5_r2889.exe -device 0 -spike_fft_thresh 2048 -tune 1 16 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 #MB7_win_x86_SSE_OpenCL_ATi_HD5_r2889.exe -device 0 -spike_fft_thresh 2048 -tune 1 16 1 8 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 #MB7_win_x86_SSE_OpenCL_ATi_HD5_r2889.exe -device 0 -spike_fft_thresh 2048 -tune 1 16 1 16 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 #MB7_win_x86_SSE_OpenCL_ATi_HD5_r2889.exe -device 0 -spike_fft_thresh 2048 -tune 1 32 1 2 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 #MB7_win_x86_SSE_OpenCL_ATi_HD5_r2889.exe -device 0 -spike_fft_thresh 2048 -tune 1 32 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 #MB7_win_x86_SSE_OpenCL_ATi_HD5_r2889.exe -device 0 -spike_fft_thresh 2048 -tune 1 32 1 8 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 #MB7_win_x86_SSE_OpenCL_ATi_HD5_r2889.exe -device 0 -spike_fft_thresh 2048 -tune 1 64 1 2 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 #MB7_win_x86_SSE_OpenCL_ATi_HD5_r2889.exe -device 0 -spike_fft_thresh 2048 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 #MB7_win_x86_SSE_OpenCL_ATi_HD5_r2889.exe -device 0 -spike_fft_thresh 2048 -tune 1 128 1 2 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 With each crime and every kindness we birth our future. |
Sutaru Tsureku Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5 |
AstroPulse... 'Windows AP bench 211 minimal' and... Where I could download the 'Bench v2.13'? (http://lunatics.kwsn.info/index.php?module=Downloads) In which download folder are the three PG0395, PG444 & PG1327 WUs? (The fastest settings are found, if all three WUs are fastest calculated? (Mix of WUs in real life.)) I understand it like this, that this cmdline params are 'connected': -oclfft_tune_gr N -oclfft_tune_lr N -oclfft_tune_wg N -oclfft_tune_ls N -oclfft_tune_bn N -oclfft_tune_cw N Which is the min value and which is the max value? min = 16 max = 512 So I can test, 16, 32, 48, 64, 80 (always +16) ... up to 512 for all above mentioned -oclfft_tune_* params? Example: 1.) -oclfft_tune_gr 16 -oclfft_tune_lr 16 -oclfft_tune_wg 16 -oclfft_tune_ls 16 -oclfft_tune_bn 16 -oclfft_tune_cw 16 2.) -oclfft_tune_gr 32 -oclfft_tune_lr 32 -oclfft_tune_wg 32 -oclfft_tune_ls 32 -oclfft_tune_bn 32 -oclfft_tune_cw 32 3.) -oclfft_tune_gr 48 -oclfft_tune_lr 48 -oclfft_tune_wg 48 -oclfft_tune_ls 48 -oclfft_tune_bn 48 -oclfft_tune_cw 48 (...) 32.) -oclfft_tune_gr 512 -oclfft_tune_lr 512 -oclfft_tune_wg 512 -oclfft_tune_ls 512 -oclfft_tune_bn 512 -oclfft_tune_cw 512 1.) -oclfft_tune_gr 16 -oclfft_tune_lr 16 -oclfft_tune_wg 16 -oclfft_tune_ls 16 -oclfft_tune_bn 16 -oclfft_tune_cw 16 2.) -oclfft_tune_gr 32 -oclfft_tune_lr 16 -oclfft_tune_wg 16 -oclfft_tune_ls 16 -oclfft_tune_bn 16 -oclfft_tune_cw 16 3.) -oclfft_tune_gr 48 -oclfft_tune_lr 16 -oclfft_tune_wg 16 -oclfft_tune_ls 16 -oclfft_tune_bn 16 -oclfft_tune_cw 16 (...) 32.)-oclfft_tune_gr 512 -oclfft_tune_lr 16 -oclfft_tune_wg 16 -oclfft_tune_ls 16 -oclfft_tune_bn 16 -oclfft_tune_cw 16 1.) -oclfft_tune_gr 16 -oclfft_tune_lr 16 -oclfft_tune_wg 16 -oclfft_tune_ls 16 -oclfft_tune_bn 16 -oclfft_tune_cw 16 2.) -oclfft_tune_gr 16 -oclfft_tune_lr 32 -oclfft_tune_wg 16 -oclfft_tune_ls 16 -oclfft_tune_bn 16 -oclfft_tune_cw 16 3.) -oclfft_tune_gr 16 -oclfft_tune_lr 48 -oclfft_tune_wg 16 -oclfft_tune_ls 16 -oclfft_tune_bn 16 -oclfft_tune_cw 16 (...) 32.)-oclfft_tune_gr 16 -oclfft_tune_lr 512 -oclfft_tune_wg 16 -oclfft_tune_ls 16 -oclfft_tune_bn 16 -oclfft_tune_cw 16 1.) -oclfft_tune_gr 16 -oclfft_tune_lr 16 -oclfft_tune_wg 16 -oclfft_tune_ls 16 -oclfft_tune_bn 16 -oclfft_tune_cw 16 2.) -oclfft_tune_gr 16 -oclfft_tune_lr 16 -oclfft_tune_wg 32 -oclfft_tune_ls 16 -oclfft_tune_bn 16 -oclfft_tune_cw 16 3.) -oclfft_tune_gr 16 -oclfft_tune_lr 16 -oclfft_tune_wg 48 -oclfft_tune_ls 16 -oclfft_tune_bn 16 -oclfft_tune_cw 16 (...) 32.)-oclfft_tune_gr 16 -oclfft_tune_lr 16 -oclfft_tune_wg 512 -oclfft_tune_ls 16 -oclfft_tune_bn 16 -oclfft_tune_cw 16 1.) -oclfft_tune_gr 16 -oclfft_tune_lr 16 -oclfft_tune_wg 16 -oclfft_tune_ls 16 -oclfft_tune_bn 16 -oclfft_tune_cw 16 2.) -oclfft_tune_gr 16 -oclfft_tune_lr 16 -oclfft_tune_wg 16 -oclfft_tune_ls 32 -oclfft_tune_bn 16 -oclfft_tune_cw 16 3.) -oclfft_tune_gr 16 -oclfft_tune_lr 16 -oclfft_tune_wg 16 -oclfft_tune_ls 48 -oclfft_tune_bn 16 -oclfft_tune_cw 16 (...) 32.)-oclfft_tune_gr 16 -oclfft_tune_lr 16 -oclfft_tune_wg 16 -oclfft_tune_ls 512 -oclfft_tune_bn 16 -oclfft_tune_cw 16 1.) -oclfft_tune_gr 16 -oclfft_tune_lr 16 -oclfft_tune_wg 16 -oclfft_tune_ls 16 -oclfft_tune_bn 16 -oclfft_tune_cw 16 2.) -oclfft_tune_gr 16 -oclfft_tune_lr 16 -oclfft_tune_wg 16 -oclfft_tune_ls 16 -oclfft_tune_bn 32 -oclfft_tune_cw 16 3.) -oclfft_tune_gr 16 -oclfft_tune_lr 16 -oclfft_tune_wg 16 -oclfft_tune_ls 16 -oclfft_tune_bn 48 -oclfft_tune_cw 16 (...) 32.)-oclfft_tune_gr 16 -oclfft_tune_lr 16 -oclfft_tune_wg 16 -oclfft_tune_ls 16 -oclfft_tune_bn 512 -oclfft_tune_cw 16 1.) -oclfft_tune_gr 16 -oclfft_tune_lr 16 -oclfft_tune_wg 16 -oclfft_tune_ls 16 -oclfft_tune_bn 16 -oclfft_tune_cw 16 2.) -oclfft_tune_gr 16 -oclfft_tune_lr 16 -oclfft_tune_wg 16 -oclfft_tune_ls 16 -oclfft_tune_bn 16 -oclfft_tune_cw 32 3.) -oclfft_tune_gr 16 -oclfft_tune_lr 16 -oclfft_tune_wg 16 -oclfft_tune_ls 16 -oclfft_tune_bn 16 -oclfft_tune_cw 48 (...) 32.)-oclfft_tune_gr 16 -oclfft_tune_lr 16 -oclfft_tune_wg 16 -oclfft_tune_ls 16 -oclfft_tune_bn 16 -oclfft_tune_cw 512 Then: 1.) -oclfft_tune_gr 16 -oclfft_tune_lr 32 -oclfft_tune_wg 32 -oclfft_tune_ls 32 -oclfft_tune_bn 32 -oclfft_tune_cw 32 2.) -oclfft_tune_gr 32 -oclfft_tune_lr 32 -oclfft_tune_wg 32 -oclfft_tune_ls 32 -oclfft_tune_bn 32 -oclfft_tune_cw 32 3.) -oclfft_tune_gr 48 -oclfft_tune_lr 32 -oclfft_tune_wg 32 -oclfft_tune_ls 32 -oclfft_tune_bn 32 -oclfft_tune_cw 32 (...) 32.)-oclfft_tune_gr 512 -oclfft_tune_lr 32 -oclfft_tune_wg 32 -oclfft_tune_ls 32 -oclfft_tune_bn 32 -oclfft_tune_cw 32 (...) 1.) -oclfft_tune_gr 16 -oclfft_tune_lr 48 -oclfft_tune_wg 48 -oclfft_tune_ls 48 -oclfft_tune_bn 48 -oclfft_tune_cw 48 2.) -oclfft_tune_gr 32 -oclfft_tune_lr 48 -oclfft_tune_wg 48 -oclfft_tune_ls 48 -oclfft_tune_bn 48 -oclfft_tune_cw 48 3.) -oclfft_tune_gr 48 -oclfft_tune_lr 48 -oclfft_tune_wg 48 -oclfft_tune_ls 48 -oclfft_tune_bn 48 -oclfft_tune_cw 48 (...) 32.)-oclfft_tune_gr 512 -oclfft_tune_lr 48 -oclfft_tune_wg 48 -oclfft_tune_ls 48 -oclfft_tune_bn 48 -oclfft_tune_cw 48 (...) So that I had tested 16 (in +16 steps) up to 512 at all 6 -oclfft_tune_* places. With all combinations which are possible? Then (?) independence cmdline params: -tune 1 N N N = all possible entries like at AstroPulse -sbs N = 32, 64, 128, 256, 512 -period_iterations_num N = 10 (with +1 steps) up to 40 -spike_fft_thresh N = 512, 1024, 2048, 4096 ...yes? For AstroPulse I made already bench test runs... For MultiBeam it will be the first time. Maybe there is already a manual how to make MB bench test runs? Thanks. |
Louis Loria II Send message Joined: 20 Oct 03 Posts: 259 Credit: 9,208,040 RAC: 24 |
okie dokie... installed the BETA. No problems as far as I can tell at this point. I am still running multiple WUs with no hiccups. We'll see what happens over the next day or so... Running the BETA for 3 days now, no problems here. <app_config> <app> <name>astropulse_v7</name> <max_concurrent>12</max_concurrent> <gpu_versions> <gpu_usage>.33</gpu_usage> <cpu_usage>.5</cpu_usage> </gpu_versions> </app> <app> <name>setiathome_v7</name> <max_concurrent>12</max_concurrent> <gpu_versions> <gpu_usage>.20</gpu_usage> <cpu_usage>.10</cpu_usage> </gpu_versions> </app> </app_config> |
Sutaru Tsureku Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5 |
Someone could answer my questions in my last message? Thanks. I can't make bench test runs now, until someone answer the questions... (Maybe someone of the Lunatics crew could make a thread about which cmdline settings are possible (for MB and AP (ATI)) and which values (should be tested for slow and fast (compute units) GPUs)? This would be very helpful.) OK, installed Catalyst v15.10 Beta. MB with '-v 0 -hp -no_cpu_lock' and AP with '-v 0 -hp' in cmdline.txt files. Count 0.5 -> 2 WUs/GPU MB, after a short time, a few computations errors. AP, I started 8 WUs, all ended successful. So does this mean that AP work well 2 WUs/GPU, but MB don't? What's different? |
Sutaru Tsureku Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5 |
If I would like to let run SETI stock, without app_info.xml file, BOINC v7.6.9 would get all the apps (?): SETI@home v7 7.07 (opencl_ati5_cat132) 7.07 (opencl_ati5_nocal) 7.07 (opencl_ati5_sah) 7.07 (opencl_ati_cat132) 7.07 (opencl_ati_nocal) 7.07 (opencl_ati_sah) AstroPulse v7 7.09 (opencl_ati_100) BOINC say just: [4] AMD Fiji (4096MB) OpenCL: 2.0 (AFAIK, the VGA cards have 3072 MB VRAM.) No driver version. So BOINC need an upgrade? Maybe *_cat132 apps BOINC will not get? Thanks. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Maybe there is already a manual how to make MB bench test runs? Nope, you could be its author ;) To start with look these tables I created at times of oclFFT settings development. They could give some impression what is possible and what isn't. On case study of C-60 APU capabilities. Only green lines should be used. https://drive.google.com/file/d/0BwjTLNvsJmLBcEtaNG5xTUc4TDA/view?usp=sharing https://docs.google.com/spreadsheets/d/1bywjOlnPhTcpzk7UFl4T4ZPb2T0uS19l-ILQBQR3OS4/edit?usp=sharing https://docs.google.com/spreadsheets/d/1hgMumRHrYJ-R35xixyCI1_CHlczUE2f_M75jQ9euMJ8/edit?usp=sharing |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
If I would like to let run SETI stock, without app_info.xml file, BOINC v7.6.9 would get all the apps (?): It will get all allowed plan classes. Assuming SETI main plan classes are same as beta, look here: http://setiweb.ssl.berkeley.edu/beta/plan_class_spec.xml |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
What's different? Application used. Algorithms used inside application. Exact GPU device machine codes sequence inside applications. Or do you think if both apps use GPU device for computations they are identical? Definitely would be good to narrow area of differencies to particular parts of algorithm. This could help AMD to fix driver issues. As example of such approach look posts by Jacob Klein who (despite on my skepticism) was able to drive nVidia to bug fixing. IMHO most important part in his endeavour was ability to demonstrate bug on nVidia's own SDK samples. Working with known code usually much easier then debugging smth else creation. So, in very beginning of this thread I proposed you to try AMD own SDK samples in attempt to re-create similar invalid computations in their own code running simultaneously. This would make AMD more convinced about bug existance. Unfortunately, my proposal looks ignored so far. So, I could only propose such "deal" to you: either you do such tests on hardware you own or you buy me such hardware and I do those tests ;) Good deal? ;) ;) |
Sutaru Tsureku Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5 |
OK, there: http://developer.amd.com/tools-and-sdks/opencl-zone/amd-accelerated-parallel-processing-app-sdk AMD-SDK-InstallManager-v1.4.84.exe ? http://amd-dev.wpengine.netdna-cdn.com/wordpress/media/2013/12/AMD_APP_SDK_GettingStartedGuide.pdf (.PDF file) It looks like I download the InstallManager and execute the samples over command prompt. No additional software/tools needed? This is just a fast message. I look later more deeper because of correct execution/make samples results. Is there something I shouldn't forget, pay attention of? - - - - - - - - - - Maybe it would be also good if you would own such Fury X VGA card? SETI would profit from it (app development)? Maybe a few SETI members could donate money that you could buy such VGA card? I payed €700 for one Fury X. I looked, now for €650... If 13 members pay €50 each... But currently I have no idea how, which way. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
AFAIK many tests have CPU-based verification. You should check that GPU results pass that verification. Also, you should run few samples simultaneously or run sample with BOINC GPU computation enabled (to emulate few simultaneously run GPU apps). - - - - - - - - - -
LoL, maybe, indeed :D Especially taking into account what today's change rate is... |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.