Any hazard in period_iterations_num too small?

Message boards : Number crunching : Any hazard in period_iterations_num too small?
Message board moderation

To post messages, you must log in.

AuthorMessage
Gene Project Donor

Send message
Joined: 26 Apr 99
Posts: 150
Credit: 48,393,279
RAC: 118
United States
Message 1831246 - Posted: 18 Nov 2016, 17:53:27 UTC

It is only a modest 750Ti GPU but I like to play with the command line parameters to get the best I can from it. Running the nvidia_opencl_SoG in a "benchmark" isolated configuration and trying various parameters on the same work units. With other parameters (of the oclfft_tune_* set) copied from other threads I am exploring the effects of the -period_iterations_num value. Lowering that parameter to 40, from the default 50, helps about 16%. Below 40 there's not much improvement (less than 10%) although I've only tried it down to 10.

The question: for just benchmark testing, is there any (software) hazard in trying period_iterations_num down to 9...8...7... ? Is there a driver limit beyond which the GPU might freeze up? I am guessing (hoping?) that the driver, and the application code, will override a value that is beyond the hardware capability and just do the best they can, with a possible loss of performance as a by-product. That would show up in the benchmark times and tell me clearly that I've gone too far. (Slow screen refresh, and other artifacts of graphics resource contention, are not an issue at this point.)

-Gene-
ID: 1831246 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1831270 - Posted: 18 Nov 2016, 19:33:04 UTC

[joking mode on]Yes, if you set it too small, you could develop a black hole in your RAM, which will swallow up your computer and make it invisible/[/joking mode off]
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1831270 · Report as offensive
Profile Shaggie76
Avatar

Send message
Joined: 9 Oct 09
Posts: 282
Credit: 271,858,118
RAC: 196
Canada
Message 1831296 - Posted: 18 Nov 2016, 21:33:28 UTC

With lower values I've found I had to disable TDR checking in Windows. Even with that turned off I've had system lockups when I go below 10 in multi-gpu setups.

I cannot testify to Linux although I'll be switching one of my crunchers soon so I'll found out I guess!
ID: 1831296 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13731
Credit: 208,696,464
RAC: 304
Australia
Message 1831316 - Posted: 18 Nov 2016, 23:06:16 UTC - in response to Message 1831296.  

With lower values I've found I had to disable TDR checking in Windows. Even with that turned off I've had system lockups when I go below 10 in multi-gpu setups.

I cannot testify to Linux although I'll be switching one of my crunchers soon so I'll found out I guess!

I'm running my i7 with 1* GTX750Ti & 1* GTX1070 under Win10 with period_iterations_num set to 3.
Other than some screen /keyboard lag (particularly on the monitor connected to the GTX 750Ti which I can live with), I've had no issues. I did have it set to 1, but the lag was too much to tolerate- it wouldn't be an issue for a dedicated cruncher.
Grant
Darwin NT
ID: 1831316 · Report as offensive
bluestar

Send message
Joined: 5 Sep 12
Posts: 7015
Credit: 2,084,789
RAC: 3
Message 1831331 - Posted: 19 Nov 2016, 0:42:01 UTC

Try that with a .vlar task.
ID: 1831331 · Report as offensive
Profile Rune Bjørge

Send message
Joined: 5 Feb 00
Posts: 45
Credit: 30,508,204
RAC: 5
Norway
Message 1831418 - Posted: 19 Nov 2016, 14:08:50 UTC - in response to Message 1831331.  

Running with a low number on period_iterations_num does not cause any damage to your hardware. The only thing you might run into is the Timeout Detection and Recovery problem with it's bluescreens. You could also experience display lag with a too low number.

I've pushed my gtx Titans to run with a number as low as 2 in a rig running 3 titans. Had to do som tweeks on the TDR system to get it running stable, but now it is crunching like a dream. Lag is also within what I can live with.
ID: 1831418 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1831513 - Posted: 20 Nov 2016, 0:29:33 UTC - in response to Message 1831316.  

With lower values I've found I had to disable TDR checking in Windows. Even with that turned off I've had system lockups when I go below 10 in multi-gpu setups.

I cannot testify to Linux although I'll be switching one of my crunchers soon so I'll found out I guess!

I'm running my i7 with 1* GTX750Ti & 1* GTX1070 under Win10 with period_iterations_num set to 3.
Other than some screen /keyboard lag (particularly on the monitor connected to the GTX 750Ti which I can live with), I've had no issues. I did have it set to 1, but the lag was too much to tolerate- it wouldn't be an issue for a dedicated cruncher.


. . Hi Grant,

. . Did you try swapping the cards around and running the monitor off the 1080? I would think it would be less affected by lag at the lower values of period_iterations_num.

Stephen

.
ID: 1831513 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1831515 - Posted: 20 Nov 2016, 0:41:49 UTC - in response to Message 1831246.  

It is only a modest 750Ti GPU but I like to play with the command line parameters to get the best I can from it. Running the nvidia_opencl_SoG in a "benchmark" isolated configuration and trying various parameters on the same work units. With other parameters (of the oclfft_tune_* set) copied from other threads I am exploring the effects of the -period_iterations_num value. Lowering that parameter to 40, from the default 50, helps about 16%. Below 40 there's not much improvement (less than 10%) although I've only tried it down to 10.

The question: for just benchmark testing, is there any (software) hazard in trying period_iterations_num down to 9...8...7... ? Is there a driver limit beyond which the GPU might freeze up? I am guessing (hoping?) that the driver, and the application code, will override a value that is beyond the hardware capability and just do the best they can, with a possible loss of performance as a by-product. That would show up in the benchmark times and tell me clearly that I've gone too far. (Slow screen refresh, and other artifacts of graphics resource contention, are not an issue at this point.)

-Gene-


. . Hi Gene

. . Just for a comparison. I am running two GTX950s on an old Pentium D 930 rig. I am running with -period_iterations_num set to 5 with no problems, I did have it set to 3 but the lag became annoying. I don't even know what TDR is or how to set it, but that rig just keeps on chugging away for me. To be honest I don't know that there was any significant improvement in runtimes when it was set to 3 so I am happy where it is.

Stephen

.
ID: 1831515 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13731
Credit: 208,696,464
RAC: 304
Australia
Message 1831516 - Posted: 20 Nov 2016, 0:48:05 UTC - in response to Message 1831513.  

. . Did you try swapping the cards around and running the monitor off the 1080?

One monitor on the GTX 1070, another on the GTX 750Ti. The effect on the GTX 1070 is minimal.
Grant
Darwin NT
ID: 1831516 · Report as offensive
Gene Project Donor

Send message
Joined: 26 Apr 99
Posts: 150
Credit: 48,393,279
RAC: 118
United States
Message 1831861 - Posted: 22 Nov 2016, 5:51:09 UTC - in response to Message 1831316.  

Encouraged by message replies, I went ahead with benchmark runs on three different work units, one AR=11.9 and two AR=0.01, using the linux x86_64 opencl_nvidia_SoG application. Tested -period_iterations_num all the way down to "1" with no adverse consequences. Below 7, however, there was no improvement in run times. So, I've settled on 7 as the configuration to use. It was observed that for the AR=11.9 work unit there was NO IMPROVEMENT for any value smaller than the default. The result file showed no "pulse" detections so I guess the "period iterations" part of the code was never executed, or was of trivial length.
In contrast, for the longer running vlar work (27no15ac...vlar), which reported 7 pulse detections, here are a few benchmark numbers:
<per_it_num>  <elapsed seconds> <relative speed>
    50  -default-     1996           100%
    40                1747           114%
    20                1672           119%
    10                1611           124%
     7                1569           127%
     3                1578           126%
     1                1568           127%

Yeah, o.k., so 1 tick better at num=1 but it was 1 tick worse on a blc work unit, thus, to my mind, not worth the potential screen lag impact.
And, for those who might be curious, the cpu is an AMD FX-4300 with its "handicapped" floating-point design; don't know if that is significant in this context. But isn't that the point of benchmarking? I.e. to see what's best for the actual hardware in use?

@kittyman
[joke_mode] I feared the keyboard LEDs might revert to DEDs (Dark Emitting Diodes) and black out the entire room! [/joke_mode]
ID: 1831861 · Report as offensive

Message boards : Number crunching : Any hazard in period_iterations_num too small?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.