Weird: AP6_win_x86_SSE2_OpenCL_NV_r2058.exe


log in

Advanced search

Message boards : Number crunching : Weird: AP6_win_x86_SSE2_OpenCL_NV_r2058.exe

Author Message
Ulrich Metzner
Volunteer tester
Avatar
Send message
Joined: 3 Jul 02
Posts: 984
Credit: 8,479,995
RAC: 3,993
Germany
Message 1467970 - Posted: 24 Jan 2014, 2:23:13 UTC
Last modified: 24 Jan 2014, 2:24:07 UTC

Hi everyone,

as the title reads, it is weird, the "AP6_win_x86_SSE2_OpenCL_NV_r2058.exe" executable takes a nearly to nothing (i mean, really nearly *nothing*) cpu share on my Windows XP pro 32 bit system, where the very same executable takes a *full* core on my Windows 8.1 pro 64 bit machine. On both systems the very same Nvidia driver version 332.21 is active and i don't get a clue, what's going on?
Is Windows XP still the Windows derivative to choose for really effective crunching?
I thought, Windows 8.1 is mostly written in C/C++ again and therefore again as fast, as it could be?! :? :?

Any help/hints very much appreciated!
____________
Aloha, Uli

Profile arkaynProject donor
Volunteer tester
Avatar
Send message
Joined: 14 May 99
Posts: 3632
Credit: 48,591,178
RAC: 10,193
United States
Message 1467990 - Posted: 24 Jan 2014, 3:29:40 UTC - in response to Message 1467970.

Hi everyone,

as the title reads, it is weird, the "AP6_win_x86_SSE2_OpenCL_NV_r2058.exe" executable takes a nearly to nothing (i mean, really nearly *nothing*) cpu share on my Windows XP pro 32 bit system, where the very same executable takes a *full* core on my Windows 8.1 pro 64 bit machine. On both systems the very same Nvidia driver version 332.21 is active and i don't get a clue, what's going on?
Is Windows XP still the Windows derivative to choose for really effective crunching?
I thought, Windows 8.1 is mostly written in C/C++ again and therefore again as fast, as it could be?! :? :?

Any help/hints very much appreciated!


From the XP machine...

Running on device number: 0
DATA_CHUNK_UNROLL set to:4
FFA thread block override value:2048
FFA thread fetchblock override value:1024
Maximum single buffer size set to:256MB
Sleep() & wait for event loops will be used in some places
Priority of worker thread raised successfully
Priority of process adjusted successfully, below normal priority class used


From the 8.1 machine...

Running on device number: 0
Priority of worker thread raised successfully
Priority of process adjusted successfully, below normal priority class used


Looks like you forgot the set the parameters on the 8.1 machine.
____________

Ulrich Metzner
Volunteer tester
Avatar
Send message
Joined: 3 Jul 02
Posts: 984
Credit: 8,479,995
RAC: 3,993
Germany
Message 1468001 - Posted: 24 Jan 2014, 4:10:13 UTC - in response to Message 1467990.
Last modified: 24 Jan 2014, 4:40:57 UTC

(...)
Looks like you forgot the set the parameters on the 8.1 machine.

Do you really think, it depends on that parameters?
I only set these parameters on the XP machine, because i want to see DVB-C television on that machine from time to time and if i omit the parameters the video stream is awfully stuttering. :? But ok, thanks, i'll give it a try on the 8.1 machine, let's see...

[edit]
You are right, the "-use_sleep" parameter is the culprit.
If i use this parameter in Windows 8.1, the processor usage drops from a full core to nearly zero! :O
Also the GPU usage drops from ~95% to 50-60%, but the overall performance seems to be the same! :O²
On Windows XP the GPU usage stays at ~95%, regardless of this parameter! :O³
I'm puzzled, but i'll let it run some time to see, how it develops...
____________
Aloha, Uli

juan BFBProject donor
Volunteer tester
Avatar
Send message
Joined: 16 Mar 07
Posts: 5265
Credit: 291,226,630
RAC: 465,833
Brazil
Message 1468042 - Posted: 24 Jan 2014, 6:26:44 UTC
Last modified: 24 Jan 2014, 6:30:24 UTC

With r2058 you could easely run 2 AP WU at a time on your Fermi/Keppler GPUs with no pain, that will rise your GPU usage again. Of course YMMV so you need to test if crunching 2 at a time is better than 1 (usualy is).
____________

ClaggyProject donor
Volunteer tester
Send message
Joined: 5 Jul 99
Posts: 4084
Credit: 32,976,540
RAC: 5,215
United Kingdom
Message 1468085 - Posted: 24 Jan 2014, 8:26:20 UTC - in response to Message 1468001.

(...)
Looks like you forgot the set the parameters on the 8.1 machine.

Do you really think, it depends on that parameters?
I only set these parameters on the XP machine, because i want to see DVB-C television on that machine from time to time and if i omit the parameters the video stream is awfully stuttering. :? But ok, thanks, i'll give it a try on the 8.1 machine, let's see...

[edit]
You are right, the "-use_sleep" parameter is the culprit.
If i use this parameter in Windows 8.1, the processor usage drops from a full core to nearly zero! :O
Also the GPU usage drops from ~95% to 50-60%, but the overall performance seems to be the same! :O²
On Windows XP the GPU usage stays at ~95%, regardless of this parameter! :O³
I'm puzzled, but i'll let it run some time to see, how it develops...

With the "-use_sleep" parameter your've got to up the unroll and FFA thread block parameters to claw back GPU usage.

Claggy

Ulrich Metzner
Volunteer tester
Avatar
Send message
Joined: 3 Jul 02
Posts: 984
Credit: 8,479,995
RAC: 3,993
Germany
Message 1468089 - Posted: 24 Jan 2014, 8:34:45 UTC - in response to Message 1468085.


With the "-use_sleep" parameter your've got to up the unroll and FFA thread block parameters to claw back GPU usage.


Thank you and yes, i got to the same conclusion, here are the new parameters:
DATA_CHUNK_UNROLL set to:8
FFA thread block override value:6144
FFA thread fetchblock override value:1536
Maximum single buffer size set to:256MB
Sleep() & wait for event loops will be used in some places
Priority of worker thread raised successfully
Priority of process adjusted successfully, below normal priority class used


from here: http://setiathome.berkeley.edu/result.php?resultid=3347672063
You have to scroll down to the last restart of the executable.
____________
Aloha, Uli

Profile Raistmer
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 16 Jun 01
Posts: 3409
Credit: 46,456,319
RAC: 8,069
Russia
Message 1468099 - Posted: 24 Jan 2014, 9:10:15 UTC
Last modified: 24 Jan 2014, 9:10:58 UTC

-use_sleep was introduced much before r2058 but because of nasty NV OpenCL runtime bug it was almost non-functional.
With latest builds that bug was discovered and workaround implemented.
Bug was filed to NV, they said @will look@ and no response more than for 2 months already.
____________

Ulrich Metzner
Volunteer tester
Avatar
Send message
Joined: 3 Jul 02
Posts: 984
Credit: 8,479,995
RAC: 3,993
Germany
Message 1468102 - Posted: 24 Jan 2014, 9:25:32 UTC

BTW:
On Mike's site (http://mikesworldnet.de/produkte.html) there are r2083 executables for AMD/ATI and for the CPU but for Nvidia there is only version r2058. Did i miss something?
____________
Aloha, Uli

Profile Raistmer
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 16 Jun 01
Posts: 3409
Credit: 46,456,319
RAC: 8,069
Russia
Message 1468180 - Posted: 24 Jan 2014, 13:12:36 UTC - in response to Message 1468102.

Perhaps not. Work in progress.Re-building whole app set, especially with new BOINC APis takes the time so sometimes better to skip few iterations for less essential flavours.
____________

Profile jason_geeProject donor
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 24 Nov 06
Posts: 4978
Credit: 73,314,594
RAC: 16,278
Australia
Message 1468185 - Posted: 24 Jan 2014, 13:26:07 UTC - in response to Message 1468180.
Last modified: 24 Jan 2014, 13:26:53 UTC

Perhaps not. Work in progress.Re-building whole app set, especially with new BOINC APis takes the time so sometimes better to skip few iterations for less essential flavours.


If you're currently rebuilding applications, you might want to consider linking to commode.obj (as documented and supplied amongst the MS stuff) just to workaround the stderr truncation issues (also possibly others). It's not a fix for Boinc's dated practices, but seems for now to work around at least the stderr truncation issue.
____________
"It is not the strongest of the species that survives, nor the most intelligent that survives. It is the one that is the most adaptable to change."
Charles Darwin

Profile Raistmer
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 16 Jun 01
Posts: 3409
Credit: 46,456,319
RAC: 8,069
Russia
Message 1468212 - Posted: 24 Jan 2014, 14:40:00 UTC - in response to Message 1468185.

thanks, Jason, will try on next build.
____________

Message boards : Number crunching : Weird: AP6_win_x86_SSE2_OpenCL_NV_r2058.exe

Copyright © 2014 University of California